RUTE employs a robust and modular methodological process that leverages state-of-the-art techniques in THz spectroscopy and image processing to address challenges in cultural heritage preservation. The process uses a THz time-domain system in reflection mode, where THz pulses are reflected from paper-air gaps and detected on the detector side as a mixture of pulses. The RUTE pipeline includes four primary stages: 1) data collection; 2) THz pulse separation; 3) image restoration; and 4) evaluation of results and pipeline tuning.
- Datasets creation: To create datasets, we prepare well-known ancient inks, such as iron-gall, using ingredients, methods, and mechanisms described in historical treatises, such as medieval texts. We then use these inks to create mockups that are scanned with the THz system. We also collect close-range images with a hyperspectral camera to investigate the behavior of different inks in the near-infrared spectral region.
- Separating information corresponding to different pages of a closed book: On the detector side of the THz system, we detect a mixture of THz pulses that correspond to multiple paper-air gaps. We then separate these mixtures, resulting in individual pulses that correspond to a single page of the closed book. Some inks have unique behaviour in the THz domain, which we can use to enhance the images of a single page even further.
- Image restoration: Our team uses image restoration to improve the quality of THz time-domain and hyperspectral images of ancient manuscripts. This process removes unwanted noise and blur, resulting in sharper and more readable text. We achieve this by leveraging our knowledge of the unique properties of hyperspectral data and using prior knowledge about text-containing images.
- Evaluation and interpretation of results: At RUTE, we work closely with palaeographers and archivists to evaluate and interpret the results of our image restoration process. The evaluation process is subjective and is carried out by the experts themselves. We start with the preliminary results after the signal separation step and then continue with cycles of results evaluation and pipeline tuning after image restoration. Tuning the pipeline is an iterative process and requires constant input and supervision from an archivist to ensure that the final results are of high quality.