Machine Learning Projects

Active Projects 

  • Detection of Archaeological Sites on Earth Observation Data

    Sub-surface or hidden Cultural Heritage sites can be discovered through Earth Observation (EO) data from a variety of sensors (e.g., hyperspectral, multispectral, LiDAR) by identifying and analysing anomalies or traces on bare soils, crops or vegetation that could be connected to the presence of archaeological deposits under them. This project aims to develop self-supervised deep learning architecture to automatically identify sub-surface CH sites using the wealth of unlabeled EO data produced every day from airborne and spaceborn sensors. A key feature of these novel methods is the exploitation of GIS-based information in the designed learning methods that ensures the decrease of the false positive and the consequence increase of the reliability of the detections.

    Collaborations: European Space Agency (ESA), Italian Space Agency (ASI). 
    Linked funded projectCultural Landscapes Scanner (CLS) 

  • Identification of Looting Activities on Earth Observation Data

    Illegal excavation of archaeological sites aimed at collecting historical material culture (‘looting’) to introduce it in the illicit market of antiquities is a pressing problem on a global scale. Under favorable circumstances, looting can be exposed on Earth Observation (EO) data by detecting changes that have occurred between two or more consecutive EO images of a time-series. The project focuses on the design of semi-supervised change detection pipeline composed of an unsupervised deep neural network, that learn a low-dimensional representation of EO images, followed by a semi-supervised graph neural network for enabling the detection of changes related to looting activities. As alternative approach, an optimal transport-based approach will be developed to monitor pillaging activities (past and ongoing) through the analysis of time series of LiDAR data as part of the MSCA-IF OPTIMAL project. 

    Collaborations: European Space Agency (ESA), Italian Space Agency (ASI). 
    Linked funded projectOPTIMAL 

  • Automatic Transcription of Medieval Handwritten Manuscripts

    The automatic transcription of document images provides the cultural heritage experts a more efficient way to access the content of historical documents and to extract meaningful information, while facilitating searching through a large set of pages. Historical documents frequently suffer from high degradation which makes their transcription a challenging task. This project aims to develop a novel handwriting recognition model for historical documents, taking advantage of recent deep-learning approaches combined with conventional computer vision techniques.
  • A Context-aware Approach for Historical Document Analysis 

    The transcription of handwritten historical texts of cultural value is necessary to make their content widely available. To accomplish such a task, many Handwriting Text Recognition (HTR) models have been developed. However, the cutting-edge methods are far from perfect. Several reasons account for this, such as writing medium deterioration (due to the manipulation in addition to the presence of creases, scratches, ink stains, and bleed-through) or the variability of the  writing. This project aims to define a newend-to-end transcription model, further enhanced by the ability to exploitcontextual information. Contextual information is a widely known and differently interpreted concept in computer vision. It means a labelling approach that also takes into consideration the relationship of what one wants to classify with the information of the neighborhood. The definition of this concept is crucial in obtaining accurate classification models. Our approach focuses on defining a new model capable of achieving consistent labeling in HTR. 

  • Ancient Cuneiform Tablet Image Analysis 

    Despite the considerable number of already discovered cuneiform tablets, and new ones being continuously excavated or identified in museums all over the world, many of them remain still to be studied, due to the inherent difficulties in their analysis, a hard and time-consuming activity even for domain experts. The aim of this project is to apply advanced machine learning and computer vision techniques to develop automatic methods for analyzing the ancient cuneiform tablet images processing their structures, contents, and visual properties. 

  • Layout Extraction from historical document images 

    Layout extraction is a key pre-processing step in document image analysis that segments the input image into its homogeneous regions, i.e., blocks of text, side notes, drawings, and tables. Applying this pre-processing step facilitates subsequent procedures in any document image analysis applications such as optical character recognition and automatic transcription. This project aims to develop a model for layout extraction in historical documents. Unlike modern documents, Historical manuscripts do not often have a regular layout or a structured text arrangement. To tackle this problem, this project focuses on developing a learning-based approach that generalizes to the wide range of documents with various layouts.