FacsimMu dataset, Images, Ink Restauration Studio, Google Colab notebook

This page contains materials used for the paper entitled “Comparing Shapes, Not Noise: Human–Machine Clustering of Handwritten Greek Characters from Papyri” submitted to IEEE Workshop on Historical Handwriting Analysis, Natural Language Processing and Knowledge Graphs (HHA-NLP-KG), to be held in Venice, Italy (hybrid event), September 7–9, 2026

  • the FacsimMu_Dataset  composed of small images of individual letters or "cliplets" divided into two subsets (original and redrawn), with dataset description and metadata.
  • Full-size images included into the article (Images).
  • Ink Restauration Studio, a standalone web application, to be opened with Google Chrome or any modern browser and description of how to use it.
  • Google Colab Notebook that allows the user to run same (with provided dataset) or similar experiments as described in the article, with notes on fine-tuning of the code. (Forthcoming).
  • DINOv2, UMAP, HDBSCAN based visual outputs as well as similarity score, heatmap, etc (DINOv2_Output_20260505).
To top