AlphEpMu dataset and Stylistic Similarities in Greek Papyri based on Letter Shapes

This page contains material used for the paper entitled "Stylistic Similarities in Greek Papyri based on Letter Shapes: A Deep Learning approach" published in the 2nd Int. Workshop of Computational Paleography (ICDAR 2023), see article here.

  • the AlphEpMu dataset composed of small images of individual letters or "cliplets" divided into three subsets (alphas, epsilons and mus).
  • Description of AlphEpMu in csv.
  • Appendix A describing the AlphEpMu*, a dataset that was used in the experiments described in the article
  • Appendix B describing AlphEpMu-72, a subset of 72 papyri selected for style comparison (with metadata on date, provenance, etc) and its cliplet counts
  • Similarity scores obtained on AlphEpMu-72 of the three letter categories separately and merged
  • Code on Github
  • other resources (heatmaps of similarity scores per letter category, vizualization tool for clusers)

Manuscripts are refered to by their Trismegistos number (

Images are visible on PalEx but note that some tags reguarding the preservation state (bt 1, 2 etc) may have changed since the AlphEpMu dataset has been extracted.