AlphEpMu dataset and Stylistic Similarities in Greek Papyri based on Letter Shapes

This page contains material used for the paper entitled "Stylistic Similarities in Greek Papyri based on Letter Shapes: A Deep Learning approach" published in the 2nd Int. Workshop of Computational Paleography (ICDAR 2023), see article here.

  • the AlphEpMu dataset composed of small images of individual letters or "cliplets" divided into three subsets (alphas, epsilons and mus).
  • Description of AlphEpMu in csv.
  • Appendix A describing the AlphEpMu*, a dataset that was used in the experiments described in the article
  • Appendix B describing AlphEpMu-72, a subset of 72 papyri selected for style comparison (with metadata on date, provenance, etc) and its cliplet counts
  • Similarity scores obtained on AlphEpMu-72 of the three letter categories separately and merged (forthcoming)
  • Code on Github
  • other resources (heatmaps of similarity scores per letter category, vizualization tool for clusers -forthcoming)

Manuscripts are refered to by their Trismegistos number (

Images are visible on PalEx but note that some tags reguarding the preservation state (bt 1, 2 etc) may have changed since the AlphEpMu dataset has been extracted.

Modifications in the current available material compared with the one used for the paper:

- 23.10.2023: to avoid confusion, we changed the TM number labeled by mistake 60847* by its correct one which is 60515 in the dataset, csv-s and appendices.