Hell-Date (EGRAPSA Hellenistic Dated Papyri Dataset)

This dataset counts 187 images of 155 papyri that are precisely dated (within two years) from the Hellenistic period (3rd to 1st c. BCE, more precisely from -310 to -3). Download.

For each papyrus, the following identifiers are used:

The Hell-Date.zip archive contains the following files:

  • data.csv gives access to the 187 images with, for each image, a standard name, the location, collection name, inventory number, and the link to access online the file.
    • Names are standardised across the csv as TMnumber_checklistAbbreviation. Some papyri are in more than one image, in that case the name contains additional information to distinguish the various images (e.g., two fragments of the same papyrus preserved in different collections, or the recto and verso of the same papyrus);
    • A python script is joint with the .csv to automatize the download process.
  • metadata.csv contains metadata for each image. Each column of the file represents the following metadata:
    • image_name: name of the file for the image of the papyrus;
    • checklist: checklist identifier of the papyrus (usual way to refer to the papyrus in papyrology);
    • TM: TM number as unique identifier of the text;
    • Year post: i.e. the year before which the papyrus cannot have been written;
    • Year ante quem: i.e. the year after which the papyrus cannot have been written;
    • Production Nome (supposed): the geographical region where the papyrus was written;
    • Function: the type of document (e.g. a contract, or a letter. This item could be a comma separated list).
  • downloader.py allows to download automatically all the images of the dataset taking each of them from the original archive.
  • How to download the dataset.pdf briefly describe the simple procedure to download the images using the downloader.py script.
  • Requirements.txt describe the requirements for the python environment to run the script correctly.

Some caveats concerning the images:

Concerning provenance, most documents come from Egypt, but there are a few outsiders from Near East.

The chronological coverage is balanced around 50 papyri per century over the considered period (III – I BCE); only the earliest decades are not covered, and the decade 250s is overrepresented.

Users of this dataset must comply with the licenses provided by the various websites that give access to the images. Please take note that some of them do not allow reuse, or commercial reuse, of the images, and that credits are mostly required. By using this dataset, you confirm that you have read and understood the following licenses: