The Datacons Project: An Open-access Dataset of Late Roman Consular Dates

The DataCons Project provides an open-access dataset of late Roman consular dating formulae spanning from CE 284 to 541. It aggregates consular materials from publications worldwide. Currently, the dataset contains over 4,800 documents, penned in three distinct scripts, originating from ten ancient regions, and categorised by material type and textual content. The present release of the dataset (2.0.0) focuses on the Latin and Greek documentation from CE 476 to 526, includes nearly 880 items in total, and pertains exclusively to papyri and inscriptions. Periodic updates are anticipated to expand its temporal coverage. The dataset is tailored to support interdisciplinary research involving consular material, and the project will soon be accompanied by the launch of an online relational database.


Dosi
Journal of Open Humanities Data DOI: 10.5334/johd.130 2. Data Entry and Structuring: Data were manually entered into an Excel document, organised in tabular form, and sorted by key attributes.This process included integrating hyperlinks to other databases' records and images.3. Data Normalisation and Enhancement: Basic Excel operations were performed to normalise data, generate unique identifiers for each row, and separate the GPS coordinates into latitude and longitude columns.Texts are presented in original and restored forms following the Leiden convention, supplemented with photographs of the original texts or scholarly reconstructions and transcriptions.4. Documentation and Metadata Management: Although the dataset does not include a critical apparatus for the dating of each document -which will be addressed in future publications -the accompanying documentation on Zenodo provides an overview of the rationale behind the dating of each document (when it differs from previous scholarly opinions), as well as a description of each column.

SAMPLING STRATEGY
The sampling objective was to comprehensively gather all material from CE 476-526.Tokens that were securely dated within this timeframe received priority during the collection process, ensuring a focus on information pertinent to each attribute for this specific period.

QUALITY CONTROL
The palaeography of each formula was confirmed using original documents, photographs, or scholarly editions to ensure accuracy.Attribute information underwent multiple rounds of verification by the author to maintain reliability.Additionally, findspot identification and GPS coordinates were validated using authoritative sources such as the Barrington Atlas (Talbert & Bagnall, 2000), Trismegistos Places8 (Verreth, 2008) and/or Pleiades9 (Elliott, 2021) ensuring geographical precision. (

3) DATASET DESCRIPTION
In version 2.0.0, the dataset comprises 39 data attributes, represented as columns.These include: a unique identifier; Possible Year(s); Year Assigned; Year Assigned (m.l.d.e.app.);Class; Carrier; Language; Type; Text; Evidence (i.e., where the evidence has been sourced, with the main bibliographic notes); Formula (full titulary); Formula (Full titulary, simplified); Formula (simplified); Day; Month; Indiction; Other supporting dating elements; Errors; Latitude; Longitude; Place; Region; Macro-Region, Image, and fifteen additional columns linking to external databases.10 The dataset comprises virtually all published material from CE 476 to 526, including 614 stone and 264 papyrus formulae.These records range from single to multi-year dates, like 461 or 482, or 4 th to 6 th c.In detail, over 500 dates are consular, nearly 200 are post-consular, and the remainder are of uncertain type.They are mainly in Greek and Latin, with a few bilingual scripts.These formulae date six groups of texts: 1) funerary; 2) administrative; 3) legal/fiscal; 4) monumental; 5) military; 6) 'other' (mostly private), and 7) uncertain.
Maps and graphs presented in Figures 1, 2, 3, 4, 5 to 6 illustrate that the distribution of these texts varies significantly by region, time, language, and type, reflecting the cultural complexities and varying conditions of preservation.Notably, although the dataset covers 10 ancient regions and over 190 findspots, more than 50% of the evidence is from the Italian diocese, with 40% specifically from Rome. 12inally, an analysis of 726 texts (about 80% of the total) reveals that over 70% of this sample shows abbreviations or variations in the titulary, misspellings, and other non-linguistic irregularities (i.e., omitted, wrong or conflicting dating elements within the formula).

OBJECT NAME
DataCons Dataset version 2.0.0

FORMAT NAMES AND VERSIONS
CSV UTF-8

DATASET CREATORS
Marco Dosi

LANGUAGE
Ancient Greek, Latin, English

REPOSITORY NAME
Zenodo

(4) REUSE POTENTIAL
The dataset offers an in-depth exploration of the late Roman world, with a particular focus on the consulship as both an institution and a dating system.Scholars using this resource can delve into a wide range of themes, including the socio-cultural and political roles of the consulship, its historical significance and decline, and its varied adoption as a dating system across societal levels and regions.The dataset facilitates detailed analysis of movement dynamics, spatial and geographical considerations, and the perception and management of time in the late Roman period.It also sheds lights on aspects of late Roman communications and bureaucratic operations, contributing to a comprehensive understanding of these complex topics.Additionally, the dataset is instrumental in examining the geopolitical dimensions of consular dissemination and their implications for West-East relations during periods like CE 476-526.By centralising material formerly dispersed across numerous publications, the dataset supports both quantitative and qualitative research, aligning with the digital trend in epigraphy, papyrology and related fields like palaeography and linguistics (Berti, 2019;Bodel, 2012;Liuzzo, 2019;Stokes, 2009Stokes, , 2015;;Xella & Zamora, 2018).Integration with platforms such as Trismegistos 13 (Depauw & Gheldof, 2014), Papy.info 14 , Pleiades 15 (Elliott, 2021), and similar projects will further enrich existing resources and deepen our understanding of specific texts, periods and regions.The ability to closely examine the full text of each document via hyperlinks facilitates a detailed analysis of linguistic shifts across different sources. (

) LIMITATIONS
The current dataset acknowledges certain areas in need of improvement.One primary shortcoming is the omission of detailed discussions on the dating of the formulae.This is obviously a major limitation, but at this stage the dataset aims to function as a checklist more than a critical edition of the entire corpus.Additionally, while the dataset offers valuable insights into the demographics of these regions and periods, it currently lacks a detailed, structured categorisation of the individuals referenced.
As mentioned, the dataset provides only the original text of the dating protocol rather than the full text of each document, and these texts have not been fully encoded.To mitigate these constraints, the dataset includes hyperlinks to external databases where complete and encoded texts are available.
I am committed to continually enhancing the dataset and addressing these shortcomings in future updates.In the meantime, users should remain cognisant of these limitations, which might restrict the scope of their analysis.
Consular dating typically involved two types of formulae: post-consular and consular formulae.Post-consular formulae, rendered as post consulatum + N. et N. (gen.) in Latin and μετὰ τὴν ὑπατείαν + N. καὶ N. (gen.) in Greek, dated events relative to the most recent consulship known at the place of writing.In contrast, consular formulae, which referenced the names of the consuls in office at the time of writing, were normally expressed as N. (et) N. consulibus or consulatu + N. (et) N. (gen.) in Latin, and as ὑπατείας + N. καὶ N. (gen.) or (ἐν) ὑπατείᾳ + N. καὶ N. in Greek.Most texts primarily provide the dating protocol, namely the consular or post-consular date, day and month designations, and occasionally other dating systems like the indiction, and the place of writing. 11papyri.info(last accessed: 28 November 2023.Papyri combines resources from several papyrological databases, including the Advanced Papyrological Information System (APIS), Duke Databank of Documentary Papyri (DDbDP), Heidelberger Gesamtverzeichnis der griechischen Papyrusurkunden Ägyptens (HGV), and Bibliographie Papyrologique (BP); Pleiades, https://pleiades.stoa.org(last accessed: 28 November 2023) and Trismegistos (TM), https://www.trismegistos.org(last accessed: 28 November 2023).Dosi Journal of Open Humanities Data DOI: 10.5334/johd.130

Figure 1
Figure 1 Overall geographical distribution of consular dates on stone and papyrus, CE 476-526.Colours indicate the material type, while size represents the density of evidence.

Figure 2 Figure 3
Figure 2 Linguistic evidence distribution map, CE 476-526.Colours show details about Language (formula).Size shows details about density of evidence.

Figure 4
Figure 4 Regional, linguistic and quantitative complexity of the evidence.Colour shows details about Language (formula).The evidence sourced from Hispaniae, North Africa, and Britannia is exclusively in Latin.

Figure 5
Figure 5 Chronological distribution by decade.Colour shows details about macroregion.

Figure 6
Figure 6 Quantity of material by language and region.Colour shows details about region.