A Dataset Describing the Manufacturing of Stone Tools Over 3 Million Years

This dataset is the product of an attempt to summarize the complexity and variability of tool-making sequences spanning the past 3 million years of hominin evolution. Each of the 155 entires in the dataset represents one technology, or one set of technologies, reported in the literature, and coded in terms of presence or absence of any one of 33 possible procedural units, or tool making techniques. The data were generated by coding published descriptions of technologies in the literature following the standards in a codebook. In total, 100 archaeological sites were sampled in addition to five non-human primate tool making behaviors, and five technologies produced in controlled flintknapping experiments. This is one of the primary datasets developed over the course of the Leakey Foundation funded project: Estimating the reliability of stone tools in reconstructing cultural relationships in prehistory . This dataset should be useful to researchers interested in studying technological variability at large spatio-temporal scales.

Chipped stone tools are the only source of information about technological change spanning the entirety of hominin evolution [1,2].This is because chipped stone artifacts have been essential components of the human niche and preserve very well, making them the most ubiquitous artifacts in the archaeological record [3].Stone tools are also useful for systematically measuring technological change because all stone tool production occurs under the same basic set of physical constraints [4,5].The diversity of chipped stone technologies found in the past 3.3 million years are based, in large part, on iterations of the same operation, flake removal, chained together in different ways [6][7][8].The decisions made while reducing cores and making tools leave diagnostic evidence [9].By studying the waste products from flaking events, and through comparison to experimentally produced artifacts, archaeologists reconstruct the kinds of actions that were taken by tool-makers in the past, how flakes were removed, and how the convexities of the core were managed across tool-making sequences.Reconstructions of tool making sequences are one of our primary sources for understanding prehistoric behavioral variation [10][11][12].Stone tools also tend to serve the same set of functions that were in demand across hominin evolution: cutting, scraping, and perforating [13].Thus, by focusing on lithic technology we may also hold constant many of the basic functions, and physical constraints in a way not possible if we had to track changes across different classes of technologies, such as stone tools, metals, basketry and household architecture.
This paper reports on a dataset describing the presence or absence of stone tool making techniques reported in published descriptions of technologies.In total, there are 155 entries in the dataset.Each technology in this recording system is broken up into sets of discrete actions, or "procedural units" following the initial definition by Perreault et al. [14].Procedural units are discrete, mutually exclusive manufacturing steps that can be chained together in the production of technologies (Table 1).The 33 procedural units include steps involved in core preparation (cresting, centripetal preparation, platform rejuvenation), the tools used to produce flakes (pressure flakers, anvils, hard hammers), and the nature of retouch (abrupt retouch, burination).The procedural unit system has been applied in other studies focused on quantifying technological variability [15,16].Of the 155 entries in the dataset, 145 are entries that record procedural units reported from archaeological assemblages.An additional five entries describe extant primate tool making behaviors, and another five describe the procedural units observed in experimentally produced assemblages.
The dataset was assembled for two main scientific purposes.The first purpose was to explore the timing and tempo of the evolution of cumulative culture in the hominin lineage.The second was to evaluate the relationship between similarities in lithic technologies and shared historical ancestry.Addressing both issues required collecting data of a large spatio-temporal scope.The resulting dataset should, however, be useful for other kinds of questions about technological variability within other theoretical frameworks.

SPATIAL COVERAGE
The dataset includes portions of Africa, Western Eurasia, North America, and Oceania.The sampled sites span between +69.7, and -46.3 Latitude, and between -169.6 and +170.5 Longitude (Figure 1).

TEMPORAL COVERAGE
The temporal coverage of the dataset extends from ~3.3 mya, to 19 th century A.D. The dates included in the dataset are based on published dates, though here they are reported as a range instead of a probability distribution.Most assemblages fall within the Late Holocene (Figure 2).

SAMPLING STRATEGY
We surveyed the literature for descriptions of lithic technology from dated archaeological contexts in Africa, Eurasia, Greenland, Sahul, Oceania and the Americas, from the earliest archaeological record through the late Holocene.To be included in the database, an archaeological context had to be accompanied by a detailed description of the lithic technology, including discussions about how cores were managed, and illustrations of debitage, cores, and retouched elements.In some cases, individual elements, such as detailed illustrations, or in-depth descriptions may have been absent.However, as long as another element was detailed enough to compensate, then that text was included in the sample.An idealized example of a stone tool description that would be selected for coding is the description of blade technology at Kathu Pan 1 by Jayne Wilkins and Michael Chazan [17].The paper includes in depth descriptions of artifacts, diagnostic pieces that verify the practice of particular procedural units, abundant illustrations to show the range of variability in cores, core trimming pieces, and blades, as well as discussion about the kinds of techniques for which there is no or weak evidence.Most coded descriptions of lithic assemblages are less detailed than the Wilkins paper, though they tend to have many of the same elements.
For each tool-making sequence identified, the 33 procedural units listed in Table 1 were coded as present or absent based on the standards outlined in a codebook described below.In cases where more than one technology was described in text, each distinct technology was coded, and an additional row was included describing all procedural units present across all technologies reported.This latter row type is distinguished from the other single-chain entries through a "no" entry in the "Single.Chain" column.In total, 62 individual tool making sequences were coded, and 83 assemblages were coded with all procedural units described as present across all technologies in the assemblage.
An additional set of extant primate technologies were also coded.These are based on published descriptions of observations of primate behavior, rather than only on the artifacts themselves.Finally, a set of technologies "invented" over the course of two flintknapping experiments, one by Snyder et al. [18] and one by Moore and Perston [19] were also coded.These, similarly, were coded based on descriptions of the artifacts produced, as well as the written observations of the flintknapping actions themselves.Table 1 Procedural units and their short definitions.In this study, all procedural units reported, or observable based on illustrations and tables of artifact types were described as present.This included procedural units belonging to separate reduction sequences that likely have been time averaged.
We developed a codebook to outline the standards by which any given procedural unit was to be counted as present or absent to help ensure reproducibility of the data collection process, and to help minimize betweencoder error.This codebook follows the structure of those developed at the Center for Disease Control for processing interview transcripts, and serves to prevent coders from applying their own heuristics to the coding process [20].The structure of the codebook includes definitions of the code (for example, the definition of débordante employed), but also explicit inclusion and exclusion criteria, as well as written phrases, terms, and example  illustrations that would be typical and atypical evidence sufficient to code the procedural unit as present.The codebook also has examples of evidence close, but not sufficient to code the technique as present.

CONSTRAINTS
The dataset is constrained in its reliance on publication of detailed descriptions of stone tool reduction sequences.As such, its temporal and spatial focus is patchy.However, future research effort could certainly expand the size of the dataset, as there are far more reports of sufficient detail to code.Another constraint is the lack of information about assemblage size.This is important for future studies that should investigate the relationship between the number of procedural units reported in an assemblage, and the size of that assemblage.Rarer procedural units may only be identified in particularly rich assemblages, for example.Furthermore, reporting biases may also skew the data in unpredictable ways.For example, the focus on most papers describing technological variation tend to focus on more "derived" aspects of assemblages.In an Acheulean assemblage, the focus may be on the handaxes and cleavers, with less attention paid to variability in things like pebble cores.Consequently, some technologies, like bipolar percussion, are likely under-represented in the literature, relative to how common the practice likely was in prehistory.

DATASET DESCRIPTION THE PROCEDURAL UNIT DATASET
Procedural units obfus.csv-a csv file outlining the presence or absence of procedural units coded based on published descriptions of assemblages.
R code procedural unit dataset.R -A simple R code file to load the procedural unit data and perform some simple subsetting and plotting of the data.
Sources.docx -A word document outlining the texts that were coded.

THE PROCEDURAL UNIT CODEBOOK
Procedural unit codebook.pdf-This is the codebook used to determine whether any given procedural unit was present or absent based on a reading of a published descriptions of stone tool technologies.

CREATION DATES
This dataset was developed between 2017 and 2021.

REUSE POTENTIAL
The dataset represents the most spatially, and temporally expansive example of an attempt to quantify variability in chaînes opératoires, which are notoriously difficult to handle quantitatively.This means that this data set has potential to address several kinds of questions using different kinds of statistics that cannot normally be applied to chaînes opératoires.The data also provide opportunities to measure spatio-temporal patterning in reduction sequences [16].

LIMITATIONS
Big blocks of space and time are missing.For example, there are no datapoints from South America, Asia, or West Africa.This is partly because of stringent coding criteria requiring a certain style of stone tool study, which may be less common in some research traditions compared to others, in addition to linguistic barriers.

7 .
CrestingCresting to shape core face during initial steps of core preparation.8. Débordante shaping Convexities maintained through flakes along lateral margins of core face 9. Overshot flaking Invasive flake removals that clip or remove the distal margin of the core 10.Kombewa flaking Removal of flake from ventral surface of a flake 11.Core tablet Removal of core platform by striking flake into face 12. Abrasion Abrasion or grinding performed at any point in reduction sequence.13.Trimming platform overhang Removal of chips to modify area below platform.14.Hard hammer percussion Use of hard hammer 15.Support core with hand Support of core by hand 16.Use of an anvil Use of an anvil 17.Core rotation Rotation of core 18.Soft hammer percussion use of a soft hammer 19.Indirect percussion Use of a punch to remove flakes 20.Flaking through pressure Removal of flakes through application of pressure on core platform 21.Hammer dressing Modification of a piece through pecking 22. Invasive flaking Removal of non-cortical flakes that extend beyond the midpoint of the piece 23.Ochre use Use of ochre 24.Asphalt use Use of asphalt 25.Retouch Retouch of flake or core tool (unifacial only) 26.Backing Retouch forms an abrupt, scraper-like margin 27.Notching Retouch forms round concavity 28.Burination Removal of spalls along the margins of flakes 29.Tanging Retouching base of piece to form a tang 30.Tranchet Rejuvenation of core tool by striking a flake across the edge 31.Bifacial retouch Retouch on both faces of a flake or core-tool 32.Invasive retouch Retouch that extends to the midline of a tool 33.Pressure flaked retouch Pressure flaking retouch

Figure 1
Figure 1 Assemblages sampled in this dataset.The dataset contains 145 data entries across 100 archaeological assemblages, in addition to five non-human primate technologies, and five experimentally produced technologies.

Figure 2
Figure 2The age distribution of all archaeological assemblages sampled in this dataset.