No description has been provided for this image

Links: Navigator Page | Chemical Index | State Index | Operator Index


openFF logo

Open-FF

Open-FF Data Dictionary


This file was generated on May 17, 2025
from data repository: openFF_data_2025_05_14.

FracTracker logo

Sponsored by FracTracker Alliance


Description of the contents of the final data files generated by Open-FF from the FracFocus data.¶

Pulling repo tables from: G:\My Drive\production\repos\openFF_data_2025_05_14\pickles

Acceptable use of FracFocus data¶

One requirement for using the FracFocus data is stipulated on the FracFocus website:

"Downloaded data may be aggregated or combined with other datasets, but the FracFocus data may not be altered in any way."

Please read the entire "Terms of use" at http://fracfocus.org/data-download.

The work in this project maintains the original FracFocus data as is reported in the bulk download. The field names used in the original are kept: All of these original names begin with an upper-case letter and can be identified in that way. Fields generated by this project or from external data sources will begin with a lower case letter (for example, CASNumber is the original field, bgCAS is the generated field. Note there are two exceptions: DTXSID and MI_inconsistent are NOT original with FracFocus.)

In the zipped bulk download from FracFocus, a data dictionary is provided in the 'readme.txt' file. (This zipped download is in the /sources or /data directory and we rename it as 'currentData.zip') This file gives some information about many of the fields found; however, it is written for the SQL database version of the bulk download, not the CSV version which we use in this project. Further, some important fields are not mentioned in that readme.txt file; they are described below. In the descriptions of all fields below, we cite the FracFocus text from a June 2021 bulk download.

Descriptions of fields in the output data sets¶

Explanation of columns in the table below
column what it is
fieldName: The name of the field or column in the data set. All field names that are capitalized are from the original FracFocus downloaded data. Lower-case names are generated by Open-FF.
tables: Which Open-FF internal tables that are used to construct output data sets have this field
FracFocus description: Description of the (original) field given by FracFocus in the bulk download file, readme.txt.
Open-FF description: Our description of the field
source: is this field a direct copy of the original FracFocus data or is it generated by Open-FF, or pulled from an external data set?
Num: the number of non-empty values in the field
Unique: the number of unique types (including NaN) in the field
Data_type: the python/pandas data type for the field
field Name, [tables] FracFocus description Open-FF description source Num Unique Data_type
Loading ITables v2.2.2 from the init_notebook_mode cell... (need help?)
field Name, [tables] FracFocus description Open-FF description source Num Unique Data_type

Carrier detection sets:¶

Among the filters below, s1 finds the majority of water carriers. However, there is no single set of criteria that can be used to identify the water carrier record(s) for all FracFocus disclosures. Therefore the other filters are employed to catch many other disclosure patterns without needing to curate each by hand.

Set name description Criteria to be detected
s1 Primary filter; most recent disclosures are detected with this - Only one record whose Purpose is "carrier" (or related)
- bgCAS is '7732-18-5'
- at least 50% PercentHFJob
- total % of disclosure is 95% > x > 105%
s2 More than one record as the carrier;
covers situations, for example, where there are two water records
(fresh and produced) and where other chemicals are also labeled as part of the carrier.
It is important to include all water carrier records
to avoid underestimating carrier mass
- More than one record whose Purpose is "carrier" (or related)
- at least one bgCAS is '7732-18-5'
- total of water records is at least 50% PercentHFJob
- total % of disclosure is 95% > x > 105%
s3 No carrier records labeled; but clear water record with typical percentage - bgCAS is '7732-18-5'
- at least 40% PercentHFJob
- IngredientName contains phrase "including mix water"
- total % of disclosure is 95% > x > 105%
s4 Like s3, but CAS number missing; still obvious water record - CASNumber is empty
- at least 60% PercentHFJob
- IngredientName contains phrase "including mix water"
- total % of disclosure is 95% > x > 105%
s5 Like s1 but no carrier records are labeled;
- bgCAS is '7732-18-5'
- at least 50% PercentHFJob
- total % of disclosure is 95% > x > 105%
s6 CASNumber missing but clear carrier label - bgCAS is ambiguousID
- single record with a carrier Purpose
- IngredientName is either "carrier" (or related) or has "water" in it
- TradeName has "water" in it
- 50% < %HFJob < 100%
- total % of disclosure is 95% > x > 105%
s7 Like s1, but for "salted" water
Note that even though the record is labeled with the salt CAS number,
the predominant mass is water
- Only one record whose Purpose is "carrier" (or related)
- bgCAS is either '7747-40-7' (kcl) or '7647-14-5' (nacl)
- at least 50% PercentHFJob
- total % of disclosure is 95% > x > 105%
s8 Common pattern in the older disclosures (incl. SkyTruth archive) - bgCAS is ambiguousID or 7732-18-5
- IngredientName is MISSING
- Purpose is "unrecorded purpose"
- TradeName has either "water" or "brine"
- can be one or two records in each disclosure
- 50% < sum of PercentHFJob of these records < 100%
- total % of disclosure is 95% > x > 105%
s9 Common pattern in the older disclosures (incl. SkyTruth archive) - bgCAS is ambiguousID or 7732-18-5
- IngredientName is MISSING
- Purpose is one of the standard carrier words or phrases
- TradeName has either "water" or "brine"
- can be one or two records in each disclosure
- 50% < sum of PercentHFJob of these records < 100%
- total % of disclosure is 95% > x > 105%
s10 A pattern seen in later disclosures:
the carrier is only reported in the top part of the
systems approach section under the "Listed Below" CASNumber.
The actual PercentHFJob value isn't even reported in the PDF
version, but is in the bulk download.
- CASNumber is "Listed Below"
- record has a carrier Purpose
- PercentHFJob>50 %
- TradeName has "water" in it
- total % of disclosure is 95% > x > 105%

Disclosures with detected problems for determination of water carrier ID¶

code description
0 Disclosure has no valid chemical records.
1 TotalBaseWaterVolume is empty or 0 gallons.
2 None of the chemical records have non-zero PercentHFJob.
3 The sum of PercentHFJob values for valid CAS records is larger than limit (105%)
4 The sum of PercentHFJob values for all records excluding SystemApproach is larger than limit (105%)
5 PercentHFJob of all "proppant" records is greater than 50% (not used after v16)
6 The sum of PercentHFJob values for all records is less than 90% - a partial disclosure
7 PercentHFJob of Nitrogen or Carbon Dioxide records is greater than 50% (so carrier will be smaller) (not used after v16)
8 PercentHFJob of Chlorine dioxide records is 100% (it is typically an additive to the water; not a replacement) (added 3/2023, after v16).
9 PercentHFJob of Nonwater carrier record too large (>50%) (added 3/2023, after v16).