
Links: Navigator Page | Chemical Index | State Index | Operator Index
![]() Open-FF |
Open-FF Data Dictionary This file was generated on May 17, 2025 |
![]() Sponsored by FracTracker Alliance |
Description of the contents of the final data files generated by Open-FF from the FracFocus data.¶
Pulling repo tables from: G:\My Drive\production\repos\openFF_data_2025_05_14\pickles
Acceptable use of FracFocus data¶
One requirement for using the FracFocus data is stipulated on the FracFocus website:
"Downloaded data may be aggregated or combined with other datasets, but the FracFocus data may not be altered in any way."
Please read the entire "Terms of use" at http://fracfocus.org/data-download.
The work in this project maintains the original FracFocus data as is reported
in the bulk download. The field names used in the original are kept: All of
these original names begin with an upper-case letter and can be identified
in that way. Fields generated by this project or from external data sources will begin with a lower case
letter (for example, CASNumber
is the original field, bgCAS
is the
generated field. Note there are two exceptions: DTXSID
and MI_inconsistent
are NOT
original with FracFocus.)
In the zipped bulk download from FracFocus, a data dictionary is provided in the 'readme.txt' file. (This zipped download is in the /sources or /data directory and we rename it as 'currentData.zip') This file gives some information about many of the fields found; however, it is written for the SQL database version of the bulk download, not the CSV version which we use in this project. Further, some important fields are not mentioned in that readme.txt file; they are described below. In the descriptions of all fields below, we cite the FracFocus text from a June 2021 bulk download.
Descriptions of fields in the output data sets¶
Explanation of columns in the table below |
---|
column | what it is |
---|---|
fieldName: | The name of the field or column in the data set. All field names that are capitalized are from the original FracFocus downloaded data. Lower-case names are generated by Open-FF. tables: Which Open-FF internal tables that are used to construct output data sets have this field |
FracFocus description: | Description of the (original) field given by FracFocus in the bulk download file, readme.txt. |
Open-FF description: | Our description of the field |
source: | is this field a direct copy of the original FracFocus data or is it generated by Open-FF, or pulled from an external data set? |
Num: | the number of non-empty values in the field |
Unique: | the number of unique types (including NaN) in the field |
Data_type: | the python/pandas data type for the field |
field Name, [tables] | FracFocus description | Open-FF description | source | Num | Unique | Data_type |
---|---|---|---|---|---|---|
Loading ITables v2.2.2 from the init_notebook_mode cell...
(need help?) |
Carrier detection sets:¶
Among the filters below, s1 finds the majority of water carriers. However, there is no single set of criteria that can be used to identify the water carrier record(s) for all FracFocus disclosures. Therefore the other filters are employed to catch many other disclosure patterns without needing to curate each by hand.
Set name | description | Criteria to be detected |
---|---|---|
s1 | Primary filter; most recent disclosures are detected with this | - Only one record whose Purpose is "carrier" (or related)- bgCAS is '7732-18-5'- at least 50% PercentHFJob - total % of disclosure is 95% > x > 105% |
s2 | More than one record as the carrier; covers situations, for example, where there are two water records (fresh and produced) and where other chemicals are also labeled as part of the carrier. It is important to include all water carrier records to avoid underestimating carrier mass |
- More than one record whose Purpose is "carrier" (or related)- at least one bgCAS is '7732-18-5'- total of water records is at least 50% PercentHFJob - total % of disclosure is 95% > x > 105% |
s3 | No carrier records labeled; but clear water record with typical percentage | - bgCAS is '7732-18-5'- at least 40% PercentHFJob - IngredientName contains phrase "including mix water"- total % of disclosure is 95% > x > 105% |
s4 | Like s3, but CAS number missing; still obvious water record | - CASNumber is empty - at least 60% PercentHFJob - IngredientName contains phrase "including mix water"- total % of disclosure is 95% > x > 105% |
s5 | Like s1 but no carrier records are labeled; | - bgCAS is '7732-18-5'- at least 50% PercentHFJob - total % of disclosure is 95% > x > 105% |
s6 | CASNumber missing but clear carrier label |
- bgCAS is ambiguousID- single record with a carrier Purpose - IngredientName is either "carrier" (or related) or has "water" in it- TradeName has "water" in it- 50% < %HFJob < 100% - total % of disclosure is 95% > x > 105% |
s7 | Like s1, but for "salted" water Note that even though the record is labeled with the salt CAS number, the predominant mass is water |
- Only one record whose Purpose is "carrier" (or related)- bgCAS is either '7747-40-7' (kcl) or '7647-14-5' (nacl)- at least 50% PercentHFJob - total % of disclosure is 95% > x > 105% |
s8 | Common pattern in the older disclosures (incl. SkyTruth archive) | - bgCAS is ambiguousID or 7732-18-5- IngredientName is MISSING- Purpose is "unrecorded purpose"- TradeName has either "water" or "brine"- can be one or two records in each disclosure - 50% < sum of PercentHFJob of these records < 100%- total % of disclosure is 95% > x > 105% |
s9 | Common pattern in the older disclosures (incl. SkyTruth archive) | - bgCAS is ambiguousID or 7732-18-5- IngredientName is MISSING- Purpose is one of the standard carrier words or phrases- TradeName has either "water" or "brine"- can be one or two records in each disclosure - 50% < sum of PercentHFJob of these records < 100%- total % of disclosure is 95% > x > 105% |
s10 | A pattern seen in later disclosures: the carrier is only reported in the top part of the systems approach section under the "Listed Below" CASNumber .The actual PercentHFJob value isn't even reported in the PDF version, but is in the bulk download. |
- CASNumber is "Listed Below"- record has a carrier Purpose - PercentHFJob >50 %- TradeName has "water" in it- total % of disclosure is 95% > x > 105% |
Disclosures with detected problems for determination of water carrier ID¶
code | description |
---|---|
0 | Disclosure has no valid chemical records. |
1 | TotalBaseWaterVolume is empty or 0 gallons. |
2 | None of the chemical records have non-zero PercentHFJob . |
3 | The sum of PercentHFJob values for valid CAS records is larger than limit (105%) |
4 | The sum of PercentHFJob values for all records excluding SystemApproach is larger than limit (105%) |
5 | PercentHFJob of all "proppant" records is greater than 50% (not used after v16) |
6 | The sum of PercentHFJob values for all records is less than 90% - a partial disclosure |
7 | PercentHFJob of Nitrogen or Carbon Dioxide records is greater than 50% (so carrier will be smaller) (not used after v16) |
8 | PercentHFJob of Chlorine dioxide records is 100% (it is typically an additive to the water; not a replacement) (added 3/2023, after v16). |
9 | PercentHFJob of Nonwater carrier record too large (>50%) (added 3/2023, after v16). |