Fuzz introspector
For issues and ideas: https://github.com/ossf/fuzz-introspector/issues

Project functions overview

The following table shows data about each function in the project. The functions included in this table correspond to all functions that exist in the executables of the fuzzers. As such, there may be functions that are from third-party libraries.

For further technical details on the meaning of columns in the below table, please see the Glossary .

Func name Functions filename Args Function call depth Reached by Fuzzers Runtime reached by Fuzzers Combined reached by Fuzzers Fuzzers runtime hit Func lines hit % I Count BB Count Cyclomatic complexity Functions reached Reached by functions Accumulated cyclomatic complexity Undiscovered complexity

Fuzzer details

Fuzzer: extract_text_to_fp_fuzzer

Call tree

The calltree shows the control flow of the fuzzer. This is overlaid with coverage information to display how much of the potential code a fuzzer can reach is in fact covered at runtime. In the following there is a link to a detailed calltree visualisation as well as a bitmap showing a high-level view of the calltree. For further information about these topics please see the glossary for full calltree and calltree overview

Call tree overview bitmap:

The distribution of callsites in terms of coloring is
Color Runtime hitcount Callsite count Percentage
red 0 535 59.3%
gold [1:9] 0 0.0%
yellow [10:29] 0 0.0%
greenyellow [30:49] 0 0.0%
lawngreen 50+ 367 40.6%
All colors 902 100

Fuzz blockers

The following nodes represent call sites where fuzz blockers occur.

Amount of callsites blocked Calltree index Parent function Callsite Largest blocked function
217 565 pdfminer.pdffont.PDFFont._parse_bbox call site: 00565 pdfminer.pdffont.PDFCIDFont.__init__
96 171 pdfminer.pdftypes.decompress_corrupted call site: 00171 pdfminer.ccitt.ccittfaxdecode
41 21 pdfminer.converter.PDFConverter._is_binary_stream call site: 00021 pdfminer.converter.HTMLConverter.__init__
35 785 pdfminer.utils.choplist call site: 00785 pdfminer.pdfinterp.PDFResourceManager.get_font
21 880 pdfminer.converter.PDFLayoutAnalyzer.end_page call site: 00880 pdfminer.converter.HTMLConverter.close
19 489 pdfminer.encodingdb.name2unicode call site: 00489 .map
17 821 pdfminer.pdfinterp.PDFResourceManager.get_font call site: 00821 pdfminer.pdfinterp.PDFPageInterpreter.init_resources.get_colorspace
13 270 pdfminer.pdftypes.int_value call site: 00270 pdfminer.utils.apply_tiff_predictor
12 510 pdfminer.pdffont.PDFSimpleFont.__init__ call site: 00510 pdfminer.pdftypes.PDFStream.get_data
11 0 EP call site: 00000 pdfminer.high_level.extract_text_to_fp
5 350 pdfminer.psparser.literal_name call site: 00350 pdfminer.pdftypes.int_value
5 474 pdfminer.pdffont.PDFSimpleFont.__init__ call site: 00474 pdfminer.encodingdb.EncodingDB.get_encoding

Runtime coverage analysis

Covered functions
373
Functions that are reachable but not covered
215
Reachable functions
318
Percentage of reachable functions covered
32.39%
NB: The sum of covered functions and functions that are reachable but not covered need not be equal to Reachable functions . This is because the reachability analysis is an approximation and thus at runtime some functions may be covered that are not included in the reachability analysis. This is a limitation of our static analysis capabilities.
Warning: The number of covered functions are larger than the number of reachable functions. This means that there are more functions covered at runtime than are extracted using static analysis. This is likely a result of the static analysis component failing to extract the right call graph or the coverage runtime being compiled with sanitizers in code that the static analysis has not analysed. This can happen if lto/gold is not used in all places that coverage instrumentation is used.
Function name source code lines source lines hit percentage hit

Files reached

filename functions hit
/ 1
...pdfminer.six.fuzzing.extract_text_to_fp_fuzzer 11
pdfminer.high_level 16
pdfminer.image 2
pdfminer.converter 31
pdfminer.pdfdevice 5
pdfminer.pdfinterp 43
pdfminer.pdfpage 29
pdfminer.pdfparser 3
pdfminer.psparser 7
pdfminer.pdfdocument 54
pdfminer.casting 8
pdfminer.pdftypes 31
pdfminer.lzw 12
pdfminer.ascii85 7
pdfminer.runlength 5
pdfminer.ccitt 35
pdfminer.utils 17
pdfminer.layout 6
pdfminer.pdffont 74
pdfminer.encodingdb 17
pdfminer.cmapdb 29

Fuzzer: page_extraction_fuzzer

Call tree

The calltree shows the control flow of the fuzzer. This is overlaid with coverage information to display how much of the potential code a fuzzer can reach is in fact covered at runtime. In the following there is a link to a detailed calltree visualisation as well as a bitmap showing a high-level view of the calltree. For further information about these topics please see the glossary for full calltree and calltree overview

Call tree overview bitmap:

The distribution of callsites in terms of coloring is
Color Runtime hitcount Callsite count Percentage
red 0 491 51.8%
gold [1:9] 0 0.0%
yellow [10:29] 0 0.0%
greenyellow [30:49] 0 0.0%
lawngreen 50+ 456 48.1%
All colors 947 100

Fuzz blockers

The following nodes represent call sites where fuzz blockers occur.

Amount of callsites blocked Calltree index Parent function Callsite Largest blocked function
217 498 pdfminer.pdffont.PDFFont._parse_bbox call site: 00498 pdfminer.pdffont.PDFCIDFont.__init__
93 111 pdfminer.pdftypes.decompress_corrupted call site: 00111 pdfminer.ccitt.ccittfaxdecode
34 718 pdfminer.utils.choplist call site: 00718 pdfminer.pdfinterp.PDFResourceManager.get_font
19 422 pdfminer.encodingdb.name2unicode call site: 00422 .map
16 753 pdfminer.pdfinterp.PDFResourceManager.get_font call site: 00753 pdfminer.pdfinterp.PDFPageInterpreter.init_resources.get_colorspace
13 207 pdfminer.pdftypes.int_value call site: 00207 pdfminer.utils.apply_tiff_predictor
12 443 pdfminer.pdffont.PDFSimpleFont.__init__ call site: 00443 pdfminer.pdftypes.PDFStream.get_data
10 814 pdfminer.layout.LTLayoutContainer.analyze call site: 00814 obj0.is_voverlap
6 0 EP call site: 00000 pdfminer.high_level.extract_pages
5 286 pdfminer.psparser.literal_name call site: 00286 pdfminer.pdftypes.int_value
5 407 pdfminer.pdffont.PDFSimpleFont.__init__ call site: 00407 pdfminer.encodingdb.EncodingDB.get_encoding
5 413 pdfminer.encodingdb.EncodingDB.get_encoding call site: 00413 pdfminer.encodingdb.name2unicode

Runtime coverage analysis

Covered functions
373
Functions that are reachable but not covered
225
Reachable functions
347
Percentage of reachable functions covered
35.16%
NB: The sum of covered functions and functions that are reachable but not covered need not be equal to Reachable functions . This is because the reachability analysis is an approximation and thus at runtime some functions may be covered that are not included in the reachability analysis. This is a limitation of our static analysis capabilities.
Warning: The number of covered functions are larger than the number of reachable functions. This means that there are more functions covered at runtime than are extracted using static analysis. This is likely a result of the static analysis component failing to extract the right call graph or the coverage runtime being compiled with sanitizers in code that the static analysis has not analysed. This can happen if lto/gold is not used in all places that coverage instrumentation is used.
Function name source code lines source lines hit percentage hit

Files reached

filename functions hit
/ 1
...pdfminer.six.fuzzing.page_extraction_fuzzer 8
pdfminer.high_level 9
pdfminer.layout 68
pdfminer.utils 27
pdfminer.converter 12
pdfminer.pdfinterp 41
pdfminer.pdfpage 29
pdfminer.pdfparser 3
pdfminer.psparser 7
pdfminer.pdfdocument 47
pdfminer.pdftypes 31
pdfminer.lzw 12
pdfminer.ascii85 7
pdfminer.runlength 5
pdfminer.ccitt 32
pdfminer.pdfdevice 4
pdfminer.pdffont 75
pdfminer.encodingdb 17
pdfminer.cmapdb 29
pdfminer.casting 7

Fuzzer: extract_text_fuzzer

Call tree

The calltree shows the control flow of the fuzzer. This is overlaid with coverage information to display how much of the potential code a fuzzer can reach is in fact covered at runtime. In the following there is a link to a detailed calltree visualisation as well as a bitmap showing a high-level view of the calltree. For further information about these topics please see the glossary for full calltree and calltree overview

Call tree overview bitmap:

The distribution of callsites in terms of coloring is
Color Runtime hitcount Callsite count Percentage
red 0 464 55.2%
gold [1:9] 0 0.0%
yellow [10:29] 0 0.0%
greenyellow [30:49] 0 0.0%
lawngreen 50+ 376 44.7%
All colors 840 100

Fuzz blockers

The following nodes represent call sites where fuzz blockers occur.

Amount of callsites blocked Calltree index Parent function Callsite Largest blocked function
217 519 pdfminer.pdffont.PDFFont._parse_bbox call site: 00519 pdfminer.pdffont.PDFCIDFont.__init__
93 129 pdfminer.pdftypes.decompress_corrupted call site: 00129 pdfminer.ccitt.ccittfaxdecode
35 739 pdfminer.utils.choplist call site: 00739 pdfminer.pdfinterp.PDFResourceManager.get_font
19 443 pdfminer.encodingdb.name2unicode call site: 00443 .map
17 775 pdfminer.pdfinterp.PDFResourceManager.get_font call site: 00775 pdfminer.pdfinterp.PDFPageInterpreter.init_resources.get_colorspace
13 225 pdfminer.pdftypes.int_value call site: 00225 pdfminer.utils.apply_tiff_predictor
12 464 pdfminer.pdffont.PDFSimpleFont.__init__ call site: 00464 pdfminer.pdftypes.PDFStream.get_data
5 304 pdfminer.psparser.literal_name call site: 00304 pdfminer.pdftypes.int_value
5 428 pdfminer.pdffont.PDFSimpleFont.__init__ call site: 00428 pdfminer.encodingdb.EncodingDB.get_encoding
5 434 pdfminer.encodingdb.EncodingDB.get_encoding call site: 00434 pdfminer.encodingdb.name2unicode
4 254 pdfminer.utils.apply_png_predictor call site: 00254 .enumerate
4 374 pdfminer.pdfinterp.PDFPageInterpreter.process_page call site: 00374 pdfminer.pdfdevice.TagExtractor._write

Runtime coverage analysis

Covered functions
373
Functions that are reachable but not covered
195
Reachable functions
294
Percentage of reachable functions covered
33.67%
NB: The sum of covered functions and functions that are reachable but not covered need not be equal to Reachable functions . This is because the reachability analysis is an approximation and thus at runtime some functions may be covered that are not included in the reachability analysis. This is a limitation of our static analysis capabilities.
Warning: The number of covered functions are larger than the number of reachable functions. This means that there are more functions covered at runtime than are extracted using static analysis. This is likely a result of the static analysis component failing to extract the right call graph or the coverage runtime being compiled with sanitizers in code that the static analysis has not analysed. This can happen if lto/gold is not used in all places that coverage instrumentation is used.
Function name source code lines source lines hit percentage hit

Files reached

filename functions hit
/ 1
...pdfminer.six.fuzzing.extract_text_fuzzer 7
pdfminer.high_level 10
pdfminer.layout 8
pdfminer.utils 21
pdfminer.converter 11
pdfminer.pdfinterp 43
pdfminer.pdfpage 29
pdfminer.pdfparser 3
pdfminer.psparser 7
pdfminer.pdfdocument 54
pdfminer.casting 8
pdfminer.pdftypes 31
pdfminer.lzw 12
pdfminer.ascii85 7
pdfminer.runlength 5
pdfminer.ccitt 32
pdfminer.pdfdevice 4
pdfminer.pdffont 74
pdfminer.encodingdb 17
pdfminer.cmapdb 29

Analyses and suggestions

Optimal target analysis

Remaining optimal interesting functions

The following table shows a list of functions that are optimal targets. Optimal targets are identified by finding the functions that in combination, yield a high code coverage.

Func name Functions filename Arg count Args Function depth hitcount instr count bb count cyclomatic complexity Reachable functions Incoming references total cyclomatic complexity Unreached complexity
pdfminer.converter.HTMLConverter.receive_layout.render pdfminer.converter 1 ['N/A'] 6 0 24 15 9 75 2 244 196
pdfminer.psparser.PSStackParser.nextobject pdfminer.psparser 1 ['N/A'] 3 0 15 13 8 48 0 167 127
pdfminer.pdfdocument.PDFStandardSecurityHandlerV5.authenticate pdfminer.pdfdocument 2 ['N/A', 'N/A'] 3 0 0 2 4 26 0 85 82
pdfminer.pdfinterp.PDFPageInterpreter.do_TJ pdfminer.pdfinterp 2 ['N/A', 'N/A'] 4 0 2 2 4 32 3 106 67
pdfminer.pdfinterp.PDFPageInterpreter.do_Do pdfminer.pdfinterp 2 ['N/A', 'N/A'] 6 0 8 3 4 41 0 131 52
pdfminer.cmapdb.CMapParser.do_keyword pdfminer.cmapdb 3 ['N/A', 'N/A', 'N/A'] 4 0 27 29 15 51 0 179 42
pdfminer.pdfdocument.PDFDocument.getobj pdfminer.pdfdocument 2 ['N/A', 'N/A'] 5 0 4 6 5 107 2 367 40
pdfminer.pdfdocument.PageLabels.labels pdfminer.pdfdocument 1 ['N/A'] 3 0 2 3 4 24 0 82 38

Implementing fuzzers that target the above functions will improve reachability such that it becomes:

Functions statically reachable by fuzzers
48.0%
292 / 611
Cyclomatic complexity statically reachable by fuzzers
51.0%
1068 / 2111

All functions overview

If you implement fuzzers for these functions, the status of all functions in the project will be:

Func name Functions filename Args Function call depth Reached by Fuzzers Runtime reached by Fuzzers Combined reached by Fuzzers Fuzzers runtime hit Func lines hit % I Count BB Count Cyclomatic complexity Functions reached Reached by functions Accumulated cyclomatic complexity Undiscovered complexity

Fuzz engine guidance

This sections provides heuristics that can be used as input to a fuzz engine when running a given fuzz target. The current focus is on providing input that is usable by libFuzzer.

fuzzing/extract_text_to_fp_fuzzer.py

Dictionary

Use this with the libFuzzer -dict=DICT.file flag


Fuzzer function priority

Use one of these functions as input to libfuzzer with flag: -focus_function name

-focus_function=['pdfminer.pdffont.PDFFont._parse_bbox', 'pdfminer.pdftypes.decompress_corrupted', 'pdfminer.converter.PDFConverter._is_binary_stream', 'pdfminer.utils.choplist', 'pdfminer.converter.PDFLayoutAnalyzer.end_page', 'pdfminer.encodingdb.name2unicode', 'pdfminer.pdfinterp.PDFResourceManager.get_font', 'pdfminer.pdftypes.int_value', 'pdfminer.pdffont.PDFSimpleFont.__init__']

fuzzing/page_extraction_fuzzer.py

Dictionary

Use this with the libFuzzer -dict=DICT.file flag


Fuzzer function priority

Use one of these functions as input to libfuzzer with flag: -focus_function name

-focus_function=['pdfminer.pdffont.PDFFont._parse_bbox', 'pdfminer.pdftypes.decompress_corrupted', 'pdfminer.utils.choplist', 'pdfminer.encodingdb.name2unicode', 'pdfminer.pdfinterp.PDFResourceManager.get_font', 'pdfminer.pdftypes.int_value', 'pdfminer.pdffont.PDFSimpleFont.__init__', 'pdfminer.layout.LTLayoutContainer.analyze', 'pdfminer.psparser.literal_name']

fuzzing/extract_text_fuzzer.py

Dictionary

Use this with the libFuzzer -dict=DICT.file flag


Fuzzer function priority

Use one of these functions as input to libfuzzer with flag: -focus_function name

-focus_function=['pdfminer.pdffont.PDFFont._parse_bbox', 'pdfminer.pdftypes.decompress_corrupted', 'pdfminer.utils.choplist', 'pdfminer.encodingdb.name2unicode', 'pdfminer.pdfinterp.PDFResourceManager.get_font', 'pdfminer.pdftypes.int_value', 'pdfminer.pdffont.PDFSimpleFont.__init__', 'pdfminer.psparser.literal_name', 'pdfminer.encodingdb.EncodingDB.get_encoding']

Files and Directories in report

This section shows which files and directories are considered in this report. The main reason for showing this is fuzz introspector may include more code in the reasoning than is desired. This section helps identify if too many files/directories are included, e.g. third party code, which may be irrelevant for the threat model. In the event too much is included, fuzz introspector supports a configuration file that can exclude data from the report. See the following link for more information on how to create a config file: link

Files in report

Source file Reached by Covered by
[] []
pdfminer.encodingdb ['extract_text_to_fp_fuzzer', 'page_extraction_fuzzer', 'extract_text_fuzzer'] []
pdfminer.pdfexceptions [] []
pdfminer.runlength ['extract_text_to_fp_fuzzer', 'page_extraction_fuzzer', 'extract_text_fuzzer'] []
zlib [] []
binascii [] []
struct [] []
pdfminer.utils ['extract_text_to_fp_fuzzer', 'page_extraction_fuzzer', 'extract_text_fuzzer'] []
pdfminer.layout ['extract_text_to_fp_fuzzer', 'page_extraction_fuzzer', 'extract_text_fuzzer'] []
pdfminer.cmapdb ['extract_text_to_fp_fuzzer', 'page_extraction_fuzzer', 'extract_text_fuzzer'] []
unittest [] []
pdfminer.ccitt ['extract_text_to_fp_fuzzer', 'page_extraction_fuzzer', 'extract_text_fuzzer'] []
warnings [] []
itertools [] []
...pdfminer.six.fuzzing.page_extraction_fuzzer ['page_extraction_fuzzer'] []
pdfminer.psexceptions [] []
pdfminer.pdfcolor [] []
collections [] []
logging [] []
unicodedata [] []
pdfminer.lzw ['extract_text_to_fp_fuzzer', 'page_extraction_fuzzer', 'extract_text_fuzzer'] []
base64 [] []
stringprep [] []
os [] []
gzip [] []
PIL [] []
pdfminer.pdfinterp ['extract_text_to_fp_fuzzer', 'page_extraction_fuzzer', 'extract_text_fuzzer'] []
heapq [] []
hashlib [] []
io [] []
pdfminer.ascii85 ['extract_text_to_fp_fuzzer', 'page_extraction_fuzzer', 'extract_text_fuzzer'] []
pdfminer._saslprep [] []
pdfminer.pdfparser ['extract_text_to_fp_fuzzer', 'page_extraction_fuzzer', 'extract_text_fuzzer'] []
pdfminer.pdftypes ['extract_text_to_fp_fuzzer', 'page_extraction_fuzzer', 'extract_text_fuzzer'] []
pdfminer.pdffont ['extract_text_to_fp_fuzzer', 'page_extraction_fuzzer', 'extract_text_fuzzer'] []
charset_normalizer [] []
pdfminer.converter ['extract_text_to_fp_fuzzer', 'page_extraction_fuzzer', 'extract_text_fuzzer'] []
pdfminer.high_level ['extract_text_to_fp_fuzzer', 'page_extraction_fuzzer', 'extract_text_fuzzer'] []
atheris [] []
typing [] []
pdfminer.pdfdocument ['extract_text_to_fp_fuzzer', 'page_extraction_fuzzer', 'extract_text_fuzzer'] []
pdfminer.pdfdevice ['extract_text_to_fp_fuzzer', 'page_extraction_fuzzer', 'extract_text_fuzzer'] []
pdfminer.psparser ['extract_text_to_fp_fuzzer', 'page_extraction_fuzzer', 'extract_text_fuzzer'] []
pdfminer [] []
array [] []
pdfminer.fontmetrics [] []
html [] []
contextlib [] []
pygame [] []
[] []
json [] []
...pdfminer.six.fuzzing.extract_text_to_fp_fuzzer ['extract_text_to_fp_fuzzer'] []
pdfminer.casting ['extract_text_to_fp_fuzzer', 'page_extraction_fuzzer', 'extract_text_fuzzer'] []
pdfminer.arcfour [] []
pdfminer.data_structures [] []
pdfminer.glyphlist [] []
math [] []
pdfminer.image ['extract_text_to_fp_fuzzer'] []
pdfminer.settings [] []
pdfminer.latin_enc [] []
pdfminer.jbig2 [] []
importlib [] []
...pdfminer.six.fuzzing.extract_text_fuzzer ['extract_text_fuzzer'] []
re [] []
pdfminer.pdfpage ['extract_text_to_fp_fuzzer', 'page_extraction_fuzzer', 'extract_text_fuzzer'] []
cryptography [] []
fuzzing [] []

Directories in report

Directory