Fuzz introspector
For issues and ideas: https://github.com/ossf/fuzz-introspector/issues

Project functions overview

The following table shows data about each function in the project. The functions included in this table correspond to all functions that exist in the executables of the fuzzers. As such, there may be functions that are from third-party libraries.

For further technical details on the meaning of columns in the below table, please see the Glossary .

Func name Functions filename Args Function call depth Reached by Fuzzers Runtime reached by Fuzzers Combined reached by Fuzzers Fuzzers runtime hit Func lines hit % I Count BB Count Cyclomatic complexity Functions reached Reached by functions Accumulated cyclomatic complexity Undiscovered complexity

Fuzzer details

Fuzzer: sanitize_fuzzer

Call tree

The calltree shows the control flow of the fuzzer. This is overlaid with coverage information to display how much of the potential code a fuzzer can reach is in fact covered at runtime. In the following there is a link to a detailed calltree visualisation as well as a bitmap showing a high-level view of the calltree. For further information about these topics please see the glossary for full calltree and calltree overview

Call tree overview bitmap:

The distribution of callsites in terms of coloring is
Color Runtime hitcount Callsite count Percentage
red 0 553 50.2%
gold [1:9] 0 0.0%
yellow [10:29] 0 0.0%
greenyellow [30:49] 0 0.0%
lawngreen 50+ 548 49.7%
All colors 1101 100

Fuzz blockers

The following nodes represent call sites where fuzz blockers occur.

Amount of callsites blocked Calltree index Parent function Callsite Largest blocked function
69 61 bleach._vendor.html5lib._inputstream.HTMLUnicodeInputStream.__init__ call site: 00061 bleach._vendor.html5lib._inputstream.HTMLBinaryInputStream.__init__
54 863 bleach.html5lib_shim.match_entity call site: 00863 bleach._vendor.html5lib.filters.alphabeticalattributes.Filter.__iter__
50 996 bleach.sanitizer.BleachSanitizerFilter.sanitize_stream call site: 00996 bleach._vendor.html5lib.filters.optionaltags.Filter.__iter__
31 274 bleach._vendor.html5lib._tokenizer.HTMLTokenizer.entityDataState call site: 00274 bleach._vendor.html5lib._tokenizer.HTMLTokenizer.consumeNumberEntity
27 920 bleach.sanitizer.BleachSanitizerFilter.merge_characters call site: 00920 bleach._vendor.html5lib.filters.alphabeticalattributes.Filter.__iter__
19 309 bleach._vendor.html5lib._trie.py.Trie.has_keys_with_prefix call site: 00309 bleach._vendor.html5lib._inputstream.HTMLUnicodeInputStream.char
15 838 bleach.sanitizer.Cleaner.clean call site: 00838 bleach._vendor.html5lib.serializer.HTMLSerializer.serialize
14 155 bleach._vendor.html5lib._inputstream.HTMLUnicodeInputStream.readChunk call site: 00155 bleach._vendor.html5lib._utils.isSurrogatePair
14 445 bleach._vendor.html5lib._tokenizer.HTMLTokenizer.__iter__ call site: 00445 bleach._vendor.html5lib._inputstream.HTMLUnicodeInputStream.charsUntil
13 46 bleach._vendor.html5lib._inputstream.HTMLUnicodeInputStream.__init__ call site: 00046 bleach._vendor.html5lib._inputstream.BufferedStream.seek
11 973 bleach.sanitizer.BleachSanitizerFilter.sanitize_uri_value call site: 00973 normalized_uri.split
10 505 bleach._vendor.html5lib._tokenizer.HTMLTokenizer.__iter__ call site: 00505 bleach._vendor.html5lib._inputstream.HTMLUnicodeInputStream.charsUntil

Runtime coverage analysis

Covered functions
287
Functions that are reachable but not covered
195
Reachable functions
296
Percentage of reachable functions covered
34.12%
NB: The sum of covered functions and functions that are reachable but not covered need not be equal to Reachable functions . This is because the reachability analysis is an approximation and thus at runtime some functions may be covered that are not included in the reachability analysis. This is a limitation of our static analysis capabilities.
Function name source code lines source lines hit percentage hit

Files reached

filename functions hit
/ 1
...sanitize_fuzzer 3
bleach 2
bleach.sanitizer 51
bleach.html5lib_shim 33
bleach._vendor.html5lib.serializer 29
bleach._vendor.html5lib.html5parser 22
bleach._vendor.html5lib._tokenizer 95
bleach._vendor.html5lib._inputstream 62
bleach._vendor.html5lib._utils 2
bleach._vendor.html5lib._trie.py 3
bleach._vendor.html5lib._trie._base 2
bleach._vendor.html5lib.filters.alphabeticalattributes 11
bleach.linkifier 29
bleach._vendor.html5lib.filters.sanitizer 10
bleach._vendor.html5lib.filters.whitespace 9
bleach._vendor.html5lib.filters.optionaltags 5
bleach._vendor.html5lib.filters.inject_meta_charset 14

Fuzzer: linkify_fuzzer

Call tree

The calltree shows the control flow of the fuzzer. This is overlaid with coverage information to display how much of the potential code a fuzzer can reach is in fact covered at runtime. In the following there is a link to a detailed calltree visualisation as well as a bitmap showing a high-level view of the calltree. For further information about these topics please see the glossary for full calltree and calltree overview

Call tree overview bitmap:

The distribution of callsites in terms of coloring is
Color Runtime hitcount Callsite count Percentage
red 0 518 50.3%
gold [1:9] 0 0.0%
yellow [10:29] 0 0.0%
greenyellow [30:49] 0 0.0%
lawngreen 50+ 511 49.6%
All colors 1029 100

Fuzz blockers

The following nodes represent call sites where fuzz blockers occur.

Amount of callsites blocked Calltree index Parent function Callsite Largest blocked function
93 881 bleach.sanitizer.BleachSanitizerFilter.sanitize_stream call site: 00881 bleach._vendor.html5lib.filters.whitespace.Filter.__iter__
45 757 bleach.html5lib_shim.match_entity call site: 00757 bleach.linkifier.LinkifyFilter.extract_entities
31 310 bleach._vendor.html5lib._tokenizer.HTMLTokenizer.processEntityInAttribute call site: 00310 bleach._vendor.html5lib._tokenizer.HTMLTokenizer.consumeNumberEntity
20 727 bleach._vendor.html5lib.html5parser.HTMLParser.mainLoop call site: 00727 bleach._vendor.html5lib.html5parser.HTMLParser.mainLoop
19 345 bleach._vendor.html5lib._trie.py.Trie.has_keys_with_prefix call site: 00345 bleach._vendor.html5lib._inputstream.HTMLUnicodeInputStream.char
17 815 bleach.sanitizer.BleachSanitizerFilter.merge_characters call site: 00815 bleach._vendor.html5lib.filters.inject_meta_charset.Filter.__iter__
15 47 bleach._vendor.html5lib._inputstream.HTMLUnicodeInputStream.readChunk call site: 00047 bleach._vendor.html5lib._inputstream.BufferedStream._readStream
14 65 bleach._vendor.html5lib._inputstream.HTMLUnicodeInputStream.readChunk call site: 00065 bleach._vendor.html5lib._utils.isSurrogatePair
14 510 bleach._vendor.html5lib._tokenizer.HTMLTokenizer.__iter__ call site: 00510 bleach._vendor.html5lib._inputstream.HTMLUnicodeInputStream.charsUntil
11 858 bleach.sanitizer.BleachSanitizerFilter.sanitize_uri_value call site: 00858 normalized_uri.split
10 231 bleach._vendor.html5lib._tokenizer.HTMLTokenizer.__iter__ call site: 00231 bleach._vendor.html5lib._inputstream.HTMLUnicodeInputStream.charsUntil
9 557 bleach._vendor.html5lib._tokenizer.HTMLTokenizer.__iter__ call site: 00557 bleach._vendor.html5lib._inputstream.HTMLUnicodeInputStream.char

Runtime coverage analysis

Covered functions
287
Functions that are reachable but not covered
183
Reachable functions
272
Percentage of reachable functions covered
32.72%
NB: The sum of covered functions and functions that are reachable but not covered need not be equal to Reachable functions . This is because the reachability analysis is an approximation and thus at runtime some functions may be covered that are not included in the reachability analysis. This is a limitation of our static analysis capabilities.
Warning: The number of covered functions are larger than the number of reachable functions. This means that there are more functions covered at runtime than are extracted using static analysis. This is likely a result of the static analysis component failing to extract the right call graph or the coverage runtime being compiled with sanitizers in code that the static analysis has not analysed. This can happen if lto/gold is not used in all places that coverage instrumentation is used.
Function name source code lines source lines hit percentage hit

Files reached

filename functions hit
/ 1
...linkify_fuzzer 3
bleach 2
bleach.linkifier 37
bleach.html5lib_shim 33
bleach._vendor.html5lib.serializer 29
bleach._vendor.html5lib.html5parser 18
bleach._vendor.html5lib._tokenizer 93
bleach._vendor.html5lib._inputstream 25
bleach._vendor.html5lib._utils 2
bleach._vendor.html5lib._trie.py 3
bleach._vendor.html5lib._trie._base 2
bleach._vendor.html5lib.filters.inject_meta_charset 14
bleach._vendor.html5lib.filters.base 1
bleach._vendor.html5lib.filters.alphabeticalattributes 11
bleach._vendor.html5lib.filters.optionaltags 5
bleach.sanitizer 39
bleach._vendor.html5lib.filters.whitespace 9
bleach._vendor.html5lib.filters.sanitizer 37

Analyses and suggestions

Optimal target analysis

Remaining optimal interesting functions

The following table shows a list of functions that are optimal targets. Optimal targets are identified by finding the functions that in combination, yield a high code coverage.

Func name Functions filename Arg count Args Function depth hitcount instr count bb count cyclomatic complexity Reachable functions Incoming references total cyclomatic complexity Unreached complexity
bleach._vendor.parse.urljoin bleach._vendor.parse 3 ['N/A', 'N/A', 'N/A'] 5 0 4 12 8 36 0 126 114
bleach._vendor.html5lib.treebuilders.etree_lxml.TreeBuilder.insertRoot bleach._vendor.html5lib.treebuilders.etree_lxml 2 ['N/A', 'N/A'] 2 0 5 7 6 18 0 60 54
bleach._vendor.html5lib.treewalkers.base.NonRecursiveTreeWalker.__iter__ bleach._vendor.html5lib.treewalkers.base 1 ['N/A'] 2 0 8 13 8 16 0 57 54
bleach._vendor.parse.urlencode bleach._vendor.parse 6 ['N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A'] 5 0 6 9 7 19 0 69 54
bleach._vendor.html5lib.html5parser.getPhases.InBodyPhase.startTagA bleach._vendor.html5lib.html5parser 2 ['N/A', 'N/A'] 2 0 6 3 4 16 0 59 53
bleach._vendor.html5lib.treebuilders.etree_lxml.testSerializer.serializeElement bleach._vendor.html5lib.treebuilders.etree_lxml 2 ['N/A', 'N/A'] 2 0 15 12 8 15 2 58 46
bleach._vendor.html5lib._inputstream.EncodingParser.handleMeta bleach._vendor.html5lib._inputstream 1 ['N/A'] 3 0 0 10 7 18 0 70 45
bleach._vendor.html5lib._utils.moduleFactoryFactory.moduleFactory bleach._vendor.html5lib._utils 3 ['N/A', 'N/A', 'N/A'] 2 0 1 4 5 14 0 47 38
bleach._vendor.parse.parse_qsl bleach._vendor.parse 7 ['N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A'] 3 0 3 8 6 18 1 64 36

Implementing fuzzers that target the above functions will improve reachability such that it becomes:

Functions statically reachable by fuzzers
28.9%
209 / 731
Cyclomatic complexity statically reachable by fuzzers
34.0%
888 / 2595

All functions overview

If you implement fuzzers for these functions, the status of all functions in the project will be:

Func name Functions filename Args Function call depth Reached by Fuzzers Runtime reached by Fuzzers Combined reached by Fuzzers Fuzzers runtime hit Func lines hit % I Count BB Count Cyclomatic complexity Functions reached Reached by functions Accumulated cyclomatic complexity Undiscovered complexity

Fuzz engine guidance

This sections provides heuristics that can be used as input to a fuzz engine when running a given fuzz target. The current focus is on providing input that is usable by libFuzzer.

/src/sanitize_fuzzer.py

Dictionary

Use this with the libFuzzer -dict=DICT.file flag


Fuzzer function priority

Use one of these functions as input to libfuzzer with flag: -focus_function name

-focus_function=['bleach._vendor.html5lib._inputstream.HTMLUnicodeInputStream.__init__', 'bleach.html5lib_shim.match_entity', 'bleach.sanitizer.BleachSanitizerFilter.sanitize_stream', 'bleach._vendor.html5lib._tokenizer.HTMLTokenizer.entityDataState', 'bleach.sanitizer.BleachSanitizerFilter.merge_characters', 'bleach._vendor.html5lib._trie.py.Trie.has_keys_with_prefix', 'bleach.sanitizer.Cleaner.clean', 'bleach._vendor.html5lib._inputstream.HTMLUnicodeInputStream.readChunk', 'bleach._vendor.html5lib._tokenizer.HTMLTokenizer.__iter__']

/src/linkify_fuzzer.py

Dictionary

Use this with the libFuzzer -dict=DICT.file flag


Fuzzer function priority

Use one of these functions as input to libfuzzer with flag: -focus_function name

-focus_function=['bleach.sanitizer.BleachSanitizerFilter.sanitize_stream', 'bleach.html5lib_shim.match_entity', 'bleach._vendor.html5lib._tokenizer.HTMLTokenizer.processEntityInAttribute', 'bleach._vendor.html5lib.html5parser.HTMLParser.mainLoop', 'bleach._vendor.html5lib._trie.py.Trie.has_keys_with_prefix', 'bleach.sanitizer.BleachSanitizerFilter.merge_characters', 'bleach._vendor.html5lib._inputstream.HTMLUnicodeInputStream.readChunk', 'bleach._vendor.html5lib._tokenizer.HTMLTokenizer.__iter__', 'bleach.sanitizer.BleachSanitizerFilter.sanitize_uri_value']

Files and Directories in report

This section shows which files and directories are considered in this report. The main reason for showing this is fuzz introspector may include more code in the reasoning than is desired. This section helps identify if too many files/directories are included, e.g. third party code, which may be irrelevant for the threat model. In the event too much is included, fuzz introspector supports a configuration file that can exclude data from the report. See the following link for more information on how to create a config file: link

Files in report

Source file Reached by Covered by
[] []
bleach._vendor.html5lib.filters.base ['linkify_fuzzer'] []
bleach._vendor.html5lib._ihatexml [] []
bleach._vendor.html5lib.treebuilders [] []
bleach.six_shim [] []
bleach._vendor.html5lib.filters.optionaltags ['sanitize_fuzzer', 'linkify_fuzzer'] []
bleach._vendor.html5lib.filters.alphabeticalattributes ['sanitize_fuzzer', 'linkify_fuzzer'] []
bleach._vendor.html5lib.treewalkers.etree [] []
bleach._vendor.parse [] []
bleach._vendor.html5lib.treewalkers.dom [] []
bleach.html5lib_shim ['sanitize_fuzzer', 'linkify_fuzzer'] []
...sanitize_fuzzer ['sanitize_fuzzer'] []
webencodings [] []
bleach._vendor.html5lib.filters.whitespace ['sanitize_fuzzer', 'linkify_fuzzer'] []
lxml [] []
itertools [] []
genshi [] []
bleach.linkifier ['sanitize_fuzzer', 'linkify_fuzzer'] []
bleach ['sanitize_fuzzer', 'linkify_fuzzer'] []
sys [] []
bleach.sanitizer ['sanitize_fuzzer', 'linkify_fuzzer'] []
unicodedata [] []
bleach._vendor.html5lib.filters.lint [] []
bleach._vendor.html5lib._tokenizer ['sanitize_fuzzer', 'linkify_fuzzer'] []
bleach._vendor.html5lib [] []
bleach._vendor.html5lib.treewalkers [] []
bleach._vendor.html5lib.filters.inject_meta_charset ['sanitize_fuzzer', 'linkify_fuzzer'] []
weakref [] []
bleach._vendor.html5lib.treewalkers.base [] []
bleach._vendor.html5lib.treewalkers.genshi [] []
bleach.css_sanitizer [] []
bleach._vendor.html5lib.treeadapters.sax [] []
bleach._vendor.html5lib._trie._base ['sanitize_fuzzer', 'linkify_fuzzer'] []
bleach._vendor.html5lib.html5parser ['sanitize_fuzzer', 'linkify_fuzzer'] []
bleach._vendor.html5lib._trie [] []
urllib [] []
bisect [] []
tinycss2 [] []
io [] []
bleach._vendor.html5lib.serializer ['sanitize_fuzzer', 'linkify_fuzzer'] []
warnings [] []
operator [] []
bleach._vendor.html5lib.filters [] []
re [] []
types [] []
bleach._vendor.html5lib._trie.py ['sanitize_fuzzer', 'linkify_fuzzer'] []
codecs [] []
bleach._vendor.html5lib.treebuilders.base [] []
bleach._vendor.html5lib.treebuilders.dom [] []
bleach._vendor.html5lib.constants [] []
xml [] []
bleach._vendor.html5lib.treebuilders.etree_lxml [] []
copy [] []
bleach._vendor.html5lib.treeadapters.genshi [] []
bleach.callbacks [] []
...linkify_fuzzer ['linkify_fuzzer'] []
bleach._vendor.html5lib.treewalkers.etree_lxml [] []
bleach._vendor [] []
collections [] []
bleach._vendor.html5lib.treeadapters [] []
atheris [] []
bleach._vendor.html5lib._utils ['sanitize_fuzzer', 'linkify_fuzzer'] []
bleach.parse_shim [] []
bleach._vendor.html5lib.treebuilders.etree [] []
bleach._vendor.html5lib.filters.sanitizer ['sanitize_fuzzer', 'linkify_fuzzer'] []
[] []
chardet [] []
bleach._vendor.html5lib._inputstream ['sanitize_fuzzer', 'linkify_fuzzer'] []

Directories in report

Directory