Fuzz introspector
For issues and ideas: https://github.com/ossf/fuzz-introspector/issues
Report generation date: 2025-07-11

Project overview: sacremoses

High level conclusions

Reachability and coverage overview

Functions statically reachable by fuzzers
25.0%
18 / 71
Cyclomatic complexity statically reachable by fuzzers
28.9%
71 / 243
Runtime code coverage of functions
34.0%
24 / 71

Warning: The number of runtime covered functions are larger than the number of reachable functions. This means that Fuzz Introspector found there are more functions covered at runtime than what is considered reachable based on the static analysis. This is a limitation in the analysis as anything covered at runtime is by definition reachable by the fuzzers.
This is likely due to a limitation in the static analysis. In this case, the count of functions covered at runtime is the true value, which means this is what should be considered "achieved" by the fuzzer.

Use the project functions table below to query all functions that were not covered at runtime.

Project functions overview

The following table shows data about each function in the project. The functions included in this table correspond to all functions that exist in the executables of the fuzzers. As such, there may be functions that are from third-party libraries.

For further technical details on the meaning of columns in the below table, please see the Glossary .

Func name Functions filename Args Function call depth Reached by Fuzzers Runtime reached by Fuzzers Combined reached by Fuzzers Fuzzers runtime hit Func lines hit % I Count BB Count Cyclomatic complexity Functions reached Reached by functions Accumulated cyclomatic complexity Undiscovered complexity

Fuzzer details

Fuzzer: fuzz_normalizer

Call tree

The calltree shows the control flow of the fuzzer. This is overlaid with coverage information to display how much of the potential code a fuzzer can reach is in fact covered at runtime. In the following there is a link to a detailed calltree visualisation as well as a bitmap showing a high-level view of the calltree. For further information about these topics please see the glossary for full calltree and calltree overview

Call tree overview bitmap:

The distribution of callsites in terms of coloring is
Color Runtime hitcount Callsite count Percentage
red 0 8 44.4%
gold [1:9] 0 0.0%
yellow [10:29] 0 0.0%
greenyellow [30:49] 0 0.0%
lawngreen 50+ 10 55.5%
All colors 18 100

Fuzz blockers

The following nodes represent call sites where fuzz blockers occur.

Amount of callsites blocked Calltree index Parent function Callsite Largest blocked function
4 8 sacremoses.normalize.MosesPunctNormalizer.__init__ call site: 00008 sacremoses.normalize.MosesPunctNormalizer.normalize
2 3 ...fuzz_normalizer.TestOneInput call site: 00003 fdp.ConsumeIntInRange
2 14 sacremoses.normalize.MosesPunctNormalizer.normalize call site: 00014 sacremoses.normalize.MosesPunctNormalizer.remove_control_chars

Runtime coverage analysis

Covered functions
35
Functions that are reachable but not covered
10
Reachable functions
14
Percentage of reachable functions covered
28.57%
NB: The sum of covered functions and functions that are reachable but not covered need not be equal to Reachable functions . This is because the reachability analysis is an approximation and thus at runtime some functions may be covered that are not included in the reachability analysis. This is a limitation of our static analysis capabilities.
Warning: The number of covered functions are larger than the number of reachable functions. This means that there are more functions covered at runtime than are extracted using static analysis. This is likely a result of the static analysis component failing to extract the right call graph or the coverage runtime being compiled with sanitizers in code that the static analysis has not analysed. This can happen if lto/gold is not used in all places that coverage instrumentation is used.
Function name source code lines source lines hit percentage hit

Files reached

filename functions hit
/ 1
...fuzz_normalizer 5
sacremoses.normalize 8

Fuzzer: fuzz_tokenizer

Call tree

The calltree shows the control flow of the fuzzer. This is overlaid with coverage information to display how much of the potential code a fuzzer can reach is in fact covered at runtime. In the following there is a link to a detailed calltree visualisation as well as a bitmap showing a high-level view of the calltree. For further information about these topics please see the glossary for full calltree and calltree overview

Call tree overview bitmap:

The distribution of callsites in terms of coloring is
Color Runtime hitcount Callsite count Percentage
red 0 14 25.4%
gold [1:9] 0 0.0%
yellow [10:29] 0 0.0%
greenyellow [30:49] 0 0.0%
lawngreen 50+ 41 74.5%
All colors 55 100

Fuzz blockers

The following nodes represent call sites where fuzz blockers occur.

Amount of callsites blocked Calltree index Parent function Callsite Largest blocked function
5 15 sacremoses.corpus.NonbreakingPrefixes.words call site: 00015 sacremoses.tokenize.MosesTokenizer.has_numeric_only
3 7 sacremoses.tokenize.MosesTokenizer.__init__ call site: 00007 sacremoses.corpus.NonbreakingPrefixes.words
2 3 ...fuzz_tokenizer.TestOneInput call site: 00003 fdp.ConsumeIntInRange
1 35 sacremoses.tokenize.MosesTokenizer.__init__ call site: 00035 sacremoses.tokenize.MosesTokenizer.penn_tokenize
1 44 sacremoses.tokenize.MosesTokenizer.handles_nonbreaking_prefixes call site: 00044 sacremoses.tokenize.MosesTokenizer.isanyalpha
1 48 sacremoses.tokenize.MosesTokenizer.isanyalpha call site: 00048 sacremoses.tokenize.MosesTokenizer.islower
1 51 sacremoses.tokenize.MosesTokenizer.islower call site: 00051 re.search

Runtime coverage analysis

Covered functions
35
Functions that are reachable but not covered
23
Reachable functions
31
Percentage of reachable functions covered
25.81%
NB: The sum of covered functions and functions that are reachable but not covered need not be equal to Reachable functions . This is because the reachability analysis is an approximation and thus at runtime some functions may be covered that are not included in the reachability analysis. This is a limitation of our static analysis capabilities.
Warning: The number of covered functions are larger than the number of reachable functions. This means that there are more functions covered at runtime than are extracted using static analysis. This is likely a result of the static analysis component failing to extract the right call graph or the coverage runtime being compiled with sanitizers in code that the static analysis has not analysed. This can happen if lto/gold is not used in all places that coverage instrumentation is used.
Function name source code lines source lines hit percentage hit

Files reached

filename functions hit
/ 1
...fuzz_tokenizer 5
sacremoses.tokenize 22
sacremoses.corpus 6

Fuzzer: fuzz_split_xml

Call tree

The calltree shows the control flow of the fuzzer. This is overlaid with coverage information to display how much of the potential code a fuzzer can reach is in fact covered at runtime. In the following there is a link to a detailed calltree visualisation as well as a bitmap showing a high-level view of the calltree. For further information about these topics please see the glossary for full calltree and calltree overview

Call tree overview bitmap:

The distribution of callsites in terms of coloring is
Color Runtime hitcount Callsite count Percentage
red 0 4 17.3%
gold [1:9] 0 0.0%
yellow [10:29] 0 0.0%
greenyellow [30:49] 0 0.0%
lawngreen 50+ 19 82.6%
All colors 23 100

Fuzz blockers

The following nodes represent call sites where fuzz blockers occur.

Amount of callsites blocked Calltree index Parent function Callsite Largest blocked function
3 9 sacremoses.truecase.MosesTruecaser.split_xml call site: 00009 re.search
1 20 sacremoses.truecase.MosesTruecaser.split_xml call site: 00020 xml_cognates.group

Runtime coverage analysis

Covered functions
35
Functions that are reachable but not covered
12
Reachable functions
13
Percentage of reachable functions covered
7.69%
NB: The sum of covered functions and functions that are reachable but not covered need not be equal to Reachable functions . This is because the reachability analysis is an approximation and thus at runtime some functions may be covered that are not included in the reachability analysis. This is a limitation of our static analysis capabilities.
Warning: The number of covered functions are larger than the number of reachable functions. This means that there are more functions covered at runtime than are extracted using static analysis. This is likely a result of the static analysis component failing to extract the right call graph or the coverage runtime being compiled with sanitizers in code that the static analysis has not analysed. This can happen if lto/gold is not used in all places that coverage instrumentation is used.
Function name source code lines source lines hit percentage hit

Files reached

filename functions hit
/ 1
...fuzz_split_xml 4
sacremoses.truecase 8

Fuzzer: fuzz_detokenize

Call tree

The calltree shows the control flow of the fuzzer. This is overlaid with coverage information to display how much of the potential code a fuzzer can reach is in fact covered at runtime. In the following there is a link to a detailed calltree visualisation as well as a bitmap showing a high-level view of the calltree. For further information about these topics please see the glossary for full calltree and calltree overview

Call tree overview bitmap:

The distribution of callsites in terms of coloring is
Color Runtime hitcount Callsite count Percentage
red 0 16 37.2%
gold [1:9] 0 0.0%
yellow [10:29] 0 0.0%
greenyellow [30:49] 0 0.0%
lawngreen 50+ 27 62.7%
All colors 43 100

Fuzz blockers

The following nodes represent call sites where fuzz blockers occur.

Amount of callsites blocked Calltree index Parent function Callsite Largest blocked function
12 21 sacremoses.tokenize.MosesDetokenizer.tokenize call site: 00021 re.search
3 36 sacremoses.tokenize.MosesDetokenizer.tokenize call site: 00036 re.search
1 16 sacremoses.util.is_cjk call site: 00016 .ord

Runtime coverage analysis

Covered functions
35
Functions that are reachable but not covered
18
Reachable functions
23
Percentage of reachable functions covered
21.74%
NB: The sum of covered functions and functions that are reachable but not covered need not be equal to Reachable functions . This is because the reachability analysis is an approximation and thus at runtime some functions may be covered that are not included in the reachability analysis. This is a limitation of our static analysis capabilities.
Warning: The number of covered functions are larger than the number of reachable functions. This means that there are more functions covered at runtime than are extracted using static analysis. This is likely a result of the static analysis component failing to extract the right call graph or the coverage runtime being compiled with sanitizers in code that the static analysis has not analysed. This can happen if lto/gold is not used in all places that coverage instrumentation is used.
Function name source code lines source lines hit percentage hit

Files reached

filename functions hit
/ 1
...fuzz_detokenize 5
sacremoses.tokenize 15
sacremoses.util 2

Fuzz engine guidance

This sections provides heuristics that can be used as input to a fuzz engine when running a given fuzz target. The current focus is on providing input that is usable by libFuzzer.

/src/fuzz_normalizer.py

Dictionary

Use this with the libFuzzer -dict=DICT.file flag


Fuzzer function priority

Use one of these functions as input to libfuzzer with flag: -focus_function name

-focus_function=['sacremoses.normalize.MosesPunctNormalizer.__init__', '...fuzz_normalizer.TestOneInput', 'sacremoses.normalize.MosesPunctNormalizer.normalize']

/src/fuzz_tokenizer.py

Dictionary

Use this with the libFuzzer -dict=DICT.file flag


Fuzzer function priority

Use one of these functions as input to libfuzzer with flag: -focus_function name

-focus_function=['sacremoses.corpus.NonbreakingPrefixes.words', 'sacremoses.tokenize.MosesTokenizer.__init__', '...fuzz_tokenizer.TestOneInput', 'sacremoses.tokenize.MosesTokenizer.handles_nonbreaking_prefixes', 'sacremoses.tokenize.MosesTokenizer.isanyalpha', 'sacremoses.tokenize.MosesTokenizer.islower']

/src/fuzz_split_xml.py

Dictionary

Use this with the libFuzzer -dict=DICT.file flag


Fuzzer function priority

Use one of these functions as input to libfuzzer with flag: -focus_function name

-focus_function=['sacremoses.truecase.MosesTruecaser.split_xml']

/src/fuzz_detokenize.py

Dictionary

Use this with the libFuzzer -dict=DICT.file flag


Fuzzer function priority

Use one of these functions as input to libfuzzer with flag: -focus_function name

-focus_function=['sacremoses.tokenize.MosesDetokenizer.tokenize', 'sacremoses.util.is_cjk']