FuzzBench: 2024-05-16-new-bug report

(experiment incomplete/still running...)

experiment summary

We show two different aggregate (cross-benchmark) rankings of fuzzers. The first is based on the average of per-benchmarks scores, where the score represents the percentage of the highest reached median bug-coverage on a given benchmark (higher value is better). The second ranking shows the average rank of fuzzers, after we rank them on each benchmark according to their median reached bug-covereges (lower value is better).
By avg. score
average normalized score
fuzzer
afl 100.0
aflsmart 100.0
honggfuzz 100.0
libafl 100.0
libfuzzer 100.0
mopt 100.0
aflplusplus 0.0
By avg. rank
average rank
fuzzer
afl 1.00
aflsmart 1.00
honggfuzz 1.00
libafl 1.00
libfuzzer 1.00
mopt 1.00
aflplusplus 1.43
  • Critical difference diagram
    The diagram visualizes the average rank of fuzzers (second ranking above) while showing the significance of the differences as well. What is considered a "critical difference" (CD) is based on the Friedman/Nemenyi post-hoc test. See more in the documentation.
    Note: If a fuzzer does not support all benchmarks, its ranking as shown in this diagram can be lower than it should be. So please check the list of supported benchmarks for the fuzzer(s) of your interest. The list could be specified in the fuzzer's README.md like this.
  • Median relative code-coverages on each benchmark

    Note: The relative coverage summary table shows the median relative performance of each fuzzer to the experiment maximum. Thus the highest relative performance may not be 100%.
    trial_relative_coverage = trial_coverage / experiment_max_coverage

      libafl mopt aflsmart libfuzzer honggfuzz afl aflplusplus
    FuzzerMedian 65.00 69.50 69.50 69.50 63.50 65.00 65.00
    FuzzerMean 58.36 58.21 58.14 58.14 56.00 55.23 55.23
    arrow_arrow-ipc-stream-fuzz_1a34a0 65.00 65.00 65.00 65.00 nan 65.00 65.00
    aspell_aspell_fuzzer_e8eb74 83.00 83.00 83.00 83.00 83.00 83.00 83.00
    assimp_assimp_fuzzer_4d451f 0.00 0.00 0.00 0.00 0.00 0.00 0.00
    bloaty_fuzz_target_52948c 81.00 81.00 81.00 81.00 81.00 81.00 81.00
    ffmpeg_ffmpeg_demuxer_fuzzer_7adeef nan 0.00 0.00 0.00 0.00 0.00 0.00
    file_magic_fuzzer_2d5f85 78.00 78.00 78.00 78.00 nan 78.00 78.00
    grok_grk_decompress_fuzzer_9cd001 93.00 97.00 95.00 96.00 96.00 95.00 nan
    harfbuzz_hb-shape-fuzzer_17863b nan 74.00 74.00 74.00 74.00 74.00 74.00
    lcms_cms_transform_all_fuzzer_97d37d 3.00 3.00 3.00 3.00 3.00 3.00 3.00
    libaom_av1_dec_fuzzer_6e1848 91.00 91.00 91.00 91.00 91.00 91.00 91.00
    libpcap_fuzz_filter_98b0a2 53.00 53.00 53.00 53.00 53.00 53.00 53.00
    libxml2_xml_e85b9b 44.00 44.00 44.00 44.00 44.00 44.00 44.00
    php_php-fuzz-parser_0dbedb nan 95.00 96.00 95.00 96.00 nan 95.00
    systemd_fuzz-network-parser_288baf 51.00 51.00 51.00 51.00 51.00 51.00 51.00
    • Fuzzers are sorted by "FuzzerMean" (average median relative coverage), highest on the left.
    • Green background = highest relative median coverage.
    • Blue gradient background = greater than 95% relative median coverage.
  • Median relative bug-coverages on each benchmark

    Note: The relative coverage summary table shows the median relative performance of each fuzzer to the experiment maximum. Thus the highest relative performance may not be 100%.
    trial_relative_coverage = trial_coverage / experiment_max_coverage

      libafl honggfuzz afl aflsmart libfuzzer mopt aflplusplus
    FuzzerMedian 0.00 0.00 0.00 0.00 0.00 0.00 0.00
    FuzzerMean 4.55 4.17 3.85 3.57 3.57 3.57 0.00
    arrow_arrow-ipc-stream-fuzz_1a34a0 0.00 nan 0.00 0.00 0.00 0.00 0.00
    aspell_aspell_fuzzer_e8eb74 0.00 0.00 0.00 0.00 0.00 0.00 0.00
    assimp_assimp_fuzzer_4d451f 0.00 0.00 0.00 0.00 0.00 0.00 0.00
    bloaty_fuzz_target_52948c 0.00 0.00 0.00 0.00 0.00 0.00 0.00
    ffmpeg_ffmpeg_demuxer_fuzzer_7adeef nan 0.00 0.00 0.00 0.00 0.00 0.00
    file_magic_fuzzer_2d5f85 0.00 nan 0.00 0.00 0.00 0.00 0.00
    grok_grk_decompress_fuzzer_9cd001 50.00 50.00 50.00 50.00 50.00 50.00 nan
    harfbuzz_hb-shape-fuzzer_17863b nan 0.00 0.00 0.00 0.00 0.00 0.00
    lcms_cms_transform_all_fuzzer_97d37d 0.00 0.00 0.00 0.00 0.00 0.00 0.00
    libaom_av1_dec_fuzzer_6e1848 0.00 0.00 0.00 0.00 0.00 0.00 0.00
    libpcap_fuzz_filter_98b0a2 0.00 0.00 0.00 0.00 0.00 0.00 0.00
    libxml2_xml_e85b9b 0.00 0.00 0.00 0.00 0.00 0.00 0.00
    php_php-fuzz-parser_0dbedb nan 0.00 nan 0.00 0.00 0.00 0.00
    systemd_fuzz-network-parser_288baf 0.00 0.00 0.00 0.00 0.00 0.00 0.00
    • Fuzzers are sorted by "FuzzerMean" (average median relative coverage), highest on the left.
    • Green background = highest relative median coverage.
    • Blue gradient background = greater than 95% relative median coverage.
  • Total unique bugs found on each benchmark
      Total libfuzzer mopt honggfuzz libafl afl aflsmart aflplusplus
    FuzzerSum 79 70 14 6 6 2 2 0
    arrow_arrow-ipc-stream-fuzz_1a34a0 0 0 0 nan 0 0 0 0
    aspell_aspell_fuzzer_e8eb74 2 0 2 0 2 0 0 0
    assimp_assimp_fuzzer_4d451f 61 61 0 0 0 0 0 0
    bloaty_fuzz_target_52948c 1 1 1 0 0 0 0 0
    ffmpeg_ffmpeg_demuxer_fuzzer_7adeef 1 0 0 1 nan 0 0 0
    file_magic_fuzzer_2d5f85 2 2 1 nan 0 0 0 0
    grok_grk_decompress_fuzzer_9cd001 4 2 2 2 4 2 2 nan
    harfbuzz_hb-shape-fuzzer_17863b 4 4 4 3 nan 0 0 0
    lcms_cms_transform_all_fuzzer_97d37d 0 0 0 0 0 0 0 0
    libaom_av1_dec_fuzzer_6e1848 0 0 0 0 0 0 0 0
    libpcap_fuzz_filter_98b0a2 0 0 0 0 0 0 0 0
    libxml2_xml_e85b9b 3 0 3 0 0 0 0 0
    php_php-fuzz-parser_0dbedb 1 0 1 0 nan 0 0 0
    systemd_fuzz-network-parser_288baf 0 0 0 0 0 0 0 0
    • Fuzzers are sorted by "FuzzerSum", highest on the left.
    • Green background = most unique bugs found.
    • *note: This table represents unique bugs found across all trials.

arrow_arrow-ipc-stream-fuzz_1a34a0 summary

Discovered bug coverage distribution
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
Mean bug coverage growth over time
Mean bug coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
  • Sample statistics and statistical significance (bugs covered)
    Bug coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    afl 0 15.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    aflplusplus 0 10.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    aflsmart 0 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    libafl 0 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    libfuzzer 0 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    mopt 0 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    afl 0 15.0 1516.0 0.0 1516.0 1516.0 1516.0 1516.0 1516.0
    aflplusplus 0 10.0 1516.0 0.0 1516.0 1516.0 1516.0 1516.0 1516.0
    aflsmart 0 20.0 1516.0 0.0 1516.0 1516.0 1516.0 1516.0 1516.0
    libafl 0 20.0 1516.0 0.0 1516.0 1516.0 1516.0 1516.0 1516.0
    libfuzzer 0 20.0 1516.0 0.0 1516.0 1516.0 1516.0 1516.0 1516.0
    mopt 0 20.0 1516.0 0.0 1516.0 1516.0 1516.0 1516.0 1516.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

aspell_aspell_fuzzer_e8eb74 summary

Discovered bug coverage distribution
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
Mean bug coverage growth over time
Mean bug coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
  • Sample statistics and statistical significance (bugs covered)
    Bug coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    afl 0 18.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    aflplusplus 0 17.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    aflsmart 0 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    honggfuzz 0 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    libafl 0 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    libfuzzer 0 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    mopt 0 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    afl 0 18.0 2624.0 0.0 2624.0 2624.0 2624.0 2624.0 2624.0
    aflplusplus 0 17.0 2624.0 0.0 2624.0 2624.0 2624.0 2624.0 2624.0
    aflsmart 0 20.0 2624.0 0.0 2624.0 2624.0 2624.0 2624.0 2624.0
    honggfuzz 0 20.0 2624.0 0.0 2624.0 2624.0 2624.0 2624.0 2624.0
    libafl 0 20.0 2624.0 0.0 2624.0 2624.0 2624.0 2624.0 2624.0
    libfuzzer 0 20.0 2624.0 0.0 2624.0 2624.0 2624.0 2624.0 2624.0
    mopt 0 20.0 2624.0 0.0 2624.0 2624.0 2624.0 2624.0 2624.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

assimp_assimp_fuzzer_4d451f summary

Discovered bug coverage distribution
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
Mean bug coverage growth over time
Mean bug coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
  • Sample statistics and statistical significance (bugs covered)
    Bug coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    afl 0 16.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    aflplusplus 0 12.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    aflsmart 0 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    honggfuzz 0 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    libafl 0 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    libfuzzer 0 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    mopt 0 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    afl 0 16.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    aflplusplus 0 12.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    aflsmart 0 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    honggfuzz 0 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    libafl 0 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    libfuzzer 0 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    mopt 0 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

bloaty_fuzz_target_52948c summary

Discovered bug coverage distribution
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
Mean bug coverage growth over time
Mean bug coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
  • Sample statistics and statistical significance (bugs covered)
    Bug coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    afl 0 17.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    aflplusplus 0 16.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    aflsmart 0 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    honggfuzz 0 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    libafl 0 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    libfuzzer 0 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    mopt 0 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    afl 0 17.0 4364.0 0.0 4364.0 4364.0 4364.0 4364.0 4364.0
    aflplusplus 0 16.0 4364.0 0.0 4364.0 4364.0 4364.0 4364.0 4364.0
    aflsmart 0 20.0 4364.0 0.0 4364.0 4364.0 4364.0 4364.0 4364.0
    honggfuzz 0 20.0 4364.0 0.0 4364.0 4364.0 4364.0 4364.0 4364.0
    libafl 0 20.0 4364.0 0.0 4364.0 4364.0 4364.0 4364.0 4364.0
    libfuzzer 0 20.0 4364.0 0.0 4364.0 4364.0 4364.0 4364.0 4364.0
    mopt 0 20.0 4364.0 0.0 4364.0 4364.0 4364.0 4364.0 4364.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

ffmpeg_ffmpeg_demuxer_fuzzer_7adeef summary

Discovered bug coverage distribution
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
Mean bug coverage growth over time
Mean bug coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
  • Sample statistics and statistical significance (bugs covered)
    Bug coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    afl 0 18.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    aflplusplus 0 14.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    aflsmart 0 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    honggfuzz 0 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    libfuzzer 0 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    mopt 0 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    afl 0 18.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    aflplusplus 0 14.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    aflsmart 0 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    honggfuzz 0 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    libfuzzer 0 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    mopt 0 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

file_magic_fuzzer_2d5f85 summary

Discovered bug coverage distribution
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
Mean bug coverage growth over time
Mean bug coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
  • Sample statistics and statistical significance (bugs covered)
    Bug coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    afl 0 12.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    aflplusplus 0 17.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    aflsmart 0 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    libafl 0 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    libfuzzer 0 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    mopt 0 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    afl 0 12.0 1810.0 0.0 1810.0 1810.0 1810.0 1810.0 1810.0
    aflplusplus 0 17.0 1810.0 0.0 1810.0 1810.0 1810.0 1810.0 1810.0
    aflsmart 0 20.0 1810.0 0.0 1810.0 1810.0 1810.0 1810.0 1810.0
    libafl 0 20.0 1810.0 0.0 1810.0 1810.0 1810.0 1810.0 1810.0
    libfuzzer 0 20.0 1810.0 0.0 1810.0 1810.0 1810.0 1810.0 1810.0
    mopt 0 20.0 1810.0 0.0 1810.0 1810.0 1810.0 1810.0 1810.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

grok_grk_decompress_fuzzer_9cd001 summary

Discovered bug coverage distribution
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
Mean bug coverage growth over time
Mean bug coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
  • Sample statistics and statistical significance (bugs covered)
    Bug coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    afl 26100 4.0 1.0 0.0 1.0 1.0 1.0 1.0 1.0
    aflsmart 26100 4.0 1.0 0.0 1.0 1.0 1.0 1.0 1.0
    honggfuzz 26100 8.0 1.0 0.0 1.0 1.0 1.0 1.0 1.0
    libafl 26100 16.0 1.0 0.0 1.0 1.0 1.0 1.0 1.0
    libfuzzer 26100 16.0 1.0 0.0 1.0 1.0 1.0 1.0 1.0
    mopt 26100 9.0 1.0 0.0 1.0 1.0 1.0 1.0 1.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    mopt 26100 9.0 5832.222222 50.291097 5743.0 5816.00 5841.0 5852.00 5906.0
    libfuzzer 26100 16.0 5771.062500 55.963641 5631.0 5743.25 5779.5 5809.75 5851.0
    honggfuzz 26100 8.0 5767.250000 15.078367 5745.0 5756.75 5772.5 5777.75 5784.0
    aflsmart 26100 4.0 5750.250000 33.029028 5704.0 5739.25 5758.5 5769.50 5780.0
    afl 26100 4.0 5749.750000 49.378639 5702.0 5728.25 5739.0 5760.50 5819.0
    libafl 26100 16.0 5539.500000 307.030726 4846.0 5337.75 5622.0 5786.00 5887.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

harfbuzz_hb-shape-fuzzer_17863b summary

Discovered bug coverage distribution
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
Mean bug coverage growth over time
Mean bug coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
  • Sample statistics and statistical significance (bugs covered)
    Bug coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    afl 0 18.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    aflplusplus 0 16.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    aflsmart 0 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    honggfuzz 0 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    libfuzzer 0 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    mopt 0 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    mopt 0 20.0 7318.55 1.503505 7317.0 7317.0 7319.0 7320.00 7320.0
    libfuzzer 0 20.0 7318.50 1.538968 7317.0 7317.0 7318.5 7320.00 7320.0
    afl 0 18.0 7317.00 0.000000 7317.0 7317.0 7317.0 7317.00 7317.0
    aflplusplus 0 16.0 7317.00 0.000000 7317.0 7317.0 7317.0 7317.00 7317.0
    aflsmart 0 20.0 7317.75 1.332785 7317.0 7317.0 7317.0 7317.75 7320.0
    honggfuzz 0 20.0 7318.20 1.507874 7317.0 7317.0 7317.0 7320.00 7321.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

lcms_cms_transform_all_fuzzer_97d37d summary

Discovered bug coverage distribution
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
Mean bug coverage growth over time
Mean bug coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
  • Sample statistics and statistical significance (bugs covered)
    Bug coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    afl 0 14.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    aflplusplus 0 14.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    aflsmart 0 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    honggfuzz 0 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    libafl 0 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    libfuzzer 0 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    mopt 0 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    afl 0 14.0 73.0 0.0 73.0 73.0 73.0 73.0 73.0
    aflplusplus 0 14.0 73.0 0.0 73.0 73.0 73.0 73.0 73.0
    aflsmart 0 20.0 73.0 0.0 73.0 73.0 73.0 73.0 73.0
    honggfuzz 0 20.0 73.0 0.0 73.0 73.0 73.0 73.0 73.0
    libafl 0 20.0 73.0 0.0 73.0 73.0 73.0 73.0 73.0
    libfuzzer 0 20.0 73.0 0.0 73.0 73.0 73.0 73.0 73.0
    mopt 0 20.0 73.0 0.0 73.0 73.0 73.0 73.0 73.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

libaom_av1_dec_fuzzer_6e1848 summary

Discovered bug coverage distribution
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
Mean bug coverage growth over time
Mean bug coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
  • Sample statistics and statistical significance (bugs covered)
    Bug coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    afl 0 14.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    aflplusplus 0 12.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    aflsmart 0 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    honggfuzz 0 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    libafl 0 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    libfuzzer 0 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    mopt 0 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    afl 0 14.0 9123.142857 11.175916 9102.0 9112.75 9125.5 9131.75 9139.0
    aflplusplus 0 12.0 9124.583333 10.518022 9111.0 9116.00 9123.0 9133.50 9140.0
    aflsmart 0 20.0 9091.300000 15.204224 9068.0 9079.25 9089.0 9100.00 9121.0
    libfuzzer 0 20.0 9075.100000 14.909905 9044.0 9064.75 9080.0 9084.00 9098.0
    libafl 0 20.0 9080.150000 10.095726 9063.0 9072.50 9079.0 9088.50 9099.0
    mopt 0 20.0 9073.850000 18.658355 9034.0 9069.25 9076.5 9089.25 9100.0
    honggfuzz 0 20.0 9077.850000 15.638263 9047.0 9065.50 9076.0 9090.00 9106.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

libpcap_fuzz_filter_98b0a2 summary

Discovered bug coverage distribution
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
Mean bug coverage growth over time
Mean bug coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
  • Sample statistics and statistical significance (bugs covered)
    Bug coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    afl 0 13.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    aflplusplus 0 16.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    aflsmart 0 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    honggfuzz 0 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    libafl 0 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    libfuzzer 0 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    mopt 0 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    afl 0 13.0 1625.0 0.0 1625.0 1625.0 1625.0 1625.0 1625.0
    aflplusplus 0 16.0 1625.0 0.0 1625.0 1625.0 1625.0 1625.0 1625.0
    aflsmart 0 20.0 1625.0 0.0 1625.0 1625.0 1625.0 1625.0 1625.0
    honggfuzz 0 20.0 1625.0 0.0 1625.0 1625.0 1625.0 1625.0 1625.0
    libafl 0 20.0 1625.0 0.0 1625.0 1625.0 1625.0 1625.0 1625.0
    libfuzzer 0 20.0 1625.0 0.0 1625.0 1625.0 1625.0 1625.0 1625.0
    mopt 0 20.0 1625.0 0.0 1625.0 1625.0 1625.0 1625.0 1625.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

libxml2_xml_e85b9b summary

Discovered bug coverage distribution
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
Mean bug coverage growth over time
Mean bug coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
  • Sample statistics and statistical significance (bugs covered)
    Bug coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    afl 0 19.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    aflplusplus 0 19.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    aflsmart 0 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    honggfuzz 0 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    libafl 0 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    libfuzzer 0 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    mopt 0 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    afl 0 19.0 8503.0 0.0 8503.0 8503.0 8503.0 8503.0 8503.0
    aflplusplus 0 19.0 8503.0 0.0 8503.0 8503.0 8503.0 8503.0 8503.0
    aflsmart 0 20.0 8503.0 0.0 8503.0 8503.0 8503.0 8503.0 8503.0
    honggfuzz 0 20.0 8503.0 0.0 8503.0 8503.0 8503.0 8503.0 8503.0
    libafl 0 20.0 8503.0 0.0 8503.0 8503.0 8503.0 8503.0 8503.0
    libfuzzer 0 20.0 8503.0 0.0 8503.0 8503.0 8503.0 8503.0 8503.0
    mopt 0 20.0 8503.0 0.0 8503.0 8503.0 8503.0 8503.0 8503.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

php_php-fuzz-parser_0dbedb summary

Discovered bug coverage distribution
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
Mean bug coverage growth over time
Mean bug coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
  • Sample statistics and statistical significance (bugs covered)
    Bug coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    aflplusplus 900 3.0 0.00 0.000000 0.0 0.0 0.0 0.0 0.0
    aflsmart 900 8.0 0.00 0.000000 0.0 0.0 0.0 0.0 0.0
    honggfuzz 900 15.0 0.00 0.000000 0.0 0.0 0.0 0.0 0.0
    libfuzzer 900 20.0 0.00 0.000000 0.0 0.0 0.0 0.0 0.0
    mopt 900 20.0 0.05 0.223607 0.0 0.0 0.0 0.0 1.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    aflsmart 900 8.0 15507.875000 834.998963 14156.0 15453.50 15960.0 15974.50 15991.0
    honggfuzz 900 15.0 15889.800000 66.900993 15767.0 15861.00 15922.0 15929.00 15983.0
    aflplusplus 900 3.0 15838.333333 160.104133 15675.0 15760.00 15845.0 15920.00 15995.0
    libfuzzer 900 20.0 15747.250000 61.595348 15625.0 15721.75 15757.0 15779.50 15842.0
    mopt 900 20.0 15538.500000 477.551155 14156.0 15664.75 15697.0 15734.25 15785.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

systemd_fuzz-network-parser_288baf summary

Discovered bug coverage distribution
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
Mean bug coverage growth over time
Mean bug coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
  • Sample statistics and statistical significance (bugs covered)
    Bug coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    afl 0 18.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    aflplusplus 0 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    aflsmart 0 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    honggfuzz 0 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    libafl 0 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    libfuzzer 0 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    mopt 0 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    afl 0 18.0 1615.666667 5.749680 1609.0 1609.5 1616.5 1621.5 1622.0
    aflplusplus 0 20.0 1613.600000 4.694902 1609.0 1610.5 1612.0 1620.0 1622.0
    aflsmart 0 20.0 1614.250000 5.580747 1609.0 1609.0 1612.0 1620.0 1622.0
    libfuzzer 0 20.0 1614.800000 5.578153 1609.0 1609.0 1612.0 1620.0 1622.0
    mopt 0 20.0 1614.300000 5.478186 1609.0 1610.5 1612.0 1620.5 1623.0
    libafl 0 20.0 1614.100000 5.476265 1609.0 1609.0 1611.5 1620.0 1622.0
    honggfuzz 0 20.0 1613.450000 5.614596 1609.0 1609.0 1610.5 1620.0 1623.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

experiment data

You can download the raw data for this report here.

Check out the documentation on how to create customized reports using this data. Also see some example Colab notebooks for doing custom analysis on the data here.

Experiment Description:

(None,)