FuzzBench: SBFT'23 Final Evaluation report

15 Bug-based Benchmarks

Experiment Summary

There is one known bug in each benchmark. Fuzzers are evaluated based on their ability to find the input to trigger the bug and cause a crash. We show two different aggregates (cross-benchmark) rankings of fuzzers. The first ranking is based on the average of per-benchmark scores, where the score represents the percentage of the highest reached median bug coverage on a given benchmark (the higher, the better), and ties are broken by the average time taken to find the input. Repeatedly triggering the bug will not gain an extra score. The second ranking shows the average rank of fuzzers, after we rank them on each benchmark according to their median reached bug-coverage (the lower, the better).
By avg. score
average normalized score average extra time to find bugs (seconds)
fuzzer
pastis 53.33 0.0
aflrustrust 53.33 960.0
aflsmart_plusplus 50.00 1440.0
afl 46.67 4140.0
honggfuzz 46.67 5310.0
libafl_libfuzzer 46.67 5490.0
aflplusplusplus 40.00 7680.0
hastefuzz 40.00 8880.0
libfuzzer 40.00 9030.0
aflplusplus 40.00 9600.0
symsan 20.00 24720.0
learnperffuzz 6.67 32100.0
By avg. rank

average
rank
fuzzer
aflrustrust 1.40
pastis 1.47
honggfuzz 1.80
afl 1.80
aflsmart_plusplus 2.00
libfuzzer 2.13
libafl_libfuzzer 2.13
hastefuzz 2.13
aflplusplusplus 2.13
aflplusplus 2.13
symsan 3.67
learnperffuzz 5.07
  • Critical difference diagram
    The diagram visualizes the average rank of fuzzers (second ranking above) while showing the significance of the differences as well. What is considered a "critical difference" (CD) is based on the Friedman/Nemenyi post-hoc test. See more in the documentation.
    Note: If a fuzzer does not support all benchmarks, its ranking as shown in this diagram can be lower than it should be. So please check the list of supported benchmarks for the fuzzer(s) of your interest. The list could be specified in the fuzzer's README.md like this.
  • Median relative code-coverages on each benchmark

    Note: The relative coverage summary table shows the median relative performance of each fuzzer to the experiment maximum. Thus the highest relative performance may not be 100%.
    trial_relative_coverage = trial_coverage / experiment_max_coverage

      libafl_libfuzzer hastefuzz aflrustrust aflplusplusplus aflplusplus afl aflsmart_plusplus libfuzzer pastis honggfuzz symsan learnperffuzz
    FuzzerMedian 93.00 92.00 94.00 88.00 89.00 93.00 91.00 84.00 83.00 84.00 83.00 61.00
    FuzzerMean 86.67 85.27 85.13 85.00 83.60 83.20 82.13 81.13 76.53 74.07 56.60 55.87
    arrow_arrow-ipc-stream-fuzz_1a34a0 95.00 96.00 94.00 86.00 89.00 95.00 83.00 92.00 nan nan nan 59.00
    aspell_aspell_fuzzer_e8eb74 80.00 81.00 82.00 81.00 81.00 80.00 80.00 78.00 80.00 84.00 83.00 74.00
    assimp_assimp_fuzzer_4d451f 33.00 57.00 51.00 62.00 51.00 36.00 36.00 71.00 71.00 81.00 90.00 0.00
    bloaty_fuzz_target_52948c 98.00 81.00 95.00 76.00 90.00 96.00 96.00 71.00 84.00 93.00 nan 71.00
    ffmpeg_ffmpeg_demuxer_fuzzer_7adeef 81.00 61.00 56.00 85.00 72.00 58.00 59.00 65.00 83.00 82.00 nan 16.00
    file_magic_fuzzer_2d5f85 93.00 98.00 99.00 89.00 88.00 93.00 92.00 92.00 73.00 nan 71.00 71.00
    grok_grk_decompress_fuzzer_9cd001 87.00 96.00 94.00 97.00 97.00 95.00 95.00 93.00 97.00 96.00 96.00 86.00
    harfbuzz_hb-shape-fuzzer_17863b 99.00 96.00 89.00 95.00 95.00 96.00 96.00 84.00 95.00 96.00 89.00 77.00
    lcms_cms_transform_all_fuzzer_97d37d 84.00 82.00 76.00 70.00 59.00 59.00 56.00 67.00 68.00 66.00 nan 2.00
    libaom_av1_dec_fuzzer_6e1848 98.00 97.00 97.00 94.00 94.00 94.00 97.00 91.00 97.00 98.00 93.00 84.00
    libpcap_fuzz_filter_98b0a2 93.00 94.00 90.00 95.00 95.00 92.00 88.00 86.00 94.00 91.00 nan 54.00
    libxml2_xml_e85b9b 97.00 92.00 95.00 93.00 94.00 98.00 98.00 76.00 98.00 85.00 83.00 61.00
    mbedtls_fuzz_dtlsclient_7c6b0e 68.00 69.00 68.00 68.00 68.00 68.00 69.00 68.00 66.00 68.00 67.00 48.00
    php_php-fuzz-parser_0dbedb 96.00 95.00 95.00 96.00 96.00 96.00 96.00 95.00 98.00 98.00 92.00 89.00
    systemd_fuzz-network-parser_288baf 98.00 84.00 96.00 88.00 85.00 92.00 91.00 88.00 44.00 73.00 85.00 46.00
    • Fuzzers are sorted by "FuzzerMean" (average median relative coverage), highest on the left.
    • Green background = highest relative median coverage.
    • Blue gradient background = greater than 95% relative median coverage.
  • Median relative bug-coverages on each benchmark

    Note: The relative coverage summary table shows the median relative performance of each fuzzer to the experiment maximum. Thus the highest relative performance may not be 100%.
    trial_relative_coverage = trial_coverage / experiment_max_coverage

      aflrustrust pastis aflsmart_plusplus afl honggfuzz libafl_libfuzzer aflplusplus aflplusplusplus hastefuzz libfuzzer symsan learnperffuzz
    FuzzerMedian 100.00 100.00 50.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
    FuzzerMean 53.33 53.33 50.00 46.67 46.67 46.67 40.00 40.00 40.00 40.00 20.00 6.67
    arrow_arrow-ipc-stream-fuzz_1a34a0 0.00 nan 0.00 0.00 nan 0.00 0.00 0.00 0.00 0.00 nan 0.00
    aspell_aspell_fuzzer_e8eb74 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00 0.00 0.00
    assimp_assimp_fuzzer_4d451f 100.00 100.00 50.00 100.00 100.00 0.00 100.00 100.00 100.00 100.00 100.00 0.00
    bloaty_fuzz_target_52948c 100.00 100.00 100.00 100.00 0.00 100.00 0.00 0.00 0.00 0.00 nan 0.00
    ffmpeg_ffmpeg_demuxer_fuzzer_7adeef 0.00 100.00 0.00 0.00 100.00 100.00 100.00 100.00 0.00 100.00 nan 0.00
    file_magic_fuzzer_2d5f85 100.00 0.00 100.00 100.00 nan 0.00 100.00 100.00 100.00 100.00 0.00 0.00
    grok_grk_decompress_fuzzer_9cd001 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00
    harfbuzz_hb-shape-fuzzer_17863b 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00 0.00
    lcms_cms_transform_all_fuzzer_97d37d 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 nan 0.00
    libaom_av1_dec_fuzzer_6e1848 100.00 100.00 100.00 0.00 100.00 100.00 0.00 0.00 100.00 0.00 0.00 0.00
    libpcap_fuzz_filter_98b0a2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 nan 0.00
    libxml2_xml_e85b9b 100.00 100.00 100.00 100.00 100.00 100.00 0.00 0.00 0.00 0.00 0.00 0.00
    mbedtls_fuzz_dtlsclient_7c6b0e 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
    php_php-fuzz-parser_0dbedb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
    systemd_fuzz-network-parser_288baf 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
    • Fuzzers are sorted by "FuzzerMean" (average median relative coverage), highest on the left.
    • Green background = highest relative median coverage.
    • Blue gradient background = greater than 95% relative median coverage.
  • Total unique crashes found on each benchmark
    (Note that a unique crash does not imply a unique bug)
      Total honggfuzz pastis aflplusplusplus symsan aflrustrust hastefuzz libafl_libfuzzer aflplusplus libfuzzer aflsmart_plusplus afl learnperffuzz
    FuzzerSum 229 133 94 79 76 70 55 53 48 39 37 29 5
    arrow_arrow-ipc-stream-fuzz_1a34a0 0 nan nan 0 nan 0 0 0 0 0 0 0 0
    aspell_aspell_fuzzer_e8eb74 5 2 5 2 1 2 3 2 2 2 2 2 1
    assimp_assimp_fuzzer_4d451f 149 83 55 49 68 30 27 5 22 24 2 3 2
    bloaty_fuzz_target_52948c 1 1 1 1 nan 1 1 1 1 0 1 1 0
    ffmpeg_ffmpeg_demuxer_fuzzer_7adeef 21 7 7 12 nan 3 5 12 6 2 4 2 0
    file_magic_fuzzer_2d5f85 2 nan 0 1 0 1 1 1 1 1 1 2 0
    grok_grk_decompress_fuzzer_9cd001 3 2 2 3 2 2 2 1 2 2 2 2 1
    harfbuzz_hb-shape-fuzzer_17863b 8 5 4 4 2 5 6 8 4 3 6 6 1
    lcms_cms_transform_all_fuzzer_97d37d 8 2 2 1 nan 1 3 8 0 3 0 2 0
    libaom_av1_dec_fuzzer_6e1848 16 16 16 5 3 16 7 13 5 0 14 6 0
    libpcap_fuzz_filter_98b0a2 0 0 0 0 nan 0 0 0 0 0 0 0 0
    libxml2_xml_e85b9b 3 2 2 0 0 2 0 2 1 2 3 2 0
    mbedtls_fuzz_dtlsclient_7c6b0e 0 0 0 0 0 0 0 0 0 0 0 0 0
    php_php-fuzz-parser_0dbedb 3 3 0 1 0 0 0 0 1 0 2 1 0
    systemd_fuzz-network-parser_288baf 10 10 0 0 0 7 0 0 3 0 0 0 0
    • Fuzzers are sorted by "FuzzerSum", highest on the left.
    • Green background = most unique bugs found.
    • *note: This table represents unique bugs found across all trials.

arrow_arrow-ipc-stream-fuzz_1a34a0 summary

Discovered bug coverage distribution
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
Mean bug coverage growth over time
Mean bug coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
  • Sample statistics and statistical significance (bugs covered)
    Bug coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    afl 82800 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    aflplusplus 82800 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    aflplusplusplus 82800 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    aflrustrust 82800 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    aflsmart_plusplus 82800 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    hastefuzz 82800 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    learnperffuzz 82800 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    libafl_libfuzzer 82800 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    libfuzzer 82800 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    hastefuzz 82800 20.0 2459.55 69.794039 2316.0 2397.00 2473.5 2512.25 2566.0
    libafl_libfuzzer 82800 20.0 2438.90 34.129629 2336.0 2427.75 2439.0 2455.50 2510.0
    afl 82800 20.0 2433.90 30.755231 2369.0 2427.75 2438.0 2446.50 2504.0
    aflrustrust 82800 20.0 2392.80 52.156243 2306.0 2334.50 2419.5 2438.50 2453.0
    libfuzzer 82800 20.0 2373.65 45.375944 2296.0 2342.75 2380.5 2402.50 2444.0
    aflplusplus 82800 20.0 2327.45 38.558875 2300.0 2305.75 2309.0 2332.50 2446.0
    aflplusplusplus 82800 20.0 2210.05 49.650860 2025.0 2200.00 2220.5 2238.50 2255.0
    aflsmart_plusplus 82800 20.0 2123.70 95.438103 1822.0 2121.50 2150.0 2171.25 2225.0
    learnperffuzz 82800 20.0 1674.50 204.702223 1516.0 1516.00 1516.0 1872.00 2031.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

aspell_aspell_fuzzer_e8eb74 summary

Discovered bug coverage distribution
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
Mean bug coverage growth over time
Mean bug coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
  • Sample statistics and statistical significance (bugs covered)
    Bug coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    afl 82800 20.0 1.00 0.000000 1.0 1.0 1.0 1.0 1.0
    aflplusplus 82800 20.0 1.00 0.000000 1.0 1.0 1.0 1.0 1.0
    aflplusplusplus 82800 20.0 0.80 0.410391 0.0 1.0 1.0 1.0 1.0
    aflrustrust 82800 20.0 0.85 0.366348 0.0 1.0 1.0 1.0 1.0
    aflsmart_plusplus 82800 20.0 1.00 0.000000 1.0 1.0 1.0 1.0 1.0
    hastefuzz 82800 20.0 0.70 0.470162 0.0 0.0 1.0 1.0 1.0
    honggfuzz 82800 20.0 0.70 0.470162 0.0 0.0 1.0 1.0 1.0
    libafl_libfuzzer 82800 20.0 1.00 0.000000 1.0 1.0 1.0 1.0 1.0
    libfuzzer 82800 20.0 0.80 0.410391 0.0 1.0 1.0 1.0 1.0
    pastis 82800 20.0 1.00 0.000000 1.0 1.0 1.0 1.0 1.0
    learnperffuzz 82800 20.0 0.10 0.307794 0.0 0.0 0.0 0.0 1.0
    symsan 82800 20.0 0.30 0.470162 0.0 0.0 0.0 1.0 1.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    honggfuzz 82800 20.0 3322.95 69.073169 3246.0 3275.00 3304.5 3333.25 3533.0
    symsan 82800 20.0 3336.35 172.236763 3192.0 3215.00 3267.5 3399.50 3892.0
    aflrustrust 82800 20.0 3216.85 92.621059 3029.0 3152.50 3227.0 3276.50 3356.0
    hastefuzz 82800 20.0 3275.95 204.932974 3024.0 3128.25 3190.0 3391.75 3754.0
    aflplusplus 82800 20.0 3252.85 128.214161 3139.0 3163.00 3179.0 3323.00 3552.0
    aflplusplusplus 82800 20.0 3224.30 204.567606 2969.0 3120.75 3175.0 3311.50 3826.0
    aflsmart_plusplus 82800 20.0 3141.70 6.105218 3132.0 3137.00 3141.0 3146.00 3153.0
    afl 82800 20.0 3139.15 6.611593 3116.0 3137.00 3140.0 3143.00 3149.0
    libafl_libfuzzer 82800 20.0 3135.10 12.540041 3106.0 3129.75 3134.0 3141.25 3155.0
    pastis 82800 20.0 3133.35 44.317366 3038.0 3103.75 3132.5 3160.25 3225.0
    libfuzzer 82800 20.0 3068.00 80.004605 2930.0 3005.75 3055.0 3124.00 3222.0
    learnperffuzz 82800 20.0 2902.40 75.345799 2625.0 2880.00 2911.0 2932.25 3013.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

assimp_assimp_fuzzer_4d451f summary

Discovered bug coverage distribution
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
Mean bug coverage growth over time
Mean bug coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
  • Sample statistics and statistical significance (bugs covered)
    Bug coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    afl 82800 20.0 0.60 0.502625 0.0 0.0 1.0 1.00 1.0
    aflplusplus 82800 20.0 1.00 0.000000 1.0 1.0 1.0 1.00 1.0
    aflplusplusplus 82800 20.0 1.00 0.000000 1.0 1.0 1.0 1.00 1.0
    aflrustrust 82800 20.0 0.90 0.307794 0.0 1.0 1.0 1.00 1.0
    hastefuzz 82800 20.0 1.00 0.000000 1.0 1.0 1.0 1.00 1.0
    honggfuzz 82800 20.0 1.00 0.000000 1.0 1.0 1.0 1.00 1.0
    libfuzzer 82800 20.0 1.00 0.000000 1.0 1.0 1.0 1.00 1.0
    pastis 82800 20.0 1.00 0.000000 1.0 1.0 1.0 1.00 1.0
    symsan 82800 20.0 1.00 0.000000 1.0 1.0 1.0 1.00 1.0
    aflsmart_plusplus 82800 20.0 0.50 0.512989 0.0 0.0 0.5 1.00 1.0
    learnperffuzz 82800 20.0 0.15 0.366348 0.0 0.0 0.0 0.00 1.0
    libafl_libfuzzer 82800 20.0 0.25 0.444262 0.0 0.0 0.0 0.25 1.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    symsan 82800 20.0 3829.65 281.525969 3005.0 3712.50 3788.5 4029.00 4181.0
    honggfuzz 82800 20.0 3396.35 196.457113 3143.0 3245.25 3396.0 3501.00 3885.0
    libfuzzer 82800 20.0 2979.95 119.598220 2712.0 2932.75 2981.0 3060.50 3178.0
    pastis 82800 20.0 2900.35 231.003822 2544.0 2687.00 2970.0 3078.00 3228.0
    aflplusplusplus 82800 20.0 2606.00 176.928175 2286.0 2498.25 2615.0 2716.25 3012.0
    hastefuzz 82800 20.0 2441.40 172.700654 2120.0 2326.75 2423.0 2531.50 2866.0
    aflrustrust 82800 20.0 2115.50 273.123165 1689.0 1854.25 2151.0 2375.00 2485.0
    aflplusplus 82800 20.0 2173.10 159.644968 1981.0 2039.75 2133.5 2243.50 2524.0
    aflsmart_plusplus 82800 20.0 1526.95 82.045222 1376.0 1462.75 1545.5 1579.25 1722.0
    afl 82800 20.0 1499.55 101.539453 1254.0 1441.25 1518.0 1570.50 1638.0
    libafl_libfuzzer 82800 20.0 1361.80 420.583254 0.0 1152.25 1416.5 1647.00 1888.0
    learnperffuzz 82800 20.0 287.85 331.236324 0.0 0.00 0.0 603.25 748.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

bloaty_fuzz_target_52948c summary

Discovered bug coverage distribution
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
Mean bug coverage growth over time
Mean bug coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
  • Sample statistics and statistical significance (bugs covered)
    Bug coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    afl 82800 20.0 1.00 0.000000 1.0 1.0 1.0 1.0 1.0
    aflrustrust 82800 20.0 1.00 0.000000 1.0 1.0 1.0 1.0 1.0
    aflsmart_plusplus 82800 20.0 1.00 0.000000 1.0 1.0 1.0 1.0 1.0
    libafl_libfuzzer 82800 20.0 1.00 0.000000 1.0 1.0 1.0 1.0 1.0
    pastis 82800 20.0 0.70 0.470162 0.0 0.0 1.0 1.0 1.0
    aflplusplus 82800 20.0 0.45 0.510418 0.0 0.0 0.0 1.0 1.0
    aflplusplusplus 82800 20.0 0.10 0.307794 0.0 0.0 0.0 0.0 1.0
    hastefuzz 82800 20.0 0.30 0.470162 0.0 0.0 0.0 1.0 1.0
    honggfuzz 82800 20.0 0.40 0.502625 0.0 0.0 0.0 1.0 1.0
    learnperffuzz 82800 20.0 0.00 0.000000 0.0 0.0 0.0 0.0 0.0
    libfuzzer 82800 20.0 0.00 0.000000 0.0 0.0 0.0 0.0 0.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    libafl_libfuzzer 82800 20.0 6006.40 84.788530 5828.0 5964.25 6022.5 6076.25 6135.0
    afl 82800 20.0 5930.65 105.336691 5700.0 5857.25 5940.0 6007.50 6089.0
    aflsmart_plusplus 82800 20.0 5894.45 121.817584 5591.0 5856.50 5922.0 5945.25 6113.0
    aflrustrust 82800 20.0 5864.40 89.901349 5667.0 5818.75 5863.5 5919.50 6030.0
    honggfuzz 82800 20.0 5765.45 141.435674 5598.0 5663.00 5712.5 5833.75 6077.0
    aflplusplus 82800 20.0 5560.05 66.008353 5461.0 5521.00 5568.5 5598.00 5739.0
    pastis 82800 20.0 5208.90 122.218054 5078.0 5132.50 5183.5 5241.25 5540.0
    hastefuzz 82800 20.0 4965.70 143.219780 4684.0 4867.25 4988.5 5061.75 5199.0
    aflplusplusplus 82800 20.0 4700.40 125.806619 4488.0 4589.75 4713.0 4805.00 4872.0
    learnperffuzz 82800 20.0 4409.75 74.880167 4364.0 4364.00 4364.0 4430.75 4598.0
    libfuzzer 82800 20.0 4364.00 0.000000 4364.0 4364.00 4364.0 4364.00 4364.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

ffmpeg_ffmpeg_demuxer_fuzzer_7adeef summary

Discovered bug coverage distribution
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
Mean bug coverage growth over time
Mean bug coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
  • Sample statistics and statistical significance (bugs covered)
    Bug coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    aflplusplus 82800 20.0 0.85 0.366348 0.0 1.00 1.0 1.0 1.0
    aflplusplusplus 82800 20.0 1.00 0.000000 1.0 1.00 1.0 1.0 1.0
    honggfuzz 82800 20.0 1.00 0.000000 1.0 1.00 1.0 1.0 1.0
    libafl_libfuzzer 82800 20.0 0.75 0.444262 0.0 0.75 1.0 1.0 1.0
    libfuzzer 82800 20.0 0.90 0.307794 0.0 1.00 1.0 1.0 1.0
    pastis 82800 20.0 1.00 0.000000 1.0 1.00 1.0 1.0 1.0
    afl 82800 20.0 0.20 0.410391 0.0 0.00 0.0 0.0 1.0
    aflrustrust 82800 20.0 0.15 0.366348 0.0 0.00 0.0 0.0 1.0
    aflsmart_plusplus 82800 20.0 0.20 0.410391 0.0 0.00 0.0 0.0 1.0
    hastefuzz 82800 20.0 0.35 0.489360 0.0 0.00 0.0 1.0 1.0
    learnperffuzz 82800 20.0 0.00 0.000000 0.0 0.00 0.0 0.0 0.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    aflplusplusplus 82800 20.0 20430.50 1565.344203 17060.0 20072.75 20589.5 21105.25 24113.0
    pastis 82800 20.0 20044.45 365.877816 19319.0 19850.50 20088.5 20221.00 20895.0
    honggfuzz 82800 20.0 19726.80 612.479998 18154.0 19625.25 19867.0 20104.75 20535.0
    libafl_libfuzzer 82800 20.0 19573.70 895.369615 17633.0 19122.50 19644.0 20292.75 20993.0
    aflplusplus 82800 20.0 17498.05 552.573832 16475.0 17194.25 17536.5 17716.50 18907.0
    libfuzzer 82800 20.0 15740.05 370.563792 14835.0 15579.25 15769.5 16000.75 16270.0
    hastefuzz 82800 20.0 14985.55 924.759683 13071.0 14135.75 14889.5 15761.50 16336.0
    aflsmart_plusplus 82800 20.0 14285.10 294.491337 13690.0 14185.75 14324.0 14448.50 14781.0
    afl 82800 20.0 13987.80 641.314571 12515.0 13761.75 14149.0 14360.50 14905.0
    aflrustrust 82800 20.0 13352.55 3849.710987 4372.0 11431.50 13586.0 16452.00 18781.0
    learnperffuzz 82800 20.0 2269.50 2109.999439 0.0 0.00 3881.5 4111.75 4422.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

file_magic_fuzzer_2d5f85 summary

Discovered bug coverage distribution
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
Mean bug coverage growth over time
Mean bug coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
  • Sample statistics and statistical significance (bugs covered)
    Bug coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    afl 82800 20.0 1.00 0.000000 1.0 1.0 1.0 1.0 1.0
    aflplusplus 82800 20.0 0.55 0.510418 0.0 0.0 1.0 1.0 1.0
    aflplusplusplus 82800 20.0 0.70 0.470162 0.0 0.0 1.0 1.0 1.0
    aflrustrust 82800 20.0 1.00 0.000000 1.0 1.0 1.0 1.0 1.0
    aflsmart_plusplus 82800 20.0 1.00 0.000000 1.0 1.0 1.0 1.0 1.0
    hastefuzz 82800 20.0 1.00 0.000000 1.0 1.0 1.0 1.0 1.0
    libfuzzer 82800 20.0 1.00 0.000000 1.0 1.0 1.0 1.0 1.0
    learnperffuzz 82800 20.0 0.00 0.000000 0.0 0.0 0.0 0.0 0.0
    libafl_libfuzzer 82800 20.0 0.10 0.307794 0.0 0.0 0.0 0.0 1.0
    pastis 82800 20.0 0.00 0.000000 0.0 0.0 0.0 0.0 0.0
    symsan 82800 20.0 0.00 0.000000 0.0 0.0 0.0 0.0 0.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    aflrustrust 82800 20.0 2496.40 12.542139 2472.0 2485.50 2498.0 2504.25 2521.0
    hastefuzz 82800 20.0 2479.10 19.512209 2433.0 2469.25 2489.0 2492.00 2502.0
    afl 82800 20.0 2360.90 9.930813 2332.0 2356.75 2362.0 2366.50 2375.0
    libafl_libfuzzer 82800 20.0 2393.05 67.296262 2321.0 2338.25 2351.5 2470.25 2505.0
    aflsmart_plusplus 82800 20.0 2345.70 9.194392 2332.0 2340.00 2344.0 2353.50 2366.0
    libfuzzer 82800 20.0 2331.70 8.163462 2319.0 2325.75 2329.5 2338.75 2344.0
    aflplusplusplus 82800 20.0 2260.75 44.048866 2203.0 2237.00 2251.0 2267.75 2385.0
    aflplusplus 82800 20.0 2259.95 64.353853 2190.0 2221.00 2242.0 2273.50 2411.0
    pastis 82800 20.0 1850.65 4.270770 1845.0 1848.25 1850.0 1852.75 1859.0
    learnperffuzz 82800 20.0 1899.75 118.996185 1810.0 1810.00 1810.0 2008.50 2104.0
    symsan 82800 20.0 1810.00 0.000000 1810.0 1810.00 1810.0 1810.00 1810.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

grok_grk_decompress_fuzzer_9cd001 summary

Discovered bug coverage distribution
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
Mean bug coverage growth over time
Mean bug coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
  • Sample statistics and statistical significance (bugs covered)
    Bug coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    afl 82800 20.0 1.0 0.0 1.0 1.0 1.0 1.0 1.0
    aflplusplus 82800 20.0 1.0 0.0 1.0 1.0 1.0 1.0 1.0
    aflplusplusplus 82800 20.0 1.0 0.0 1.0 1.0 1.0 1.0 1.0
    aflrustrust 82800 20.0 1.0 0.0 1.0 1.0 1.0 1.0 1.0
    aflsmart_plusplus 82800 20.0 1.0 0.0 1.0 1.0 1.0 1.0 1.0
    hastefuzz 82800 20.0 1.0 0.0 1.0 1.0 1.0 1.0 1.0
    honggfuzz 82800 20.0 1.0 0.0 1.0 1.0 1.0 1.0 1.0
    learnperffuzz 82800 20.0 1.0 0.0 1.0 1.0 1.0 1.0 1.0
    libafl_libfuzzer 82800 20.0 1.0 0.0 1.0 1.0 1.0 1.0 1.0
    libfuzzer 82800 20.0 1.0 0.0 1.0 1.0 1.0 1.0 1.0
    pastis 82800 20.0 1.0 0.0 1.0 1.0 1.0 1.0 1.0
    symsan 82800 20.0 1.0 0.0 1.0 1.0 1.0 1.0 1.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    aflplusplusplus 82800 20.0 6192.80 59.330120 6108.0 6146.25 6192.0 6231.50 6294.0
    pastis 82800 20.0 6164.55 46.052001 6001.0 6148.50 6166.0 6188.75 6242.0
    aflplusplus 82800 20.0 6162.45 56.716725 6082.0 6127.50 6152.0 6189.00 6335.0
    symsan 82800 20.0 5894.40 462.420793 4944.0 5765.25 6120.5 6187.50 6267.0
    hastefuzz 82800 20.0 6114.95 55.866829 6007.0 6089.00 6106.0 6140.75 6235.0
    honggfuzz 82800 20.0 6089.35 24.602685 6040.0 6073.00 6084.0 6108.25 6128.0
    aflsmart_plusplus 82800 20.0 6073.05 54.985620 5991.0 6026.25 6078.5 6093.50 6231.0
    afl 82800 20.0 6050.20 69.261214 5932.0 5998.50 6046.0 6102.50 6203.0
    aflrustrust 82800 20.0 5978.65 154.930978 5518.0 5927.25 5997.5 6086.50 6204.0
    libfuzzer 82800 20.0 5943.10 40.895116 5887.0 5916.50 5943.5 5966.75 6037.0
    libafl_libfuzzer 82800 20.0 5519.20 5.001053 5512.0 5515.75 5519.0 5522.25 5530.0
    learnperffuzz 82800 20.0 5553.15 79.865199 5490.0 5496.75 5500.5 5627.50 5724.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

harfbuzz_hb-shape-fuzzer_17863b summary

Discovered bug coverage distribution
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
Mean bug coverage growth over time
Mean bug coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
  • Sample statistics and statistical significance (bugs covered)
    Bug coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    afl 82800 20.0 1.00 0.000000 1.0 1.0 1.0 1.0 1.0
    aflplusplus 82800 20.0 1.00 0.000000 1.0 1.0 1.0 1.0 1.0
    aflplusplusplus 82800 20.0 1.00 0.000000 1.0 1.0 1.0 1.0 1.0
    aflrustrust 82800 20.0 0.70 0.470162 0.0 0.0 1.0 1.0 1.0
    aflsmart_plusplus 82800 20.0 1.00 0.000000 1.0 1.0 1.0 1.0 1.0
    hastefuzz 82800 20.0 1.00 0.000000 1.0 1.0 1.0 1.0 1.0
    honggfuzz 82800 20.0 1.00 0.000000 1.0 1.0 1.0 1.0 1.0
    libafl_libfuzzer 82800 20.0 1.00 0.000000 1.0 1.0 1.0 1.0 1.0
    libfuzzer 82800 20.0 1.00 0.000000 1.0 1.0 1.0 1.0 1.0
    pastis 82800 20.0 1.00 0.000000 1.0 1.0 1.0 1.0 1.0
    symsan 82800 20.0 0.70 0.470162 0.0 0.0 1.0 1.0 1.0
    learnperffuzz 82800 20.0 0.05 0.223607 0.0 0.0 0.0 0.0 1.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    libafl_libfuzzer 82800 20.0 10485.05 57.207356 10352.0 10455.25 10483.0 10520.50 10571.0
    aflsmart_plusplus 82800 20.0 10190.05 40.783736 10093.0 10163.75 10180.5 10214.50 10269.0
    hastefuzz 82800 20.0 10169.80 46.813853 10096.0 10138.25 10170.0 10195.75 10261.0
    afl 82800 20.0 10155.75 56.914409 10048.0 10115.50 10160.5 10181.50 10249.0
    honggfuzz 82800 20.0 10150.50 50.514719 10077.0 10099.50 10160.0 10191.25 10227.0
    pastis 82800 20.0 10123.20 41.710279 10036.0 10096.50 10122.5 10152.00 10216.0
    aflplusplus 82800 20.0 10075.45 36.035764 10013.0 10045.75 10075.0 10108.00 10147.0
    aflplusplusplus 82800 20.0 10060.55 40.177861 9994.0 10027.00 10069.5 10077.75 10165.0
    symsan 82800 20.0 9448.85 89.870828 9282.0 9381.75 9435.0 9518.50 9629.0
    aflrustrust 82800 20.0 9107.00 754.933946 7910.0 8341.00 9427.0 9618.75 10302.0
    libfuzzer 82800 20.0 8950.80 134.508697 8672.0 8875.00 8947.5 9031.50 9240.0
    learnperffuzz 82800 20.0 8093.75 353.363221 7317.0 7937.75 8149.0 8288.25 8638.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

lcms_cms_transform_all_fuzzer_97d37d summary

Discovered bug coverage distribution
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
Mean bug coverage growth over time
Mean bug coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
  • Sample statistics and statistical significance (bugs covered)
    Bug coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    afl 82800 20.0 0.10 0.307794 0.0 0.0 0.0 0.0 1.0
    aflplusplus 82800 20.0 0.00 0.000000 0.0 0.0 0.0 0.0 0.0
    aflplusplusplus 82800 20.0 0.10 0.307794 0.0 0.0 0.0 0.0 1.0
    aflrustrust 82800 20.0 0.40 0.502625 0.0 0.0 0.0 1.0 1.0
    aflsmart_plusplus 82800 20.0 0.00 0.000000 0.0 0.0 0.0 0.0 0.0
    hastefuzz 82800 20.0 0.15 0.366348 0.0 0.0 0.0 0.0 1.0
    honggfuzz 82800 20.0 0.10 0.307794 0.0 0.0 0.0 0.0 1.0
    learnperffuzz 82800 20.0 0.00 0.000000 0.0 0.0 0.0 0.0 0.0
    libafl_libfuzzer 82800 20.0 0.45 0.510418 0.0 0.0 0.0 1.0 1.0
    libfuzzer 82800 20.0 0.15 0.366348 0.0 0.0 0.0 0.0 1.0
    pastis 82800 20.0 0.10 0.307794 0.0 0.0 0.0 0.0 1.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    libafl_libfuzzer 82800 20.0 2353.10 220.974850 1931.0 2266.00 2350.0 2530.25 2784.0
    hastefuzz 82800 20.0 2281.10 164.471690 1762.0 2215.75 2298.5 2370.50 2487.0
    aflrustrust 82800 20.0 2100.60 239.607442 1679.0 1896.00 2135.5 2238.25 2531.0
    aflplusplusplus 82800 20.0 1912.30 301.010246 1117.0 1851.75 1963.0 2112.00 2308.0
    pastis 82800 20.0 1853.85 150.481779 1400.0 1766.50 1895.0 1959.75 2056.0
    libfuzzer 82800 20.0 1865.95 182.349137 1529.0 1731.25 1884.5 1975.75 2162.0
    honggfuzz 82800 20.0 1761.55 264.876828 1180.0 1569.50 1838.5 1943.25 2181.0
    afl 82800 20.0 1622.10 227.987280 1019.0 1510.50 1658.0 1771.00 1966.0
    aflplusplus 82800 20.0 1659.20 178.379135 1270.0 1575.50 1655.5 1784.25 2028.0
    aflsmart_plusplus 82800 20.0 1505.35 302.290776 798.0 1486.75 1583.0 1690.75 1931.0
    learnperffuzz 82800 20.0 87.70 45.320032 73.0 73.00 73.0 73.00 228.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

libaom_av1_dec_fuzzer_6e1848 summary

Discovered bug coverage distribution
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
Mean bug coverage growth over time
Mean bug coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
  • Sample statistics and statistical significance (bugs covered)
    Bug coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    aflrustrust 82800 20.0 0.75 0.444262 0.0 0.75 1.0 1.00 1.0
    aflsmart_plusplus 82800 20.0 0.95 0.223607 0.0 1.00 1.0 1.00 1.0
    hastefuzz 82800 20.0 0.60 0.502625 0.0 0.00 1.0 1.00 1.0
    honggfuzz 82800 20.0 0.90 0.307794 0.0 1.00 1.0 1.00 1.0
    libafl_libfuzzer 82800 20.0 0.65 0.489360 0.0 0.00 1.0 1.00 1.0
    pastis 82800 20.0 0.90 0.307794 0.0 1.00 1.0 1.00 1.0
    afl 82800 20.0 0.40 0.502625 0.0 0.00 0.0 1.00 1.0
    aflplusplus 82800 20.0 0.35 0.489360 0.0 0.00 0.0 1.00 1.0
    aflplusplusplus 82800 20.0 0.25 0.444262 0.0 0.00 0.0 0.25 1.0
    learnperffuzz 82800 20.0 0.00 0.000000 0.0 0.00 0.0 0.00 0.0
    libfuzzer 82800 20.0 0.00 0.000000 0.0 0.00 0.0 0.00 0.0
    symsan 82800 20.0 0.20 0.410391 0.0 0.00 0.0 0.00 1.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    libafl_libfuzzer 82800 20.0 10456.60 264.029982 9954.0 10229.00 10601.5 10654.00 10703.0
    honggfuzz 82800 20.0 10557.40 101.410785 10260.0 10500.25 10578.5 10634.25 10697.0
    pastis 82800 20.0 10537.70 59.993947 10446.0 10494.50 10535.0 10590.00 10632.0
    aflsmart_plusplus 82800 20.0 10506.75 79.056858 10323.0 10478.50 10500.5 10572.25 10634.0
    hastefuzz 82800 20.0 10499.80 165.938035 10146.0 10383.75 10491.5 10621.75 10762.0
    aflrustrust 82800 20.0 10439.15 162.794438 10138.0 10315.25 10478.0 10526.25 10722.0
    aflplusplus 82800 20.0 10194.85 236.731977 9718.0 10003.75 10213.5 10382.50 10573.0
    aflplusplusplus 82800 20.0 10166.95 159.989301 9661.0 10124.75 10201.5 10273.50 10372.0
    afl 82800 20.0 10176.00 97.993018 10076.0 10129.25 10144.5 10197.75 10524.0
    symsan 82800 20.0 10132.90 116.539083 9928.0 10061.00 10108.0 10220.75 10373.0
    libfuzzer 82800 20.0 9810.55 58.383915 9733.0 9764.00 9800.0 9852.50 9928.0
    learnperffuzz 82800 20.0 9077.60 12.141274 9057.0 9072.75 9076.5 9084.00 9102.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

libpcap_fuzz_filter_98b0a2 summary

Discovered bug coverage distribution
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
Mean bug coverage growth over time
Mean bug coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
  • Sample statistics and statistical significance (bugs covered)
    Bug coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    afl 82800 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    aflplusplus 82800 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    aflplusplusplus 82800 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    aflrustrust 82800 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    aflsmart_plusplus 82800 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    hastefuzz 82800 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    honggfuzz 82800 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    learnperffuzz 82800 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    libafl_libfuzzer 82800 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    libfuzzer 82800 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    pastis 82800 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    aflplusplusplus 82800 20.0 3448.30 73.160314 3339.0 3382.50 3442.5 3511.00 3586.0
    aflplusplus 82800 20.0 3418.35 45.550752 3331.0 3386.75 3417.5 3448.25 3507.0
    hastefuzz 82800 20.0 3414.45 79.349643 3292.0 3350.00 3380.5 3473.00 3547.0
    pastis 82800 20.0 3393.40 63.641595 3296.0 3349.50 3379.5 3424.75 3512.0
    libafl_libfuzzer 82800 20.0 3348.60 28.051456 3301.0 3324.25 3354.5 3369.25 3401.0
    afl 82800 20.0 3328.95 38.134768 3247.0 3309.25 3332.0 3350.00 3385.0
    honggfuzz 82800 20.0 3266.45 27.755464 3223.0 3248.75 3264.0 3282.50 3340.0
    aflrustrust 82800 20.0 3249.90 35.002105 3166.0 3232.25 3258.0 3277.00 3291.0
    aflsmart_plusplus 82800 20.0 3214.55 69.430370 3141.0 3164.50 3186.0 3292.00 3331.0
    libfuzzer 82800 20.0 3093.45 100.218958 2932.0 3002.50 3107.0 3174.00 3260.0
    learnperffuzz 82800 20.0 1921.10 122.437481 1625.0 1858.00 1958.0 1995.50 2070.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

libxml2_xml_e85b9b summary

Discovered bug coverage distribution
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
Mean bug coverage growth over time
Mean bug coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
  • Sample statistics and statistical significance (bugs covered)
    Bug coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    afl 82800 20.0 1.00 0.000000 1.0 1.00 1.0 1.00 1.0
    aflrustrust 82800 20.0 0.75 0.444262 0.0 0.75 1.0 1.00 1.0
    aflsmart_plusplus 82800 20.0 1.00 0.000000 1.0 1.00 1.0 1.00 1.0
    honggfuzz 82800 20.0 0.75 0.444262 0.0 0.75 1.0 1.00 1.0
    libafl_libfuzzer 82800 20.0 0.75 0.444262 0.0 0.75 1.0 1.00 1.0
    pastis 82800 20.0 0.95 0.223607 0.0 1.00 1.0 1.00 1.0
    aflplusplus 82800 20.0 0.15 0.366348 0.0 0.00 0.0 0.00 1.0
    aflplusplusplus 82800 20.0 0.00 0.000000 0.0 0.00 0.0 0.00 0.0
    hastefuzz 82800 20.0 0.00 0.000000 0.0 0.00 0.0 0.00 0.0
    learnperffuzz 82800 20.0 0.00 0.000000 0.0 0.00 0.0 0.00 0.0
    libfuzzer 82800 20.0 0.25 0.444262 0.0 0.00 0.0 0.25 1.0
    symsan 82800 20.0 0.00 0.000000 0.0 0.00 0.0 0.00 0.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    afl 82800 20.0 19718.45 212.021467 19378.0 19506.25 19761.5 19868.75 20069.0
    aflsmart_plusplus 82800 20.0 19667.25 214.119710 19252.0 19527.50 19688.0 19820.00 19970.0
    pastis 82800 20.0 19675.55 121.492246 19451.0 19603.25 19678.5 19751.50 19905.0
    libafl_libfuzzer 82800 20.0 19441.70 644.416676 17179.0 19405.00 19587.0 19793.00 20041.0
    aflrustrust 82800 20.0 18786.85 929.576829 16788.0 18892.50 19160.5 19313.25 19703.0
    aflplusplus 82800 20.0 18742.55 592.956684 16584.0 18563.50 18898.5 19055.25 19386.0
    aflplusplusplus 82800 20.0 18659.85 616.815188 16173.0 18650.75 18837.0 18934.50 19075.0
    hastefuzz 82800 20.0 18326.40 646.066837 16287.0 18012.00 18582.5 18744.00 19209.0
    honggfuzz 82800 20.0 17095.90 55.619856 16998.0 17067.50 17096.5 17124.75 17218.0
    symsan 82800 20.0 16618.95 486.249175 15602.0 16387.00 16713.5 16867.50 17569.0
    libfuzzer 82800 20.0 15525.50 550.380586 14618.0 15296.25 15409.5 15719.75 17195.0
    learnperffuzz 82800 20.0 12003.85 931.552257 9659.0 12039.00 12272.5 12482.00 13522.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

mbedtls_fuzz_dtlsclient_7c6b0e summary

Discovered bug coverage distribution
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
Mean bug coverage growth over time
Mean bug coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
  • Sample statistics and statistical significance (bugs covered)
    Bug coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    afl 82800 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    aflplusplus 82800 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    aflplusplusplus 82800 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    aflrustrust 82800 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    aflsmart_plusplus 82800 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    hastefuzz 82800 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    honggfuzz 82800 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    learnperffuzz 82800 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    libafl_libfuzzer 82800 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    libfuzzer 82800 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    pastis 82800 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
    symsan 82800 20.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    hastefuzz 82800 20.0 2635.20 9.168596 2618.0 2630.75 2637.0 2638.00 2662.0
    aflsmart_plusplus 82800 20.0 2843.00 476.554741 2517.0 2615.75 2625.5 2638.75 3789.0
    aflplusplusplus 82800 20.0 2614.45 18.503129 2562.0 2607.00 2613.5 2625.50 2645.0
    afl 82800 20.0 2608.95 11.500458 2589.0 2602.75 2612.5 2616.50 2629.0
    libafl_libfuzzer 82800 20.0 2868.25 389.592077 2557.0 2589.25 2605.0 3149.25 3755.0
    honggfuzz 82800 20.0 2655.10 194.723258 2553.0 2582.00 2600.0 2611.00 3283.0
    aflrustrust 82800 20.0 2732.70 227.217054 2543.0 2566.25 2596.5 2915.25 3192.0
    libfuzzer 82800 20.0 2588.35 15.428187 2557.0 2580.25 2587.5 2592.25 2620.0
    aflplusplus 82800 20.0 2589.95 19.645476 2570.0 2573.75 2583.0 2605.00 2631.0
    symsan 82800 20.0 2626.75 246.799487 2500.0 2525.75 2556.0 2568.25 3354.0
    pastis 82800 20.0 2536.20 19.933046 2493.0 2524.25 2537.5 2550.75 2570.0
    learnperffuzz 82800 20.0 1823.00 0.000000 1823.0 1823.00 1823.0 1823.00 1823.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

php_php-fuzz-parser_0dbedb summary

Discovered bug coverage distribution
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
Mean bug coverage growth over time
Mean bug coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
  • Sample statistics and statistical significance (bugs covered)
    Bug coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    afl 82800 20.0 0.15 0.366348 0.0 0.0 0.0 0.00 1.0
    aflplusplus 82800 20.0 0.05 0.223607 0.0 0.0 0.0 0.00 1.0
    aflplusplusplus 82800 20.0 0.20 0.410391 0.0 0.0 0.0 0.00 1.0
    aflrustrust 82800 20.0 0.00 0.000000 0.0 0.0 0.0 0.00 0.0
    aflsmart_plusplus 82800 20.0 0.25 0.444262 0.0 0.0 0.0 0.25 1.0
    hastefuzz 82800 20.0 0.00 0.000000 0.0 0.0 0.0 0.00 0.0
    honggfuzz 82800 20.0 0.10 0.307794 0.0 0.0 0.0 0.00 1.0
    learnperffuzz 82800 20.0 0.00 0.000000 0.0 0.0 0.0 0.00 0.0
    libafl_libfuzzer 82800 20.0 0.00 0.000000 0.0 0.0 0.0 0.00 0.0
    libfuzzer 82800 20.0 0.00 0.000000 0.0 0.0 0.0 0.00 0.0
    pastis 82800 20.0 0.00 0.000000 0.0 0.0 0.0 0.00 0.0
    symsan 82800 20.0 0.00 0.000000 0.0 0.0 0.0 0.00 0.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    honggfuzz 82800 20.0 17082.55 114.516593 16835.0 17003.75 17088.0 17156.75 17293.0
    pastis 82800 20.0 16966.35 74.106094 16784.0 16935.75 16958.0 16992.25 17119.0
    libafl_libfuzzer 82800 20.0 16760.00 24.094114 16722.0 16740.25 16758.0 16778.75 16799.0
    aflplusplus 82800 20.0 16754.90 50.301093 16668.0 16714.75 16748.5 16789.00 16873.0
    aflplusplusplus 82800 20.0 16738.85 95.821806 16568.0 16680.00 16704.5 16845.75 16910.0
    aflsmart_plusplus 82800 20.0 16670.15 65.302756 16551.0 16637.25 16666.5 16682.25 16842.0
    afl 82800 20.0 16649.55 49.663579 16574.0 16632.50 16644.0 16653.25 16824.0
    hastefuzz 82800 20.0 16580.35 43.343518 16510.0 16553.00 16574.0 16610.50 16677.0
    aflrustrust 82800 20.0 16481.60 43.711254 16438.0 16461.50 16468.0 16492.50 16644.0
    libfuzzer 82800 20.0 16456.35 59.187725 16374.0 16423.25 16455.0 16475.25 16597.0
    symsan 82800 20.0 16058.30 187.560545 15614.0 15982.50 16065.0 16194.25 16351.0
    learnperffuzz 82800 20.0 15441.45 123.038066 15034.0 15404.25 15459.0 15531.50 15582.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

systemd_fuzz-network-parser_288baf summary

Discovered bug coverage distribution
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
Mean bug coverage growth over time
Mean bug coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
  • Sample statistics and statistical significance (bugs covered)
    Bug coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    afl 82800 20.0 0.00 0.000000 0.0 0.0 0.0 0.0 0.0
    aflplusplus 82800 20.0 0.05 0.223607 0.0 0.0 0.0 0.0 1.0
    aflplusplusplus 82800 20.0 0.00 0.000000 0.0 0.0 0.0 0.0 0.0
    aflrustrust 82800 20.0 0.05 0.223607 0.0 0.0 0.0 0.0 1.0
    aflsmart_plusplus 82800 20.0 0.00 0.000000 0.0 0.0 0.0 0.0 0.0
    hastefuzz 82800 20.0 0.00 0.000000 0.0 0.0 0.0 0.0 0.0
    honggfuzz 82800 20.0 0.15 0.366348 0.0 0.0 0.0 0.0 1.0
    learnperffuzz 82800 20.0 0.00 0.000000 0.0 0.0 0.0 0.0 0.0
    libafl_libfuzzer 82800 20.0 0.00 0.000000 0.0 0.0 0.0 0.0 0.0
    libfuzzer 82800 20.0 0.00 0.000000 0.0 0.0 0.0 0.0 0.0
    pastis 82800 20.0 0.00 0.000000 0.0 0.0 0.0 0.0 0.0
    symsan 82800 20.0 0.00 0.000000 0.0 0.0 0.0 0.0 0.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    libafl_libfuzzer 82800 20.0 3631.65 20.481763 3602.0 3616.75 3624.5 3649.00 3670.0
    aflrustrust 82800 20.0 3555.25 24.167998 3507.0 3535.75 3555.5 3568.50 3598.0
    afl 82800 20.0 3406.70 23.965106 3369.0 3388.00 3405.5 3421.25 3448.0
    aflsmart_plusplus 82800 20.0 3373.55 31.523550 3324.0 3353.25 3372.0 3400.00 3434.0
    libfuzzer 82800 20.0 2885.10 683.519984 1880.0 2039.25 3260.0 3424.00 3524.0
    aflplusplusplus 82800 20.0 3242.15 77.869172 3128.0 3169.25 3247.5 3289.25 3400.0
    symsan 82800 20.0 3133.95 21.597941 3092.0 3122.00 3130.0 3140.00 3196.0
    aflplusplus 82800 20.0 3121.25 34.228143 3058.0 3093.75 3119.5 3153.50 3179.0
    hastefuzz 82800 20.0 3120.55 33.254165 3066.0 3098.50 3111.0 3136.75 3186.0
    honggfuzz 82800 20.0 2771.35 259.213847 2562.0 2650.25 2701.5 2746.25 3505.0
    learnperffuzz 82800 20.0 1856.50 348.236535 1636.0 1669.00 1699.5 1728.00 2558.0
    pastis 82800 20.0 1617.60 4.881760 1611.0 1612.00 1620.0 1622.00 1623.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

experiment data

You can download the raw data for this report here.

Check out the documentation on how to create customized reports using this data. Also see some example Colab notebooks for doing custom analysis on the data here.

The experiment was conducted using this FuzzBench commit: 614a601b50ed154b3514e0eb65c1a9690c47bbef

To reproduce this experiment run the following commands in your FuzzBench repo:
# Check out the right commit.
git checkout 614a601b50ed154b3514e0eb65c1a9690c47bbef
# Download the internal config file.
curl https://storage.googleapis.com/fuzzbench-data/FuzzBench: SBFT'23 Final Evaluation/input/config/experiment.yaml > /tmp/experiment-config.yaml
make install-dependencies
# Launch the experiment using paramters from the internal config file.
PYTHONPATH=. python experiment/reproduce_experiment.py -c /tmp/experiment-config.yaml -e <new_experiment_name>


Experiment Description:

from cached data