FuzzBench: 2024-05-22-bases report

(experiment incomplete/still running...)

experiment summary

We show two different aggregate (cross-benchmark) rankings of fuzzers. The first is based on the average of per-benchmarks scores, where the score represents the percentage of the highest reached median code-coverage on a given benchmark (higher value is better). The second ranking shows the average rank of fuzzers, after we rank them on each benchmark according to their median reached code-covereges (lower value is better).
By avg. score
average normalized score
fuzzer
libafl 94.75
libfuzzer 93.70
aflplusplus 89.38
centipede 68.95
aflfast 32.86
afl 21.46
By avg. rank
average rank
fuzzer
aflplusplus 2.04
libafl 2.26
libfuzzer 2.61
centipede 3.74
afl 4.70
aflfast 4.70
  • Critical difference diagram
    The diagram visualizes the average rank of fuzzers (second ranking above) while showing the significance of the differences as well. What is considered a "critical difference" (CD) is based on the Friedman/Nemenyi post-hoc test. See more in the documentation.
    Note: If a fuzzer does not support all benchmarks, its ranking as shown in this diagram can be lower than it should be. So please check the list of supported benchmarks for the fuzzer(s) of your interest. The list could be specified in the fuzzer's README.md like this.
  • Median relative code-coverages on each benchmark

    Note: The relative coverage summary table shows the median relative performance of each fuzzer to the experiment maximum. Thus the highest relative performance may not be 100%.
    trial_relative_coverage = trial_coverage / experiment_max_coverage

      libafl aflplusplus libfuzzer aflfast afl centipede
    FuzzerMedian 98.00 98.00 95.00 94.00 97.50 90.00
    FuzzerMean 95.86 94.62 90.70 82.33 81.33 80.58
    bloaty_fuzz_target 98.00 98.00 91.00 94.00 95.00 nan
    curl_curl_fuzzer_http 98.00 97.00 91.00 nan nan nan
    freetype2_ftfuzzer 90.00 90.00 78.00 nan nan 57.00
    harfbuzz_hb-shape-fuzzer 99.00 98.00 95.00 96.00 97.00 nan
    jsoncpp_jsoncpp_fuzzer 98.00 99.00 100.00 98.00 98.00 99.00
    lcms_cms_transform_fuzzer 94.00 92.00 89.00 nan nan 40.00
    libjpeg-turbo_libjpeg_turbo_fuzzer 99.00 99.00 99.00 nan nan 99.00
    libpcap_fuzz_both 87.00 85.00 79.00 1.00 1.00 89.00
    libpng_libpng_read_fuzzer 95.00 95.00 96.00 nan nan 96.00
    libxml2_xml 98.00 99.00 98.00 nan nan 93.00
    libxslt_xpath 97.00 99.00 93.00 nan nan 94.00
    mbedtls_fuzz_dtlsclient 94.00 73.00 71.00 nan nan 69.00
    openh264_decoder_fuzzer nan 99.00 98.00 nan 98.00 96.00
    openssl_x509 99.00 99.00 99.00 99.00 99.00 99.00
    openthread_ot-ip6-send-fuzzer 88.00 88.00 76.00 70.00 nan 71.00
    proj4_proj_crs_to_crs_fuzzer 93.00 91.00 97.00 nan nan 10.00
    re2_fuzzer 99.00 99.00 99.00 nan nan 95.00
    sqlite3_ossfuzz 99.00 95.00 80.00 nan nan 64.00
    stb_stbi_read_fuzzer 95.00 94.00 90.00 nan nan 85.00
    systemd_fuzz-link-parser 98.00 99.00 72.00 91.00 nan nan
    vorbis_decode_fuzzer 98.00 99.00 99.00 98.00 nan 90.00
    woff2_convert_woff2ttf_fuzzer 98.00 nan 97.00 nan nan 90.00
    zlib_zlib_uncompress_fuzzer 95.00 nan 99.00 94.00 nan 95.00
    • Fuzzers are sorted by "FuzzerMean" (average median relative coverage), highest on the left.
    • Green background = highest relative median coverage.
    • Blue gradient background = greater than 95% relative median coverage.

bloaty_fuzz_target summary

Ranking by median reached code coverage
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    libafl 124200 20.0 6409.350000 47.082542 6280.0 6382.75 6425.5 6438.5 6478.0
    aflplusplus 124200 17.0 6351.882353 96.503162 6191.0 6268.00 6390.0 6425.0 6492.0
    afl 124200 3.0 6210.333333 113.694034 6120.0 6146.50 6173.0 6255.5 6338.0
    aflfast 124200 4.0 6123.500000 72.702591 6056.0 6083.00 6106.5 6147.0 6225.0
    libfuzzer 124200 20.0 5954.550000 110.179984 5788.0 5863.75 5950.5 6008.5 6174.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

curl_curl_fuzzer_http summary

Ranking by median reached code coverage
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    libafl 125100 20.0 10908.20 23.701210 10864.0 10900.50 10912.0 10926.50 10938.0
    aflplusplus 125100 12.0 10898.25 88.518231 10793.0 10851.50 10869.0 10928.75 11130.0
    libfuzzer 125100 20.0 10112.20 379.030564 9342.0 9985.25 10213.0 10373.50 10561.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

freetype2_ftfuzzer summary

Ranking by median reached code coverage
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    aflplusplus 125100 11.0 11471.090909 450.600589 10459.0 11360.0 11558.0 11696.00 12013.0
    libafl 125100 20.0 11516.650000 649.724498 10378.0 11027.5 11480.0 11819.75 12718.0
    libfuzzer 125100 20.0 10061.000000 636.180792 8656.0 9639.0 10037.0 10459.00 11083.0
    centipede 125100 16.0 7302.500000 190.245455 6957.0 7178.5 7307.5 7416.50 7686.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

harfbuzz_hb-shape-fuzzer summary

Ranking by median reached code coverage
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    libafl 124200 20.0 11056.20 175.419977 10391.0 11072.25 11101.0 11136.75 11166.0
    aflplusplus 124200 20.0 10931.50 114.853912 10504.0 10910.25 10959.5 10988.25 11041.0
    afl 124200 5.0 10869.20 23.552070 10835.0 10854.00 10884.0 10886.00 10887.0
    aflfast 124200 4.0 10736.50 39.314967 10689.0 10712.25 10741.5 10765.75 10774.0
    libfuzzer 124200 20.0 10642.85 62.224784 10506.0 10607.50 10632.5 10681.75 10759.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

jsoncpp_jsoncpp_fuzzer summary

Ranking by median reached code coverage
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    libfuzzer 124200 20.0 524.900000 0.307794 524.0 525.0 525.0 525.00 525.0
    centipede 124200 20.0 520.750000 1.996708 518.0 519.0 521.5 522.25 523.0
    aflplusplus 124200 17.0 519.941176 0.242536 519.0 520.0 520.0 520.00 520.0
    afl 124200 3.0 518.333333 1.527525 517.0 517.5 518.0 519.00 520.0
    aflfast 124200 2.0 517.000000 0.000000 517.0 517.0 517.0 517.00 517.0
    libafl 124200 20.0 517.350000 0.745160 517.0 517.0 517.0 517.00 519.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

lcms_cms_transform_fuzzer summary

Ranking by median reached code coverage
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    libafl 125100 20.0 2115.350 61.015335 2031.0 2068.00 2117.0 2153.25 2228.0
    aflplusplus 125100 5.0 1995.800 307.873675 1468.0 2012.00 2090.0 2158.00 2251.0
    libfuzzer 125100 20.0 2000.900 91.810961 1810.0 1943.50 2019.0 2075.75 2131.0
    centipede 125100 16.0 1082.125 284.363588 782.0 839.25 905.0 1377.00 1444.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

libjpeg-turbo_libjpeg_turbo_fuzzer summary

Ranking by median reached code coverage
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    aflplusplus 125100 11.0 2549.272727 2.686667 2545.0 2547.00 2551.0 2551.5 2552.0
    libfuzzer 125100 20.0 2549.550000 1.932411 2546.0 2549.75 2550.0 2550.0 2553.0
    centipede 125100 16.0 2547.187500 1.470544 2546.0 2546.00 2546.5 2549.0 2550.0
    libafl 125100 20.0 2545.500000 1.933091 2543.0 2544.75 2545.0 2546.0 2550.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

libpcap_fuzz_both summary

Ranking by median reached code coverage
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    centipede 89100 20.0 2798.950000 645.005057 101.0 2851.75 2911.5 3006.75 3218.0
    libafl 89100 4.0 2807.250000 121.771302 2631.0 2784.00 2844.0 2867.25 2910.0
    aflplusplus 89100 1.0 2769.000000 NaN 2769.0 2769.00 2769.0 2769.00 2769.0
    libfuzzer 89100 15.0 2582.400000 87.076157 2459.0 2526.50 2598.0 2616.00 2760.0
    aflfast 89100 4.0 38.250000 5.500000 33.0 33.75 38.5 43.00 43.0
    afl 89100 17.0 36.941176 4.022912 33.0 34.00 34.0 42.00 43.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

libpng_libpng_read_fuzzer summary

Ranking by median reached code coverage
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    libfuzzer 126000 15.0 2022.933333 19.110456 2017.0 2018.00 2018.0 2018.00 2092.0
    centipede 126000 8.0 2015.750000 1.035098 2014.0 2015.00 2016.0 2016.25 2017.0
    aflplusplus 126000 20.0 2004.500000 2.704772 1998.0 2003.00 2005.0 2006.00 2010.0
    libafl 126000 20.0 2000.700000 20.246117 1974.0 1997.25 2000.5 2001.25 2079.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

libxml2_xml summary

Ranking by median reached code coverage
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    aflplusplus 126000 3.0 15772.666667 90.168361 15670.0 15739.50 15809.0 15824.00 15839.0
    libafl 126000 20.0 15630.300000 124.532008 15135.0 15633.00 15664.5 15682.00 15744.0
    libfuzzer 126000 20.0 15551.250000 74.194534 15448.0 15492.50 15554.0 15607.50 15688.0
    centipede 126000 18.0 14777.888889 102.809870 14568.0 14718.25 14767.5 14829.75 15046.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

libxslt_xpath summary

Ranking by median reached code coverage
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    aflplusplus 125100 12.0 11249.083333 86.757193 11086.0 11185.50 11280.0 11308.0 11361.0
    libafl 125100 20.0 11020.750000 69.967568 10844.0 11003.25 11031.5 11081.0 11102.0
    centipede 125100 16.0 10743.125000 101.075467 10563.0 10657.25 10742.5 10794.5 10948.0
    libfuzzer 125100 20.0 10639.000000 123.678275 10354.0 10591.75 10651.0 10726.0 10833.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

mbedtls_fuzz_dtlsclient summary

Ranking by median reached code coverage
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    libafl 125100 20.0 3429.500000 367.875050 2745.0 3133.00 3588.5 3733.5 3811.0
    aflplusplus 125100 9.0 2789.888889 34.034705 2746.0 2759.00 2789.0 2822.0 2832.0
    libfuzzer 125100 20.0 2712.450000 26.083822 2655.0 2700.25 2711.0 2722.0 2788.0
    centipede 125100 20.0 2662.400000 25.948126 2624.0 2639.00 2661.0 2677.0 2729.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

openh264_decoder_fuzzer summary

Ranking by median reached code coverage
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    aflplusplus 128700 20.0 9544.500000 39.896577 9441.0 9541.00 9554.0 9567.75 9592.0
    afl 128700 20.0 9544.000000 37.337719 9460.0 9527.75 9538.5 9558.00 9617.0
    libfuzzer 128700 14.0 9522.285714 42.376388 9401.0 9513.50 9530.0 9548.75 9569.0
    centipede 128700 9.0 9185.666667 277.917254 8469.0 9235.00 9264.0 9295.00 9389.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

openssl_x509 summary

Ranking by median reached code coverage
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    aflplusplus 126900 9.0 5831.444444 6.043821 5821.0 5831.00 5833.0 5834.00 5841.0
    libfuzzer 126900 20.0 5832.550000 3.300319 5820.0 5832.00 5833.0 5833.50 5836.0
    libafl 126900 20.0 5828.450000 6.108450 5821.0 5823.75 5830.0 5830.00 5844.0
    centipede 126900 15.0 5827.733333 5.775152 5814.0 5824.50 5828.0 5831.50 5838.0
    afl 126900 2.0 5825.500000 0.707107 5825.0 5825.25 5825.5 5825.75 5826.0
    aflfast 126900 2.0 5819.500000 3.535534 5817.0 5818.25 5819.5 5820.75 5822.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

openthread_ot-ip6-send-fuzzer summary

Ranking by median reached code coverage
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    aflplusplus 125100 11.0 3410.454545 274.282833 3048.0 3081.50 3587.0 3598.50 3682.0
    libafl 125100 20.0 3583.600000 352.506275 3050.0 3347.25 3574.0 3882.00 4037.0
    libfuzzer 125100 19.0 3088.684211 16.227729 3046.0 3079.50 3091.0 3098.50 3118.0
    centipede 125100 20.0 2894.850000 92.664803 2745.0 2800.00 2906.0 2946.25 3086.0
    aflfast 125100 3.0 2858.333333 49.943301 2829.0 2829.50 2830.0 2873.00 2916.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

proj4_proj_crs_to_crs_fuzzer summary

Ranking by median reached code coverage
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    libfuzzer 126000 20.0 7816.600000 93.182560 7632.0 7782.00 7829.0 7855.25 7969.0
    libafl 126000 19.0 7488.263158 156.280674 7175.0 7420.50 7514.0 7581.00 7728.0
    aflplusplus 126000 8.0 7342.375000 221.857057 6993.0 7234.75 7296.5 7540.25 7631.0
    centipede 126000 15.0 822.000000 4.070802 816.0 820.00 821.0 823.50 832.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

re2_fuzzer summary

Ranking by median reached code coverage
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    libfuzzer 125100 20.0 2885.60 1.759186 2882.0 2885.00 2885.5 2887.00 2889.0
    aflplusplus 125100 10.0 2878.40 4.005552 2871.0 2877.25 2878.5 2881.50 2883.0
    libafl 125100 20.0 2861.95 7.708335 2847.0 2860.75 2863.0 2864.25 2881.0
    centipede 125100 20.0 2777.85 20.959296 2744.0 2766.25 2773.0 2790.00 2831.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

sqlite3_ossfuzz summary

Ranking by median reached code coverage
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    libafl 126900 20.0 20579.200000 947.179614 16691.0 20671.75 20852.5 20937.25 21006.0
    aflplusplus 126900 3.0 20210.666667 186.551155 20062.0 20106.00 20150.0 20285.00 20420.0
    libfuzzer 126900 20.0 16966.250000 443.628208 16352.0 16582.50 16843.5 17359.50 17799.0
    centipede 126900 17.0 13657.529412 552.505443 12881.0 13326.00 13561.0 13842.00 15214.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

stb_stbi_read_fuzzer summary

Ranking by median reached code coverage
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    libafl 126000 20.0 2192.500000 46.641409 2103.0 2188.0 2192.5 2203.25 2268.0
    aflplusplus 126000 2.0 2163.000000 73.539105 2111.0 2137.0 2163.0 2189.00 2215.0
    libfuzzer 126000 20.0 2073.050000 49.656160 2008.0 2025.0 2076.5 2105.50 2161.0
    centipede 126000 19.0 1962.473684 9.094372 1948.0 1957.5 1960.0 1966.00 1989.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

vorbis_decode_fuzzer summary

Ranking by median reached code coverage
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    libfuzzer 125100 20.0 1269.35 1.348488 1267.0 1268.75 1270.0 1270.00 1271.0
    aflplusplus 125100 10.0 1265.80 4.131182 1261.0 1263.25 1265.0 1266.75 1275.0
    libafl 125100 20.0 1252.85 3.407036 1248.0 1250.00 1252.5 1255.25 1260.0
    aflfast 125100 6.0 1250.00 6.603030 1238.0 1248.50 1251.5 1254.50 1256.0
    centipede 125100 20.0 1156.40 13.359010 1134.0 1145.00 1156.5 1164.00 1186.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

woff2_convert_woff2ttf_fuzzer summary

Ranking by median reached code coverage
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    libafl 126900 20.0 1184.850000 11.389168 1165.0 1178.50 1184.5 1194.0 1203.0
    libfuzzer 126900 20.0 1162.300000 44.251792 1066.0 1135.25 1175.5 1201.5 1208.0
    centipede 126900 18.0 1088.555556 12.486332 1067.0 1075.75 1093.0 1098.0 1105.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

zlib_zlib_uncompress_fuzzer summary

Ranking by median reached code coverage
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    libfuzzer 127800 20.0 468.6 4.638512 461.0 463.0 471.0 472.0 473.0
    centipede 127800 7.0 451.0 2.081666 447.0 451.0 451.0 451.5 454.0
    libafl 127800 15.0 450.6 6.231258 442.0 447.0 450.0 452.0 461.0
    aflfast 127800 20.0 447.7 13.913643 390.0 448.0 449.0 454.0 456.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

experiment data

You can download the raw data for this report here.

Check out the documentation on how to create customized reports using this data. Also see some example Colab notebooks for doing custom analysis on the data here.

Experiment Description:

(None,)