FuzzBench: 2023-09-04-libafl-2 report

experiment summary

We show two different aggregate (cross-benchmark) rankings of fuzzers. The first is based on the average of per-benchmarks scores, where the score represents the percentage of the highest reached median code-coverage on a given benchmark (higher value is better). The second ranking shows the average rank of fuzzers, after we rank them on each benchmark according to their median reached code-covereges (lower value is better).
By avg. score
average normalized score
fuzzer
libafl_fuzzbench_ngram4 85.90
aflplusplus 81.27
libafl_fuzzbench_naive 80.27
libafl_fuzzbench_ngram8 79.45
libafl_fuzzbench_naive_ctx 79.19
libfuzzer 17.37
afl 12.11
honggfuzz 11.76
centipede 0.32
aflfast 0.00
aflsmart 0.00
eclipser 0.00
fairfuzz 0.00
mopt 0.00
By avg. rank
average rank
fuzzer
aflplusplus 2.06
libafl_fuzzbench_ngram4 2.76
libafl_fuzzbench_ngram8 3.00
libafl_fuzzbench_naive 3.18
libafl_fuzzbench_naive_ctx 3.94
honggfuzz 5.24
libfuzzer 5.24
afl 5.53
centipede 5.65
aflfast 5.71
aflsmart 5.71
eclipser 5.71
fairfuzz 5.71
mopt 5.71
  • Critical difference diagram
    The diagram visualizes the average rank of fuzzers (second ranking above) while showing the significance of the differences as well. What is considered a "critical difference" (CD) is based on the Friedman/Nemenyi post-hoc test. See more in the documentation.
    Note: If a fuzzer does not support all benchmarks, its ranking as shown in this diagram can be lower than it should be. So please check the list of supported benchmarks for the fuzzer(s) of your interest. The list could be specified in the fuzzer's README.md like this.
  • Median relative code-coverages on each benchmark

    Note: The relative coverage summary table shows the median relative performance of each fuzzer to the experiment maximum. Thus the highest relative performance may not be 100%.
    trial_relative_coverage = trial_coverage / experiment_max_coverage

      libafl_fuzzbench_ngram4 libafl_fuzzbench_naive aflplusplus libafl_fuzzbench_ngram8 libafl_fuzzbench_naive_ctx libfuzzer afl honggfuzz centipede aflfast aflsmart eclipser fairfuzz mopt
    FuzzerMedian 98.00 98.50 99.00 98.50 97.00 95.00 47.00 48.00 5.00 0.00 0.00 0.00 0.00 0.00
    FuzzerMean 93.80 93.71 93.57 93.00 92.50 80.67 50.67 48.67 5.00 0.00 0.00 0.00 0.00 0.00
    arduinojson_json_fuzzer 99.00 99.00 100.00 100.00 99.00 nan nan nan nan nan nan nan nan nan
    assimp_assimp_fuzzer 68.00 67.00 nan 61.00 65.00 nan nan nan nan nan nan nan nan nan
    astc-encoder_fuzz_astc_physical_to_symbolic 95.00 95.00 nan 95.00 95.00 nan nan nan nan nan nan nan nan nan
    botan_tls_server nan nan 48.00 nan nan 48.00 47.00 48.00 nan nan nan nan nan nan
    brotli_decode_fuzzer 99.00 99.00 nan 99.00 nan nan nan nan nan nan nan nan nan nan
    double-conversion_string_to_double_fuzzer 97.00 97.00 99.00 98.00 96.00 nan nan nan nan nan nan nan nan nan
    draco_draco_pc_decoder_fuzzer 70.00 75.00 92.00 72.00 70.00 nan nan nan nan nan nan nan nan nan
    dropbear_fuzzer-postauth_nomaths nan nan 96.00 nan nan 95.00 78.00 98.00 nan nan nan nan nan nan
    firestore_firestore_serializer_fuzzer 100.00 100.00 100.00 100.00 100.00 nan nan nan nan nan nan nan nan nan
    fmt_chrono-duration-fuzzer 98.00 98.00 99.00 94.00 97.00 nan nan nan nan nan nan nan nan nan
    guetzli_guetzli_fuzzer 97.00 99.00 99.00 96.00 97.00 nan nan nan nan nan nan nan nan nan
    icu_unicode_string_codepage_create_fuzzer 99.00 99.00 99.00 99.00 99.00 nan nan nan nan nan nan nan nan nan
    jansson_json_load_dump_fuzzer 99.00 99.00 99.00 99.00 99.00 nan nan nan nan nan nan nan nan nan
    libaom_av1_dec_fuzzer 96.00 96.00 97.00 nan 91.00 nan nan nan nan nan nan nan nan nan
    libcoap_pdu_parse_fuzzer 91.00 90.00 99.00 91.00 89.00 nan nan nan nan nan nan nan nan nan
    libhevc_hevc_dec_fuzzer 99.00 99.00 99.00 99.00 99.00 nan nan nan 5.00 0.00 0.00 0.00 0.00 0.00
    librdkafka_fuzz_regex 100.00 nan 84.00 99.00 99.00 99.00 27.00 0.00 nan nan nan nan nan nan
    • Fuzzers are sorted by "FuzzerMean" (average median relative coverage), highest on the left.
    • Green background = highest relative median coverage.
    • Blue gradient background = greater than 95% relative median coverage.

arduinojson_json_fuzzer summary

Ranking by median reached code coverage
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    aflplusplus 82800 20.0 401.60 0.502625 401.0 401.00 402.0 402.00 402.0
    libafl_fuzzbench_ngram8 82800 20.0 401.75 0.444262 401.0 401.75 402.0 402.00 402.0
    libafl_fuzzbench_ngram4 82800 20.0 400.75 0.910465 399.0 400.00 401.0 401.25 402.0
    libafl_fuzzbench_naive_ctx 82800 20.0 400.05 1.145931 398.0 399.00 400.0 401.00 402.0
    libafl_fuzzbench_naive 82800 20.0 398.45 1.099043 397.0 398.00 398.0 399.00 401.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.
  • Unique code coverage plots
    Ranking by unique code branches covered
    Each bar shows the total number of code branches found by a given fuzzer. The colored area shows the number of unique code branches (i.e., branches that were not covered by any other fuzzers).
    Pairwise unique code coverage
    Each cell represents the number of code branches covered by the fuzzer of the column but not by the fuzzer of the row

assimp_assimp_fuzzer summary

Ranking by median reached code coverage
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
error
The following fuzzers do not have enough samples: libafl_fuzzbench_naive, libafl_fuzzbench_ngram4.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    libafl_fuzzbench_ngram4 9900 14.0 2671.285714 231.057245 2316.0 2476.50 2674.0 2729.75 3125.0
    libafl_fuzzbench_naive 9900 14.0 2587.071429 202.994681 2244.0 2420.25 2627.5 2756.50 2879.0
    libafl_fuzzbench_naive_ctx 9900 16.0 2529.125000 239.643033 2052.0 2369.25 2548.0 2653.00 3110.0
    libafl_fuzzbench_ngram8 9900 18.0 2426.833333 222.147076 2067.0 2307.75 2395.5 2536.75 2916.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.
  • Unique code coverage plots
    Ranking by unique code branches covered
    Each bar shows the total number of code branches found by a given fuzzer. The colored area shows the number of unique code branches (i.e., branches that were not covered by any other fuzzers).
    Pairwise unique code coverage
    Each cell represents the number of code branches covered by the fuzzer of the column but not by the fuzzer of the row

astc-encoder_fuzz_astc_physical_to_symbolic summary

Ranking by median reached code coverage
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    libafl_fuzzbench_ngram8 82800 20.0 492.5 2.685242 490.0 491.0 492.0 493.00 501.0
    libafl_fuzzbench_naive 82800 20.0 488.8 0.695852 488.0 488.0 489.0 489.00 490.0
    libafl_fuzzbench_naive_ctx 82800 20.0 489.0 0.973329 488.0 488.0 489.0 489.00 492.0
    libafl_fuzzbench_ngram4 82800 20.0 489.1 0.788069 488.0 489.0 489.0 489.25 491.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.
  • Unique code coverage plots
    Ranking by unique code branches covered
    Each bar shows the total number of code branches found by a given fuzzer. The colored area shows the number of unique code branches (i.e., branches that were not covered by any other fuzzers).
    Pairwise unique code coverage
    Each cell represents the number of code branches covered by the fuzzer of the column but not by the fuzzer of the row

botan_tls_server summary

Ranking by median reached code coverage
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
error
The following fuzzers do not have enough samples: afl, honggfuzz, libfuzzer.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    honggfuzz 3600 10.0 1365.1 188.481328 1171.0 1242.75 1252.5 1549.00 1671.0
    aflplusplus 3600 20.0 1333.2 203.259906 1179.0 1224.00 1252.0 1288.75 1792.0
    libfuzzer 3600 10.0 1217.0 34.062035 1176.0 1178.25 1241.5 1242.00 1248.0
    afl 3600 10.0 1239.9 6.172520 1235.0 1236.25 1237.5 1241.25 1255.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.
  • Unique code coverage plots
    Ranking by unique code branches covered
    Each bar shows the total number of code branches found by a given fuzzer. The colored area shows the number of unique code branches (i.e., branches that were not covered by any other fuzzers).
    Pairwise unique code coverage
    Each cell represents the number of code branches covered by the fuzzer of the column but not by the fuzzer of the row

brotli_decode_fuzzer summary

Ranking by median reached code coverage
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    libafl_fuzzbench_naive 82800 20.0 900.85 3.183427 893.0 900.00 902.0 903.00 904.0
    libafl_fuzzbench_ngram4 82800 20.0 900.45 3.720144 893.0 900.00 902.0 902.00 904.0
    libafl_fuzzbench_ngram8 82800 20.0 899.90 2.511028 895.0 898.75 900.5 901.25 904.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.
  • Unique code coverage plots
    Ranking by unique code branches covered
    Each bar shows the total number of code branches found by a given fuzzer. The colored area shows the number of unique code branches (i.e., branches that were not covered by any other fuzzers).
    Pairwise unique code coverage
    Each cell represents the number of code branches covered by the fuzzer of the column but not by the fuzzer of the row

double-conversion_string_to_double_fuzzer summary

Ranking by median reached code coverage
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    aflplusplus 82800 20.0 507.75 1.831738 504.0 506.00 508.0 509.0 510.0
    libafl_fuzzbench_ngram8 82800 20.0 501.50 1.849609 495.0 501.00 502.0 502.0 504.0
    libafl_fuzzbench_naive 82800 20.0 497.30 2.473012 490.0 496.75 498.0 499.0 500.0
    libafl_fuzzbench_ngram4 82800 20.0 497.35 2.277464 493.0 496.00 497.5 498.0 503.0
    libafl_fuzzbench_naive_ctx 82800 20.0 494.00 2.615742 488.0 493.00 494.0 496.0 497.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.
  • Unique code coverage plots
    Ranking by unique code branches covered
    Each bar shows the total number of code branches found by a given fuzzer. The colored area shows the number of unique code branches (i.e., branches that were not covered by any other fuzzers).
    Pairwise unique code coverage
    Each cell represents the number of code branches covered by the fuzzer of the column but not by the fuzzer of the row

draco_draco_pc_decoder_fuzzer summary

Ranking by median reached code coverage
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
error
The following fuzzers do not have enough samples: aflplusplus.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    aflplusplus 82800 15.0 1850.066667 96.068775 1698.0 1779.00 1839.0 1931.50 1996.0
    libafl_fuzzbench_naive 82800 20.0 1476.400000 127.145751 1228.0 1377.50 1514.0 1540.75 1672.0
    libafl_fuzzbench_ngram8 82800 20.0 1402.550000 138.243292 972.0 1353.25 1440.5 1478.75 1629.0
    libafl_fuzzbench_ngram4 82800 20.0 1401.000000 159.328195 1047.0 1330.50 1413.0 1520.25 1594.0
    libafl_fuzzbench_naive_ctx 82800 20.0 1406.450000 108.526725 1211.0 1331.00 1402.5 1472.75 1589.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.
  • Unique code coverage plots
    Ranking by unique code branches covered
    Each bar shows the total number of code branches found by a given fuzzer. The colored area shows the number of unique code branches (i.e., branches that were not covered by any other fuzzers).
    Pairwise unique code coverage
    Each cell represents the number of code branches covered by the fuzzer of the column but not by the fuzzer of the row

dropbear_fuzzer-postauth_nomaths summary

Ranking by median reached code coverage
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    honggfuzz 3600 10.0 883.6 9.570789 860.0 881.25 885.5 889.0 895.0
    aflplusplus 3600 10.0 863.0 2.538591 858.0 861.00 864.5 865.0 865.0
    libfuzzer 3600 10.0 857.1 5.877452 841.0 857.00 858.5 860.0 861.0
    afl 3600 10.0 701.0 0.000000 701.0 701.00 701.0 701.0 701.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.
  • Unique code coverage plots
    Ranking by unique code branches covered
    Each bar shows the total number of code branches found by a given fuzzer. The colored area shows the number of unique code branches (i.e., branches that were not covered by any other fuzzers).
    Pairwise unique code coverage
    Each cell represents the number of code branches covered by the fuzzer of the column but not by the fuzzer of the row

firestore_firestore_serializer_fuzzer summary

Ranking by median reached code coverage
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    aflplusplus 82800 20.0 287.0 0.000000 287.0 287.0 287.0 287.0 287.0
    libafl_fuzzbench_naive 82800 20.0 286.3 0.978721 285.0 285.0 287.0 287.0 287.0
    libafl_fuzzbench_naive_ctx 82800 20.0 287.0 0.000000 287.0 287.0 287.0 287.0 287.0
    libafl_fuzzbench_ngram4 82800 20.0 286.8 0.615587 285.0 287.0 287.0 287.0 287.0
    libafl_fuzzbench_ngram8 82800 20.0 286.9 0.447214 285.0 287.0 287.0 287.0 287.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.
  • Unique code coverage plots
    Ranking by unique code branches covered
    Each bar shows the total number of code branches found by a given fuzzer. The colored area shows the number of unique code branches (i.e., branches that were not covered by any other fuzzers).
    Pairwise unique code coverage
    Each cell represents the number of code branches covered by the fuzzer of the column but not by the fuzzer of the row

fmt_chrono-duration-fuzzer summary

Ranking by median reached code coverage
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    aflplusplus 82800 20.0 1089.80 3.607011 1081.0 1086.75 1091.0 1093.00 1095.0
    libafl_fuzzbench_naive 82800 20.0 1081.75 4.689013 1074.0 1078.00 1082.0 1085.00 1091.0
    libafl_fuzzbench_ngram4 82800 20.0 1075.60 4.018379 1067.0 1073.00 1075.5 1079.25 1081.0
    libafl_fuzzbench_naive_ctx 82800 20.0 1069.20 5.356747 1059.0 1065.00 1068.5 1073.00 1079.0
    libafl_fuzzbench_ngram8 82800 20.0 1036.95 13.236612 1013.0 1027.00 1036.0 1048.25 1057.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.
  • Unique code coverage plots
    Ranking by unique code branches covered
    Each bar shows the total number of code branches found by a given fuzzer. The colored area shows the number of unique code branches (i.e., branches that were not covered by any other fuzzers).
    Pairwise unique code coverage
    Each cell represents the number of code branches covered by the fuzzer of the column but not by the fuzzer of the row

guetzli_guetzli_fuzzer summary

Ranking by median reached code coverage
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    aflplusplus 82800 20.0 1498.30 5.391026 1484.0 1495.00 1498.5 1502.25 1506.0
    libafl_fuzzbench_naive 82800 20.0 1486.50 8.312958 1474.0 1476.50 1492.0 1493.00 1493.0
    libafl_fuzzbench_ngram4 82800 20.0 1474.65 8.060789 1464.0 1467.75 1473.0 1479.75 1491.0
    libafl_fuzzbench_naive_ctx 82800 20.0 1475.85 7.975456 1466.0 1471.00 1472.0 1481.75 1490.0
    libafl_fuzzbench_ngram8 82800 20.0 1461.45 7.556628 1449.0 1457.75 1459.0 1464.25 1478.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.
  • Unique code coverage plots
    Ranking by unique code branches covered
    Each bar shows the total number of code branches found by a given fuzzer. The colored area shows the number of unique code branches (i.e., branches that were not covered by any other fuzzers).
    Pairwise unique code coverage
    Each cell represents the number of code branches covered by the fuzzer of the column but not by the fuzzer of the row

icu_unicode_string_codepage_create_fuzzer summary

Ranking by median reached code coverage
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    aflplusplus 82800 20.0 1339.85 0.745160 1339.0 1339.0 1340.0 1340.0 1341.0
    libafl_fuzzbench_naive 82800 20.0 1338.45 1.538112 1333.0 1338.0 1339.0 1339.0 1340.0
    libafl_fuzzbench_ngram8 82800 20.0 1326.80 17.715886 1300.0 1301.0 1338.0 1339.0 1340.0
    libafl_fuzzbench_naive_ctx 82800 20.0 1324.55 18.259749 1296.0 1298.0 1336.0 1337.0 1338.0
    libafl_fuzzbench_ngram4 82800 20.0 1333.70 8.596817 1299.0 1335.0 1336.0 1337.0 1339.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.
  • Unique code coverage plots
    Ranking by unique code branches covered
    Each bar shows the total number of code branches found by a given fuzzer. The colored area shows the number of unique code branches (i.e., branches that were not covered by any other fuzzers).
    Pairwise unique code coverage
    Each cell represents the number of code branches covered by the fuzzer of the column but not by the fuzzer of the row

jansson_json_load_dump_fuzzer summary

Ranking by median reached code coverage
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    libafl_fuzzbench_ngram8 82800 20.0 772.45 0.887041 771.0 772.00 773.0 773.0 774.0
    libafl_fuzzbench_ngram4 82800 20.0 768.80 1.641565 764.0 768.00 769.0 770.0 771.0
    aflplusplus 82800 20.0 767.80 1.472556 765.0 767.00 768.0 769.0 770.0
    libafl_fuzzbench_naive_ctx 82800 20.0 768.10 2.403944 761.0 767.00 768.0 769.5 771.0
    libafl_fuzzbench_naive 82800 20.0 766.55 1.637553 763.0 765.75 767.0 768.0 770.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.
  • Unique code coverage plots
    Ranking by unique code branches covered
    Each bar shows the total number of code branches found by a given fuzzer. The colored area shows the number of unique code branches (i.e., branches that were not covered by any other fuzzers).
    Pairwise unique code coverage
    Each cell represents the number of code branches covered by the fuzzer of the column but not by the fuzzer of the row

libaom_av1_dec_fuzzer summary

Ranking by median reached code coverage
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    aflplusplus 82800 20.0 11048.45 295.341457 10282.0 10997.75 11104.5 11242.25 11352.0
    libafl_fuzzbench_naive 82800 20.0 10927.90 116.383169 10664.0 10855.75 10968.5 11002.50 11066.0
    libafl_fuzzbench_ngram4 82800 20.0 10880.30 144.171353 10500.0 10777.25 10927.0 10966.25 11094.0
    libafl_fuzzbench_naive_ctx 82800 20.0 10503.75 148.475366 10329.0 10383.25 10441.0 10619.75 10836.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.
  • Unique code coverage plots
    Ranking by unique code branches covered
    Each bar shows the total number of code branches found by a given fuzzer. The colored area shows the number of unique code branches (i.e., branches that were not covered by any other fuzzers).
    Pairwise unique code coverage
    Each cell represents the number of code branches covered by the fuzzer of the column but not by the fuzzer of the row

libcoap_pdu_parse_fuzzer summary

Ranking by median reached code coverage
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    aflplusplus 82800 20.0 817.35 1.268028 815.0 817.00 817.0 818.00 821.0
    libafl_fuzzbench_ngram8 82800 20.0 757.90 0.447214 756.0 758.00 758.0 758.00 758.0
    libafl_fuzzbench_ngram4 82800 20.0 753.65 1.843195 750.0 752.75 754.0 755.00 756.0
    libafl_fuzzbench_naive 82800 20.0 745.45 1.050063 743.0 745.00 745.5 746.00 747.0
    libafl_fuzzbench_naive_ctx 82800 20.0 740.75 2.531382 735.0 739.75 741.0 742.25 745.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.
  • Unique code coverage plots
    Ranking by unique code branches covered
    Each bar shows the total number of code branches found by a given fuzzer. The colored area shows the number of unique code branches (i.e., branches that were not covered by any other fuzzers).
    Pairwise unique code coverage
    Each cell represents the number of code branches covered by the fuzzer of the column but not by the fuzzer of the row

libhevc_hevc_dec_fuzzer summary

Ranking by median reached code coverage
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    aflplusplus 82800 20.0 10332.70 47.309841 10219.0 10331.50 10355.0 10360.00 10375.0
    libafl_fuzzbench_ngram4 82800 20.0 10335.65 10.011967 10322.0 10324.50 10337.0 10343.25 10349.0
    libafl_fuzzbench_ngram8 82800 20.0 10326.05 12.032740 10300.0 10319.00 10330.0 10333.25 10344.0
    libafl_fuzzbench_naive 82800 20.0 10320.75 7.806037 10303.0 10315.75 10321.5 10329.00 10329.0
    libafl_fuzzbench_naive_ctx 82800 20.0 10313.35 9.455018 10285.0 10308.00 10313.0 10319.50 10331.0
    centipede 82800 20.0 1260.30 1723.585892 555.0 555.00 557.0 557.00 5651.0
    aflfast 82800 20.0 0.00 0.000000 0.0 0.00 0.0 0.00 0.0
    aflsmart 82800 20.0 0.00 0.000000 0.0 0.00 0.0 0.00 0.0
    eclipser 82800 17.0 0.00 0.000000 0.0 0.00 0.0 0.00 0.0
    fairfuzz 82800 20.0 0.00 0.000000 0.0 0.00 0.0 0.00 0.0
    mopt 82800 20.0 0.00 0.000000 0.0 0.00 0.0 0.00 0.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.
  • Unique code coverage plots
    Ranking by unique code branches covered
    Each bar shows the total number of code branches found by a given fuzzer. The colored area shows the number of unique code branches (i.e., branches that were not covered by any other fuzzers).
    Pairwise unique code coverage
    Each cell represents the number of code branches covered by the fuzzer of the column but not by the fuzzer of the row

librdkafka_fuzz_regex summary

Ranking by median reached code coverage
Reached code coverage distribution
Mean code coverage growth over time
Mean code coverage growth over time
* The error bands show the 95% confidence interval around the mean code coverage.
error
The following fuzzers do not have enough samples: afl, aflplusplus, honggfuzz, libfuzzer, libafl_fuzzbench_naive_ctx.
  • Sample statistics and statistical significance (code coverage)
    Code coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    libafl_fuzzbench_ngram4 3600 16.0 383.375 0.957427 382.0 382.00 384.0 384.00 384.0
    libafl_fuzzbench_ngram8 3600 20.0 381.850 1.182103 380.0 381.00 382.0 382.25 384.0
    libfuzzer 3600 10.0 379.800 4.661902 367.0 381.00 381.5 382.00 382.0
    libafl_fuzzbench_naive_ctx 3600 2.0 380.500 0.707107 380.0 380.25 380.5 380.75 381.0
    aflplusplus 3600 10.0 288.400 110.639756 0.0 266.25 325.0 350.00 371.0
    afl 3600 10.0 169.600 184.713712 0.0 0.00 107.5 367.75 378.0
    honggfuzz 3600 10.0 83.400 143.665661 0.0 0.00 0.0 114.75 344.0

    Vargha-Delaney A12 measure
    The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.
    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.
  • Unique code coverage plots
    Ranking by unique code branches covered
    Each bar shows the total number of code branches found by a given fuzzer. The colored area shows the number of unique code branches (i.e., branches that were not covered by any other fuzzers).
    Pairwise unique code coverage
    Each cell represents the number of code branches covered by the fuzzer of the column but not by the fuzzer of the row

experiment data

You can download the raw data for this report here.

Check out the documentation on how to create customized reports using this data. Also see some example Colab notebooks for doing custom analysis on the data here.

Experiment Description:

(None,)