FuzzBench: 2024-05-22-bases report (running)

experiment summary

We show two different aggregate (cross-benchmark) rankings of fuzzers. The first is based on the average of per-benchmarks scores, where the score represents the percentage of the highest reached median code-coverage on a given benchmark (higher value is better). The second ranking shows the average rank of fuzzers, after we rank them on each benchmark according to their median reached code-covereges (lower value is better).

By avg. score

	average normalized score
fuzzer
libafl	94.75
libfuzzer	93.70
aflplusplus	89.38
centipede	68.95
aflfast	32.86
afl	21.46

By avg. rank

	average rank
fuzzer
aflplusplus	2.04
libafl	2.26
libfuzzer	2.61
centipede	3.74
afl	4.70
aflfast	4.70

Critical difference diagram

The diagram visualizes the average rank of fuzzers (second ranking above) while showing the significance of the differences as well. What is considered a "critical difference" (CD) is based on the Friedman/Nemenyi post-hoc test. See more in the documentation.

Note: If a fuzzer does not support all benchmarks, its ranking as shown in this diagram can be lower than it should be. So please check the list of supported benchmarks for the fuzzer(s) of your interest. The list could be specified in the fuzzer's README.md like this.

Median relative code-coverages on each benchmark

Note: The relative coverage summary table shows the median relative performance of each fuzzer to the experiment maximum. Thus the highest relative performance may not be 100%.
trial_relative_coverage = trial_coverage / experiment_max_coverage

	libafl	aflplusplus	libfuzzer	aflfast	afl	centipede
FuzzerMedian	98.00	98.00	95.00	94.00	97.50	90.00
FuzzerMean	95.86	94.62	90.70	82.33	81.33	80.58
bloaty_fuzz_target	98.00	98.00	91.00	94.00	95.00	nan
curl_curl_fuzzer_http	98.00	97.00	91.00	nan	nan	nan
freetype2_ftfuzzer	90.00	90.00	78.00	nan	nan	57.00
harfbuzz_hb-shape-fuzzer	99.00	98.00	95.00	96.00	97.00	nan
jsoncpp_jsoncpp_fuzzer	98.00	99.00	100.00	98.00	98.00	99.00
lcms_cms_transform_fuzzer	94.00	92.00	89.00	nan	nan	40.00
libjpeg-turbo_libjpeg_turbo_fuzzer	99.00	99.00	99.00	nan	nan	99.00
libpcap_fuzz_both	87.00	85.00	79.00	1.00	1.00	89.00
libpng_libpng_read_fuzzer	95.00	95.00	96.00	nan	nan	96.00
libxml2_xml	98.00	99.00	98.00	nan	nan	93.00
libxslt_xpath	97.00	99.00	93.00	nan	nan	94.00
mbedtls_fuzz_dtlsclient	94.00	73.00	71.00	nan	nan	69.00
openh264_decoder_fuzzer	nan	99.00	98.00	nan	98.00	96.00
openssl_x509	99.00	99.00	99.00	99.00	99.00	99.00
openthread_ot-ip6-send-fuzzer	88.00	88.00	76.00	70.00	nan	71.00
proj4_proj_crs_to_crs_fuzzer	93.00	91.00	97.00	nan	nan	10.00
re2_fuzzer	99.00	99.00	99.00	nan	nan	95.00
sqlite3_ossfuzz	99.00	95.00	80.00	nan	nan	64.00
stb_stbi_read_fuzzer	95.00	94.00	90.00	nan	nan	85.00
systemd_fuzz-link-parser	98.00	99.00	72.00	91.00	nan	nan
vorbis_decode_fuzzer	98.00	99.00	99.00	98.00	nan	90.00
woff2_convert_woff2ttf_fuzzer	98.00	nan	97.00	nan	nan	90.00
zlib_zlib_uncompress_fuzzer	95.00	nan	99.00	94.00	nan	95.00

Fuzzers are sorted by "FuzzerMean" (average median relative coverage), highest on the left.
Green background = highest relative median coverage.
Blue gradient background = greater than 95% relative median coverage.

bloaty_fuzz_target summary

Ranking by median reached code coverage

Mean code coverage growth over time

* The error bands show the 95% confidence interval around the mean code coverage.

Sample statistics and statistical significance (code coverage)

Code coverage sample statistics

		count	mean	std	min	25%	median	75%	max
fuzzer	time
libafl	124200	20.0	6409.350000	47.082542	6280.0	6382.75	6425.5	6438.5	6478.0
aflplusplus	124200	17.0	6351.882353	96.503162	6191.0	6268.00	6390.0	6425.0	6492.0
afl	124200	3.0	6210.333333	113.694034	6120.0	6146.50	6173.0	6255.5	6338.0
aflfast	124200	4.0	6123.500000	72.702591	6056.0	6083.00	6106.5	6147.0	6225.0
libfuzzer	124200	20.0	5954.550000	110.179984	5788.0	5863.75	5950.5	6008.5	6174.0

Vargha-Delaney A12 measure

The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.

Mann-Whitney U test

The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

curl_curl_fuzzer_http summary

Ranking by median reached code coverage

Reached code coverage distribution

Code coverage (linear)
Code coverage (log)

Mean code coverage growth over time

* The error bands show the 95% confidence interval around the mean code coverage.

Sample statistics and statistical significance (code coverage)

Code coverage sample statistics

		count	mean	std	min	25%	median	75%	max
fuzzer	time
libafl	125100	20.0	10908.20	23.701210	10864.0	10900.50	10912.0	10926.50	10938.0
aflplusplus	125100	12.0	10898.25	88.518231	10793.0	10851.50	10869.0	10928.75	11130.0
libfuzzer	125100	20.0	10112.20	379.030564	9342.0	9985.25	10213.0	10373.50	10561.0

Vargha-Delaney A12 measure

The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.

Mann-Whitney U test

The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

freetype2_ftfuzzer summary

Ranking by median reached code coverage

Reached code coverage distribution

Code coverage (linear)
Code coverage (log)

Mean code coverage growth over time

* The error bands show the 95% confidence interval around the mean code coverage.

Sample statistics and statistical significance (code coverage)

Code coverage sample statistics

		count	mean	std	min	25%	median	75%	max
fuzzer	time
aflplusplus	125100	11.0	11471.090909	450.600589	10459.0	11360.0	11558.0	11696.00	12013.0
libafl	125100	20.0	11516.650000	649.724498	10378.0	11027.5	11480.0	11819.75	12718.0
libfuzzer	125100	20.0	10061.000000	636.180792	8656.0	9639.0	10037.0	10459.00	11083.0
centipede	125100	16.0	7302.500000	190.245455	6957.0	7178.5	7307.5	7416.50	7686.0

Vargha-Delaney A12 measure

The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.

Mann-Whitney U test

The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

harfbuzz_hb-shape-fuzzer summary

Ranking by median reached code coverage

Reached code coverage distribution

Code coverage (linear)
Code coverage (log)

Mean code coverage growth over time

* The error bands show the 95% confidence interval around the mean code coverage.

Sample statistics and statistical significance (code coverage)

Code coverage sample statistics

		count	mean	std	min	25%	median	75%	max
fuzzer	time
libafl	124200	20.0	11056.20	175.419977	10391.0	11072.25	11101.0	11136.75	11166.0
aflplusplus	124200	20.0	10931.50	114.853912	10504.0	10910.25	10959.5	10988.25	11041.0
afl	124200	5.0	10869.20	23.552070	10835.0	10854.00	10884.0	10886.00	10887.0
aflfast	124200	4.0	10736.50	39.314967	10689.0	10712.25	10741.5	10765.75	10774.0
libfuzzer	124200	20.0	10642.85	62.224784	10506.0	10607.50	10632.5	10681.75	10759.0

Vargha-Delaney A12 measure

The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.

Mann-Whitney U test

The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

jsoncpp_jsoncpp_fuzzer summary

Ranking by median reached code coverage

Reached code coverage distribution

Code coverage (linear)
Code coverage (log)

Mean code coverage growth over time

* The error bands show the 95% confidence interval around the mean code coverage.

Sample statistics and statistical significance (code coverage)

Code coverage sample statistics

		count	mean	std	min	25%	median	75%	max
fuzzer	time
libfuzzer	124200	20.0	524.900000	0.307794	524.0	525.0	525.0	525.00	525.0
centipede	124200	20.0	520.750000	1.996708	518.0	519.0	521.5	522.25	523.0
aflplusplus	124200	17.0	519.941176	0.242536	519.0	520.0	520.0	520.00	520.0
afl	124200	3.0	518.333333	1.527525	517.0	517.5	518.0	519.00	520.0
aflfast	124200	2.0	517.000000	0.000000	517.0	517.0	517.0	517.00	517.0
libafl	124200	20.0	517.350000	0.745160	517.0	517.0	517.0	517.00	519.0

Vargha-Delaney A12 measure

The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.

Mann-Whitney U test

The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

lcms_cms_transform_fuzzer summary

Ranking by median reached code coverage

Reached code coverage distribution

Code coverage (linear)
Code coverage (log)

Mean code coverage growth over time

* The error bands show the 95% confidence interval around the mean code coverage.

Sample statistics and statistical significance (code coverage)

Code coverage sample statistics

		count	mean	std	min	25%	median	75%	max
fuzzer	time
libafl	125100	20.0	2115.350	61.015335	2031.0	2068.00	2117.0	2153.25	2228.0
aflplusplus	125100	5.0	1995.800	307.873675	1468.0	2012.00	2090.0	2158.00	2251.0
libfuzzer	125100	20.0	2000.900	91.810961	1810.0	1943.50	2019.0	2075.75	2131.0
centipede	125100	16.0	1082.125	284.363588	782.0	839.25	905.0	1377.00	1444.0

Vargha-Delaney A12 measure

The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.

Mann-Whitney U test

The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

libjpeg-turbo_libjpeg_turbo_fuzzer summary

Ranking by median reached code coverage

Reached code coverage distribution

Code coverage (linear)
Code coverage (log)

Mean code coverage growth over time

* The error bands show the 95% confidence interval around the mean code coverage.

Sample statistics and statistical significance (code coverage)

Code coverage sample statistics

		count	mean	std	min	25%	median	75%	max
fuzzer	time
aflplusplus	125100	11.0	2549.272727	2.686667	2545.0	2547.00	2551.0	2551.5	2552.0
libfuzzer	125100	20.0	2549.550000	1.932411	2546.0	2549.75	2550.0	2550.0	2553.0
centipede	125100	16.0	2547.187500	1.470544	2546.0	2546.00	2546.5	2549.0	2550.0
libafl	125100	20.0	2545.500000	1.933091	2543.0	2544.75	2545.0	2546.0	2550.0

Vargha-Delaney A12 measure

The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.

Mann-Whitney U test

The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

libpcap_fuzz_both summary

Ranking by median reached code coverage

Reached code coverage distribution

Code coverage (linear)
Code coverage (log)

Mean code coverage growth over time

* The error bands show the 95% confidence interval around the mean code coverage.

Sample statistics and statistical significance (code coverage)

Code coverage sample statistics

		count	mean	std	min	25%	median	75%	max
fuzzer	time
centipede	89100	20.0	2798.950000	645.005057	101.0	2851.75	2911.5	3006.75	3218.0
libafl	89100	4.0	2807.250000	121.771302	2631.0	2784.00	2844.0	2867.25	2910.0
aflplusplus	89100	1.0	2769.000000	NaN	2769.0	2769.00	2769.0	2769.00	2769.0
libfuzzer	89100	15.0	2582.400000	87.076157	2459.0	2526.50	2598.0	2616.00	2760.0
aflfast	89100	4.0	38.250000	5.500000	33.0	33.75	38.5	43.00	43.0
afl	89100	17.0	36.941176	4.022912	33.0	34.00	34.0	42.00	43.0

Vargha-Delaney A12 measure

The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.

Mann-Whitney U test

The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

libpng_libpng_read_fuzzer summary

Ranking by median reached code coverage

Reached code coverage distribution

Code coverage (linear)
Code coverage (log)

Mean code coverage growth over time

* The error bands show the 95% confidence interval around the mean code coverage.

Sample statistics and statistical significance (code coverage)

Code coverage sample statistics

		count	mean	std	min	25%	median	75%	max
fuzzer	time
libfuzzer	126000	15.0	2022.933333	19.110456	2017.0	2018.00	2018.0	2018.00	2092.0
centipede	126000	8.0	2015.750000	1.035098	2014.0	2015.00	2016.0	2016.25	2017.0
aflplusplus	126000	20.0	2004.500000	2.704772	1998.0	2003.00	2005.0	2006.00	2010.0
libafl	126000	20.0	2000.700000	20.246117	1974.0	1997.25	2000.5	2001.25	2079.0

Vargha-Delaney A12 measure

The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.

Mann-Whitney U test

The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

libxml2_xml summary

Ranking by median reached code coverage

Reached code coverage distribution

Code coverage (linear)
Code coverage (log)

Mean code coverage growth over time

* The error bands show the 95% confidence interval around the mean code coverage.

Sample statistics and statistical significance (code coverage)

Code coverage sample statistics

		count	mean	std	min	25%	median	75%	max
fuzzer	time
aflplusplus	126000	3.0	15772.666667	90.168361	15670.0	15739.50	15809.0	15824.00	15839.0
libafl	126000	20.0	15630.300000	124.532008	15135.0	15633.00	15664.5	15682.00	15744.0
libfuzzer	126000	20.0	15551.250000	74.194534	15448.0	15492.50	15554.0	15607.50	15688.0
centipede	126000	18.0	14777.888889	102.809870	14568.0	14718.25	14767.5	14829.75	15046.0

Vargha-Delaney A12 measure

The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.

Mann-Whitney U test

The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

libxslt_xpath summary

Ranking by median reached code coverage

Reached code coverage distribution

Code coverage (linear)
Code coverage (log)

Mean code coverage growth over time

* The error bands show the 95% confidence interval around the mean code coverage.

Sample statistics and statistical significance (code coverage)

Code coverage sample statistics

		count	mean	std	min	25%	median	75%	max
fuzzer	time
aflplusplus	125100	12.0	11249.083333	86.757193	11086.0	11185.50	11280.0	11308.0	11361.0
libafl	125100	20.0	11020.750000	69.967568	10844.0	11003.25	11031.5	11081.0	11102.0
centipede	125100	16.0	10743.125000	101.075467	10563.0	10657.25	10742.5	10794.5	10948.0
libfuzzer	125100	20.0	10639.000000	123.678275	10354.0	10591.75	10651.0	10726.0	10833.0

Vargha-Delaney A12 measure

The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.

Mann-Whitney U test

The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

mbedtls_fuzz_dtlsclient summary

Ranking by median reached code coverage

Reached code coverage distribution

Code coverage (linear)
Code coverage (log)

Mean code coverage growth over time

* The error bands show the 95% confidence interval around the mean code coverage.

Sample statistics and statistical significance (code coverage)

Code coverage sample statistics

		count	mean	std	min	25%	median	75%	max
fuzzer	time
libafl	125100	20.0	3429.500000	367.875050	2745.0	3133.00	3588.5	3733.5	3811.0
aflplusplus	125100	9.0	2789.888889	34.034705	2746.0	2759.00	2789.0	2822.0	2832.0
libfuzzer	125100	20.0	2712.450000	26.083822	2655.0	2700.25	2711.0	2722.0	2788.0
centipede	125100	20.0	2662.400000	25.948126	2624.0	2639.00	2661.0	2677.0	2729.0

Vargha-Delaney A12 measure

The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.

Mann-Whitney U test

The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

openh264_decoder_fuzzer summary

Ranking by median reached code coverage

Reached code coverage distribution

Code coverage (linear)
Code coverage (log)

Mean code coverage growth over time

* The error bands show the 95% confidence interval around the mean code coverage.

Sample statistics and statistical significance (code coverage)

Code coverage sample statistics

		count	mean	std	min	25%	median	75%	max
fuzzer	time
aflplusplus	128700	20.0	9544.500000	39.896577	9441.0	9541.00	9554.0	9567.75	9592.0
afl	128700	20.0	9544.000000	37.337719	9460.0	9527.75	9538.5	9558.00	9617.0
libfuzzer	128700	14.0	9522.285714	42.376388	9401.0	9513.50	9530.0	9548.75	9569.0
centipede	128700	9.0	9185.666667	277.917254	8469.0	9235.00	9264.0	9295.00	9389.0

Vargha-Delaney A12 measure

The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.

Mann-Whitney U test

The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

openssl_x509 summary

Ranking by median reached code coverage

Reached code coverage distribution

Code coverage (linear)
Code coverage (log)

Mean code coverage growth over time

* The error bands show the 95% confidence interval around the mean code coverage.

Sample statistics and statistical significance (code coverage)

Code coverage sample statistics

		count	mean	std	min	25%	median	75%	max
fuzzer	time
aflplusplus	126900	9.0	5831.444444	6.043821	5821.0	5831.00	5833.0	5834.00	5841.0
libfuzzer	126900	20.0	5832.550000	3.300319	5820.0	5832.00	5833.0	5833.50	5836.0
libafl	126900	20.0	5828.450000	6.108450	5821.0	5823.75	5830.0	5830.00	5844.0
centipede	126900	15.0	5827.733333	5.775152	5814.0	5824.50	5828.0	5831.50	5838.0
afl	126900	2.0	5825.500000	0.707107	5825.0	5825.25	5825.5	5825.75	5826.0
aflfast	126900	2.0	5819.500000	3.535534	5817.0	5818.25	5819.5	5820.75	5822.0

Vargha-Delaney A12 measure

The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.

Mann-Whitney U test

The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

openthread_ot-ip6-send-fuzzer summary

Ranking by median reached code coverage

Reached code coverage distribution

Code coverage (linear)
Code coverage (log)

Mean code coverage growth over time

* The error bands show the 95% confidence interval around the mean code coverage.

Sample statistics and statistical significance (code coverage)

Code coverage sample statistics

		count	mean	std	min	25%	median	75%	max
fuzzer	time
aflplusplus	125100	11.0	3410.454545	274.282833	3048.0	3081.50	3587.0	3598.50	3682.0
libafl	125100	20.0	3583.600000	352.506275	3050.0	3347.25	3574.0	3882.00	4037.0
libfuzzer	125100	19.0	3088.684211	16.227729	3046.0	3079.50	3091.0	3098.50	3118.0
centipede	125100	20.0	2894.850000	92.664803	2745.0	2800.00	2906.0	2946.25	3086.0
aflfast	125100	3.0	2858.333333	49.943301	2829.0	2829.50	2830.0	2873.00	2916.0

Vargha-Delaney A12 measure

The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.

Mann-Whitney U test

The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

proj4_proj_crs_to_crs_fuzzer summary

Ranking by median reached code coverage

Reached code coverage distribution

Code coverage (linear)
Code coverage (log)

Mean code coverage growth over time

* The error bands show the 95% confidence interval around the mean code coverage.

Sample statistics and statistical significance (code coverage)

Code coverage sample statistics

		count	mean	std	min	25%	median	75%	max
fuzzer	time
libfuzzer	126000	20.0	7816.600000	93.182560	7632.0	7782.00	7829.0	7855.25	7969.0
libafl	126000	19.0	7488.263158	156.280674	7175.0	7420.50	7514.0	7581.00	7728.0
aflplusplus	126000	8.0	7342.375000	221.857057	6993.0	7234.75	7296.5	7540.25	7631.0
centipede	126000	15.0	822.000000	4.070802	816.0	820.00	821.0	823.50	832.0

Vargha-Delaney A12 measure

The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.

Mann-Whitney U test

The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

re2_fuzzer summary

Ranking by median reached code coverage

Reached code coverage distribution

Code coverage (linear)
Code coverage (log)

Mean code coverage growth over time

* The error bands show the 95% confidence interval around the mean code coverage.

Sample statistics and statistical significance (code coverage)

Code coverage sample statistics

		count	mean	std	min	25%	median	75%	max
fuzzer	time
libfuzzer	125100	20.0	2885.60	1.759186	2882.0	2885.00	2885.5	2887.00	2889.0
aflplusplus	125100	10.0	2878.40	4.005552	2871.0	2877.25	2878.5	2881.50	2883.0
libafl	125100	20.0	2861.95	7.708335	2847.0	2860.75	2863.0	2864.25	2881.0
centipede	125100	20.0	2777.85	20.959296	2744.0	2766.25	2773.0	2790.00	2831.0

Vargha-Delaney A12 measure

The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.

Mann-Whitney U test

The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

sqlite3_ossfuzz summary

Ranking by median reached code coverage

Reached code coverage distribution

Code coverage (linear)
Code coverage (log)

Mean code coverage growth over time

* The error bands show the 95% confidence interval around the mean code coverage.

Sample statistics and statistical significance (code coverage)

Code coverage sample statistics

		count	mean	std	min	25%	median	75%	max
fuzzer	time
libafl	126900	20.0	20579.200000	947.179614	16691.0	20671.75	20852.5	20937.25	21006.0
aflplusplus	126900	3.0	20210.666667	186.551155	20062.0	20106.00	20150.0	20285.00	20420.0
libfuzzer	126900	20.0	16966.250000	443.628208	16352.0	16582.50	16843.5	17359.50	17799.0
centipede	126900	17.0	13657.529412	552.505443	12881.0	13326.00	13561.0	13842.00	15214.0

Vargha-Delaney A12 measure

The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.

Mann-Whitney U test

The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

stb_stbi_read_fuzzer summary

Ranking by median reached code coverage

Reached code coverage distribution

Code coverage (linear)
Code coverage (log)

Mean code coverage growth over time

* The error bands show the 95% confidence interval around the mean code coverage.

Sample statistics and statistical significance (code coverage)

Code coverage sample statistics

		count	mean	std	min	25%	median	75%	max
fuzzer	time
libafl	126000	20.0	2192.500000	46.641409	2103.0	2188.0	2192.5	2203.25	2268.0
aflplusplus	126000	2.0	2163.000000	73.539105	2111.0	2137.0	2163.0	2189.00	2215.0
libfuzzer	126000	20.0	2073.050000	49.656160	2008.0	2025.0	2076.5	2105.50	2161.0
centipede	126000	19.0	1962.473684	9.094372	1948.0	1957.5	1960.0	1966.00	1989.0

Vargha-Delaney A12 measure

The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.

Mann-Whitney U test

The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

systemd_fuzz-link-parser summary

Ranking by median reached code coverage

Reached code coverage distribution

Code coverage (linear)
Code coverage (log)

Mean code coverage growth over time

* The error bands show the 95% confidence interval around the mean code coverage.

Sample statistics and statistical significance (code coverage)

Code coverage sample statistics

		count	mean	std	min	25%	median	75%	max
fuzzer	time
aflplusplus	124200	20.0	237.800000	3.053902	226.0	238.00	239.0	239.0	240.0
libafl	124200	5.0	236.800000	0.447214	236.0	237.00	237.0	237.0	237.0
aflfast	124200	14.0	219.071429	1.979288	215.0	218.25	219.0	221.0	221.0
libfuzzer	124200	17.0	193.000000	37.396524	156.0	156.00	174.0	234.0	238.0

Vargha-Delaney A12 measure

The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.

Mann-Whitney U test

The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

vorbis_decode_fuzzer summary

Ranking by median reached code coverage

Reached code coverage distribution

Code coverage (linear)
Code coverage (log)

Mean code coverage growth over time

* The error bands show the 95% confidence interval around the mean code coverage.

Sample statistics and statistical significance (code coverage)

Code coverage sample statistics

		count	mean	std	min	25%	median	75%	max
fuzzer	time
libfuzzer	125100	20.0	1269.35	1.348488	1267.0	1268.75	1270.0	1270.00	1271.0
aflplusplus	125100	10.0	1265.80	4.131182	1261.0	1263.25	1265.0	1266.75	1275.0
libafl	125100	20.0	1252.85	3.407036	1248.0	1250.00	1252.5	1255.25	1260.0
aflfast	125100	6.0	1250.00	6.603030	1238.0	1248.50	1251.5	1254.50	1256.0
centipede	125100	20.0	1156.40	13.359010	1134.0	1145.00	1156.5	1164.00	1186.0

Vargha-Delaney A12 measure

The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.

Mann-Whitney U test

The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

woff2_convert_woff2ttf_fuzzer summary

Ranking by median reached code coverage

Reached code coverage distribution

Code coverage (linear)
Code coverage (log)

Mean code coverage growth over time

* The error bands show the 95% confidence interval around the mean code coverage.

Sample statistics and statistical significance (code coverage)

Code coverage sample statistics

		count	mean	std	min	25%	median	75%	max
fuzzer	time
libafl	126900	20.0	1184.850000	11.389168	1165.0	1178.50	1184.5	1194.0	1203.0
libfuzzer	126900	20.0	1162.300000	44.251792	1066.0	1135.25	1175.5	1201.5	1208.0
centipede	126900	18.0	1088.555556	12.486332	1067.0	1075.75	1093.0	1098.0	1105.0

Vargha-Delaney A12 measure

The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.

Mann-Whitney U test

The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

zlib_zlib_uncompress_fuzzer summary

Ranking by median reached code coverage

Reached code coverage distribution

Code coverage (linear)
Code coverage (log)

Mean code coverage growth over time

* The error bands show the 95% confidence interval around the mean code coverage.

Sample statistics and statistical significance (code coverage)

Code coverage sample statistics

		count	mean	std	min	25%	median	75%	max
fuzzer	time
libfuzzer	127800	20.0	468.6	4.638512	461.0	463.0	471.0	472.0	473.0
centipede	127800	7.0	451.0	2.081666	447.0	451.0	451.0	451.5	454.0
libafl	127800	15.0	450.6	6.231258	442.0	447.0	450.0	452.0	461.0
aflfast	127800	20.0	447.7	13.913643	390.0	448.0	449.0	454.0	456.0

Vargha-Delaney A12 measure

The table summarizes the A12 values from the pairwise Vargha-Delaney A measure of effect size. Green cells indicate the probability the fuzzer in the row will outperform the fuzzer in the column.

Mann-Whitney U test

The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

experiment data

You can download the raw data for this report here.

Check out the documentation on how to create customized reports using this data. Also see some example Colab notebooks for doing custom analysis on the data here.

Experiment Description:

(None,)

FuzzBench: 2024-05-22-bases report

(experiment incomplete/still running...)

experiment summary

By avg. score

By avg. rank

bloaty_fuzz_target summary

Ranking by median reached code coverage

Reached code coverage distribution

Mean code coverage growth over time

Mean code coverage growth over time

Code coverage sample statistics

Vargha-Delaney A12 measure

Mann-Whitney U test

curl_curl_fuzzer_http summary

Ranking by median reached code coverage

Reached code coverage distribution

Mean code coverage growth over time

Mean code coverage growth over time

Code coverage sample statistics

Vargha-Delaney A12 measure

Mann-Whitney U test

freetype2_ftfuzzer summary

Ranking by median reached code coverage

Reached code coverage distribution

Mean code coverage growth over time

Mean code coverage growth over time

Code coverage sample statistics

Vargha-Delaney A12 measure

Mann-Whitney U test

harfbuzz_hb-shape-fuzzer summary

Ranking by median reached code coverage

Reached code coverage distribution

Mean code coverage growth over time

Mean code coverage growth over time

Code coverage sample statistics

Vargha-Delaney A12 measure

Mann-Whitney U test

jsoncpp_jsoncpp_fuzzer summary

Ranking by median reached code coverage

Reached code coverage distribution

Mean code coverage growth over time

Mean code coverage growth over time

Code coverage sample statistics

Vargha-Delaney A12 measure

Mann-Whitney U test

lcms_cms_transform_fuzzer summary

Ranking by median reached code coverage

Reached code coverage distribution

Mean code coverage growth over time

Mean code coverage growth over time

Code coverage sample statistics

Vargha-Delaney A12 measure

Mann-Whitney U test

libjpeg-turbo_libjpeg_turbo_fuzzer summary

Ranking by median reached code coverage

Reached code coverage distribution

Mean code coverage growth over time

Mean code coverage growth over time

Code coverage sample statistics

Vargha-Delaney A12 measure

Mann-Whitney U test

libpcap_fuzz_both summary

Ranking by median reached code coverage

Reached code coverage distribution

Mean code coverage growth over time

Mean code coverage growth over time

Code coverage sample statistics

Vargha-Delaney A12 measure

Mann-Whitney U test

libpng_libpng_read_fuzzer summary

Ranking by median reached code coverage

Reached code coverage distribution

Mean code coverage growth over time

Mean code coverage growth over time

Code coverage sample statistics

Vargha-Delaney A12 measure

Mann-Whitney U test

libxml2_xml summary

Ranking by median reached code coverage

Reached code coverage distribution