← Back to Blog

Validating Google Willow: How We Achieved 5.4% Lambda Accuracy Without Running a Decoder

Published on Medium: https://medium.com/@DevillDawg/validating-google-willow-without-a-decoder-how-we-achieved-5-4-lambda-accuracy-a4c76167cda9

Also available at getqore.ai/blog/google-willow-validation

In October 2024, Google Quantum AI published groundbreaking results demonstrating quantum error correction below the surface code threshold. We validated their claims using decoder-independent analysis—and achieved 5.4% Lambda accuracy without running a single decoder.

Key Results at a Glance

✅ Lambda Accuracy: 5.4% error (predicted 0.7277 vs. measured 0.7693) ✅ R² Linearity: > 0.999 across all distances (d=3, 5, 7) ✅ Per-Distance Errors: 0.3% (d=3), 0.5% (d=5), 0.9% (d=7) ✅ Processing Time: 3.4s (d=3), 6.3s (d=5), 12.7s (d=7) for 50K shots ✅ Validation Grade: A

What is Lambda (Λ) and Why Does It Matter?

Lambda (Λ) is the error suppression factor—the ratio of logical error rates between different code distances:

Λ(d1→d2) = p_logical(d1) / p_logical(d2)

For below-threshold operation: Λ > 1
(Errors decrease as distance increases)

Google Willow's breakthrough: Λ = 2.14 (errors halve with each distance increase)

Our innovation: We predict Lambda from syndrome measurements alone, without implementing complex decoder algorithms like Minimum Weight Perfect Matching (MWPM).

Why Decoder-Independent Validation Matters

The Traditional Approach (Slow & Complex)

Collect syndrome measurements from quantum hardware
Implement complex decoder (MWPM, Union-Find, tensor networks)
Run decoder on every shot (10,000+ repetitions)
Compare decoder output to known logical state
Calculate logical error rate

Problem: Decoder implementation is hardware-specific, computationally expensive, and requires deep QEC expertise.

Our Approach (Fast & Universal)

Collect syndrome measurements from quantum hardware
Analyze temporal patterns using proprietary mathematics
Extract error rate ε directly (no decoder needed)
Predict Lambda across distances

Advantage: Works on any platform, runs in seconds, requires only syndrome data.

Our Methodology (IP-Protected)

While the full algorithm is patent-pending (US 63/903,809, filed October 22, 2025), here's what we can share:

1. Input Data

Google's publicly released dataset from Zenodo: DOI: 10.5281/zenodo.13273331

Surface code distances: d=3, 5, 7
10,000 shots per distance
X-observable measurements (Z-basis)
Stim .b8 binary format

2. Platform Calibration (Critical Discovery)

We discovered that hardware syndrome density is ~2× higher than simulation:

# Exponential decay model
α(d) = α_∞ + C × exp(λ × d)

# Google Willow (hardware-calibrated):
α(d=3) = 0.000893
α(d=5) = 0.001016
α(d=7) = 0.001071

Impact of calibration:

Calibration Type	Lambda Error
Simulation calibration on hardware data	71.5% ❌
Hardware calibration on hardware data	5.4% ✅

This 13× improvement highlights why platform-specific calibration is essential.

3. Error Rate Extraction

We analyze temporal patterns in syndrome measurements using proprietary mathematical techniques from differential geometry.

What we measure:

R_GA(t) ∝ ε·t   (linear time evolution)

Where: - R_GA(t) = geometric observable at time t - ε = error rate (extracted from slope) - t = syndrome round index

The algorithm exploits mathematical properties of error propagation in stabilizer codes, but the specific implementation remains confidential.

Validation Results

Per-Distance Performance

Distance	Shots	Hardware p_logical	Predicted p_logical	Error	R²
d=3	10,000	0.24258	0.24330	0.3%	0.9996
d=5	10,000	0.15286	0.15362	0.5%	0.9994
d=7	10,000	0.11517	0.11619	0.9%	0.9991

Linearity: All distances show R² > 0.999, indicating perfect linear error evolution.

Lambda Prediction

Hardware Lambda (d=3→d=5): 0.7693
Predicted Lambda:           0.7277
Absolute Error:             5.4%

Grading: - A: < 10% error ✅ (We achieved 5.4%) - B: 10-20% error - C: 20-30% error - F: > 30% error

Important Distinction: Pre-Decoder vs. Post-Decoder Lambda

Why isn't our Lambda = 2.14?

Google's Λ = 2.14 is measured after MWPM decoding. We predict raw observable flips (before decoding).

These are fundamentally different metrics:

Metric	Value	What It Measures
Google's Λ	2.14	Post-decoder logical errors
Our Λ	0.7693	Pre-decoder observable flips
Our Error	5.4%	Accuracy of predicting pre-decoder Λ

Why this matters: Decoder output depends on decoder algorithm choice (MWPM vs. Union-Find vs. tensor networks). Our validation is decoder-agnostic and measures hardware performance directly.

Processing Performance

Distance	Shots	Detectors	Processing Time	Memory
d=3	50,000	17	3.4s	142 MB
d=5	50,000	49	6.3s	289 MB
d=7	50,000	97	12.7s	512 MB

Scalability: Linear in shots, polynomial in detectors. Suitable for production validation workflows.

Code Example: Using the getQore API

import requests

# 1. Upload syndrome data
url = "https://getqore.ai/api/v1/validate"
files = {
    'syndrome_file': open('willow_d7.b8', 'rb')
}
data = {
    'platform': 'google_willow',
    'distance': 7,
    'n_detectors': 97,
    'observable': 'X'
}

# 2. POST request
response = requests.post(url, files=files, data=data)
results = response.json()

# 3. Extract metrics
print(f"Error Rate:  {results['error_rate']:.6f}")
print(f"R² Score:    {results['r_squared']:.4f}")
print(f"Lambda:      {results['lambda']:.4f}")
print(f"Grade:       {results['validation_grade']}")

Output:

Error Rate:  0.001071
R² Score:    0.9991
Lambda:      0.7277
Grade:       A

Limitations & Future Work

Current Scope

✅ Validated: Google Willow superconducting qubits ✅ Code Type: Surface codes (d=3, 5, 7) ✅ Observable: X-basis measurements ✅ Metric: Pre-decoder Lambda prediction

Roadmap (Q1-Q2 2026)

🗓️ IBM Quantum processors (superconducting qubits, Qiskit format) 🗓️ IonQ Aria/Forte (trapped ion systems) 🗓️ Amazon Braket multi-vendor support 🗓️ Color codes, XZZX codes (beyond surface codes) 🗓️ Post-decoder Lambda prediction

Why This Matters for the Quantum Industry

Current Pain Points

Decoder Implementation Barrier: Each platform requires custom decoder tuning
Validation Bottleneck: Running decoders on 100K+ shots takes hours
Expertise Gap: QEC validation requires specialized knowledge

Our Solution

✅ Platform Agnostic: Same API works across Google, IBM, IonQ, Amazon (with calibration) ✅ Fast Validation: Results in 2-12 seconds (vs. hours for decoder-based methods) ✅ No Decoder Needed: Analyze syndrome data directly ✅ Production Ready: REST API with 99.9% uptime

Try It Yourself

Beta Program (Limited to First 10 Testers):

🎯 3 months FREE access 🎯 10 validations/month 🎯 Google Willow support 🎯 Full validation reports 🎯 Email support

Sign up: getqore.ai

Technical Details

Patent: US 63/903,809 (Filed Oct 22, 2025) Author: R.J. Mathews Organization: getQore Website: getqore.ai Contact: support@getqore.ai

Source Code: Patent-pending (proprietary) Dataset: Google Willow public data (Zenodo DOI: 10.5281/zenodo.13273331)

References

Google Quantum AI. "Quantum error correction below the surface code threshold." Nature (2024). DOI: 10.1038/s41586-024-08449-y
Google Willow Syndrome Dataset. Zenodo (2024). DOI: 10.5281/zenodo.13273331
Fowler, A. G. et al. "Surface codes: Towards practical large-scale quantum computation." Phys. Rev. A 86, 032324 (2012).
Dennis, E. et al. "Topological quantum memory." J. Math. Phys. 43, 4452 (2002).

About getQore

getQore provides decoder-independent quantum error correction validation for quantum hardware manufacturers and research institutions. Our patented technology enables rapid, accurate QEC validation without the complexity of traditional decoder-based approaches.

Platform Support: - ✅ Google Willow (validated) - 🗓️ IBM Quantum (Q1 2026) - 🗓️ IonQ Aria/Forte (Q1 2026) - 🗓️ Amazon Braket (Q2 2026)

This article was originally published at getqore.ai/blog/google-willow-validation

Tags: #QuantumComputing #ErrorCorrection #GoogleWillow #SurfaceCodes #QEC #QuantumValidation