DeepPractise
DeepPractise

Randomized Benchmarking (Intro)

Track: Noise & Errors · Difficulty: Intermediate · Est: 14 min

Randomized Benchmarking (Intro)

Overview

Randomized Benchmarking (RB) answers a practical question:

  • “On average, how noisy are my gates when I run realistic sequences?”

RB is widely used because it can produce a robust, interpretable error estimate without reconstructing a full model of the device.

It is especially valued because it reduces sensitivity to some nuisance effects that often dominate naive measurements.

Intuition

Why RB exists

If you try to estimate gate quality by running one circuit and comparing to an ideal result, you run into a problem:

  • state preparation may be imperfect
  • measurement may be imperfect
  • the circuit result may look bad even if gates are fine, or look good even if some gates are bad

RB uses randomness to “wash out” detailed structure and reveal an average decay trend.

Averaging over gate errors

The intuition is:

  • A random sequence of gates tends to spread errors around.
  • When you repeat this many times with different random sequences, you can estimate a typical error rate.

RB looks for a simple pattern:

  • as the sequence gets longer, performance decays

That decay slope is the key output.

RB vs tomography (high level)

  • Tomography tries to reconstruct the full behavior (very detailed, expensive).
  • RB tries to estimate an average error rate (coarser, cheaper, often more robust).

Formal Description

We define RB by its workflow and its output.

Basic RB procedure (conceptual)

  1. Choose a sequence length mm.
  2. Pick a random sequence of mm gates from a set with nice properties (often Clifford gates).
  3. Append a final “inverse” gate that would undo the sequence if everything were perfect.
  4. Run the circuit many shots and estimate the probability of returning to the expected state.
  5. Repeat for many random sequences at the same length mm.
  6. Repeat for several different lengths mm.

What you plot:

  • success probability vs sequence length

Typical observation:

  • the success probability decays roughly exponentially with length

RB then fits that decay to extract an average error-per-gate-like quantity.

What RB does tell you

RB is good at estimating:

  • an average error rate over the chosen gate set
  • a trend with depth

It is especially useful as a stable “health metric” you can track over time.

What RB does NOT tell you

RB does not automatically give:

  • a full noise model
  • which specific gate is the worst
  • a worst-case guarantee
  • a direct prediction for your exact algorithm circuit

Also, RB can hide some coherent or correlated error structure by averaging. That can be good (robustness) and also a limitation (it may miss specific failure modes).

Worked Example

Suppose you run RB with lengths m=10,20,50,100m=10,20,50,100. You observe success probabilities like:

  • m=10m=10: 0.95
  • m=20m=20: 0.91
  • m=50m=50: 0.80
  • m=100m=100: 0.65

You fit a smooth decay curve and extract an average “error per gate” estimate.

Interpretation:

  • a longer random sequence is more likely to drift away from the ideal
  • the rate of decay summarizes typical gate noise

Even without exact fitting math, the key idea is: RB converts messy gate errors into a clean depth-dependent signal.

Turtle Tip

Turtle Tip

RB is less about reconstructing noise and more about summarizing it: “How fast does performance decay as I stack gates?”

Common Pitfalls

Common Pitfalls
  • Treating RB as a guarantee for your algorithm. RB measures an averaged behavior over a specific gate set.
  • Forgetting RB is gate-set dependent. A different compilation (different gate set) can change the relevance.
  • Assuming RB pinpoints which gate is broken. Standard RB usually cannot isolate a single bad gate without variants.

Quick Check

Quick Check
  1. What plot does RB conceptually rely on?
  2. Why does RB use random sequences and an inverse gate?
  3. Name one thing RB does not tell you.

What’s Next

RB gives an interpretable average error estimate. Next we contrast it with a more detailed approach: Quantum Process Tomography, which attempts to reconstruct the full action of a process, but scales poorly.