aqora / Magic State Cultivation with Lattice Surgery

Public

About dataset version

Magic State Cultivation with Lattice Surgery - QEC Simulation Dataset

arXiv GitHub

📋 Overview

This dataset contains comprehensive quantum error correction (QEC) simulation data for magic state cultivation using lattice surgery protocols. The data is derived from the research paper "Efficient Magic State Cultivation with Lattice Surgery" and provides detailed performance metrics, error rates, and operational costs for various quantum error correction schemes.
Magic state cultivation is a critical component of fault-tolerant quantum computing, enabling universal quantum computation through the distillation of high-fidelity magic states. This dataset captures the performance characteristics of novel lattice surgery techniques applied to the Steane code merged with rotated surface codes.

🎯 Key Features

  • Single Unified File: One Parquet file with complete context in every row
  • 25 Simulation Runs: Comprehensive results across 5 different experiment types
  • Multiple Error Correction Codes: Steane [[7,1,3]] code and rotated surface codes (distance 3, 5, 7)
  • Detailed Performance Metrics: Logical error rates, acceptance rates, qubit-round costs
  • Protocol Stage Analysis: Success rates for each stage (injection, stabilization, surgery, epilogue)
  • Complementary Gap Measurements: Novel decoder performance metrics for post-selection
  • ML-Ready Format: Denormalized structure ideal for analysis, visualization, and machine learning

📊 Dataset Structure

The dataset is a single denormalized Parquet file where each row represents one complete simulation run with all parameters, results, and code specifications.

Column Groups

🔑 Identification (4 columns)

Unique identifiers for each simulation run:
ColumnTypeDescription
experiment_idstringUnique identifier for the experiment configuration
run_idstringUnique identifier for this specific run
experiment_typestringType of simulation (see experiment types below)
run_numberintRun number within the experiment (0-4)

⚙️ Simulation Parameters (12 columns)

Configuration settings for the simulation:
ColumnTypeDescription
error_probabilityfloatPhysical error rate per gate operation (0.0005-0.001)
num_shotsintNumber of Monte Carlo samples (50K-1M)
surface_distanceintCode distance for surface code patches (3, 5, or 7)
initial_valuestringInitial logical state (Plus, Zero, or null)
syndrome_extraction_patternstringPattern for syndrome measurements (XZZ, ZXZ, etc.)
perfect_initializationboolWhether perfect state initialization is used
with_heuristic_post_selectionboolEnable heuristic post-selection
with_heuristic_gap_calculationboolEnable heuristic gap calculation
full_post_selectionboolApply full post-selection protocol
num_stabilization_rounds_after_surgeryintStabilizer rounds after surgery
num_epilogue_syndrome_extraction_roundsintFinal syndrome extraction rounds
gap_thresholdfloatThreshold for complementary gap acceptance
experiment_descriptionstringHuman-readable description

📈 Simulation Results (8 columns)

Outcome metrics from the simulation:
ColumnTypeDescription
num_valid_samplesintNumber of samples that passed checks
num_wrong_samplesintNumber of samples with detected errors
num_discarded_samplesintNumber of samples rejected by post-selection
logical_error_ratefloatProbability of logical error (wrong/total_accepted)
acceptance_ratefloatFraction of samples accepted after post-selection
qubitrounds_costfloatResource cost in qubit-rounds
gap_valuefloatComplementary gap measurement value
complementary_gap_detector_idintDetector ID for gap measurement
simulation_timestampdatetimeTimestamp of simulation run

🔬 Quantum Code Specifications (7 columns)

Details of the quantum error correction code used:
ColumnTypeDescription
code_typestringType of quantum code (Steane, Surface_Rotated_d3/5/7, Merged)
num_data_qubitsintNumber of data qubits (7-49)
num_syndrome_qubitsintNumber of syndrome measurement qubits
code_distanceintCode distance (3, 5, or 7)
num_logical_qubitsintNumber of encoded logical qubits (1 or 2)
num_x_stabilizersintNumber of X-type stabilizer generators
num_z_stabilizersintNumber of Z-type stabilizer generators
code_descriptionstringHuman-readable code description

🔄 Protocol Stage Performance (8 columns)

Aggregated success metrics for each protocol stage:
ColumnTypeDescription
injection_success_ratefloatSuccess rate for magic state injection (0.85-0.92)
stabilize1_success_ratefloatSuccess rate for first stabilization (0.92-0.97)
surgery_success_ratefloatSuccess rate for lattice surgery (0.88-0.95)
stabilize2_success_ratefloatSuccess rate for second stabilization (0.93-0.98)
epilogue_success_ratefloatSuccess rate for epilogue stage (0.94-0.99)
injection_qubits_usedintQubit count for injection stage
surgery_qubits_usedintQubit count for surgery stage
total_protocol_costfloatTotal resource cost across all stages

🔧 Stim Circuit Definitions (2 columns)

Complete quantum circuit specifications in Stim format for full reproducibility:
ColumnTypeDescription
stim_circuit_definitionstringComplete quantum circuit in Stim format - includes qubit coordinates, gate operations (RX, CX, M, MX, MPP), noise models (DEPOLARIZE1/2, X_ERROR, Z_ERROR), timing (TICK), error detection (DETECTOR), and observables. Ranges from 7,535 to 183,929 characters depending on circuit complexity.
stim_circuit_lengthintCharacter count of circuit definition (for reference)
Stim Circuit Format:
  • QUBIT_COORDS: Physical qubit layout on 2D grid
  • Gates: RX (X-basis reset), R (Z-basis reset), CX (CNOT), MX/M (measurements), S/S_DAG (phase gates), MPP (Pauli product measurements)
  • Noise: DEPOLARIZE1/2, X_ERROR, Z_ERROR with error probability parameters
  • Timing: TICK markers separate time steps
  • Detectors: DETECTOR annotations mark syndrome extraction points
  • Observables: OBSERVABLE_INCLUDE defines logical measurements

🧪 Experiment Types

The dataset includes 5 distinct experiment configurations:

1. Lattice Surgery with Complementary Gap (lattice_surgery_complementary_gap)

Standard lattice surgery protocol with complementary gap threshold for post-selection.
  • Code: Merged Steane + Surface (16 data qubits)
  • Distance: 3
  • Shots: 1,000,000
  • Focus: High-statistics study of gap-based post-selection

2. Lattice Surgery Error Detection (lattice_surgery_error_detection)

Error detection scenario with perfect initialization and full post-selection.
  • Code: Surface code distance 5 (25 data qubits)
  • Distance: 5
  • Shots: 100,000
  • Focus: Detection capabilities with larger code

3. Surface Code Complementary Gap (surface_complementary_gap)

Surface code patch with complementary gap measurement.
  • Code: Surface code distance 3 (9 data qubits)
  • Distance: 3
  • Shots: 500,000
  • Focus: Surface code performance with gap metric

4. Inject and Cultivate (inject_cultivate)

Magic state injection and cultivation protocol using Steane code.
  • Code: Steane [[7,1,3]] (7 data qubits)
  • Distance: 3
  • Shots: 50,000
  • Focus: Magic state distillation fundamentals

5. Surface Code Expansion (surface_code_expansion)

Surface code distance expansion from d=3 to d=7.
  • Code: Surface code distance 7 (49 data qubits)
  • Distance: 7
  • Shots: 100,000
  • Focus: Scaling behavior with increased distance

💡 Use Cases

Quantum Computing Research

  • Error Rate Analysis: Compare logical error rates across different codes and distances
  • Resource Optimization: Study tradeoffs between fidelity and qubit-round costs
  • Post-Selection Strategies: Evaluate effectiveness of complementary gap thresholds
  • Scaling Studies: Analyze how performance scales with code distance

Education & Teaching

  • QEC Fundamentals: Understand relationships between physical and logical error rates
  • Code Comparisons: Compare Steane vs. surface codes in practical scenarios
  • Protocol Visualization: Examine stage-by-stage success rates in magic state cultivation

Machine Learning Applications

  • Predictive Modeling: Train models to predict logical error rates from parameters
  • Anomaly Detection: Identify unusual simulation runs or parameter configurations
  • Feature Importance: Determine which parameters most impact success metrics
  • Optimization: Use ML to suggest optimal experiment configurations

📥 Loading the Dataset

Python (Pandas)

import pandas as pd

# Load the dataset
df = pd.read_parquet("aqora://aqora/magic-state-cultivation-with-lattice-surgery/v0.0.0")

# Display basic info
print(f"Total runs: {len(df)}")
print(f"Columns: {len(df.columns)}")
print(f"Experiment types: {df['experiment_type'].nunique()}")

# View first few rows
print(df.head())

# Get summary statistics
print(df.describe())

📊 Analysis Examples

Example 1: Error Rate Distribution by Experiment Type

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Load data
df = pd.read_parquet("aqora://aqora/magic-state-cultivation-with-lattice-surgery/v0.0.0")

# Create visualization
plt.figure(figsize=(12, 6))
sns.boxplot(data=df, x='experiment_type', y='logical_error_rate')
plt.xticks(rotation=45, ha='right')
plt.title('Logical Error Rate Distribution by Experiment Type')
plt.ylabel('Logical Error Rate')
plt.xlabel('Experiment Type')
plt.tight_layout()
plt.show()

# Print statistics
stats = df.groupby('experiment_type')['logical_error_rate'].describe()
print(stats)

Example 2: Cost vs Fidelity Analysis

import pandas as pd
import matplotlib.pyplot as plt

# Load data
df = pd.read_parquet("aqora://aqora/magic-state-cultivation-with-lattice-surgery/v0.0.0")

# Calculate fidelity (1 - error_rate)
df['fidelity'] = 1 - df['logical_error_rate']

# Scatter plot
plt.figure(figsize=(10, 6))
for exp_type in df['experiment_type'].unique():
    subset = df[df['experiment_type'] == exp_type]
    plt.scatter(subset['qubitrounds_cost'], subset['fidelity'], 
                label=exp_type, alpha=0.7, s=100)

plt.xlabel('Qubit-Rounds Cost')
plt.ylabel('Logical Fidelity (1 - error rate)')
plt.title('Fidelity vs Resource Cost Tradeoff')
plt.legend(bbox_to_anchor=(1.05, 1), loc='upper left')
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

Example 3: Code Distance Impact

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

# Load data
df = pd.read_parquet("aqora://aqora/magic-state-cultivation-with-lattice-surgery/v0.0.0")

# Group by code distance
distance_analysis = df.groupby('code_distance').agg({
    'logical_error_rate': ['mean', 'std'],
    'num_data_qubits': 'first',
    'acceptance_rate': 'mean'
}).round(6)

print("Impact of Code Distance:")
print(distance_analysis)

# Visualization
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(14, 5))

# Error rate vs distance
df.groupby('code_distance')['logical_error_rate'].mean().plot(
    kind='bar', ax=ax1, color='coral')
ax1.set_title('Average Logical Error Rate by Code Distance')
ax1.set_ylabel('Logical Error Rate')
ax1.set_xlabel('Code Distance')

# Qubit count vs distance
df.groupby('code_distance')['num_data_qubits'].first().plot(
    kind='bar', ax=ax2, color='skyblue')
ax2.set_title('Data Qubits Required by Code Distance')
ax2.set_ylabel('Number of Data Qubits')
ax2.set_xlabel('Code Distance')

plt.tight_layout()
plt.show()

Example 4: Protocol Stage Analysis

import pandas as pd
import matplotlib.pyplot as plt

# Load data
df = pd.read_parquet("aqora://aqora/magic-state-cultivation-with-lattice-surgery/v0.0.0")

# Extract stage success rates
stages = ['injection_success_rate', 'stabilize1_success_rate', 
          'surgery_success_rate', 'stabilize2_success_rate', 
          'epilogue_success_rate']

stage_names = ['Injection', 'Stabilize 1', 'Surgery', 'Stabilize 2', 'Epilogue']

# Calculate means
stage_means = [df[stage].mean() for stage in stages]

# Create bar plot
plt.figure(figsize=(10, 6))
bars = plt.bar(stage_names, stage_means, color='steelblue', alpha=0.7)
plt.axhline(y=0.95, color='red', linestyle='--', label='95% threshold')
plt.ylabel('Success Rate')
plt.title('Average Success Rate by Protocol Stage')
plt.ylim(0.85, 1.0)
plt.legend()
plt.grid(True, alpha=0.3, axis='y')

# Add value labels on bars
for bar in bars:
    height = bar.get_height()
    plt.text(bar.get_x() + bar.get_width()/2., height,
             f'{height:.3f}',
             ha='center', va='bottom')

plt.tight_layout()
plt.show()

print("\nStage-by-Stage Performance:")
for name, rate in zip(stage_names, stage_means):
    print(f"{name:15s}: {rate:.4f}")

🔬 Research Context

This dataset implements simulations based on:
Yutaka Hirano. "Efficient Magic State Cultivation with Lattice Surgery." arXiv preprint arXiv:2510.24615 (2024).
Key Innovations:
  • Complementary gap post-selection for improved magic state fidelity
  • Lattice surgery protocols for efficient multi-qubit operations
  • Integration of Steane and surface codes for optimized resource usage
Simulation Framework:
  • Primary Tool: Stim (v1.15 or later)
  • Decoder: PyMatching (minimum-weight perfect matching)
  • Language: Python 3.8+