MNISQ Fashion v1.0.0 – Quantum Computing Dataset

leopla / MNISQ Fashion

Public

Dataset

Discussions

About this dataset version

MNISQ-fashion: Clothing (MNIST-Fashion) Quantum Circuit Subset

URL: https://aqora.io/datasets/leopla/mnisq-fashion

Total size: 1,860,000 circuits

This subset mirrors all MNIST-Fashion circuits released as MNISQ across three fidelity thresholds (f80, f90, f95), both encodings (DenseMatrix/Qulacs and Base/portable gates), and all official splits.

This is a convenience mirror for reproducibility. It does not introduce new circuits; it consolidates the MNIST-Fashion domain of MNISQ with both encodings and all official splits.

Coverage & counts

Per fidelity tier (f80/f90/f95):

Splits included: train_orig (60k) / base_train_orig (60k) / train (480k) / test (10k) / base_test (10k)
Sum per fidelity: 60k + 60k + 480k + 10k + 10k = 620k
Fidelity tiers: 3 → 3 × 620k = 1,860,000 circuits

Why two encodings?

DenseMatrix (Qulacs-optimized): highest-fidelity simulation via dense operators in Qulacs.
Base (portable gates): standard gate set for cross-tool compatibility (Qiskit, Cirq, PennyLane).
Use DenseMatrix for Qulacs performance studies; use Base for simulator/toolchain comparisons.

Data schema

All records ship with the following columns:

dataset: always "Fashion-MNIST"
split: train_orig | base_train_orig | train | test | base_test
is_base: true for the two base_* splits, else false
fidelity_bucket: "f80" | "f90" | "f95"
fidelity_min: 0.80 | 0.90 | 0.95
fidelity_value: exact fidelity for the example
filename: internal path (e.g., qasm/1234)
n_qubits: circuit width
has_dense_operator: true for DenseMatrix variants, false for Base
label: class label (0–9)
qasm: OpenQASM text of the circuit (some entries may be OpenQASM 3)
state_gz: optional gzipped amplitudes (one line per complex pair: real imag)

Quickstart (Python)

Namespace + version used below: leopla/mnisq-fashion, v1.0.0.
state_gz exists for DenseMatrix entries and is absent for Base entries.

1) Load a slice and grab `qasm` + `state_gz`

# --- Polars (lazy) ---
import polars as pl
from aqora_cli.pyarrow import dataset
df = pl.scan_pyarrow_dataset(dataset("leopla/mnisq-fashion", "v1.0.0")).collect()
qasm_str = df["qasm"][0]
sgz = df["state_gz"][0]  # None for Base entries

# --- pandas (eager) ---
import pandas as pd
df = pd.read_parquet("aqora://leopla/mnisq-fashion/v1.0.0")
qasm_str = df.loc[df.index[0], "qasm"]
sgz = df.loc[df.index[0], "state_gz"]  # NaN/None for Base entries

2) Decompress state_gz → NumPy complex vector (if present)

import gzip, numpy as np
state = None
if sgz is not None and not (isinstance(sgz, float) and np.isnan(sgz)):
    if isinstance(sgz, memoryview):
        sgz = sgz.tobytes()
    if isinstance(sgz, bytearray):
        sgz = bytes(sgz)
    text = gzip.decompress(sgz).decode("utf-8")
    pairs = [tuple(map(float, line.split())) for line in text.splitlines() if line.strip()]
    state = np.array([complex(r, i) for r, i in pairs], dtype=np.complex128)

3) Use the circuit in your favorite framework

Qiskit (QASM2/3), PennyLane (QASM if available or StatePrep), Cirq (QASM2) — same patterns as in mnisq.

Visualize an encoded image (28×28)

Use either the provided amplitudes (state_gz) or simulate from QASM, as shown in mnisq.

Reproducibility & versions

Pin a version in your code and paper (e.g., v1.0.0) for exact reproducibility.
Use deterministic seeds across Python/NumPy/your QC library to align shuffles, inits, and simulator draws.

Benchmarks (as reported in the MNISQ paper)

Quantum kernels: up to ~97% accuracy.
Classical sequence models (e.g., S4 on tokenized QASM): ~77%.

License & attribution

The original MNISQ dataset is released under CC BY-SA 4.0. Please comply with ShareAlike and attribution terms.
This Aqora entry is a repackaged mirror for convenience; credit the original authors and include the Aqora dataset URL + pinned version you used.

How to cite

Please include both citations: the original paper and the Aqora dataset entry you actually used (with version).

1) Original MNISQ paper

@misc{placidi2023mnisq,
  title         = {MNISQ: A Large-Scale Quantum Circuit Dataset for Machine Learning on/for Quantum Computers in the NISQ era},
  author        = {Placidi, Leonardo and Hataya, Ryuichiro and Mori, Toshio and Aoyama, Koki and Morisaki, Hayata and Mitarai, Kosuke and Fujii, Keisuke},
  year          = {2023},
  eprint        = {2306.16627},
  archivePrefix = {arXiv},
  primaryClass  = {quant-ph},
  doi           = {10.48550/arXiv.2306.16627},
  url           = {https://arxiv.org/abs/2306.16627}
}

2) Aqora dataset entry (pin your version)

@misc{aqora_mnisq_fashion,
  title        = {MNISQ-Fashion: Digits (MNIST-Fashion) Quantum Circuit Subset (Aqora mirror)},
  howpublished = {\url{https://aqora.io/datasets/leopla/mnisq-fashion}},
  note         = {Aqora Datasets Hub. Please cite the pinned version you used, e.g., v1.0.0},
  year         = {2025},
  publisher    = {Aqora}
}

If your venue supports @dataset, you can switch the entry type accordingly.

mnisq (all domains)
mnisq-784 (digits)
mnisq-kuzushiji (Japanese characters)

Provenance

MNISQ was introduced by Placidi et al. (2023). This page scopes the MNIST-Fashion domain with all official splits and both encodings for convenience and reproducible benchmarking.