Josep Lumbreras, Ruo Cheng Huang, Yanglin Hu, Mile Gu, Marco Tomamichel (May 15 2025).
Abstract: We investigate work extraction protocols designed to transfer the maximum possible energy to a battery using sequential access to
N copies of an unknown pure qubit state. The core challenge is designing interactions to optimally balance two competing goals: charging of the battery optimally using the qubit in hand, and acquiring more information by qubit to improve energy harvesting in subsequent rounds. Here, we leverage exploration-exploitation trade-off in reinforcement learning to develop adaptive strategies achieving energy dissipation that scales only poly-logarithmically in
N. This represents an exponential improvement over current protocols based on full state tomography.