Paper The following article is Open access

Repetitive readout enhanced by machine learning

, , , and

Published 4 February 2020 © 2020 The Author(s). Published by IOP Publishing Ltd
, , Citation Genyue Liu et al 2020 Mach. Learn.: Sci. Technol. 1 015003 DOI 10.1088/2632-2153/ab4e24

2632-2153/1/1/015003

Abstract

Single-shot readout is a key component for scalable quantum information processing. However, many solid-state qubits with favorable properties lack the single-shot readout capability. One solution is to use the repetitive quantum-non-demolition readout technique, where the qubit is correlated with an ancilla, which is subsequently read out. The readout fidelity is therefore limited by the back-action on the qubit from the measurement. Traditionally, a threshold method is taken, where only the total photon count is used to discriminate qubit state, discarding all the information of the back-action hidden in the time trace of repetitive readout measurement. Here we show by using machine learning (ML), one obtains higher readout fidelity by taking advantage of the time trace data. ML is able to identify when back-action happened, and correctly read out the original state. Since the information is already recorded (but usually discarded), this improvement in fidelity does not consume additional experimental time, and could be directly applied to preparation-by-measurement and quantum metrology applications involving repetitive readout.

Export citation and abstract BibTeX RIS

Original content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.

1. Introduction

Single-shot readout is a key component for scalable quantum information processing [1, 2], for its close connection to state initialization and fault-tolerant quantum error correction [3]. Indeed, it is one of the main deciding factors in the selection of potential qubits. Single-shot readout has been achieved in various physical qubit systems, ranging from neutral atoms [46], to trapped ions [7], superconducting qubit [8], and solid-state defect centers [916]. There are however situations where a candidate qubit has favorable coherence properties, but does not naturally come with single-shot readout capabilities. Examples include Al+ ions [17, 18] and room-temperature nitrogen-vacancy (NV) centers in diamond [1216], where a closed optical cycle for readout is either lacking, or experimentally challenging. A solution to this problem is through repetitive quantum-non-demolition (QND) measurements [18].

In the repetitive QND protocol, a controlled-NOT (CNOT) gate is applied to correlate the qubit state to an ancilla, which is subsequently read out (figure 1(a)). If the readout operator commutes with the qubit's intrinsic Hamiltonian, in other words, if the readout is QND, one can repeat the above process multiple times to increase signal-to-noise ratio, until the desired fidelity is reached.

Figure 1. Refer to the following caption and surrounding text.

Figure 1. (a) Quantum circuit for repetitive quantum-non-demolition readout of the nuclear spin state $\left|{\psi }_{n}\right\rangle $, using the ancilla electronic spin ($\left|{0}_{e}\right\rangle $). Here we assume e.g. to map the $\left|{m}_{I}=0\right\rangle $ nuclear spin state to the NV $\left|{m}_{S}=0\right\rangle $ state and the $\left|{m}_{I}=+1\right\rangle $ state to the $\left|{m}_{s}=+1\right\rangle $ state. (b) A typical histogram of total photon numbers collected from repetitive readout, originating from bright (red, $\left|{m}_{I}=0,-1\right\rangle $) and dark (grey, $\left|{m}_{I}=+1\right\rangle $) states, is generated using simulation. A threshold at the cross point classifies future readout results in the threshold method. (c) Shallow neuron network architecture of MATLAB® Neural Net Pattern Recognition tool (nprtool), with sigmoid as activation function and softmax output. nprtool only allows users to change the number of neurons in the hidden layer for high dimensional data. The ML input is the time trace of single photon detector clicks xk (at repetition k) in individual repetitive readout experiment, and we take the cumulative sum ('cumsum') ${\bar{x}}_{i}={\sum }_{k=1}^{i}{x}_{k}$ of individual time traces before feeding the data to the neural network. W1 (W2) and b1 (b2) are the weights and bias of the hidden (output) layer, which are learnable parameters of the network. The output is the probability p1 (p2) of the state being dark (bright).

Standard image High-resolution image

This protocol is also known as the repetitive readout technique widely adopted in NV research at room-temperature, where the nuclear spin state (here the 14N or a 13C) is repetitively read out with the help of the NV electronic spin [12, 19]. In its implementations so far, the spin state was determined by comparing the total photon number collected through all the repetitive readouts with a previously established threshold (figure 1(b)). The detected photon count numbers are thus divided into two classes, referred to as bright (dark) state of the qubit.

In this threshold method (TM), the readout infidelity can be evaluated from the overlap between the photon count distributions of bright and dark states. Two factors contribute to this overlap: inefficient optical readout [20], including photon shot noise and limited photon collection efficiency; and deviation from the QND condition. The first factor can be improved by embedding the emitter into photonic structures and by using better single photon detectors. The second factor imposes a more fundamental constraint. Indeed, if the readout operator does not fully commute with the system Hamiltonian, back-action from the measurement will eventually limit the number of photons that can be collected before quantum information is destroyed [21, 22].

To mitigate this effect, we propose to use the additional information carried by the measurement-induced state perturbation itself. Information about the perturbation is already recorded during typical experiments, in the form of the time trace of photon clicks from the repetitive readouts (figure 1(c)), but is usually discarded in the TM after extracting the total photon number. Identifying the perturbation and tracing back to the unperturbed original state using this information is the key to improving the fidelity of readout.

Unfortunately, finding an elegant analytical approach proves difficult–the complexity of the photodynamics exhibits intrinsic randomness, and the inefficient photon collection process yields noisy data, precluding clean analytical analysis that would take advantage of the additional information. On the other hand, machine learning (ML) is designed to discover hidden data correlations, and it is widely used in classification problems [23]. It has been recently introduced in quantum information tasks to mitigate crosstalks in multi-qubit readout [24], to enhance quantum metrology [25, 26], to identify quantum phases of matter and phase transitions [2729], to identify entanglement [3032], and even to determine existence of quantum advantage [33], to name a few. In particular, ML shows success in efficient interpretation of quantum state tomography (QST), by being robust to partial QST and state-preparation-and-measurement (SPAM) errors [32, 3436].

In this work, we apply ML to state discrimination for the repetitive readout of NV center. To design and evaluate the ML method, we use the full information from time trace data generated by quantum Monte-Carlo simulation. We tried different supervised ML methods and mainly focused on a shallow neural network realized using MATLAB® Neural Net Pattern Recognition tool (npartool). We observed consistent increase in readout fidelity using ML over TM. The improvement in readout fidelity albeit small is robust over a parameter space that covers individual NV differences. One application of our results is in preparation-by-measurement: when one discards less trustworthy measurements, ML yields a more efficient initialization process than TM.

Since in our method the training labels are readily available in experiments with very high fidelity [1216], it can be readily applied to current experiments. Together with the robustness of our method over NV photodynamic parameters, we expect that the improved readout fidelity can be achieved in experiments.

2. Repetitive readout model and simulation

We consider reading out the native 14N nuclear spin state through the electronic spin of NV center at room-temperature as an example. The NV center's ground state is an electronic spin triplet (S = 1), and can be optically polarized to the $\left|{m}_{s}=0\right\rangle $ state. The other two sublevels $\left|{m}_{s}=\pm 1\right\rangle $ have additional non-radiative decay channels under optical illumination, allowing optical readout of spin state by fluorescence intensity. The native 14N nuclear spin is a nuclear spin-1 (I = 1), and couples to the NV center through hyperfine interaction. 14N does not have optical readout, but it supports a CnNOTe operation (control on nuclear spin and NOT gate on electronic spin): $\left|{m}_{s}=0,{m}_{I}=+1\right\rangle \leftrightarrow \left|{m}_{s}=+1,{m}_{I}=+1\right\rangle $, and $\left|{m}_{s},{m}_{I}=0,-1\right\rangle \leftrightarrow \left|{m}_{s},{m}_{I}=0,-1\right\rangle $, which correlates the 14N to the NV state.

In the repetitive readout protocol, the NV starts in $\left|{m}_{s}=0\right\rangle $, and a CNOT gate correlates the nuclear spin state to NV. A green laser then reads out the NV state, while also repolarizing it back to $\left|{m}_{s}=0\right\rangle $. Under high magnetic field, where the NV and 14N energies are well separated, this process is approximately QND and can be repeated a few thousand times to accumulate signal, discriminating the bright $\left|{m}_{I}=0,-1\right\rangle $ (dark $\left|{m}_{I}=+1\right\rangle $) state of 14N in a single shot (figure 1). Still, the high magnetic field cannot fully eliminate back-action of the measurement on 14N, which is caused by the relatively strong excited state transverse hyperfine interaction ${A}_{\perp }({S}_{+}{I}_{-}+{S}_{-}{I}_{+})$. This perturbation causes flip–flips between NV and the 14N, destroying the quantum information. In the TM, this perturbation prevents us from keeping to accumulate useful signal and reduces the fidelity of state discrimination. ML, instead, as we find out, can identify the majority of such flips and therefore improve the readout fidelity. Ultimately, the readout fidelity is limited by flips that occur very early during repetitive readout.

We used simulated data to explore the effectiveness of ML in repetitive readout and to better analyze the source of improvement. To fully capture the photodynamics involved in the repetitive readout process, we employed a 33-level model, considering the NV electronic and 14N nuclear spins and the neutrally charged NV0 state. The model is described in more detail in the appendix. Most transition rates in the model were accurately measured from independent experiments [3740] and we use values from Gupta et al [39]. The excited state NV-14N transverse hyperfine interaction strength and NV to NV0 (de)ionization rate at strong laser power were not precisely determined before, and therefore a reasonable range is explored to cover possible variations in individual NVs, based on the results from [12, 13, 41, 42].

In the simulation, we assumed an intermediate magnetic field of 7500 G typical for repetitive readout experiments, and a photon collection efficiency of 30%, standard with photonic structures like solid immersion lens or parabolic mirrors on the diamond [4345]. A perfect CNOT gate connecting $\left|{m}_{s}=0,{m}_{I}=+1\right\rangle \leftrightarrow \left|{m}_{s}=+1,{m}_{I}=+1\right\rangle $ was assumed. Correspondingly, the dark state is $\left|{m}_{I}=+1\right\rangle $, and bright state is $\left|{m}_{I}=0,-1\right\rangle $.

We remark that it is possible to use the same protocol to read out 13C rather than 14N [1316], given well-characterized hyperfine interaction strengths [4649].

3. Neural network architecture

The network in nprtool is a two-layer feed-forward neuron network (figure 1(c)). In all trainings, we used a data set of size 10,000 with a random portion of 15% for validation. The input data is the time trace of single photon detector clicks through the repetitive readout process (figure 1(c)). Because the total photon count is a good metric for state discrimination, we take the cumulative sum of the time trace of photon detection {xk} before feeding it to the neural network ${\bar{x}}_{i}={\sum }_{k=1}^{i}{x}_{k}$. Out of the 10 000 data, half are dark state $\left|{m}_{I}=+1\right\rangle $, while the other half are bright with a 1: 1 ratio between $\left|{m}_{I}=0\right\rangle $ and $\left|{m}_{I}=-1\right\rangle $. After training, we used a test set of size 4000, which was generated in the same way as the training set but not used in training, to independently test the network. We performed Monte Carlo cross-validation, which typically repeated the aforementioned training process 10 times and the average accuracy was used throughout this work. Error bars represent the standard error of the 10 results.

We found that approximately 12.5 neurons per 1000 repetitions was a good balance between the increase in fidelity and avoidance of overfitting.

4. Results

We first investigate the influence of the repetition number on readout fidelity. The fidelity F across this manuscript is defined as

Equation (1)

where Fbright and Fdark are the percentage of bright and dark states that are correctly read out, respectively.

The number of repetition influences the readout fidelity in two ways: 1. A larger repetition number means more photons detected and better separation between photon count distributions of the bright (dark) states (figure 1(b)). 2. A larger repetition number, however, also implies a longer illumination time and a higher probability of the 14N nuclear spin to flip, due to the large transverse hyperfine interaction in the excited state, which mixes the photon count distributions of two initially different states. As a result of these competing effects, there is an optimal repetition number Nopt for the TM. On the other hand, the readout fidelity from ML keeps improving as we increase the repetition number even if the increase rate slows down (figure 2(a)). At Nopt, we observed a 0.34% increase in fidelity with ML. Since the time trace input for ML is recorded in all experiments even when intended for TM, this improved fidelity does not consume additional experimental time. One can add more repetitions in the experiment, and harness a further increase as much as 0.57% in readout fidelity (compared to TM at Nopt). The improvement at N > Nopt suggests that ML is not only more robust against 14N flips, but rather extracts useful information from the flips. This is investigated in more detail later.

Figure 2. Refer to the following caption and surrounding text.

Figure 2. (a) Readout fidelity as a function of repetition number N in the repetitive readout. The fidelity from TM (grey) declines after Nopt = 2375 due to increasing probability of 14N nuclear spin flips. The fidelity from ML keeps improving, although the increase rate slows down. For each repetition number, we retrain the network and take the average fidelity over 10 trainings. Error bars are the standard error of the 10 training results and are smaller than markers. Simulation parameters: {kion = 90βMHz, A = −50 MHz}. (b) Fidelity comparison of TM at its optimal repetition number Nopt, ML at Nopt, and ML at N = 8000 under different NV parameters. Nopt for each were (from left to right): 2000, 2375, 2750, 3125 and 2750, respectively. Error bars are the standard error of 10 training results.

Standard image High-resolution image

As mentioned earlier, the excited state transverse hyperfine interaction strength A between NV and 14N , and (de)ionization rate kion(kdeion) between NV and NV0 under strong illumination have been not yet determined to satisfactory precision. We therefore explored a parameter range to cover realistic values one might encounter in experiment: A = {−30, −40, −50} MHz and ${k}_{\mathrm{ion}}=\{70,90,110\}\times \beta $ MHz, where β is a unit-less value proportional to laser power. In the simulation, we choose β such that for any combination of parameters the NV would emit the same total number of photons in the bright state during repetitive readout. Comparisons of TM at Nopt, ML at Nopt and ML at N = 8000 are shown in figure 2(b) under different ${A}_{\perp },{k}_{\mathrm{ion}}$. The trend matches figure 2(a). ML consistently outperforms TM with both repetition numbers chosen.

To better understand how ML achieves higher fidelity, we take a closer look at cases where 14N experienced flip-flops in the excited state, which is a major limit to the TM fidelity. We find the neural network is able to extract information from the time trace input to recognize if a flip has occurred, and recover the original state. Such flips could bring the photon count across the threshold, yielding misclassification when using TM. This is shown in figure 3, where we plot the cumulative sum of the time traces in cases where flip(s) occurred. In figure 3(a), ML correctly assigns all these time traces to their original states, while TM looks only at the total photon count at the end and compares it to the threshold (dashed line), making ∼25% wrong decisions. In figure 3(b), we show instances when ML gave the wrong classification. We notice that in those cases, the 14N flip-flops happen at the very beginning, making the time traces indistinguishable from those of the opposite initial state with no flips. There is little hope in correctly reading out these states, posing an ultimate limit to the readout fidelity.

Figure 3. Refer to the following caption and surrounding text.

Figure 3. Cumulative number of photons as a function of read out repetitions. Each trace corresponds to one input to the neural network. All traces shown here experienced at least one 14N flip, and are (a) correctly or (b) wrongly assigned by ML. The larger number of traces in (a) (93.78% of the total number of traces considered) reflects the high fidelity of the ML readout. In contrast, the TM only looks at the final photon number and compares it to the threshold (dashed line), assigning roughly 25% in (a) and all in (b) to the wrong state. In the figures, red lines represent time traces starting in bright state, grey in dark state; the dashed line is the threshold for N = 8000.

Standard image High-resolution image

Another important objective of ML is that of generalization. We explore this generalization power by testing the network R trained by {kion = 90 βMHz, A = −50 MHz} on data generated with different parameters.

First, we test the network R on different (de)ionization rate {kion = 110 βMHz, A = −50 MHz}, obtaining a fidelity of 94.4(1)% from the network R, compared to 96.31(4)% from TM. We attribute this deteriorated performance of ML to the change in the photodynamics. Under the same condition, different kion change the relative distributions of bright and dark states. This change cannot be compensated by laser intensity, and makes the network R obsolete.

We then tested the network R on data of different transverse hyperfine strengths, A = {−40, −30} MHz. Intuitively, a small change in A does not change the photoluminescence pattern, but rather modifies the 14N flip–flop rate a little, which could be captured by the network, given its ability to recognize the occurrence of flip-flops. Indeed, we observed better fidelity from the network R on A = −40 MHz data than TM, and comparable fidelity to TM on A = −30 MHz, where the parameter has changed by 40% (table 1). Here we used Nopt of the test data for both ML and network R. These results indicated that provided variations in the NV parameters are small, it is possible to use a fixed network R to directly read out any NV, without the need to run experiments to generate the traning data.

Table 1.  Robustness test of network R trained with {kion = 90 βMHz, A = −50 MHz}. We compare the readout fidelities of test data with different A from TM, ML, and network R. The result from network R is better than TM when A is not changed too much.

A (MHz) TM fidelity ML fidelity Network R fidelity
−40 97.94(2)% 98.20(4)% 98.24(4)%
−30 98.67(2)% 98.76(3)% 98.66(4)%

5. Application to initialization by readout

One scenario where even a modest increase in the fidelity can be beneficial is in state preparation-by-measurement [1216]. In this is a widely adopted technique, to achieve a higher fidelity of state preparation with the TM, two distinct thresholds are set, ${N}_{\mathrm{dark}}\lt {N}_{\mathrm{th}}$ and Nbright > Nth, where Nth is the readout threshold. Measurements in between the two thresholds are discarded, as they cannot be assigned to either bright or dark state with enough confidence. This leads to a lengthier state preparation routine. In ML, the neural network assigns each input to a probability pbright (pdark) of the state being bright (dark). A final step compares pbright, pdark and classifies accordingly. To achieve a higher fidelity, we discard cases where $0.5-t\lt {p}_{\mathrm{dark}/\mathrm{bright}}\lt 0.5+t$, with an adjustable threshold t. We compare the state preparation fidelity from TM and ML, when discarding the same amount of data, and observe that ML maintains its advantage over TM, and scales more favorably than TM with the ratio of discarded measurements (figure 4). This enables preparing a high fidelity initial state more efficiently. We observed similar improvement from unsupervised learning (see appendix), agreeing with [50].

Figure 4. Refer to the following caption and surrounding text.

Figure 4. More efficient state preparation-by-measurement. The state readout fidelity increases after discarding less trustworthy measurements and this improves the state preparation. ML always outperforms TM and scales more favorably with the ratio of discarded data. The solid curves are a guide to the eye. Error bars are the standard error of 10 training results, and are smaller than the marker.

Standard image High-resolution image

6. Conclusion and outlook

In conclusion, we have shown that ML techniques can exploit the hidden structure in the repetitive readout data of NV center at room-temperature to improve the state measurement fidelity. We used Quantum Monte-Carlo simulation based on a 33-level NV model to generate data for machine learning, and found improved single-shot readout fidelity over the traditional threshold method, that can be attributed to the ML ability to correctly classify a larger number of readout trajectory that are perturbed by the measurement process itself.

While we used simulations, generally the training process does not depend on knowledge of the model. In fact, the only information required is the label for the state ($\left|{m}_{I}=+1\right\rangle $ or $\left|{m}_{I}=0,-1\right\rangle $), which is readily available in experiments by discarding less trustworthy data [1216]. One can then use this data to train a network specific to the NV of interest, and expect an increase in readout fidelity in all subsequent repetitive readout experiments, free of any additional experimental time (although at the cost of an increased computational time). Although individual NVs may have slightly different photodynamic parameters, they should be covered by the range we explored in this work, and therefore the improvement in fidelity is expected to be ubiquitous.

In addition, the off-the-shelf MATLAB® deep learning toolbox we employed greatly reduces the complexities in the neuron network architecture, making this improvement easily reproducible and more accessible to experimentalists.

Though small, the increase in fidelity does not require any additional experimental time, and is readily compatible with experiments using repetitive readout of nuclear spins, including in quantum metrology [5153] to improve sensitivity.

To further shed light on the bright/dark decisions that affect the ML readout fidelity, one could use decision tree learning instead of a neuron network. This could potentially inform optimized readout protocols, with varying illumination times, or help further improve the neuron network architecture. More broadly, ML could be applied to more complex systems, for example to help mitigate crosstalk of fluorescence signals in a solid-state register consisting of a few nearby NV or other color centers [24].

Acknowledgments

This work was supported in part by the NSF grant EFRI-ACQUIRE 1641064 and by Skoltech.

The data that support the findings of this study are openly available at https://doi.org/10.6084/m9.figshare.9924911.v1.

Appendix A.: NV model and quantum Monte-Carlo simulation

We used a 33-level model to fully describe the dynamics of NV-14N in the repetitive readout process. This model includes the spin-1 triplet ground and excited states, and singlet metastable state for NV, the spin-1/2 ground and excited states for NV0, and the nuclear spin-1 of 14N, as illustrated in figure A1. The transition rates directly related to the NV photoluminescence have been precisely determined and reported in various works [3740], although with some significant variations. For the simulation we took the values from Gupta et al [39] listed in table A1.

Figure A1. Refer to the following caption and surrounding text.

Figure A1. The 33-level NV model used in our simulation, consisting of 11 electronic spin levels times 3 nuclear spin levels (level spacings not to scale). kr, k47(=k67), k57, k71(=k73), k72 and kion are incoherent transition rates connecting the corresponding energy levels. The optical transition rate kr between excited state and ground state are set equal for NVand NV0, and are assumed to be spin-conservative (spin non-conservative part is <1% [37]). β is a dimensionless parameter given by the ratio of the laser power to the optical transition rate. ${k}_{(\mathrm{de})\mathrm{ion}}$ is the (de)ionization rate. We assume the (de)ionization happens in the excited state and follows the selection rules depicted by the brown arrows.

Standard image High-resolution image

Table A1.  Transition rates used in the 33-level model.

Transition rates kr k47 k57 k71 k72
(MHz) 65.9 92.1 11.4 1.18 4.84

The exact (de)ionization mechanisms under 532 nm laser illumination have not been yet determined experimentally, neither have the (de)ionization rate under laser-power comparable to the saturation power (measurement under weak power can be found in [5456]). Here we assume the (de)ionization kion(kdeion) occurs only in the excited states, and obeys selection rules as illustrated in figure A1. To maintain the experimentally determined 70/30 ratio [54] between the charge states, we set kdeion = 2kion. The ionization rate is proportional to the laser intensity, which is swept around kion ≈ 90β MHz, in accordance with [13].

When the magnetic field is applied along the NV-axis, the ground state NV-14N Hamiltonian has negligible effect on the repetitive readout, thus it is not considered in the numerical simulation. The NV excited state Hamiltonian reads:

Equation (A.1)

where S and I are the electronic and nuclear spin operators, Δes = 1.42 GHz is the zero-field splitting of the electronic spin, Q = −4.945 MHz the nuclear quadrupole interaction [57], and γe = 2.802 MHz/G and γn = −0.308 kHz G–1 the electronic and nuclear gyromagnetic ratios. The hyperfine interaction term is diagonal due to symmetry:

Equation (A.2)

where A = −40 MHz were determined via ODMR experiment [58]. A was believed to be similar to A and is recently measured between −40 and −50 MHz [41].

The NV0 excited state Hamiltonian takes the form:

Equation (A.3)

with the hyperfine interaction term:

Equation (A.4)

The hyperfine interaction strengths were considered similar to those in the NV excited state [42], and we set C = C = −40 MHz.

To simulate repetitive readout experiments for both the training and testing data, we used the quantum Monte-Carlo method based on the aforementioned 33-level model. One challenge lies in the various time scales involved in the numerical simulation, from the electronic spin's fast oscillation $\omega \sim (2\pi )\cdot 10\,\mathrm{GHz}$, to the optical transition rates kij ∼ 100 MHz, to the flip–flop rate of 14N nuclear spin $1/{T}_{1}^{n}\sim \mathrm{kHz}$. We mitigate this issue by employing the Born–Oppenheimer approximation [59] in our numerical simulation, and average out the fast oscillation at ω as following.

We define δpmn as the transition probability from the state $\left|m\right\rangle $ to $\left|n\right\rangle $ in the time step δt. Starting from $\left|\psi (t=0)\right\rangle =\left|m\right\rangle $, we have

Equation (A.5)

Notice that $| \left\langle i| \psi (t)\right\rangle {| }^{2}$ is periodic with period 2π/ω, which is much smaller than the time step $\delta t\sim 1/{k}_{{ij}}$. Thus, we assume only the average effect of this oscillation is seen in each time step, and numerically find $\left\langle \tfrac{\delta {p}_{{mn}}}{\delta t}\right\rangle $. This allows us to efficiently perform the quantum Monte-Carlo simulation.

Appendix B.: Machine learning discussions

B.1. Recurrent neural network

Recurrent neural network (RNN) is a commonly used architecture specializing in time-series data with the capability to understand the correlation within the time-series. In the main text, we showed results obtained using shallow neural network. In order to see if we gain by exploiting the correlation within the time series we also tested the performance of an advanced recurrent neural network: long short-term memory (LSTM). Due to the nature of recurrent neural network, the training process is very time-consuming and therefore not suitable for exploring multiple parameters in our model. To speed up the training process, we averaged the input time trace data over 100 realizations, to greatly reduce the training set dimension. Indeed, this may have caused some loss of information. The result though still consistently outperforms the TM and is comparable to the shallow neural network shown in the main text (see table B1). One remark is that we did not take the cumulative sum for the input data, because LSTM specializes in time series data and is able to recognize some quasi-periodic patterns.

Table B1.  Comparison between the fidelity obtained through TM, ML and LSTM under different parameters. All training and testings were conducted at the Nopt of that set of parameters. Overall, the LSTM algorithm has similar performance compared with the shallow neural network.

A (MHz) kion (MHz) TM fidelity ML fidelity LSTM fidelity
  70β 97.56(4)% 97.86(7)% 97.61(5)%
−50 90β 96.98(4)% 97.32(5)% 97.40(2)%
  110β 96.31(4)% 96.71(5)% 96.77(7)%
−30   98.67(2)% 98.76(3)% 98.44(3)%
−40 90β 97.94(2)% 98.20(4)% 98.29(3)%
−50   96.98(4)% 97.32(5)% 97.40(2)%

B.2. Unsupervised learning

In the main text we compared the enhanced fidelities of TM and supervised learning after discarding less trustworthy data. Another possibility is to use unsupervised learning [50]. This method is of interest because unsupervised learning does not require any well-labelled data. We implemented the k-means algorithm that classifies a given data set into k different groups.

We first use the TM readout to obtain a bright (dark) group of measurement trajectories. We then perform k-means on the bright (dark) group to further classify it into k subgroups. The fidelity increases when we discard the smallest subgroup. Compared to the TM, k-means gives better fidelity as shown in figure B1, because the unsupervised learning extracts some information about 14N flips through the hidden structures in time trace data, in agreement with [50]. Note that unlike TM or supervised learning, we cannot control the ratio of discarded data. Therefore, the fidelity defined by equation (1) is not available, and only the fidelity of dark state is shown. We also remark that in rare cases, k-means gives outlier results with fidelity much worse than TM.

Figure B1. Refer to the following caption and surrounding text.

Figure B1. More efficient state preparation-by-measurement. Improved dark state readout accuracy after discarding less trustworthy readouts. Each diamond-shaped point represents an individual k-means test.

Standard image High-resolution image
Please wait… references are loading.