CORRELATION DEFENSE FOR QUANTUM RANDOMNESS

New nonparametric methods were developed for veriﬁcation and monitoring of quantum randomness based on the ranged correlation function (RCF) and a sequence of the ranged amplitudes (SRA). RCF analysis of diﬀerent topology subsamples from the raw data of the prototype of a quantum random number generator on homodyne detection was carried out. It was shown that in the real system there are weak local regression relations, for which it is possible to introduce a robust criterion of signiﬁcance. Precise SRA identiﬁcation of the long samples statistics was carried out. The obtained results extend the traditional entropy methods of the useful randomness analysis and open the way for creation of new strict quality quantum standards and defense for physical random number generators.


Introduction
Quantum randomness, as a phenomenon within modern quantum physics [1] and mathematical logic [2], has a very special status, allowing to attribute it to the class of effects in partially deterministic complex systems [3]. In addition, experimental physics has no ideal measuring technique capable of directly determining the "ideal" quantum randomness. Therefore, it is necessary to develop stable methods for nonparametric analysis of quantum time series [4,5] in order to elaborate the fundamental criteria for the quantitative parameterization of quantum randomness.
In quantum communications [6] and the problems of generating true random numbers [7,8], the question of the quantitative assessment of the quality of the initial raw randomness is one of the central ones. Common methods for analyzing the raw randomness in quantum systems often come down to identifying the empirical frequencies of long samples and calculating the autocorrelation function, but poorly predict methods for local analysis [9]. For this reason, we believe that new precise methods for the correlation analysis of time series [10] expand the possibilities of error correction in quantum information science at the post-processing stage and will make it possible to enhance the secrecy and defense of communications at the physical level.
In this paper, we developed new non-parametric methods for verifying and monitoring quantum chance based on the ranged correlation functions (RCF) and its analogues using a sequence of ranged amplitudes (SRA). A RCF analysis of various topology subsamples from the raw data of the prototype quantum generator of random numbers 99 on homodyne detection was performed. The data source in our work is the experimental prototype of a quantum random number generator based on the homodyne detection of vacuum fluctuations of laser radiation [8]. For the studied series, a method of precise SRA identification of the statistics of short samples was proposed, as well as weak local regression relations were found, for which stable criteria of significance were introduced. The results obtained contribute significantly to the traditional entropy analysis methods and promote the development of common quantum standards and defense for physical randomness.

Ranging
Any time series {X k }, k = 1, 2, . . . , N , consisting of real or complex numbers, can be ranged according to the chosen measure by decreasing (or increasing) and get a sequence of ranged amplitudes (SRA) of the form {x n }, n = 1, 2, . . . , N , where n is the index in SRA [10]. According to this definition, the sequence of SRA {x n } is composed of exactly the same elements as the original sequence {X k }; therefore, SRA is a noninvasive (without loss of information) statistical quantitative characteristic of a data sample [10]. SRA is related to the distribution function by the following approximate relation (where N is the sample size) [4,5,11,12]: Note also that any (even non-smooth and infinite) statistical functions (statistical averages over the initial sample) of a given sample {X k } and SRA {x n } strictly coincide. Mathematically, this can be written as a condition G[ We will consider this triple as the base of the ranged correlation functions (RCF). In the future, for all samples by default, we will use the normalized scale -{X k → (X k − min(X k ))/(max(X k ) − min(X k ))}. Such a normalization, the only mapping that does not destroy the structure of linear regression relations, allows to correctly define the generalized correlation functions [10] on the domain of complex variables, which is necessary to identify nonlinear regressions [3,9,10,12]. Normalization makes it possible to select only one sample {w n } among the three described SRA to verify redundant correlations based on RCF.

RCF analysis of subsamples
Quantitative analysis of internal correlations and randomness can be most effectively implemented on the basis of significantly different topology subsamples of the initial sample, to which the quality criteria should be presented. For the original sample X j of the size 2N, j = 1, 2, . . . , 2N , we selected four subsets of the same size N : Note that the properties of these four subsamples are homogeneous with respect to time, and two former (1, 2) and latter (3,4) samples form disjoint covers of the initial sample with different topology. As a result, it is possible to consider the pairs independently: 1 & 2 and 3 & 4 . The construction of the pair 1 & 2 is indicative of ultra-long correlations (typical, for example, for mathematical generators of pseudo-random numbers), while the pair 3 & 4 enables one to "see" local linear regression connections, thereby expanding the possibilities of autocorrelation function for the initial sample of the length 2N .
The motivation for the in-depth study of this four subsamples was a statistically significant (by an order of magnitude) experimental observation of the difference in the Pearson correlation coefficients in the pair 1 & 2 ( R 2 12 = 6.9 · 10 −4 ) and pair 3 & 4 ( R 2 34 = 8.4 · 10 −3 ) for raw data of the size N = 10 6 obtained on the prototype of the quantum random number generator on homodyne detection. We discovered a local regression to the challenge of sustainable criteria of significance for nonrandomness in the source sample. The problem may be solved on the basis of the methods of SRAidentification [3] statistics, which, as we have shown earlier, can be effectively applied even to the short samples of quantum data [4,5].

W statistics of the product of samples
Traditional methods for analysis of raw quantum randomness are often reduced to the identification of empirical frequencies or their histograms for long samples [7,8], which causes certain identification errors associated with the invasiveness of these methods. For raw data obtained as a result of the homodyne detection of quantum randomness [8], it is considered correct to determine normally distributed empirical frequencies.
However, a question immediately arises about the distribution of the cumulative frequency function (discrete integral of empirical frequencies) and, accordingly, the SRA distribution associated with it (1). The cumulative distribution function can be parameterized both by the error function (integral of the normal distribution) and by the sum of two normal distributions following from the discrete integration of the normal distribution. This fundamental aspect of ambiguity is related to the discretization of the data, which leaves the possibility to use empirical frequencies, cumulative frequencies, or SRA to identify statistics. But the traditional criteria for the significance of theoretical models in statistics [11] have been proved for SRA, so we hold to the version that the normality test should be understood for the most correct identification of statistics as the proximity of the fitting function for the inverse of the SRA function (see (1)) to the error function n(x) = A + B · erf ((x − x 0 )/dx) (hereafter, dx is not used to denote integration). Our calculations show that the accuracy of the normalized SRA fitting due to the error function n(x) = (1 + erf ((x − x 0 )/dx))/2 is higher than the accuracy of the parameterization of the empirical frequency distribution by the Gaussian normal distribution. Therefore, we will further use the normality conjecture in the sense of the error function for other ranged data as well.
The main mathematical task of the traditional correlation statistical analysis of a pair of samples {X k }, {X k } is to identify the symmetric relations given by the sum- On the basis of symmetric functions, we usually further consider the variable-split sum-functions, which leads us in the analysis of G sym to independent consideration of the series {W k } and {R k } and, hence, their SRA {w n } and {r n } . In this paper, we restrict ourselves to the consideration of SRA {w n } built on the normalized series {X k } and {Y k } . It is important to emphasize that the variance of such W statistics of the product of samples (SRA {w n }) in its construction can be associated with the standard Pearson match criterion R 2 expanding its capabilities to a great extent.
Previously, we found the difference between R 2 12 = 6.9 · 10 −4 and R 2 34 = 8.4 · 10 −3 for two different subsamples from the raw data of the prototype of a quantum random number generator on homodyne detection. To see the difference between the subsamples The criterion for the significance of correlations in this case is not one Pearson parameter R 2 and two parameters {w 0 ; dw} of the model error function. The sensitivity of the RCF analysis technique can also be increased by improving the model fitting function [3,10] through introducing an additional degree of freedom θ in the form of an additional non-extensiveness parameter to the error function of the z = A + B · erf ((w − w 0 ) θ /dw) type, which corresponds to the availability of effective memory in time series. However, taking into account additional parameters of fitting with the help of the standard methods of mathematical analysis is a nontrivial task in many practical situations, and the question of the implementation of algorithms for accounting non-extensiveness in statistical distributions requires a separate in-depth study [3,10].

Angle analysis of randomness
An additional method for the correlation analysis of randomness on the basis of symmetric functions is to consider the distributions of the angles {φ k = arcsin (2W 2 k /R 2 k )/2} and the radii {R k = (X 2 k + Y 2 k ) 1/2 } constructed from the centered data {X k → X k − X k } . The SRA {r n ; z = n/N } distribution is normal in the sense of the error function with a fit accuracy of about 0.9999 , and the {ϕ n ; z = n/N } angle distribution has the character of a uniform distribution ( ϕ n ∼ = n/N ) with almost the same degree of accuracy. Therefore, to identify potential regression links, we used a more subtle criterion based on the SRA analysis of discrete derivatives of the initial distributions {φ k = φ k+1 −φ k } and {R k = R k+1 −R k } characterizing the heterogeneity of the angular distributions.
In Fig. 2, we constructed relative dependencies of the velocities of the radius-angle {r 12,n ; ϕ 12,n } and {r 34,n ; ϕ 34,n } for two subsamples 1 & 2 and 3 & 4 of different topology (size N = 10 6 ), for which the structure of the normal distribution was obtained (with the fitting accuracy of 0.9995) with differing adjustable parameters.
Since the distribution of the angles in both pairs 1 & 2 and 3 & 4 has the structure of a uniform random variable, it was possible to represent the series {φ k ; R k } in a split form {sign (φ k ); abs (φ k ); R k } , where randomness analysis can be carried out simulta- For these sequences, the standard set of NIST cryptographic tests [13] can be applied in two different cases and a p-value criterion of significance is obtained ( p -value> 0.01 indicates that the test has passed). We used subsamples of the length N = 10 6 and found that a part of quantum randomness, when split off in such a way, satisfies the cryptographic criteria of true random bits, which can be immediately used in quantum cryptography applications [6,8]. The results of passing the main tests of NIST [13] are shown in Fig. 3.
For the presented NIST tests, the standard notation was used: FREQUENCY -frequency test; BLOCK FREQUENCY -frequency test in Blocks; RUNS -check the "holes"; LONGEST RUNS -check the "holes" in the subsamples; RANK -check of matrix ranks; FFT -spectral test; UNIVERSAL -Maurer's universal statistical test; CUMULATIVE SUMS -check of cumulative sums. Both sets of binary numbers {sign (φ 12,k )} for the subsamples 1 & 2 and {sign (φ 34,k )} for the subsamples 3 & 4, as shown by additional studies, successfully pass all randomness tests. Thus, we managed to present the initial randomness through its subsamples in such a three-component form {sign (φ k ); abs(φ k ); R k } that the first component is a true random variable, as verified by the NIST tests, and proved that it is possible for the second and third components (in parallel with the first component) to apply the criteria based on the SRA and RCF methods. These circumstances are important for the implementation of an effective procedure for extracting the final binary randomness from the raw data of a physical random number generator, which will have a regulated structure with the possibility of reliable statistical monitoring of internal security parameters and self-protection of the physical generator of random numbers.

Conclusions
The development of general criteria for the quality of randomness in view of the absence of regression equations that clearly distinguish non-randomness remains a task for further investigations. However, now we can present stable multi-parameter intermediate criteria based on the parameterization of the SRA curves, the product of the samples, and the SRA distributions of the angular variables of the subsamples of different topology, which extended the traditional methods of analyzing quantum randomness. A significant advantage of composite angular analysis with splitting of a species {sign (φ k ); abs (φ k ); R k } is the possibility for accurately separating highquality binary randomness {sign (φ k )} from the initial data set and verifying it using the standard set of NIST testing methods [13].
Due to non-invasiveness, the method of SRA and RCF can also be applied in the area of identification of various signal sources [3,14,15] and noise [10,16,17]. The demonstrated advantages of the ranged analysis open up new possibilities for a quantitative description of the quality of useful quantum randomness and the introduction of universal quantum standards and defense in the area of security of optical and quantum communications.