Randomness Evaluation with the Discrete Fourier Transform Test Based on   Exact Analysis of the Reference Distribution

Hiroki Okada; Ken Umeno

arXiv:1701.01960·cs.CR·March 8, 2018

Randomness Evaluation with the Discrete Fourier Transform Test Based on Exact Analysis of the Reference Distribution

Hiroki Okada, Ken Umeno

PDF

TL;DR

This paper derives a precise mathematical reference distribution for the DFT test in NIST SP 800-22, improving its reliability and sensitivity for evaluating randomness in cryptographic applications.

Contribution

It provides an exact chi-squared distribution for the power spectrum in the DFT test, replacing the previous numerical estimation method.

Findings

01

The proposed test is more reliable than the existing DFT test.

02

The new test is more sensitive in detecting non-random sequences.

03

Experimental results confirm the improved performance of the proposed method.

Abstract

In this paper, we study the problems in the discrete Fourier transform (DFT) test included in NIST SP 800-22 released by the National Institute of Standards and Technology (NIST), which is a collection of tests for evaluating both physical and pseudo-random number generators for cryptographic applications. The most crucial problem in the DFT test is that its reference distribution of the test statistic is not derived mathematically but rather numerically estimated, the DFT test for randomness is based on a pseudo-random number generator (PRNG). Therefore, the present DFT test should not be used unless the reference distribution is mathematically derived. Here, we prove that a power spectrum, which is a component of the test statistic, follows a chi-squared distribution with 2 degrees of freedom. Based on this fact, we propose a test whose reference distribution of the test statistic is…

Tables6

Table 1. Table 1: Types of error

$ℋ_{0} :$ Null hypothesis		$ℋ_{0}$ is
= “generator is ideal”		True	False
Judgment of $ℋ_{0}$	Reject	False Positive	True Positive
	Reject	(Type I error)	True Positive
	Fail to reject	True Negative	False Negative
	Fail to reject	True Negative	(Type II error)

Table 2. Table 2: Test results for periodic sequences: passing rate R I subscript 𝑅 𝐼 R_{I} and R I I subscript 𝑅 𝐼 𝐼 R_{II} for each T 𝑇 T (red cell means that the R I ( I I ) subscript 𝑅 𝐼 𝐼 𝐼 R_{I(II)} lies outside its significance interval)

Test	${DFTT}_{present}$		${DFTT}_{pareschi}$		${DFTT}_{proposed}$
Passing rate	$R_{I}$	$R_{I I}$	$R_{I}$	$R_{I I}$	$R_{I}$	$R_{I I}$
$T = 100$	0.0	0.0	0.0	0.0	0.0	0.0
$T = 101$	0.5	0.9	0.8	1.0	0.0	0.0
$T = 102$	0.7	1.0	1.0	1.0	0.0	0.0
$T = 103$	0.9	1.0	1.0	1.0	0.0	0.0
$T = 104$	0.9	0.9	1.0	1.0	0.0	0.0
$T = 105$	0.7	1.0	1.0	1.0	0.0	0.0
$T = 106$	1.0	1.0	1.0	1.0	0.0	0.0
$T = 107$	0.9	1.0	0.9	1.0	0.0	0.0
$T = 108$	0.7	1.0	1.0	1.0	0.0	0.0
$T = 109$	0.8	1.0	1.0	1.0	0.0	0.0
$T = 110$	0.9	1.0	0.9	1.0	0.0	0.0
$T = 111$	0.9	1.0	0.9	1.0	0.0	0.0
$T = 112$	1.0	1.0	1.0	1.0	0.0	0.0
$T = 113$	1.0	1.0	1.0	1.0	0.0	0.0
$T = 114$	0.9	1.0	1.0	1.0	0.0	0.0
$T = 115$	0.9	1.0	1.0	1.0	0.0	0.0
$T = 116$	1.0	1.0	1.0	1.0	0.0	0.0
$T = 117$	0.9	1.0	1.0	1.0	0.0	0.0
$T = 118$	1.0	1.0	1.0	1.0	0.0	0.0
$T = 119$	1.0	1.0	1.0	1.0	0.0	0.0
$T = 120$	1.0	1.0	1.0	1.0	0.0	0.0
$T = 130$	1.0	1.0	1.0	1.0	0.0	0.0
$T = 140$	0.9	1.0	1.0	1.0	0.0	0.0
$T = 150$	0.8	1.0	1.0	1.0	0.0	0.0

Table 3. Table 3: Test results for existing pseudo-random number generators: Passing rates R I subscript 𝑅 𝐼 R_{I} and R I I subscript 𝑅 𝐼 𝐼 R_{II} of each PRNG (red cells mean that the R I ( I I ) subscript 𝑅 𝐼 𝐼 𝐼 R_{I(II)} lies outside its significance interval)

Test	${DFTT}_{present}$		${DFTT}_{pareschi}$		${DFTT}_{proposed}$
Passing rate	$R_{I}$	$R_{I I}$	$R_{I}$	$R_{I I}$	$R_{I}$	$R_{I I}$
AES-CTR	0.952	0.995	0.996	1.000	0.988	1.000
Mersenne-Twister	0.948	0.996	0.993	1.000	0.991	1.000
Xorshift	0.947	0.989	0.996	1.000	0.986	1.000
VSC 2.0	0.952	0.998	0.994	1.000	0.988	1.000
LCG	0.952	0.995	0.995	0.998	0.984	0.999
Micali-Schnorr	0.975	0.993	1.000	1.000	0.994	1.000
QCG-I	0.955	0.994	0.997	1.000	0.697	0.991
QCG-II	0.954	0.993	0.993	0.998	0.000	0.000
CCG	0.667	0.900	0.911	0.995	0.000	0.000

Table 4. Table 4: Summary of the conclusions derived from experiments 1 and 2

Test	${DFTT}_{present}$	${DFTT}_{pareschi}$	${DFTT}_{proposed}$
Reliability	low	high enough	high enough
Sensitivity	high	low	definitely high

Table 5. Table 5: The parameter sets for each test, and the numbers of P 𝑃 P - v a l u e 𝑣 𝑎 𝑙 𝑢 𝑒 value s generated by each test

Parameter	${DFTT}_{present}$	${DFTT}_{pareschi}$	${DFTT}_{proposed}$
$n$	100,000	100,000	4,000
$m$	1,000	1,000	25,000
Number of $P$ - $v a l u e$ s	$m = 1000$	$m = 1000$	$\frac{n}{2} - 1 = 1999$

Table 6. Table 6: Trade-off in the selection of n 𝑛 n in DFTT proposed subscript DFTT proposed {\rm DFTT}_{{\rm proposed}}

$n$	small	large
Second-level test	Accurate	Erroneous
Distribution of $\frac{2}{n} {\| S_{j} (X) \|}^{2}$	Erroneous	Accurate

Equations103

S_{j} (X)

S_{j} (X)

c_{j} (X)

s_{j} (X)

∣ S_{j} (X) ∣^{2} = (c_{j} (X))^{2} + (s_{j} (X))^{2} .

∣ S_{j} (X) ∣^{2} = (c_{j} (X))^{2} + (s_{j} (X))^{2} .

P (∣ S_{j} (X) ∣ < T_{0.95})

P (∣ S_{j} (X) ∣ < T_{0.95})

∴ T_{0.95}

N_{1} = # {∣ S_{j} (X) ∣ ∣ ∣ S_{j} (X) ∣ < T_{0.95}, 0 \leq j \leq \frac{n}{2} - 1} .

N_{1} = # {∣ S_{j} (X) ∣ ∣ ∣ S_{j} (X) ∣ < T_{0.95}, 0 \leq j \leq \frac{n}{2} - 1} .

N_{1} \sim N (0.95 \frac{n}{2}, (0.95) (0.05) \frac{n}{2}) .

N_{1} \sim N (0.95 \frac{n}{2}, (0.95) (0.05) \frac{n}{2}) .

d = \frac{N _{1} - 0.95 \frac{n}{2}}{( 0.95 ) ( 0.05 ) \frac{n}{2}} .

d = \frac{N _{1} - 0.95 \frac{n}{2}}{( 0.95 ) ( 0.05 ) \frac{n}{2}} .

1 - α - 3 \frac{α ( 1 - α )}{m} < \frac{m _{p}}{m} < 1 - α + 3 \frac{α ( 1 - α )}{m} .

1 - α - 3 \frac{α ( 1 - α )}{m} < \frac{m _{p}}{m} < 1 - α + 3 \frac{α ( 1 - α )}{m} .

χ^{2}

χ^{2}

P_{T} = igamc (\frac{9}{2}, \frac{χ ^{2}}{2}),

P_{T} = igamc (\frac{9}{2}, \frac{χ ^{2}}{2}),

P_{T} \geq α_{I I} (:= 0.0001),

P_{T} \geq α_{I I} (:= 0.0001),

N_{1}

N_{1}

d_{k im}

N_{1}

N_{1}

d_{p a r esc hi}

\frac{2}{n} ∣ S_{j} (X) ∣^{2}

\frac{2}{n} ∣ S_{j} (X) ∣^{2}

\frac{2}{n} ∣ S_{0} (X) ∣^{2} = 2 (\frac{\sum _{k = 0}^{n - 1} x _{k}}{n})^{2} .

\frac{2}{n} ∣ S_{0} (X) ∣^{2} = 2 (\frac{\sum _{k = 0}^{n - 1} x _{k}}{n})^{2} .

ϕ (t)

ϕ (t)

∴ lo g ϕ (t)

E_{X} (\cdot) := \frac{1}{2 ^{n}} X \in {- 1, 1}^{n} \sum (\cdot),

E_{X} (\cdot) := \frac{1}{2 ^{n}} X \in {- 1, 1}^{n} \sum (\cdot),

E_{x_{k}} (\cdot) := \frac{1}{2} x_{k} \in {- 1, 1} \sum (\cdot) .

E_{x_{k}} (\cdot) := \frac{1}{2} x_{k} \in {- 1, 1} \sum (\cdot) .

lo g cos (\frac{2}{n} t a_{k, j}) = - \frac{1}{n} a_{k, j}^{2} t^{2} - \frac{1}{3 n ^{2}} a_{k, j}^{4} t^{4} + O (t^{6}) .

lo g cos (\frac{2}{n} t a_{k, j}) = - \frac{1}{n} a_{k, j}^{2} t^{2} - \frac{1}{3 n ^{2}} a_{k, j}^{4} t^{4} + O (t^{6}) .

∴ lo g ϕ (t) = - \frac{1}{n} k = 0 \sum n - 1 a_{k, j}^{2} t^{2} - \frac{1}{3 n ^{2}} k = 0 \sum n - 1 a_{k, j}^{4} t^{4} + O (t^{6}) .

∴ lo g ϕ (t) = - \frac{1}{n} k = 0 \sum n - 1 a_{k, j}^{2} t^{2} - \frac{1}{3 n ^{2}} k = 0 \sum n - 1 a_{k, j}^{4} t^{4} + O (t^{6}) .

k = 0 \sum n - 1 a_{k, j}^{2} = \frac{n}{2}, k = 0 \sum n - 1 a_{k, j}^{2 l} \leq n (l \in {1, 2, 3, \dots}),

k = 0 \sum n - 1 a_{k, j}^{2} = \frac{n}{2}, k = 0 \sum n - 1 a_{k, j}^{2 l} \leq n (l \in {1, 2, 3, \dots}),

n \to \infty lim lo g ϕ (t) = - \frac{1}{2} t^{2} . ∴ n \to \infty lim ϕ (t) = e^{- \frac{1}{2} t^{2}} .

n \to \infty lim lo g ϕ (t) = - \frac{1}{2} t^{2} . ∴ n \to \infty lim ϕ (t) = e^{- \frac{1}{2} t^{2}} .

Y

Y

\displaystyle\psi(\mbox{\boldmath$t$})

\displaystyle\psi(\mbox{\boldmath$t$})

\mbox{\boldmath$t$}=(t_{1},t_{2}),\,a_{k,j}=\cos\frac{2\pi kj}{n},\,b_{k,j}=\sin\frac{2\pi kj}{n}.

\mbox{\boldmath$t$}=(t_{1},t_{2}),\,a_{k,j}=\cos\frac{2\pi kj}{n},\,b_{k,j}=\sin\frac{2\pi kj}{n}.

\displaystyle\log\psi(\mbox{\boldmath$t$})=\sum_{k=0}^{n-1}\log\cos\left(\sqrt{\frac{2}{n}}(a_{k,j}t_{1}+b_{k,j}t_{2})\right).

\displaystyle\log\psi(\mbox{\boldmath$t$})=\sum_{k=0}^{n-1}\log\cos\left(\sqrt{\frac{2}{n}}(a_{k,j}t_{1}+b_{k,j}t_{2})\right).

lo g cos (\frac{2}{n} (a_{k} t_{1} + b_{k} t_{2}))

lo g cos (\frac{2}{n} (a_{k} t_{1} + b_{k} t_{2}))

k = 0 \sum n - 1 a_{k}^{2} = k = 0 \sum n - 1 b_{k}^{2} = \frac{n}{2}, k = 0 \sum n - 1 a_{k} b_{k} = 0, k = 0 \sum n - 1 a_{k}^{l} b_{k}^{m} \leq n (l, m \geq 0),

k = 0 \sum n - 1 a_{k}^{2} = k = 0 \sum n - 1 b_{k}^{2} = \frac{n}{2}, k = 0 \sum n - 1 a_{k} b_{k} = 0, k = 0 \sum n - 1 a_{k}^{l} b_{k}^{m} \leq n (l, m \geq 0),

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Randomness Evaluation with the Discrete Fourier Transform Test

Based on Exact Analysis of the Reference Distribution

Hiroki Okada and Ken Umeno

Abstract

In this paper, we study the problems in the discrete Fourier transform (DFT) test included in NIST SP 800-22 released by the National Institute of Standards and Technology (NIST), which is a collection of tests for evaluating both physical and pseudo-random number generators for cryptographic applications. The most crucial problem in the DFT test is that its reference distribution of the test statistic is not derived mathematically but rather numerically estimated; the DFT test for randomness is based on a pseudo-random number generator (PRNG). Therefore, the present DFT test should not be used unless the reference distribution is mathematically derived. Here, we prove that a power spectrum, which is a component of the test statistic, follows a chi-squared distribution with 2 degrees of freedom. Based on this fact, we propose a test whose reference distribution of the test statistic is mathematically derived. Furthermore, the results of testing non-random sequences and several PRNGs showed that the proposed test is more reliable and definitely more sensitive than the present DFT test.

H. Okada and K. Umeno are with the Department of Applied Mathematics and Physics, Graduate School of Informatics, Kyoto University, Kyoto, JAPAN.

e-mail: [email protected], [email protected]

Keywords: Computer security, random sequences, statistical analysis

1 Introduction

Random numbers are used in many types of applications, such as cryptography, numerical simulations, and so on. However, it is not easy to generate “truly” random number sequences. Pseudo-random number generators (PRNGs) generate the sequences by iterating some recurrence relation; therefore, the sequences are theoretically not “truly” random. The binary “truly” random sequence is defined as the sequence in which each element has a probability of exactly $\frac{1}{2}$ of being “0” or “1” and in which the elements are statistically independent of each other. It is also difficult to ascertain if the sequence is truly random; therefore, the randomness of the sequences is evaluated statistically.

NIST SP 800-22 [1, 2] is one of the famous statistical test suites for randomness that was used for selecting the Advanced Encryption Standard (AES) algorithm. NIST SP 800-22 consists of fifteen tests, and every test is hypothesis testing, where the hypothesis is that the input sequence is truly random; if the hypothesis is not rejected in all the tests, it is implied that the input sequences are random. Among the tests included in NIST SP 800-22, the DFT test is of the greatest concern to us. This test detects periodic features of a random number sequence; input sequences are discrete Fourier transformed, and the test statistic is composed of the Fourier coefficients. In 2003, Kim et al. [3, 4] reported that the DFT test and the Lempel-Ziv test in the original NIST SP 800-22 [1] have crucial theoretical problems. Regarding the DFT test, it is reported that the test statistic does not follow the expected reference distribution because of the problem that the DFT test regards Fourier coefficients as independent stochastic variables although they are not. Kim et al. numerically estimated the distribution of the test statistic with pseudo-random numbers generated with a PRNG and proposed a new DFT test with the estimated distribution. In 2005, Hamano [5] theoretically scrutinized the distribution of the Fourier coefficients in the original DFT test. However, he could not derive the theoretical distribution of the test statistic, but he did make the problems in the DFT test clearer. In 2005, because of these reports, in NIST SP 800-22 version 1.7, the Lempel-Ziv test was deleted, and the DFT test was revised according to the report of Kim et al. The DFT test has not subsequently been revised. In 2012, Pareschi et al. [6] reviewed three tests included in NIST SP 800-22, and they also numerically estimated the distribution of the test statistic. Consequently, they reported that the distribution estimated by Kim et al. is not sufficiently accurate. As stated above, several researchers have attempted to revise the DFT test. However, the distribution of the test statistic has still not been derived theoretically but rather numerically estimated.

In this paper, we review the problems in the DFT test, and we prove three facts, which are important for analyzing the reference distribution of the test statistic: Under the assumption that the input sequence is an ideal random number sequence, when $j\neq 0$ ,

•

The asymptotic distributions of both $\sqrt{\frac{2}{n}}c_{j}(X)$ and $\sqrt{\frac{2}{n}}s_{j}(X)$ are the standard normal distribution ( $\mathcal{N}(0,1)$ ) when $n\to\infty$ .

•

When $n$ is sufficiently large, $\sqrt{\frac{2}{n}}c_{j}(X)$ and $\sqrt{\frac{2}{n}}s_{j}(X)$ are statistically independent of each other.

•

The asymptotic distribution of $\frac{2}{n}|S_{j}(X)|^{2}$ is a chi-squared distribution with 2 degrees of freedom $(\chi_{2}^{2})$ when $n\to\infty$ .

Here, $X$ is an $n$ -bit binary sequence, $S_{j}(X)$ is the $j$ -th discrete Fourier coefficient of $X$ , and $c_{j}(X)$ and $s_{j}(X)$ are the real and imaginary parts of $S_{j}(X)$ , and they are defined in $\eqref{eq:defSj}$ , $\eqref{eq:defcj}$ and $\eqref{eq:defsj}$ in Section 2, respectively. There is no information about these factors in NIST SP800-22, and, to the best of our knowledge, no researchers who have studied the DFT test have ever provided rigorous proofs. These factors are necessary for analyzing the reference distribution of the test statistic. Furthermore, we propose a new DFT test based on the fact that $\chi_{2}^{2}$ is the asymptotic distribution of $\frac{2}{n}|S_{j}(X)|^{2}$ . By comparing the results of several PRNGs, we show that our test is more reliable and definitely more sensitive than the present DFT test.

2 Discrete Fourier Transform Test

In this section, we explain the procedure of the original DFT test ( ${\rm DFTT}_{{\rm original}}$ ), released in 2001 [1], before the revision in 2005 [2]. We also explain the problems reported by several researchers [4, 5]. The focus of this test is the peak heights in the discrete Fourier transform of the sequence. The purpose of this test is to detect periodic features in the tested sequence that would indicate a deviation from the assumption of randomness. The intention is to detect whether the number of peaks exceeding the 95 % threshold is significantly different than 5 %.

2.1 The procedure of the original DFT test

The zeros and ones of the input sequence $E=\{\epsilon_{0},\cdots,\epsilon_{n-1}\}$ are converted to values of $-1$ and $+1$ to create the sequence $X=\{x_{0},\cdots,x_{n-1}\}$ , where $x_{i}=2\epsilon_{i}-1\ \ (i\in\{0,\dots,n-1\})$ . For simplicity, let $n$ be even. 2. 2)

Apply a discrete Fourier transform (DFT) to $X$ to produce Fourier coefficients $\{S_{j}(X)\}_{j=0}^{n-1}$ . The Fourier coefficient $S_{j}(X)$ and its real and imaginary parts $c_{j}(X)$ and $s_{j}(X)$ are defined as follows:

[TABLE] 3. 3)

Compute $\{|S_{j}(X)|\}_{j=0}^{\frac{n}{2}-1}$ , where

[TABLE]

Because $|S_{j}(X)|=|\overline{S_{n-j}(X)}|$ , $\{|S_{j}(X)|\}_{j=\frac{n}{2}}^{n-1}$ are discarded. 4. 4)

Compute a threshold value $T_{0.95}=\sqrt{3n}$ . The 95% values $\{|S_{j}(X)|\}_{j=0}^{\frac{n}{2}-1}$ are supposed to be $<T_{0.95}$ .

According to SP800-22, $\frac{2}{n}|S_{j}(X)|^{2}$ is considered to follow $\chi_{2}^{2}$ , and $T_{0.95}$ is defined by the following equation.

[TABLE]

Several researchers [4, 5] reported that this $T_{0.95}=\sqrt{3n}$ was incorrect, and it was accordingly revised as $T_{0.95}=\sqrt{-n\ln(0.05)}$ in the DFT test in the revised NIST SP800-22 [2]. 5. 5)

Count

[TABLE]

If $\{|S_{j}(X)|\}^{\frac{n}{2}-1}_{j=0}$ are mutually independent, then under the assumption of randomness, $N_{1}$ can be considered to follow $\mathcal{B}(\frac{n}{2},0.95)$ , where $\mathcal{B}$ is the binomial distribution.

According to the central limit theorem, when $n$ is sufficiently large, the approximation to $\mathcal{B}(n,p)$ is given by the normal distribution $\mathcal{N}(np,\,np(1-p))$ . Therefore, when $n$ is sufficiently large, under the assumption of randomness,

[TABLE] 6. 6)

Compute a test static

[TABLE]

When $n$ is sufficiently large, under the assumption of randomness, the test statistic $d$ can be considered to follow $\mathcal{N}(0,1)$ 7. 7)

Compute $P$ - $value$ ; $p={\rm erfc}\left(\frac{|d|}{\sqrt{2}}\right)$ .

If $p<\alpha$ , then conclude that the sequence is non-random, where $\alpha$ is a significance level of the DFT test. NIST recommends $\alpha=0.01$ [2]. Therefore, we also define $\alpha=0.01$ . If $p\geq\alpha$ , conclude that the sequence is random. 8. 8)

Perform 1) to 7) for $m$ sample sequences $\{X_{1},X_{2},\dots,X_{m}\}$ ; $m$ $P$ - $value$ s $\{p_{1},p_{2},\dots,p_{m}\}$ are computed. 9. 9)

(Second-level test I: Proportion of sequences passing a test)

Count the number of sample sequences for which $P$ - $value$ $\geq\alpha$ and define it as $m_{p}$ . Then, under the assumption of randomness, $m_{p}$ follows $\mathcal{B}(m,1-\alpha)$ , which approximates $\mathcal{N}(m(1-\alpha),m\alpha(1-\alpha))$ when $m$ is sufficiently large. Therefore, the proportion of sequences passing a test ( $=m_{p}/m$ ) approximately follows $\mathcal{N}\left((1-\alpha),\frac{\alpha(1-\alpha)}{m}\right)$ . The range of acceptable $m_{p}/m$ is determined using the significance interval defined as

[TABLE]

If the proportion falls outside of this interval, there is evidence that the data are non-random. 10. 10)

(Second-level test II: Uniform distribution of $P$ - $value$ s)

Uniformity may also be determined by applying a $\chi^{2}$ test and determining a $P$ - $value$ corresponding to the goodness-of-fit distributional test on the $P$ - $value$ s obtained for an arbitrary statistical test (i.e., the $P$ - $value$ of the $P$ - $value$ s). This is performed by computing

[TABLE]

where $F_{i}$ is the number of $P$ - $value$ s in sub-interval $i$ . A $P$ - $value$ $P_{T}$ is calculated such that

[TABLE]

where igamc is the complementary incomplete gamma function. If

[TABLE]

the sequences can be considered to be uniformly distributed, where $\alpha_{II}$ is the significance level for $P_{T}$ . 11. 11)

If the set of $P$ - $value$ s $\{p_{1},p_{2},\dots,p_{m}\}$ passes both 9) and 10), the physical or pseudo-random number generators that generated the input sequences are concluded to be ideal.

2.2 The fundamental problems of the original and present DFT tests

Kim et al. [4] and Hamano [5] reported the following:

•

The test statistic $d:=\frac{N_{1}-0.95\frac{n}{2}}{\sqrt{(0.95)(0.05)\frac{n}{2}}}$ does not follow $\mathcal{N}(0,1)$ ;

•

$N_{1}$ does not follow $\mathcal{N}\left(0.95\frac{n}{2},(0.95)(0.05)\frac{n}{2}\right)$ .

Furthermore, Kim et al., using Secure Hash Generator (G-SHA1) [2] as a PRNG, estimated that

[TABLE]

and ${\rm DFTT}_{{\rm original}}$ was revised according to this report of Kim et al. [2]; the present DFT test, denoted as ${\rm DFTT}_{{\rm present}}$ , has not been revised since then. Therefore, the reference distribution of the test statistic of ${\rm DFTT}_{{\rm present}}$ is not mathematically derived. Furthermore, Pareschi et al. reported that the numerical estimation is not sufficiently accurate; they numerically estimated that

[TABLE]

Moreover, Pareschi et al. proposed that the DFT test with this test statistic ( ${\rm DFTT}_{{\rm pareschi}}$ ) is more reliable. (The definition of the reliability of a test is discussed in Section 5.) Therefore, it can be considered that ${\rm DFTT}_{{\rm present}}$ still has errors. First, ${\rm DFTT}_{{\rm present}}$ and ${\rm DFTT}_{{\rm pareschi}}$ are performed based on a PRNG, whose randomness should be evaluated with a randomness test; they cannot be used unless the reference distribution is mathematically derived.

As stated in step 5) in Section 2.1, $\{|S_{j}(X)|\}^{\frac{n}{2}-1}_{j=0}$ are considered to be mutually independent. However, $\{|S_{j}(X)|\}^{\frac{n}{2}-1}_{j=0}$ are not mutually independent, and this problem is expected to be the main factor for why $N_{1}$ does not follow $\mathcal{N}\left(0.95\frac{n}{2},(0.95)(0.05)\frac{n}{2}\right)$ [4, 5]. Furthermore, before considering this problem, it is also necessary to ensure that $\frac{2}{n}|S_{j}(X)|^{2}$ follows $\chi_{2}^{2}$ . Although $\frac{2}{n}|S_{j}(X)|^{2}$ is considered to follow $\chi_{2}^{2}$ in step 4) in Section 2.1, there is no information about this in SP800-22, and no researchers studying the DFT test have ever provided rigorous proofs to the best of our knowledge. We provide a proof for the DFT test in Section 3.

3 The asymptotic distribution of $\frac{2}{n}|S_{j}(X)|^{2}$

In this section, we analyze the asymptotic distribution of $\frac{2}{n}|S_{j}(X)|^{2}$ . From the definition of $|S_{j}(X)|$ in (1),

[TABLE]

When $j=0$ ,

[TABLE]

Under the assumption that $X$ is an ideal random number sequence, $P(x_{k}=-1)=P(x_{k}=1)=\frac{1}{2}$ and $\{x_{k}\}_{k=0}^{n-1}$ are mutually independent, and $E[x_{k}]=0,V[x_{k}]=1$ . Therefore, as a consequence of the central limit theorem, when $n$ is sufficiently large, $\left(\frac{\sum_{k=0}^{n-1}x_{k}}{\sqrt{n}}\right)$ follows $\mathcal{N}(0,1)$ , and $\left(\frac{\sum_{k=0}^{n-1}x_{k}}{\sqrt{n}}\right)^{2}$ follows a chi-squared distribution with 1 degree of freedom $(\chi_{1}^{2})$ . Thus, $\frac{2}{n}|S_{0}(X)|^{2}$ does not follow $\chi_{2}^{2}$ .

In the following, we consider the case when $j\neq 0$ . Here, $\frac{2}{n}|S_{j}(X)|^{2}$ follows $\chi_{2}^{2}$ if the following is true:

•

Both $\sqrt{\frac{2}{n}}c_{j}(X)$ and $\sqrt{\frac{2}{n}}s_{j}(X)$ follow $\mathcal{N}(0,1)$ .

•

$\sqrt{\frac{2}{n}}c_{j}(X)$ and $\sqrt{\frac{2}{n}}s_{j}(X)$ are mutually independent.

In the following 2 subsections, we prove the following Theorem 1, Theorem 2 and Theorem 3:

[TABLE]

From the definition of $\chi_{2}^{2}$ , Theorem 3 can be proven by combing Theorem 1 and Theorem 2.

3.1 Proof of Theorem 1: The asymptotic distribution of $\sqrt{\frac{2}{n}}c_{j}(X)$

In this subsection, we prove Theorem 1. Hamano [5] showed that the average, variance, skewness, and kurtosis of $c_{j}(X)$ and $\mathcal{N}(0,\frac{n}{2})$ are the same. However, it cannot be proven that $\mathcal{N}(0,\frac{n}{2})$ is the asymptotic distribution of $c_{j}(X)$ based only on these factors.

$\sqrt{\frac{2}{n}}c_{j}(X)$ is expressed as $\sqrt{\frac{2}{n}}c_{j}(X):=\sqrt{\frac{2}{n}}\sum_{k=0}^{n-1}x_{k}a_{k,j}$ , where $a_{k,j}=\cos\frac{2\pi kj}{n}$ . Under the assumption that $X$ is an ideal random number sequence, the characteristic function of $\sqrt{\frac{2}{n}}c_{j}(X)$ denoted by $\phi(t)$ is expressed as follows:

[TABLE]

where

[TABLE]

Using the Taylor expansion about a point $t=0$ , we obtain

[TABLE]

Since

[TABLE]

Thus, $\mathcal{N}(0,1)$ is the asymptotic distribution of $\sqrt{\frac{2}{n}}c_{j}(X)$ . Likewise, it can be proven that $\mathcal{N}(0,1)$ is the asymptotic distribution of $\sqrt{\frac{2}{n}}s_{j}(X)$ .

3.2 Proof of Theorem 2: Statistical independence of $\sqrt{\frac{2}{n}}c_{j}(X)$ and $\sqrt{\frac{2}{n}}s_{j}(X)$

In this subsection, we prove Theorem 2. Let us define a 2-dimensional stochastic variable $Y$ as the following equation:

[TABLE]

Under the assumption that $X$ is an ideal random number sequence, the characteristic function of $Y$ denoted by $\psi(\mbox{\boldmath$ t $})$ is expressed as follows:

[TABLE]

where

[TABLE]

Therefore,

[TABLE]

Using the Taylor expansion about a point $\mbox{\boldmath$ t $}=\mbox{\boldmath$ 0 $}$ , we obtain

[TABLE]

Since

[TABLE]

we obtain

[TABLE]

Therefore, when $n$ is sufficiently large, the joint probability distribution function is described as follows:

[TABLE]

As we proved before, $\mathcal{N}(0,1)$ is the asymptotic distribution of both $Y_{1}$ and $Y_{2}$ . Thus, when $n$ is sufficiently large, the probability distribution functions of $Y_{1}$ and $Y_{2}$ are $f_{Y_{1}}(y_{1})=\frac{1}{\sqrt{2\pi}}\exp\left(-\frac{y_{1}^{2}}{2}\right)$ and $f_{Y_{2}}(y_{2})=\frac{1}{\sqrt{2\pi}}\exp\left(-\frac{y_{2}^{2}}{2}\right)$ , respectively. Therefore, when $n$ is sufficiently large, the following equation is obtained:

[TABLE]

This means that $\sqrt{\frac{2}{n}}c_{j}(X)$ and $\sqrt{\frac{2}{n}}s_{j}(X)$ are mutually independent when $n$ is sufficiently large.

4 The proposed DFT test

In Section 3, we proved Theorem 3, stating that $\frac{2}{n}|S_{j}(X)|^{2}(j\neq 0)$ follows $\chi_{2}^{2}$ when $n$ is sufficiently large. Therefore, if $\{|S_{j}(X)|\}^{\frac{n}{2}-1}_{j=1}$ are mutually independent, we can consider that $N_{1}$ follows $\mathcal{N}\left(0.95\frac{n}{2},(0.95)(0.05)\frac{n}{2}\right)$ . However, $\{|S_{j}(X)|\}^{\frac{n}{2}-1}_{j=1}$ are not mutually independent. Therefore, it is necessary to mathematically analyze the distribution of the test statistic $d$ under the condition that $\{|S_{j}(X)|\}^{\frac{n}{2}-1}_{j=1}$ are not mutually independent. Hamano [5] attempted to mathematically derive the distribution of the set $\{|S_{j}(X)|\}^{\frac{n}{2}-1}_{j=1}$ , but he could not do so, and we also could not derive this distribution. However, we rigorously proved that the asymptotic distribution of $\frac{2}{n}|S_{j}(X)|^{2}$ is $\chi_{2}^{2}$ , and we develop the new DFT test ( ${\rm DFTT}_{{\rm proposed}}$ ) based on this fact. The reference distribution of the test statistic of ${\rm DFTT}_{{\rm proposed}}$ is mathematically derived, whereas that of ${\rm DFTT}_{{\rm present}}$ is estimated with a PRNG. We explain the test statistic of ${\rm DFTT}_{{\rm proposed}}$ in the next subsection.

4.1 The procedure of the proposed DFT test

In the standard approach in NIST SP800-22, each sequence is analyzed; thus, $m$ sequences give $m$ $P$ - $value$ s. However, ${\rm DFTT}_{{\rm proposed}}$ generates $\frac{n}{2}-1$ ( $n$ : length of a sequence) $P$ - $value$ s. Therefore, more $P$ - $value$ s are generated since $n$ is generally larger than $m$ . Since the number of $P$ - $value$ s should not be too large (see Section 5.3), before conducting ${\rm DFTT}_{{\rm proposed}}$ , it is necessary to adjust the length of the sequences and make them into more sets of short sequences (see also Table 5), assuming that the set input sequences are continuously generated by an RNG. Therefore, ${\rm DFTT}_{{\rm proposed}}$ is theoretically not appropriate for the isolated set of sequences.

The procedure of the proposed DFT test is described as follows:

The zeros and ones of the $m$ $n$ -length input sequence $\{E_{i}=\{\epsilon_{0}^{i},\cdots,\epsilon_{n-1}^{i}\}\}_{i=1}^{m}$ are converted to values of $-1$ and $+1$ to create the sequence $\{X^{i}=\{x^{i}_{0},\cdots,x^{i}_{n-1}\}\}_{i=0}^{m}$ , where $x^{i}_{j}=2\epsilon_{j}-1\ \ (j\in\{0,\dots,n-1\})$ . For simplicity, let $n$ be even. 2. 2)

Apply a discrete Fourier transform (DFT) to each $X^{i}$ to produce Fourier coefficients $\{S_{j}(X^{i})\}_{j=0}^{n-1}$ . The Fourier coefficient $S_{j}(X^{i})$ and its real and imaginary parts $c_{j}(X^{i})$ and $s_{j}(X^{i})$ are defined as follows:

[TABLE] 3. 3)

For all $j\in\{1,\dots,\frac{n}{2}-1\}$ , perform the Kolmogorov-Smirnov (KS) test [8, 9] on the empirical cumulative distribution function of $\{\frac{2}{n}S_{j}(X_{i})\}_{i=1}^{m}$ defined as $F_{m}^{j}(y)$ based on the difference from $\chi_{2}^{2}$ and compute the $P$ - $value$ $p_{j}$ . Here, the KS statistic $D_{m}^{j}$ and $p_{j}$ are defined as follows.

[TABLE]

where $H(y)$ is the cumulative distribution function of the Kolmogorov-Smirnov distribution:

[TABLE]

Note that $\frac{n}{2}-1$ $P$ - $value$ s $\{p_{1},p_{2},\dots,p_{\frac{n}{2}-1}\}$ are computed in this step, while the ${\rm DFTT}_{{\rm present}}$ computes $m$ $P$ - $value$ s. 4. 4)

Perform the second-level tests I and II defined in the original DFT test (see Section 2.1-9, 2.1-10). If the set of $P$ - $value$ s $\{p_{1},p_{2},\dots,p_{\frac{n}{2}-1}\}$ passes both second-level tests I and II, the physical or pseudo-random number generator that generated the input sequences is concluded to be ideal.

5 Experiments

In this section, we explain the experiments that we performed and the conclusions derived from their results. In these experiments, we compare the reliability and sensitivity of ${\rm DFTT}_{{\rm present}}$ and ${\rm DFTT}_{{\rm proposed}}$ . The reliability of tests means a low probability of false positives (type I error) (see Table 1), and the sensitivity of tests means a low probability of false negatives (type II error). Now, the null hypothesis of the tests ( $\mathcal{H}_{0}$ ) is that the “generator is ideal”. Therefore, a false positive (type I error) means an erroneous identification of an ideal generator as not random, and a false negative (type II error) means an erroneous identification of a generator that is not ideal as random. Comparing the probability of type I error and type II error, we can conclude which test is better.

For simplicity, in this experiment, we modify the significance interval of the second-level test I defined in (4) as follows:

[TABLE]

With this modified significance interval, the significance level of the second-level test I ( $:=\alpha_{I}$ ) is modified to be $\alpha_{I}=0.01$ .

5.1 Experiment 1: Test results for periodic sequences

In this experiment, we compare the sensitivity of ${\rm DFTT}_{{\rm present}}$ and ${\rm DFTT}_{{\rm pareschi}}$ . Sensitivity means a low false negative rate (low probability of type I error), i.e., high true positive rate. Here, we compare the true positive rate of each test result.

[TABLE]

Now, we define an $nm$ -length input sequence $\mathcal{X}_{n,m}$ as

[TABLE]

where

[TABLE]

We purposely create non-random (periodic) sequences from the $mn$ -length sequence $\mathcal{X}_{n,m}$ using the method described as follows:

[TABLE]

Therefore,

[TABLE]

We can clearly state this sequence is a non-random sequence. Therefore, if the test does not reject the $\mathcal{H}_{0}$ (=null hypothesis: “generator is random”), then it is a false negative (type II error).

For each $T\in\{100,101,102,\dots,120,130,140,150\}$ , we use $10$ sets of an $mn$ -length ( $nm=100,000,000$ ) input sequence $\mathcal{X}_{n,m}$ generated by the Mersenne Twister algorithm [10] and covert them to non-random $mn$ -length sequences $\mathcal{X}_{n,m}^{T}$ . Table 5 in Section 5.3 shows the parameters $n$ and $m$ for each test. In Section 5.3, we explain why the parameters $n$ and $m$ for ${\rm DFTT}_{{\rm proposed}}$ are different from the other tests. Note that $mn$ is the same. Table 2, Fig. 1 and Fig. 2 show the passing rate $R_{I(II)}$ , which is defined as follows:

[TABLE]

Because we know that $\mathcal{X}^{T}_{n,m}$ is non-random, we know that $\mathcal{H}_{0}$ =FALSE, and the passing rate means a false negative rate in this experiment. Now, the significance levels of second-level tests I and II are $\alpha_{I}\ (=0.01)$ and $\alpha_{II}\ (=0.0001)$ (defined in (5)), respectively. Therefore, the significance intervals defined in Eq. $\eqref{eq:sig_interval2}$ of $R_{I}$ and $R_{II}$ are described as follows:

[TABLE]

Therefore, if $R_{I}<0.991$ or $R_{II}<0.9992$ , we can conclude that the true positive rate is high, and we can conclude that the test is sensitive.

As shown in Table 2, Fig. 1 and Fig. 2, $R_{I}$ and $R_{II}$ of ${\rm DFTT}_{{\rm proposed}}$ are all $0.0\%$ , whereas $R_{I(II)}$ of ${\rm DFTT}_{{\rm present}}$ and ${\rm DFTT}_{{\rm pareschi}}$ are not as low. From this table and the figures, we can conclude that ${\rm DFTT}_{{\rm proposed}}$ is more sensitive than the other tests.

5.2 Experiment 2: Test results for existing pseudo-random number generators

We use $1000$ sets of an $mn$ -length ( $mn=100,000,000$ ) $\mathcal{X}_{n,m}$ input sequence generated by

•

AES Counter Mode (AES-CTR) [11],

•

Mersenne Twister [10],

•

Xorshift random number generator [12],

•

Vector Stream Cipher 2.0 (VSC 2.0) [13],

•

Linear congruential generator (LCG) [2],

•

Cubic congruential generator (CCG) [2],

•

Quadratic congruential generator I (QCG-I) [2],

•

Quadratic congruential generator II (QCG-II) [2],

•

Micali-Schnorr random bit generator [2].

VSC 2.0 is a stream cipher based on chaos theory, which was proposed by A. Iwasaki and K. Umeno [13]. We test these PRNGs using both the DFT and MS-DFT tests, and we compare the results. The parameter sets of $n$ and $m$ are the same as Table 5 in Section 5.3.

Now, the significance levels of second-level tests I and II are $\alpha_{I}:=0.01$ and $\alpha_{II}:=0.0001$ , respectively, and in this experiment, 1000 $mn$ -length sequences generated by each PRNG are tested. Table 3, Fig. 3 and Fig. 4 show the passing rate $R_{I(II)}$ , defined as follows:

[TABLE]

Now, the significance intervals (99%) of passing rates $R_{I}$ and $R_{II}$ are described as,

[TABLE]

respectively.

In this experiment, $\mathcal{H}_{0}$ for each PRNG is defined as follows:

•

$\mathcal{H}_{0}$ is TRUE (considered as random): AES-CTR, Mersenne-Twister, Xorshift, VSC 2.0, LCG (Define them as “good PRNGs”).

Because these PRNGs pass all the tests included in NIST SP800-22 [2, 13], we consider them as random in this experiment.

•

$\mathcal{H}_{0}$ is FALSE (considered as non-random): Micali-Schnorr random bit generator, QCG-I, QCG-II, CCG (Define them as “bad PRNGs”).

Because these PRNGs are rejected by several tests included in NIST SP800-22 [2], we consider them as non-random in this experiment.

Under the assumption that this definition of $\mathcal{H}_{0}$ is appropriate, let us consider the sensitivity and reliability of ${\rm DFTT}_{{\rm present}}$ , ${\rm DFTT}_{{\rm pareschi}}$ and ${\rm DFTT}_{{\rm proposed}}$ . As shown in Fig. 4, it is difficult to compare the reliability from the figure. This is because $R_{II}(=0.0001)$ is very small, whereas the number of sets of input sequences is $1000$ . Therefore, in this experiment, we focus on Fig. 3 and derive the conclusion of this experiment as follows.

•

Reliability; $R_{I}$ of “good PRNGs” (AES-CTR, Mersenne-Twister, Xorshift, VSC 2.0, and LCG).

If the $R_{I}$ of “good PRNGs” lies inside its significance interval, we can conclude that the reliability of the test is sufficiently high.

As shown in Fig. 3, the $R_{I}$ of “good PRNGs” of ${\rm DFTT}_{{\rm proposed}}$ and ${\rm DFTT}_{{\rm pareschi}}$ lies inside its significance interval, whereas that of ${\rm DFTT}_{{\rm present}}$ is lower than the threshold. Therefore, we can conclude that the reliabilities of ${\rm DFTT}_{{\rm pareschi}}$ and ${\rm DFTT}_{{\rm proposed}}$ are sufficiently high. Moreover, we can conclude that the reliability of ${\rm DFTT}_{{\rm present}}$ is low.

•

Sensitivity; the $R_{I}$ of “bad PRNGs” (Micali- Schnorr random bit generator, QCG-I, QCG-II, and CCG).

If the $R_{I}$ of “bad PRNGs” lies lower than the threshold, we can conclude that the sensitivity of the test is the highest.

As shown in Fig. 3, except for the Micali-Schnorr random bit generator, the $R_{I}$ of “bad PRNGs” of ${\rm DFTT}_{{\rm proposed}}$ are definitely lower than the other tests. The $R_{I}$ of ${\rm DFTT}_{{\rm present}}$ are also low, but not as low as ${\rm DFTT}_{{\rm proposed}}$ , and the $R_{I}$ of ${\rm DFTT}_{{\rm pareschi}}$ are higher than the $R_{I}$ of ${\rm DFTT}_{{\rm present}}$ . Therefore, we can conclude that the reliability of ${\rm DFTT}_{{\rm proposed}}$ is definitely high, that of ${\rm DFTT}_{{\rm present}}$ is high, and that of ${\rm DFTT}_{{\rm pareschi}}$ is low.

These conclusions from the aforementioned experiment are summarized in Table 4. We can conclude that ${\rm DFTT}_{{\rm proposed}}$ is more reliable and definitely more sensitive than ${\rm DFTT}_{{\rm present}}$ .

5.3 Appropriate selection of $n$ and $m$

As shown in Table 5, the parameters $n$ and $m$ of ${\rm DFTT}_{{\rm proposed}}$ are different from the other tests. NIST SP800-22 recommends $n=1,000,000$ and $m=1,000$ [2] (in experiments 1 and 2, we defined $n=100,000$ and $m=1,000$ for ${\rm DFTT}_{{\rm present}}$ and ${\rm DFTT}_{{\rm pareschi}}$ to avoid excessive computation because we need $10$ and $1000$ of $mn$ -length sequences, respectively). However, as we stated in Step 3) in Section 4.1, in the procedure of ${\rm DFTT}_{{\rm proposed}}$ , $\frac{n}{2}-1$ $P$ - $value$ s are generated, whereas ${\rm DFTT}_{{\rm present}}$ and ${\rm DFTT}_{{\rm pareschi}}$ generate $m$ $P$ - $value$ s.

Pareschi et al. reported that the number of $P$ - $value$ s should not be too large because for extremely large numbers of $P$ - $value$ s, the second-level tests always fail [15, 16]. Pareschi et al. recommended that, in the case that $n=2^{20}=1,048,576$ , for the frequency test included in NIST SP800-22, the number of $P$ - $value$ s should be smaller than $4795$ . Therefore, in ${\rm DFTT}_{{\rm proposed}}$ , $n$ should not be too large (in ${\rm DFTT}_{{\rm present}}$ , $m$ should not be too large). However, as we proved in Theorem 3, $\chi_{2}^{2}$ is the asymptotic distribution of $\frac{2}{n}|S_{j}(X)|^{2}$ . Therefore, $n$ should be as large as possible. Thus, in ${\rm DFTT}_{{\rm proposed}}$ , a selection of the parameter $n$ is a trade-off between the error of the second-level test and the error of the distribution of $\frac{2}{n}|S_{j}(X)|^{2}$ (as shown in Table 6). Considering this trade-off, we defined the value of $n$ as shown in Table 5. The appropriate selection of $n$ and $m$ in ${\rm DFTT}_{{\rm proposed}}$ still needs to be analyzed more specifically.

6 Conclusion

In this paper, we have considered the DFT test included in the NIST SP800-22 statistical test suite for random number sequences. The most crucial problem in the present DFT test (denoted as ${\rm DFTT}_{{\rm present}}$ ) is that the reference distribution of its test statistic is not mathematically derived but is rather obtained by numerical estimation with a pseudo-random number generator; the basis of the test for randomness itself is based on a pseudo-random number generator. Therefore, ${\rm DFTT}_{{\rm present}}$ cannot be used unless the reference distribution is mathematically derived.

We proved that the asymptotic distribution of the power spectrum is $\chi_{2}^{2}$ , and based on this fact, we proposed a new DFT test denoted as ${\rm DFTT}_{{\rm proposed}}$ , whose distribution of the test statistic is mathematically derived.

Furthermore, although appropriate selection of the parameters $n$ and $m$ for ${\rm DFTT}_{{\rm proposed}}$ still need to be analyzed more specifically, the results of testing non-random sequences and several pseudo-random number generators showed that ${\rm DFTT}_{{\rm proposed}}$ is more reliable and definitely more sensitive than ${\rm DFTT}_{{\rm present}}$ , which is the current standard DFT test.

Biography

Hiroki Okada

received his BSc degree in informatics from the Kyoto University, Japan, in Mar. 2014. He received his MSc degree in informatics from the Department of Applied Mathematics & Physics, Graduate School of Informatics Kyoto University, Japan, in Mar. 2016. He joined KDDI Corp. in Apr. 2016.

Ken Umeno

received his BSc degree in electronic communication from Waseda University, Japan, in 1990. He received his MSc and PhD degrees in physics from the University of Tokyo, Japan, in 1992 and 1995, respectively. From 1998 until he joined Kyoto University as a Professor in 2012, he worked for Japan’s Ministry of Posts and Telecommunications in its Communications Research Laboratory (currently the National Institute of Information and Communications Technology). From 2004 to 2012, he was CEO and President of ChaosWare, Inc. He received the LSI IP Award in 2003 and the Telecom-System Awards in 2003 and in 2008. He holds 46 registered Japanese patents, 23 registered United States patents and more than 5 international patents in the fields of telecommunications, security, and financial engineering. His research interests include ergodic theory, statistical computing, coding theory, chaos theory, information security, and GNSS based earthquake prediction.

Bibliography16

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Juan Soto, et al. , Special Publication 800-22, NIST, 2001
2[2] Special Publication 800-22 Revision 1a, NIST, 2010. http://csrc.nist.gov/publications/nistpubs/800-22-rev 1a/SP 800-22rev 1a.pdf
3[3] S. Kim, K. Umeno, and A. Hasegawa. “On the NIST statistical test suite for randomness,” IEICE Technical Report, ISEC 2003-87, Dec. 2003.
4[4] S. J. Kim, K. Umeno, and A. Hasegawa, “Corrections of the NIST Statistical Test Suite for Randomness,” Cryptology e Print Archive, Tech. Rep. 2004/018 , 2004.
5[5] K. Hamano, “The distribution of the spectrum for the discrete Fourier transform test included in SP 800-22,” IEICE Trans. Fundamentals, vol. E 88-A, no. 1, pp. 67-73, 2005.
6[6] Pareschi, F., Rovatti, R., & Setti, G. “On Statistical Tests for Randomness included in the NIST SP 800-22 test suite and based on the Binomial Distribution,” IEEE Transactions on Information Forensics and Security 7.2, pp. 491-505, 2012.
7[7] K, Hirose. “An inquiry report about test for pseudo random number generators - on the Discrete Fourier Transform test included in NIST SP 800-22”, 2005. http://www.cryptrec.go.jp/estimation/rep_ID 0212.pdf
8[8] M. A. Stephens. : Tests based on EDF statistics. In: D’Agostino, R.B. and Stephens, M.A., eds.: Goodness-of-Fit Techniques. Marcel Dekker, New York, 1986.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Randomness Evaluation with the Discrete Fourier Transform Test

Abstract

1 Introduction

2 Discrete Fourier Transform Test

2.1 The procedure of the original DFT test

2.2 The fundamental problems of the original and present DFT tests

3 The asymptotic distribution of 2n∣Sj(X)∣2\frac{2}{n}|S_{j}(X)|^{2}n2​∣Sj​(X)∣2

3.1 Proof of Theorem 1: The asymptotic distribution of 2ncj(X)\sqrt{\frac{2}{n}}c_{j}(X)n2​​cj​(X)

3.2 Proof of Theorem 2: Statistical independence of 2ncj(X)\sqrt{\frac{2}{n}}c_{j}(X)n2​​cj​(X) and 2nsj(X)\sqrt{\frac{2}{n}}s_{j}(X)n2​​sj​(X)

4 The proposed DFT test

4.1 The procedure of the proposed DFT test

5 Experiments

5.1 Experiment 1: Test results for periodic sequences

5.2 Experiment 2: Test results for existing pseudo-random number generators

5.3 Appropriate selection of nnn and mmm

6 Conclusion

Biography

Hiroki Okada

Ken Umeno

3 The asymptotic distribution of $\frac{2}{n}|S_{j}(X)|^{2}$

3.1 Proof of Theorem 1: The asymptotic distribution of $\sqrt{\frac{2}{n}}c_{j}(X)$

3.2 Proof of Theorem 2: Statistical independence of $\sqrt{\frac{2}{n}}c_{j}(X)$ and $\sqrt{\frac{2}{n}}s_{j}(X)$

5.3 Appropriate selection of $n$ and $m$