Deep learning the holographic black hole with charge

Jing Tan; Chong-Bin Chen

arXiv:1908.01470·hep-th·August 6, 2019

Deep learning the holographic black hole with charge

Jing Tan, Chong-Bin Chen

PDF

TL;DR

This paper employs deep learning to reconstruct the Reissner-Nordström black hole metric in AdS space by training neural networks on boundary data and analyzing the effects of training parameters.

Contribution

It introduces a neural network approach to learn charged black hole metrics from boundary data, bridging holography and machine learning.

Findings

01

Successful reconstruction of RN black hole metric

02

Analysis of training parameter effects on learning process

03

Potential for applying deep learning to other holographic models

Abstract

We use the deep learning algorithm to learn the Reissner-Nordstr\"om(RN) black hole metric by building a deep neural network. Plenty of data is made in boundary of AdS and we propagate it to the black hole horizon through AdS metric and equation of motion(e.o.m). We label this data according to the values near the horizon, and together with initial data constitute a data set. Then we construct corresponding deep neural network and train it with the data set to obtain the Reissner-Nordstrom(RN) black hole metric. Finally, we discuss the effects of learning rate, batch-size and initialization on the training process.

Tables7

Table 1. Table 1: Parameters of each RN black holes.

Black holes	Q	k	Regularizer
RN1	0.5	0	$0.039 \sum_{n = 1}^{N - 1} {(η^{(n)})}^{3.6} {(h (η^{(n + 1)}) - h (η (n)))}^{2}$
RN2	0.5	1	$0.05 \sum_{n = 1}^{N - 1} {(η^{(n)})}^{2.6} {(h (η^{(n + 1)}) - h (η (n)))}^{2}$
RN3	0.5	-1	$0.0225 \sum_{n = 1}^{N - 1} {(η^{(n)})}^{2.4} {(h (η^{(n + 1)}) - h (η (n)))}^{2}$
RN4	0.25	0	$0.05 \sum_{n = 1}^{N - 1} {(η^{(n)})}^{2.5} {(h (η^{(n + 1)}) - h (η (n)))}^{2}$
RN5	0.25	1	$0.028 \sum_{n = 1}^{N - 1} {(η^{(n)})}^{3} {(h (η^{(n + 1)}) - h (η (n)))}^{2}$
RN6	0.25	-1	$0.0115 \sum_{n = 1}^{N - 1} {(η^{(n)})}^{2} {(h (η^{(n + 1)}) - h (η (n)))}^{2}$

	r	1.78	1.68	1.59	1.51	1.44	1.38	1.33	1.29	1.257	1.234
RN1	R	3.04501	3.07225	3.11558	3.18464	3.29514	3.47341	3.76487	4.25044	5.20848	7.249
	em	3.1193	3.1314	3.1612	3.2175	3.3147	3.4787	3.7586	4.2587	5.2273	7.2988
	MSE					0.00155

	r	1.192	1.155	1.121	1.0896	1.061	1.035	1.012	0.9918	0.9745	0.96
RN2	R	3.43207	3.49943	3.58454	3.69503	3.84011	4.03586	4.30142	4.67335	5.20802	6.01493
	em	3.5666	3.5894	3.6390	3.7211	3.8447	4.0238	4.2814	4.6571	5.2214	6.0699
	MSE					0.00347

	r	1.7645	1.725	1.689	1.656	1.626	1.599	1.575	1.554	1.5363	1.5216
RN3	R	3.14846	3.22101	3.31542	3.44016	3.60633	3.83039	4.13805	4.57167	5.19288	6.13992
	em	3.3945	3.4184	3.4706	3.5575	3.6889	3.8801	4.1559	4.5589	5.1649	6.0762
	MSE					0.01520

	r	1.589	1.545	1.502	1.463	1.427	1.394	1.3635	1.336	1.311	1.289
RN4	R	3.16346	3.20997	3.27395	3.35703	3.46822	3.61924	3.8329	4.14157	4.62383	5.44262
	em	3.4405	3.4675	3.5256	3.6208	3.7623	3.9644	4.2501	4.6576	5.2521	6.1104
	MSE					0.17626

	r	1.36	1.3	1.24	1.19	1.15	1.11	1.08	1.05	1.03	1.01
RN5	R	3.32611	3.38634	3.47534	3.58949	3.72965	3.95443	4.23485	4.7336	5.35848	6.69649
	em	3.5260	3.5493	3.5993	3.6819	3.8073	3.9928	4.2690	4.6936	5.3843	6.5640
	MSE					0.01189

Table 7. Table 2: Result display and MSE

	r	1.7	1.67	1.65	1.63	1.61	1.59	1.575	1.562	1.55	1.54
RN6	R	3.38016	3.51052	3.62847	3.78439	3.99875	4.31046	4.65601	5.09654	5.72376	6.58145
	em	3.8322	3.8897	4.0071	4.1890	4.4421	4.7764	5.2068	5.7551	6.4465	7.2612
	MSE					0.27903

Equations39

d s^{2} = - f (r) d t^{2} + \frac{d r ^{2}}{f ( r )} + r^{2} (d Ω_{n - 2}^{k})^{2},

d s^{2} = - f (r) d t^{2} + \frac{d r ^{2}}{f ( r )} + r^{2} (d Ω_{n - 2}^{k})^{2},

f (r) = k - \frac{8 π}{( n - 2 ) Ω _{n - 2}^{k}} \frac{2 GM}{r ^{n - 3}} + \frac{8 π}{( n - 2 ) Ω _{n - 2}^{k}} \frac{G Q ^{2}}{r ^{2 (n - 3)}} + \frac{r ^{2}}{L _{A d S}^{2}},

(d Ω_{n - 2}^{0})^{2}

(d Ω_{n - 2}^{0})^{2}

(d Ω_{n - 2}^{1})^{2}

(d Ω_{n - 2}^{- 1})^{2}

L = - \frac{1}{2} (\partial_{μ} ϕ)^{2} - \frac{1}{2} m^{2} ϕ^{2} - V (ϕ),

L = - \frac{1}{2} (\partial_{μ} ϕ)^{2} - \frac{1}{2} m^{2} ϕ^{2} - V (ϕ),

□ ϕ - m^{2} ϕ - \frac{δ V}{δ ϕ} = 0.

□ ϕ - m^{2} ϕ - \frac{δ V}{δ ϕ} = 0.

f π^{'} + (f^{'} + \frac{2 f}{r}) π - m^{2} ϕ - \frac{δ V}{δ ϕ} = 0,

f π^{'} + (f^{'} + \frac{2 f}{r}) π - m^{2} ϕ - \frac{δ V}{δ ϕ} = 0,

ϕ (r + Δ r)

ϕ (r + Δ r)

π (r + Δ r)

y (x^{(1)}) = f_{i} φ (W_{ij}^{(N)} φ (W_{j k}^{(N - 1)} \dots φ (W_{l m}^{(1)} x_{m}^{(1)}) \dots)) .

y (x^{(1)}) = f_{i} φ (W_{ij}^{(N)} φ (W_{j k}^{(N - 1)} \dots φ (W_{l m}^{(1)} x_{m}^{(1)}) \dots)) .

W^{(n)} = (1 \frac{m ^{2}}{f} Δ r 1 - (\frac{f ^{'}}{f} + \frac{2}{r}) Δ r),

W^{(n)} = (1 \frac{m ^{2}}{f} Δ r 1 - (\frac{f ^{'}}{f} + \frac{2}{r}) Δ r),

{φ_{1} (x_{1}) = x_{1} φ_{2} (x_{2}) = x_{2} + Δ r \frac{δ V ( x _{1} )}{f δ ϕ}

{φ_{1} (x_{1}) = x_{1} φ_{2} (x_{2}) = x_{2} + Δ r \frac{δ V ( x _{1} )}{f δ ϕ}

d η = \frac{d r}{f} .

d η = \frac{d r}{f} .

\partial_{η} π + R (η) π - m^{2} ϕ - \frac{δ V [ ϕ ]}{δ ϕ} = 0.

\partial_{η} π + R (η) π - m^{2} ϕ - \frac{δ V [ ϕ ]}{δ ϕ} = 0.

ϕ (η + Δ η) = ϕ (η) + Δ η π (η),

ϕ (η + Δ η) = ϕ (η) + Δ η π (η),

π (η + Δ η) = π (η) - Δ η (R (η) π (η) - m^{2} ϕ (η) - \frac{δ V ( ϕ )}{δ ϕ ( η )}) .

W_{(n)} = (1 Δ η m^{2} Δ η 1 - Δ η R (η^{(n)}))

W_{(n)} = (1 Δ η m^{2} Δ η 1 - Δ η R (η^{(n)}))

{φ_{1} (x_{1}) = x_{1}; φ_{2} (x_{2}) = x_{2} + Δ η \frac{δ V ( x _{1} )}{δ x _{1}} .

{φ_{1} (x_{1}) = x_{1}; φ_{2} (x_{2}) = x_{2} + Δ η \frac{δ V ( x _{1} )}{δ x _{1}} .

0 = F \equiv [\frac{2}{η} π - m^{2} ϕ - \frac{δ V ( ϕ )}{δ ϕ}]_{η = η_{f in}}

0 = F \equiv [\frac{2}{η} π - m^{2} ϕ - \frac{δ V ( ϕ )}{δ ϕ}]_{η = η_{f in}}

f (F) = [t anh (100 (F - 0.1) - t anh (100 (F + 0.1)) + 2] /2

f (F) = [t anh (100 (F - 0.1) - t anh (100 (F + 0.1)) + 2] /2

E = E_{1} + E_{r e g} = d a t a \sum ∣ y (\overline{x}^{(1)}) - \overline{y} ∣ + E_{r e g} .

E = E_{1} + E_{r e g} = d a t a \sum ∣ y (\overline{x}^{(1)}) - \overline{y} ∣ + E_{r e g} .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

††thanks: E-mail address: [email protected]

Deep learning the holographic black hole with charge

Jing Tan1,2

Chong-Bin Chen1,2

1Department of Physics, Nanchang University, Nanchang, 330031, China

2Center for Relativistic Astrophysics and High Energy Physics, Nanchang University, Nanchang 330031, China

Abstract

Abstract

We use the deep learning algorithm to learn the Reissner-Nordström(RN) black hole metric by building a deep neural network. Plenty of data is made in boundary of AdS and we propagate it to the black hole horizon through AdS metric and equation of motion(e.o.m) We label this data according to the values near the horizon, and together with initial data constitute a data set. Then we construct corresponding deep neural network and train it with the data set to obtain the Reissner-Nordstrom(RN) black hole metric. Finally, we discuss the effects of learning rate, batch-size and initialization on the training process.

I Introducton

The holographic dualityMaldacena:1997re ; Gubser:1998bc ; Witten:1998qj , especially the so-called AdS/CFT correspondence, describes the equivalency between a $(d+1)$ -dimensional supergravitational system and a d-dimensional gauge quantum field. The holographic duality usually refers to the duality hypothesis proposed by Gerard ’t Hooft't Hooft:1993 and soon was given a precise string-theory interpretation by Leonard SusskindSusskind:1994vu . A particular example of this hypothesis was found by MaldacenaMaldacena:1997re , which proposed that a strongly coupled $4$ -dimensional Yang-Mills gauge theory would be dual to a weakly coupled $5$ -dimensional AdS supergravity. This implies that one can use the gravitational theory to deal with strongly coupled systems which perturbation theories cannot manage. On the contrary, we can learn about the information we care about on the gravitational field from the quantum field lived on the boundary of the gravitational system. The idea of the equivalence between gravity and quantum field provides an insight that the spacetime is possibly emerged from boundary theories. In other words, the gravitational system can be constructed if one know the information of the boundary, and figure out what equations it obeys.

On the other hand, deep learning Hinton ; Bengio ; LeCun is a field of machine learning that uses a cascade of multiple layers of nonlinear processing units for feature extraction and transformation and each successive layer uses the output from the previous layer as input L.Deng . This algorithm abstracts multiple levels of representation, and forms a hierarchy of data. Hinton proposed an algorithm called back propagation Rumelhart , which increases the process of error back propagation on the traditional algorithm. In this way, learning problems such as XOR, which were helpless in the perceptron modelRosenblatt Frank in the past, can be solved. In 2006, Hinton adopted unsupervised learning to train the algorithm layer by layer, and used supervised back propagation algorithm for tuning, which solved the problem of vanishing gradientHinton , it makes deep learning algorithm improved and it is basic of our deep neural network. At the same time, with the rapid improvement of computing power and the emergence of big data in recent years, deep leaning architectures such as deep neural network, deep belief network, is rapidly riding. Now they are widely used in signal and information processing, such as speech recognition, computer vision, natural language processing and machine translation.

A natural question raised immediately is wether there are some deep relations between holography and machine leaning. In other words, can one construct the gravity holographically from boundary systems through training the deep neural network? Before people realized this relation, someone have studied the connection between deep learning and the renormalization group of a tensor networkBeny:2013 ; Mehta:2014 . This is a support that AdS space can be emerged from deep learning because the so-called multiscale entanglement renormalization ansatz(MERA) network was regard as a discrete time slice from holographic point of viewSwingle:2009bg , and following study motivated by this was discussed in Gan:2017nyt . Recently Hashimoto et.al. have achieved the metric of an AdS black hole via deep neural network with boundary input dataHashimoto:2018ftp . Our work is base on Hashimoto:2018ftp and test it into the Reissner-Nordstrom(RN) metricReissner:1916 ; Weyl:1917 ; Nordstrom:1918 ; Jeffery:1921 , whose dual boundary theory should contain vector fields. For constructing a neural network the things we should know, as we said, are the input data(boundary value of fields and its derivative) and the equations they obey(equations of motion of these fields). To achieve it, one can obtain related equations of motion(e.o.m) from the gravitational action. After discretizing e.o.m in radial direction and map them into a neural network, the weights and active function will be determined and the network proceed availably after a coordinate transformation $d\eta=dr/\sqrt{f}$ . And then we produce data set(see Fig.3) and construct specific neural network of scalar field in new coordinate system. After selecting the appropriate regularization term and activation function, we implemented the neural network on TyTorch, which is is an open-source machine learning library for Python and succeed to learn the RN black hole metric through training the network, see Fig.7. In addition, if we consider effects of different parameters on training, we find that the loss will converge to same value as the training goes on, Fig.5. this means that the deficiencies of some parameter setting can be remedied by adding number of epoch.

II Metric reproduce of Reissner-Nordström-AdS black hole by Deep neural Network

Let us start with the $n$ -dimensional Reissner-Nordström-AdS(RNAdS) black hole

[TABLE]

where $L_{AdS}$ is AdS radius, $M$ and $Q$ are the mass and charge of the RNAdS black hole, respectively. The constant $k$ has values $k=0,1,-1$ depending on the topology is planar, spherical, or hyperbolic. The angular part $d\varOmega^{k}_{n-2}$ of the metric(1), for each $k=0,1,-1$ , is of the form

[TABLE]

$\varOmega^{k}_{n-2}$ denotes the area of the unit surface of each kind.

In this background, we consider a scalar field with Lagrangian density

[TABLE]

which yields the following equation of motion

[TABLE]

Substituting (1) we obtain

[TABLE]

where $\pi\equiv\phi^{\prime}(r)$ .

Our strategy is to provide a representation of scalar field equation in deep neural network. The emerged holographic direction( $\eta$ ) is discretized and regarded as the deep layers. Naively, to map the e.o.m to deep neural network, we can discretize the equations in radial direction by (8)

[TABLE]

Where $\Delta r$ is distance of adjacent points in discrete coordinate system, and $r^{(n)}\equiv(N-n+1)\Delta r$ , $N$ is total number of layers of deep neural network. In this black hole model the input data $x_{i}$ in each layer of the neural network are the scalar field $\phi$ and its derivative $\pi$ . Between the adjacent layers there are a linear transformation $x_{i}\to W_{ij}x_{j}$ and a nonlinear transformation $x_{i}\to\varphi(x_{i})$ which is known as the activation function. A conventional deep feedforward neural network can be constructed

[TABLE]

On $r=\infty$ , take the $\phi~{}and~{}\pi$ in (9) as input data $x_{m}^{(1)}$ put in neural network, and then propagate it to the output data on $r=0$ (black hole horizon) in accordance with equation (9) in each layer. The final layer $f_{i}$ is for summarizing all the component of $\varphi$ and map to the output data $y$ . From (9) it’s obvious to find the linear part in neural network is

[TABLE]

and the activation function at each layer is given by

[TABLE]

Where n is the total number of layers, this process is called forward propagation.

However, in this coordinate system, we find the metric function $f$ included in the weight function and the activation function if we use (11) and (12) to build a neural network. In that case, we will encounter some technical difficulties that we cannot overcome for the time being. To avoid that, we need a coordinate transformation to make coefficient of radial direction of metric in (1) not contain f . This transformation excluded $f$ from coefficient of second-order derivative of scalar field( $\partial_{r}\pi$ ) in (8), so we obtain a activation function without metric and build an effective neural network. Here is the transformation

[TABLE]

After integration we get the coordinate $\eta=G(r)$ and its inverse function $r=G^{-1}(\eta)$ . By use this function, we can turn above equations from coordinate $r$ to $\eta$ . It turns out in this new coordinate both weight function and activation function are independent of $f$ . But for the specific f given in $\eqref{rn_f}$ , we don’t have an analytic expression of integration (13). We have to appeal to numeric integrations. To verify our numerical validity, we compare our numerical results with the ones obtained analytically for Schwarzschild case, as plotted in (fig.1).

Let us turn to the Reissner-Nordström-AdS black hole metric reproduced by getting the points from the numerical integration with different charge. Using numerical integration, each value of $r$ reproduces a value of $\eta$ and the corresponding metric. Therefor we construct a metric function of $\eta$ , which called Reproduced Metric and is written as $R(\eta)$ . The metric constructed in this way is discrete by nature, this saves us much trouble in building neural network. Suppose we get infinite number of points from radius $r$ , we will get a continuous reproduced metric, then we can get the e.o.m immediately:

[TABLE]

Consider finite points case, we can get discrete e.o.m of scalar field:

[TABLE]

Where $\Delta\eta$ is distance of adjacent points in discrete coordinate system, and $\eta^{(n)}\equiv(N-n+1)\Delta\eta$ , $N$ , as before, is total number of layers of deep neural network. At the same time, we regard the reproduced metric as the target that our neural network learn, and use (10) as the neural network structure.

Mapping the discrete e.o.m into deep neural network we have weights

[TABLE]

and activation functions in each layer

[TABLE]

With these elements, we have the structure of the entire network (see Fig.2). The output of network will generate a loss function E. The information of the loss function will be propagated back by back propagation algorithm to calculate the gradient. Replicating this process layer by layer we obtain the emergent black hole metric.

III Methods

In the last section we have built a deep neural network model of Reissner-Nordström-AdS black hole. In this section we show more details to implement the network. For simplicity, consider 3+1 dimension spacetime in the unit $L=1$ , $m^{2}=-1$ and $V[\phi]=\frac{\lambda\phi^{4}}{4}$ (where $\lambda=1$ )111Actually, more generic $\phi^{n}$ ( $n>2$ is an integer) has the similar results.case, and set $N=10$ i.e. $10$ layers neural network, then $\eta_{ini}=1,\eta_{fin}=0.1$ and $\Delta\eta=-0.1$ .

III.1 Generating the data set

Training data is an indispensable part of deep learning. Following Hashimoto:2018ftp , our data are generated by propagating the e.o.m for a given metric. These data are treated as the data set of the neural network, which can be used to training the network as will see below. More specifically, we first get data randomly from ( $\phi\in[0,1.5],\pi\in[-0.2,0.2]$ ) in the location of $\eta=1$ (i.e. AdS boundary $\eta_{ini}$ ) as the initial response data. We then propagate data from $\eta_{ini}$ to $\eta_{fin}$ (i.e. black hole horizon) by using the e.o.m(14). As the last step, we use the boundary condition of the black hole horizon

[TABLE]

to divide the data into two categories labeled by 0 and 1. The positive answer data characterized by $\lvert F\rvert<0.1$ are labeled by 0, and the negative answer data characterized by $\lvert F\rvert>0.1$ is labeled by 1. We take the combination of initial data $(\phi(\eta(ini)),\pi(\eta(ini)))$ and corresponding label as a data point. Taking 1000 positive and 1000 negative data points forms the data set. This 2000 data satisfy the distribution as shown in Fig.3, which is comparable with the results of the Schwarzschild case in Hashimoto:2018ftp .

III.2 Training the Neural Network

We take the data obtained in the last subsection as a training set. The metric is expected to be obtained by training the network. Our training tool is PyTorch Ketkar ; Paszke , which is a powerful open source machine learning library of Python. We use it to construct a 10 layers deep neural network to train the data. The training data are imported into the deep neural network after divided into 200 batches. Then the neural network transports the data from input layer to the final ( $\eta_{fin}$ ). Following Hashimoto:2018ftp , choosing

[TABLE]

as the activation function is effective. Here $F$ is one of final layer neuron(in the limit $\eta_{fin}\to 0$ , equation (18) is equivalent to F $\equiv\pi(\eta_{fin})$ ). This function $y=f(F)$ is visualized in FIG.4.

So far, we construct a deep feedforward neural network describe by (10). The neural network transmits the data through whole network and generates the output data, which is the actual value. As a contrast, the label is called the target value. The difference between actual value and target value is the loss, which is described by the loss functionHashimoto:2018ftp

[TABLE]

Where $E_{1}~{}is~{}L^{1}-norm$ , $E_{reg}$ is regularizer and ${(\overline{x}^{(1)},\overline{y})}$ are the training set. The smaller loss means the closer proximity between the value given by neural network and target value.

IV Display of Parameters and Training Result

In order to reduce the loss, we optimize the neural network weights by Adam optimizer, that is a back propagation algorithm. Optimizer contains a parameter which controls convergence rate of the loss, which is called learning rate or stride. Generally, small learning rate leads to slow convergence. However, Excessive learning rate has a negative impact on convergence as well. In that case loss may over the bottom and get a suboptimal result. Additionally, choosing suitable batch size is also very important to give an ideal accuracy of the network during training. A larger batch size makes better use of the GPU parallelism, but compare to a smaller one, it require training the network for more epochs to achieve the same level of accuracy. You can set batch size to 1 if you can tolerate the training time. Their impacts are plotted in FIG.5. From these figures, we notice that by choosing appropriate learning rate and batch size, loss will converge to same value as long as the number of epoches is sufficient. And we find that too high value of learning rate in our neural network will make the loss cannot be convergent, like FIG.6. It turns out that the optimal learning rate is around 0.001.

Consider the training time and effect, we set learning rate to 0.001 and make batch size equal to 10 to train the neural network. According to experience, we find the appropriate regularization term values of different black holes. In FIG.7 we see that RN metric can be learned by training our neural network for any given charge $Q$ and topology $k$ as shown in TABLE I. In order to demonstrate the results of neural network better, in appendix material, we give out a comparison between the discrete metric which is from the neural network and the standard one, we also give out the mean square error and its change in the program progresses.

V conclusion

In this paper we apply the deep learning algorithm to black hole holography. We build a neural network to learn the RN-AdS metric of different topology. It turns out the expected metric can be obtained by training this network. It implies, according to AdS/CFT correspondence, that by learning the CFT boundary data, deep neural network reproduces the AdS metric even for spacetime with charge and different topology. This shed new insight into the possible relation between deep learning and the AdS/CFT correspondence. Technically, compared with the Schwarzschild case, RN black hole has some subtleties when build the neural network in terms of the equations of motion. We develop a numeric method to overcome this difficulty. As a last point, we also discuss the influence of different learning rate, batch size and epoch to the neural network. We find there is an optimal choice of these parameters.

Acknowledgements

The authors would like to acknowledge Ming-Yuan Luo and Tian-Wei Yan for helping construct early dataset and neural network. The author is also grateful to Prof. Fu-Wen Shu for the useful discussions and modification. This work was supported in part by Natural Science Foundation of China under Grants No.11465012 and 11665016,and 555 talent project of Jiangxi Province.

Appendix A Comparison and MSE

As a direct comparison of result, TABEL 2 contrasts between the discrete metric which is from the neural network and the standard case. In the table, R is the reproduce metric, the em is the emergent metric, and the MSE is the mean square error. In more details, we plot the evolution of the MSE, see Fig.8.

Bibliography22

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1(1) S. S. Gubser, I. R. Klebanov and A. M. Polyakov, “Gauge theory correlators from noncritical string theory,” Phys. Lett. B 428 , 105 (1998) [hep-th/9802109].
2(2) E. Witten, “Anti-de Sitter space and holography,” Adv. Theor. Math. Phys. 2 , 253 (1998) [hep-th/9802150].
3(3) J. M. Maldacena, “The Large N limit of superconformal field theories and supergravity,” Int. J. Theor. Phys. 38 , 1113 (1999) [Adv. Theor. Math. Phys. 2 , 231 (1998)] [hep-th/9711200].
4(4) G. ’t Hooft,, “Dimensional Reduction in Quantum Gravity,” [gr-qc/9310026].
5(5) L. Susskind, “The World as a hologram,” J. Math. Phys. 36 , 6377 (1995) [hep-th/9409089].
6(6) G. E. Hinton and R. R. Salakhutdinov, “Reducing the dimensionality of data with neural networks.,” Science 313, 504 (2006).
7(7) Y. Bengio, Y. Le Cun, “Scaling learning algorithms towards AI,” Large-scale kernel machines 34 (2007).
8(8) Y. Le Cun, Y. Bengio, G. Hinton, “Deep learning,” Nature 521, 436 (2015).