Linear scaling conventional Fock matrix calculation with stored non-zero two-electron integrals
Alexander V. Mitin

TL;DR
This paper demonstrates that the conventional Fock matrix calculation, when combined with stored non-zero two-electron integrals and prescreening techniques, exhibits linear scaling with system size, supported by theoretical proof and numerical validation.
Contribution
The authors reformulate Fock matrix calculation using data compression and prescreening to achieve and verify linear scaling for large molecular systems.
Findings
Linear scaling begins from 2500 to 4000 basis functions.
Theoretical proof of asymptotic linear scaling of non-zero two-electron integrals.
Numerical validation confirms linear scaling in practical calculations.
Abstract
It was shown that the conventional Fock matrix calculation, which using stored non-zero two-electron integrals, density prescreening, and data compression methods possesses the linear scaling property with respect to the problem size. This result follows from the proven theorem, which reads that the total number of non-zero two-electron integrals scales asymptotically linearly with respect to the number of basis functions for large molecular systems. An analysis of the Fock matrix calculation with density and density difference prescreening shows that its linear scaling property arises due to the asymptotically linear scaling properties of the number of non-zero two-electron integrals and the linear scaling property of the number of leading matrix elements of the density matrix. The use of the density and the density difference prescreening in the Fock matrix calculation consequently…
| Int. class | Numb. of int. | Asympt. contr. to non-zero int. |
| 1c int. | Yes | |
| 2c int. | No | |
| 3c int. | No | |
| 4c int. | No |
| R(Å) | 1/ (a.u.) | ||
|---|---|---|---|
| 0.6 | 6.554181989463915E-1 | 5.482973740079550E-1 | 8.819620820666667E-1 |
| 1.0 | 4.980644757727490E-1 | 4.322272962735285E-1 | 5.291772492400000E-1 |
| 2.0 | 2.645470381932461E-1 | 2.489481693180657E-1 | 2.645886246200000E-1 |
| 4.0 | 1.322943123099960E-1 | 1.300701033377158E-1 | 1.322943123100000E-1 |
| 6.0 | 8.819620820666761E-2 | 8.752217465970426E-2 | 8.819620820666667E-2 |
| 60.0 | 8.819620820666759E-3 | 8.818934900256092E-3 | 8.819620820666667E-3 |
| 100.0 | 5.291772492400056E-3 | 5.291624316992965E-3 | 5.291772492400000E-3 |
| 600.0 | 8.819620820666760E-4 | 8.819613960273960E-4 | 8.819620820666667E-4 |
| 6000.0 | 8.819620820666761E-5 | 8.819620752062707E-5 | 8.819620820666667E-5 |
| 10000.0 | 5.291772492400056E-5 | 5.291772477581577E-5 | 5.291772492400000E-5 |
| 60000.0 | 8.819620820666761E-6 | 8.819620819980713E-6 | 8.819620820666667E-6 |
| Molec. | BF | 1.0E-4 | 1.0E-5 | 1.0E-6 | 1.0E-7 | 1.0E-8 |
|---|---|---|---|---|---|---|
| 609 | 11.4 | 13.9 | 19.8 | 25.2 | 29.7 | |
| 5009 | 5.8 | 10.9 | 19.1 | 27.9 | 36.4 | |
| 9409 | 4.6 | 10.0 | 18.7 | 28.5 | 38.3 | |
| 13809 | 3.9 | 9.4 | 18.4 | 28.8 | 39.5 | |
| 18209 | 3.5 | 9.0 | 18.1 | 29.0 | 40.4 | |
| 22609 | 3.2 | 8.6 | 17.9 | 29.2 | 41.1 | |
| 27009 | 2.9 | 8.4 | 17.7 | 29.3 | 41.6 |
| Molec. | BF | 1.0E-2 | 1.0E-3 | 1.0E-4 | 1.0E-5 | 1.0E-6 |
|---|---|---|---|---|---|---|
| 609 | 2.6 | 11.9 | 26.5 | 38.2 | 45.5 | |
| 1159 | 1.4 | 6.7 | 15.7 | 24.1 | 30.9 | |
| 1709 | 1.0 | 4.6 | 11.1 | 17.2 | 22.5 | |
| 2259 | 0.7 | 3.6 | 8.6 | 13.5 | 17.8 | |
| 2809 | 0.6 | 2.9 | 7.0 | 11.0 | 14.6 | |
| 3359 | 0.5 | 2.4 | 5.9 | 9.3 | 12.4 |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpectroscopy and Quantum Chemical Studies · Advanced Chemical Physics Studies · Molecular Junctions and Nanostructures
Linear scaling conventional Fock matrix calculation
with stored non-zero two-electron integrals
Alexander V. Mitin
Moscow Institute of Physics and Technology, 9 Institutskiy per., Dolgoprudny, Moscow Region, 141701, Russia;
Joint Institute for High Temperatures of RAS, Izhorskaya st. 13 Bd.2, 125412 Moscow, Russia
Abstract
It was shown that the conventional Fock matrix calculation, which using stored non-zero two-electron integrals, density prescreening, and data compression methods possesses the linear scaling property with respect to the problem size. This result follows from the proven theorem, which reads that the total number of non-zero two-electron integrals scales asymptotically linearly with respect to the number of basis functions for large molecular systems. An analysis of the Fock matrix calculation with density and density difference prescreening shows that its linear scaling property arises due to the asymptotically linear scaling properties of the number of non-zero two-electron integrals and the linear scaling property of the number of leading matrix elements of the density matrix. The use of the density and the density difference prescreening in the Fock matrix calculation consequently enforces this property. The conventional Fock matrix calculation with the stored non-zero two-electron integrals has been reformulated to the Fock matrix calculation with density or density difference prescreening by using the data compression technique to store the non-zero two-electron integrals and their indices. The numerical calculations with this method show that its linear scaling property begins from 2500 to 4000 basis functions in dependence on the basis function type in molecular calculations.
One-electron integrals, Two-electron integrals, Fock operator, Molecular orbitals, Fock matrix calculation
pacs:
31.15.Ar
I Introduction
It is well known that the Fock matrix calculation is one of the most computationally expansive step in the Hartree-Fock and the density functional theory methods Hartree (1928); Fock (1930); Hohenberg and Kohn (1964); Kohn and Sham (1965). Its efficient realization is mainly defined by explicit inclusion of the main hardware features of the used computer into developed numerical algorithm and program code. The main characteristic of computers since 1950th was significantly faster CPU in comparison with input/output (I/O) capabilities. For this type of computers have been developed efficient numerical algorithms of the direct Fock matrix assembly by calculating only Fock matrix changes for known density difference with subsequent construction of the full Fock matrix from these increments Almlöf et al. (1982); White and Head-Gordon (1994); Strain et al. (1996); Challacombe et al. (1996).
However, now, current trend in computer architecture consists in subsequent significant improvement of its I/O capabilities in comparison with growing of its CPU performance. This changes appears due to solid state drive (SSD) technology. This is log term trend because the limits of this technology is too far to be exhaust. For example, the third generation of Microsemi Adaptec SmartRAID controllers permits to build a disc system with I/O rate up to 6.6 GB/sec, while the advanced Seagate Nytro XP7200 PCIE add-in SSD has sequential sustained read performance equals to 10000 MB/s at capacity 7.7 TB. The efficient data compression technique Mitin (2002) permits to keep non-zero two-electron integrals mainly in one byte words. This means that the modern external disk systems can read the non-zero two-electron integrals from the external disk system to the main computer memory with rate from to integrals per second. On the other hand, such high speed of two-electron integrals calculations has not yet been demonstrated on single/double socket computers.
Therefore, a critical analysis and reinvestigation of the conventional method Fock matrix construction from the non-zero two-electron integral stored on an external disk system is an important problem, which can open new perspectives for a development of an efficient algorithm of Fock matrix calculation on computers with extended I/O capabilities.
In this connection, the scaling property of the total number of non-zero two-electron integrals was investigated and presented in Section II. Then, a theoretical analysis of the Fock matrix calculation with density and density difference prescreening are given in Section III. After that, the conventional Fock matrix calculation from the stored non-zero two-electron integrals with the density difference prescreening and the data compression was formulated and its linear scaling property was demonstrated in numerical calculations. These results are presented in Section IV.
II Linear scaling property of non-zero two-electron integrals
It is well known that the direct Fock matrix calculation Almlöf et al. (1982), which uses the density difference for an evaluation of the two-electron integrals with the fast multipole techniques White and Head-Gordon (1994); Strain et al. (1996) and the quantum chemical tree code Challacombe et al. (1996), super-linear scales with respect to the problem size. The scalability of such a calculation is noticeably lower than quadratic with respect to the problem size and in some cases it is close to linear dependence. The explanations of this phenomenon were also given in these publications. They are based on an numerical estimation of scaling property of the total number of non-zero two-electron integrals. Formally the total number of two-electron integrals scales as for a system with basis functions. But, only non-zero integrals are important in real calculations. In this connection, in Dyczmons (1973) it was argued that the total number of non-zero two-electron integrals () scales as . Later, without a proof, it was noted Ahlrichs (1974) that scales as . An estimation that scales as for integrals with absolute values greater , which has been reported in Strout and Scuseria (1995), is now widely accepted.
On the other hand, the small and large two-electron integrals are used differently in the modern direct algorithm of the Fock matrix calculations Almlöf et al. (1982). Only integrals, whose contributions to the Fock matrix are greater than predefined threshold, are calculated and used in calculations of the Fock matrix. This means that the integrals, which are small in absolute values, are calculated really, while those one, which absolute values are large, calculated often. Therefore, the scaling properties of the total numbers of non-zero small and large two-electron integrals have to be estimated separately first before investigation of the scaling property of the Fock matrix calculation on the number of basis functions.
Possible scaling property the number of non-zero two-electron integrals can be qualitatively understand by considering a classification of the two-electron integrals on four classes in accordance with the number of centers together with estimations of numbers of integrals in these classes presented in Table 1.
From this classification follows that can only have one of the following possible asymptotic dependencies on : the first, the second, the third, and the fourth power, because this dependence is defined by the asymptotic property of integrals, belonging to the corresponding classes, with respect to increasing the distances between their centers. Then, the results presented in work Strout and Scuseria (1995) show that at least the four-center integrals give no contribution to the total number of non-zero integrals. Otherwise, any non-zero part of four-center integrals has to give the fourth power dependence of the number of non-zero integrals due to highest power dependence of this class in comparison with others. The complete elimination of these integrals arises because an asymptotic value of any four-center two-electron integral with respect to increasing the distances between their centers is equal to zero. (Detailed consideration of asymptotic properties of the two-electron integrals is given below.) The asymptotic property of the two- and three-center integrals with respect to the distances between the centers is similar to that of the four-center integrals. Therefore, they also must have no influence on the asymptotic property of the number of non-zero two-electron integrals. The only what they can do is shifting the linear scaling behavior of the number of non-zero two-electron integrals towards to larger molecular systems. The one-center two-electron integrals do not depend on the distances between the centers. Therefore, their values are constants and, hence, they give asymptotically linear scaling dependence of the total number of non-zero two-electron integrals.
For this reason, the fractional asymptotic values of the power dependence of the number of non-zero two-electron integrals from the number of basis functions given in Strout and Scuseria (1995) obviously point out that the correct asymptotic values of these quantities have not been reached.
Now, consider the two-electron integral
[TABLE]
where is the -th basis function located at the nucleus of the first electron and is the distance between two electrons. To prove the mathematical statement about power dependence of on and, hence, on the size of a system, the two-electron integrals must be preliminary classified on the numbers of centers, and the dependencies of the values of these integrals on the distances between their centers in each class must be considered. After that, the theorem on power dependence of on can be proven.
Let us define a model system which is formed by the same atoms uniformly distributed in three-dimensional space with one basis function per each nucleus and which will be further used in the present consideration.
Altogether, there are four types of two-electron integrals: one-, two-, three-, and four-center integrals.
One-center two-electron integrals. These integrals do not depend on the distances between centers because all basis functions are located on the same center. Therefore, the values of these integrals are constants and are defined by the parameters of basis functions. There are only such integrals for the model system under consideration.
Two-center two-electron integrals. There are two sub-types of these integrals. In the first one
[TABLE]
the basis functions of each electron are located on the same center. Therefore, this integral is the two-electron integral for two density distributions. It is clear that the value of such an integral decreases with increasing the distance between the two centers. Probably, it is not possible derive an explicit formula for the value of such an integral. Though, for the distances between two centers which are larger than the sum of typical sizes of these distributions, an asymptotic formula for the value of this integral can be derived if it is noted that the integral in this case is a Coulomb potential energy between two density distributions. In Electrostatics it is well known that for distances larger than the radius of the charge density distribution, the distribution can be replaced by a corresponding point charge located in the center of the density distribution Pursell (1984). For Gaussian function, due to symmetry, such a center is a point where the function is located. The integral, in this case, is expressed by a simple formula
[TABLE]
where is the distance between the centers and and , are normalization factors of the corresponding basis functions.
The values of this two-center two-electron integral for some distances between two centers and the inverse values of these distances for cases of two and two Gaussian functions with exponents equal to 0.5 are presented in Table 2. The integrals were calculated using the Rys polynomial method King and Dupuis (1976). An analysis of this data shows that the asymptotic formula gives reasonable estimations of two-center two-electron integral values, which are especially good for the case of functions. In this case, the asymptotic and numerical values are in excellent agreement for distances larger than 2.0 Å.
In the second two-center two-electron integral
[TABLE]
the basis functions of each electron are located on the different centers, however, the centers are the same for each pair. The use of the Gaussian product theorem Shavitt (1963) transforms this integral into the one-center two-electron integral considered above with a prefactor which exponentially decreases with increase of the distance between the centers
[TABLE]
Here, and are exponents of Gaussian functions and , and are Gaussian functions with exponent located at a center between centers and .
Thus, the values of both types of two-center two-electron integrals decrease with increasing distances between the centers and, therefore, their asymptotic values are equal to zero in the limit of infinite distances between the centers.
Three-center two-electron integrals
[TABLE]
The Gaussian product theorem can be applied, in this case, to a couple of functions of the same electron located on different centers. In such a way, the tree-center two-electron integral can be transformed into a two-center two-electron integral with an exponential prefactor depending on the distance between the centers
[TABLE]
The two-center two-electron integral itself is exactly the integral which has been considered above. The value of this integral depends on the distance between and centers and decreases with its increase. The value of this integral also exponentially decreases with increasing the distance between centers and . Hence, the asymptotic value of this integral is equal to zero.
Four-center two-electron integrals. The double application of the Gaussian product theorem transforms this integral into a two-center two-electron integral with two prefactors exponentially depend on distances between the centers
[TABLE]
All three multipliers in this expression decrease with increasing distances between the centers. Therefore, the asymptotic value of this integral is zero.
Thus, only the one-center two-electron integrals do not depend on geometry and the number of these integrals is proportionally to . The values of all other integrals depend on the distances between centers and their asymptotic values are equal to zero. Therefore, in large molecules, where distances between the majority of atoms or the centers of two-electron integrals are large, the values of these integrals will be negligibly small and, hence, they will have no influence on the asymptotic property of the number of non-zero integrals. Thus, the following Theorem can be formulated:
Theorem**.**
The number of two-electron integrals , whose absolute values greater than , linearly scales on for large molecular systems.
Proof.
Consider the number of two-electron integrals for the model system, which can be presented in the following way
[TABLE]
where , when , and otherwise; and indices are ordered in usual way , , and ; if then . Define a tree-dimension manifold of radius , whose centrum is located on the nucleus. Then, for any small it is possible define such an that, when , then and otherwise. (The possibility of defining such follows from the consideration of two-electron integrals given above, where it was shown that the values of all two-, three-, and four-center two-electron integrals asymptotically decrease to zero with increasing the distances between the centers.) Now, we can see that summation for fixed in the last triple sum in (1) have to be done not over the full range of indices from 1 to , but only for parts of them, which are defined by manifolds . The radius of manifold only depends on , but not on . Therefore, the triple summation for fixed gives a constant which is defines by , but does not depend . Summation over all indices, from 1 to , occurs only when and, hence, . Thus, the quadruple summation in (1) reduces to a single summation of some constants depending on . This shows that the number of non-zero two-electron integrals scales linearly with respect to a number of basis functions when are small in comparison to the typical size of a large system. In the opposite case, a deviation from linear scaling will be observed for . ∎
Before considering the results of numerical examples, it needs to be noted that the values of two-center two-electron integrals of the first type only reduce as . Therefore, these integrals can be excluded from calculations only at extremely large distances between the centers. However, a number of these integrals formally scales only as in comparison with dependence for a total number of two-electron integrals. Hence, a contribution of these integrals to the total number of two-center integrals decreases with increase of the number of basis functions. For example, in the largest calculations presented below, with 31409 basis functions, the formal weight of these integrals in the total number of two-electron integrals is about of . This small contribution permits to conclude that the influence of these integrals on the scaling property of the total number of non-zero integrals will be practically invisible, even when taking into account that the weight of these integrals in the total number of non-zero integrals can increase with due to slower asymptotic convergence of them in comparison with other integrals.
It can be also noted that due to the formal dependency all these integrals are calculated during a time comparable with that one for a calculation of one-electron integrals. Therefore, a calculation of these integrals has no significant influence on the total time of two-electron integral calculation.
The numerical investigations of differential properties of require that the size of molecular systems increases with a constant increment, because only in this case the finite difference derivatives of evaluated at different can be correctly compared. This requirement can be realized only in one-, two-, and quasi three-dimensional molecular systems. The variations of or in numerical experiments can be easy simulated by counting for different values of two-electron integral cutoff thresholds.
In the present study, the numbers were counted for different two-electron integral cutoff thresholds for three systems. The first one was a linear chain of lithium atoms, where the number of atoms are varied from 10 to 1500. The second case was a two-dimensional cell of lithium atoms, while the last one was alanine polymers, where the number of alanine blocks varied from 10 to 790 and the number of atoms varied from 112 to 7912. The internuclear distances in the first two examples correspond to the distances in the solid lithium. The geometries of alanine polymers were optimized by the Tinker program Ren and Ponder (2003). The STO-3G basis Hehre et al. (1969) was employed in the calculations of the first two systems and the 6-31G basis Hehre et al. (1972) was used in the calculations of the alanine polymers. The number of contracted Gaussian type basis functions (CGTF) in the calculations of these systems varied from 50 to 7500, from 200 to 8000, and from 609 to 43509 correspondingly. The calculations have been performed using the program Mitin (1998) with an improved integral code.
The dependencies of on the number of basis functions for the different cutoff thresholds obtained for the first two systems in these calculations are presented on Fig. 1.
The dependencies of from on these figures are quite similar for both molecular systems. In the third case the dependence of on for low precision of integrals has been investigated in detail because the dependence of these integrals on defines the scaling property of the Fock matrix calculation method with the use of density or density difference prescreening. Indeed, in modern programs the two-electron integrals are calculated when an estimation of maximal integral in a block of shell integrals , where is the largest matrix element of the density matrix difference. The values of usually equals between and in calculations of large molecular systems Challacombe and Schwegler (1997). For such systems, the convergence criterion on density matrix difference has the same value. Therefore, large non-zero two-electron integrals defines the scaling property of the Fock matrix calculations.
The dependencies and its first finite differences on are presented on different figures due to significant deviations in number of non-zero integrals. These dependencies for equals to and are presented on Fig. 2a and Fig. 2b correspondingly.
Similar dependencies for integral cutoff precision of , , , and are presented on Fig. 3a and Fig. 3b, while dependencies for precision of , and , are given on Fig. 4 and Fig. 5, correspondingly.
An analysis of these results shows that, as expected, the scales linearly on for the low cutoff thresholds almost for all , while for higher thresholds the dependencies are still higher than linear. In particular, the scales linearly for any number of basis functions for integrals with absolute values larger than . For integrals with absolute values larger than the scales linearly for more than 20000 basis functions (see Fig. 2). The dependencies presented on Fig. 3a and Fig. 3b show that, from the practical point of view, scales almost linearly for integral precision and lower. For higher precision (Fig. 4 and Fig. 5) the linear dependence of the number of non-zero two-electron integrals is not observed in the presented calculations. However, the convex form of the dependence of on and the concave form of the dependence of the first finite difference derivatives on points that the power dependence of on is lower than the second order.
Thus, the results of the numerical investigations are in agreement with the theorem which says that a number of non-zero two-electron integrals scales asymptotically linearly with respect to a number of basis functions for large systems.
III Linear scaling Fock matrix calculations with density prescreening
There are two possible algorithms for the calculation of the two-electron part of Fock matrix with density prescreening for an evaluation of the contributions of non-zero two-electron integrals. The first one is based on using the total density in a prescreening procedure, while in the second algorithm the Fock matrix is formed from the calculated increments corresponding to the density differences Almlöf et al. (1982). In the simplest form, the algorithm with total density prescreening can be presented as follows
[TABLE]
and the algorithm using density difference prescreening has the following form
[TABLE]
The schemes of both algorithms explicitly present only the main steps, which are needed for the selection of two-electron integrals by using prescreening techniques and calculations of Fock and delta Fock matrices. The fast multipole type methods White and Head-Gordon (1994); Strain et al. (1996), the quantum chemical tree code Challacombe et al. (1996), and the conventional self-consistent-field (SCF) methods with density and density difference prescreenning Mitin et al. (2003) are described by these algorithms. The fast multipole methods and the quantum chemical tree code were realized as direct SCF methods proposed in Duke (1974) and then reintroduced in Almlöf et al. (1982) together with other improvements.
The Step 4 of Algorithm I and Step 5 of Algorithm II for the Fock and delta Fock matrix calculations are similar and are the most computationally demanding steps of the algorithms. Therefore, they define the scaling property of the Fock and delta Fock matrix calculations with respect to the problem size.
The scaling property of these steps can be revealed by considering distributions of non-zero two-electron integrals and leading density matrix elements on absolute values in dependence on the problem size. The results presented above show that the non-zero two-electron integrals with large absolute values scales linearly on , although the small integrals scale higher than linearly, however, below quadratic. This means that non-zero two-electron integrals can be considered as a distribution where the largest two-electron integrals or the core part of this distribution scale linearly with respect to the number of basis functions, while the remaining large part of the distribution scales higher than linear. It is also clear that the fractions of the largest two-electron integrals in such a distribution decrease when the number of basis functions increase.
This property of non-zero two-electron integral distributions in calculations of alanine polymers with 6-31G Hehre et al. (1972) basis is presented in Table 3. The number of basis functions changes from 609 to 27009 in these calculations. The normalized fractions (in %) of are given from third to seventh columns.
Presented results show that the fraction of small integrals increases, while that of large integrals decreases with increase of molecular system size. This property of non-zero two-electron integral distribution follows from the Theorem proven above. Indeed, the considered example corresponds to the case when the number of non-zero large integrals scales linearly on , whereas the number of non-zero small integrals scales higher than linear. For this reason, the fraction of large integrals reduces and the fraction of small integrals increases when increasing the number of basis functions.
It is well known that the number of leading matrix elements of a density matrix linearly scales with respect to the number of basis functions Kohn (1959). On the other hand, the total number of matrix elements of a density matrix is a quadratic function of . Therefore, the distributions of the density matrix elements and the number of non-zero two-electron integrals on absolute value must have similar properties. An example of such a distribution is presented in Table 4. Density matrices were obtained in Hartree-Fock calculations of alanine polymers with 6-31G basis. The presented results explicitly display that the fraction of the main matrix elements of density matrix reduces with increase of molecular system size.
The results presented above permit to consider the calculations at Step 4 and Step 5 of Algorithms I and II, correspondingly, as a generalized product of two distributions: the distribution of non-zero two-electron integrals and the distribution of density matrix elements. The properties of these distributions are similar, however, they differ in the number of elements. From the analysis of the scaling properties of these distributions follows that, similar to the scaling property of the non-zero two-electron integrals, the Fock and delta Fock matrix calculations at and only asymptotically linearly scale with respect to the number of basis functions without using the cutoff criterions. This means that their power dependencies on the number of basis functions are lower than quadratic, but higher than one.
The use of the cutoff criteria at Step 4 and Step 5 of Algorithms I and II enforces the linear scaling property of the Fock matrix calculations. Indeed, the calculations with the largest two-electron integrals and the largest density matrix elements and the calculations with small integrals and small density matrix elements scale differently with respect to the number of basis functions. The calculations with small values give only insignificant contributions to the two-electron part of the Fock or the delta Fock matrices. The use of the cutoff criteria at Step 4 and Step 5 excludes these calculations and preserves only those which give important contributions to the Fock or the delta Fock matrices. Thus, the use of the cutoff criteria transforms the asymptotically linear scaling Fock and delta Fock matrix calculations into the linear scaling ones even at those dimensions where the number of non-zero two-electron integrals is still a non-linear scaling function.
A comparison of Algorithm I and Algorithm II shows that one of the main differences between them arises due to different cutoff criteria. At the same time the difference between these criteria is defined by the fact that the density converges to a constant
[TABLE]
while the density difference converges to zero
[TABLE]
when self-consistent iterations converge. This means that the density difference cutoff criterion is stronger than the density cutoff criterion. Therefore, the scaling property of Algorithm II with density difference prescreening is better than that of Algorithm I. However, one can note that Algorithm I has better numerical stability than Algorithm II, because it is free from the elimination of meaning digits in real numbers which arises due to the difference operation in the latter one. This is the reason for frequently restarting Fock matrix calculations with Algorithm II in ab initio programs.
It needs emphasis that the cutoff criteria at Step 4 and Step 5 of Algorithm I and Algorithm II, correspondingly, do not perfectly linearize the Fock and the delta Fock matrix calculations. Therefore, deviations from the linear scaling behavior must be observed in calculations performed with these algorithms. The confirmations of this fact can be found, for example, in publications Strain et al. (1996); Challacombe et al. (1996). Thus, the Fock matrix calculations with GvFMM method of , ( to ) with 3-21G basis scales as the 1.35 power on the number of basis functions Strain et al. (1996). The quantum chemical tree code Challacombe et al. (1996) has shown the best power dependencies between 1.9 and 1.6 in calculations of water and -helices with 3-21G basis. The power dependencies displayed by both methods are definitely below quadratic. However, they are not linear, as it must be in exact linear scaling algorithms.
Thus, the linear scaling property of the Fock matrix calculation with density and density difference prescreening appears due to three factors: the asymptotically linear scaling property of the number of non-zero two-electron integrals, the linear scaling property of leading elements of the density matrix, and the use of the cutoff criterion for linearization of the Fock matrix calculation.
IV Linear scaling Fock matrix calculations with stored non-zero two-electron integrals, density difference prescreening, and data compression
The above consideration shows that any algorithm of Fock matrix calculation that can be reduced to Algorithms I or II should possesses the linear scaling property. In particular, the conventional algorithm of Fock matrix calculation with stored non-zero two-electron integrals can be reformulated as an algorithm with density (Algorithm I) or density difference (Algorithm II) prescreening Mitin et al. (2003) by using the data compression method Mitin (2002) to store non-zero two-electron integrals and their indices. This means that the Fock matrix calculations with stored two-electron integrals modified in such a way should possess the linear scaling property with respect to the number of basis functions.
The conventional Fock matrix calculation method with stored non-zero two-electron integrals, density difference prescreening and data compression has been realized in program Mitin (1998) and used for investigations of the scaling property of this method in calculations of alanine polymers , when the number of alanine blocks was varied from to . The calculations were performed with STO-3G Hehre et al. (1969), 3-21SP Mitin et al. (1996), and 6-31G Hehre et al. (1972) basis sets and parameter of the prescreening procedure was equal to in all cases. The calculations were performed on PC computer with Intel i7-2600K CPU running at 5.0 GHz. The time of two-electron part Fock matrix calculations in dependence on the number of basis functions for different cutoff precision of two-electron integral calculations are presented on Figures 6a through 6c for basis STO-3G, 3-21SP, and 6-31G correspondingly. The first finite differences for these dependencies, presented on Figures 7a through 7c, were constructed to display their power dependencies.
A consideration of the numerical derivatives presented on the last figures has to take into account that the density prescreening in the Fock matrix calculation with stored non-zero two-electron integrals is based on using a discrete classification of two-electron integrals on absolute values into five classes Mitin et al. (2003). Therefore, the time dependencies of the first numerical derivatives obtained with this algorithm and presented on Figure 7 are not as smooth as those which can be obtained with the density difference prescreening algorithm that uses the continuous classification of two-electron integrals on absolute values Almlöf et al. (1982).
The dependencies presented in Figures 6 and 7 show that the scaling behavior of the conventional Fock matrix calculation with stored non-zero two-electron integrals, density difference prescreening, and data compression becomes linear at about 2500 basis functions STO-3G basis and about 4000 basis functions for split-valence 3-21SP and 6-31G bases for low precision of two-electron integral calculations. When the two-electron integral cutoff precision increases, the linear scaling regime of the Fock matrix construction shifts to a higher number of basis functions. The first finite difference derivatives in Figures 7a, 7b, and 7c display this shift especially well.
Thus, numerical tests presented above show that the conventional Fock matrix calculation method with stored non-zero two-electron integrals, density difference prescreening, and data compression possesses the linear scaling property with respect to the number of basis functions. This result is in agreement with the conclusion of the theoretical analysis, given in the previous section, that any algorithm of Fock matrix calculation, which can be reduced to Algorithm I or Algorithm II, has to possess the linear scaling property with respect to the number of basis functions.
V Conclusion
The presented results show that the conventional Fock matrix calculation from the stored non-zero two-electron integrals with density prescreening and data compression possesses the linear scaling property. It follows from the proven Theorem, which says that the total number of non-zero two-electron integrals scales asymptotically linearly with respect to the number of basis functions for large molecular systems. An analysis of the Fock matrix calculations with density and density difference prescreening shows that the linear scaling property of this algorithm arises due to linear scaling properties of the number of non-zero two-electron integrals and the number of leading matrix elements of the density matrix. The density prescreening method reinforces this property, while the density difference prescreening further strengthens of it. An application of the data compression technique to store two-electron integrals and their indices permits the reformulation of the conventional Fock matrix calculations as a method with density or density difference prescreening. Therefore, the Fock matrix calculation from stored non-zero two-electron integrals with density or density difference prescreening and data compression also possesses the linear scaling property with respect to the number of basis functions. The numerical calculations demonstrate that this property begins at 2500 to 4000 basis functions in dependence on the basis function type in molecular calculations.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Hartree (1928) D. R. Hartree, Proc. Cambridge Phil. Soc. 24 , 89 (1928) . · doi ↗
- 2Fock (1930) V. Fock, Z. Phys. 61 , 126 (1930) . · doi ↗
- 3Hohenberg and Kohn (1964) P. Hohenberg and W. Kohn, Phys. Rev. 136 , B 864 (1964) . · doi ↗
- 4Kohn and Sham (1965) W. Kohn and L. J. Sham, Phys. Rev. 140 , A 1133 (1965) . · doi ↗
- 5Almlöf et al. (1982) J. Almlöf, K. Faegri, and K. Korsell, J. Comput. Chem. 3 , 385 (1982) . · doi ↗
- 6White and Head-Gordon (1994) C. A. White and M. Head-Gordon, J. Chem. Phys. 101 , 6593 (1994) . · doi ↗
- 7Strain et al. (1996) M. C. Strain, G. E. Scuseria, and M. J. Frisch, Science 271 , 51 (1996) . · doi ↗
- 8Challacombe et al. (1996) M. Challacombe, E. Schwegler, and J. Almlöf, J. Chem. Phys. 104 , 4685 (1996) . · doi ↗
