The Chemical Space of B, N-substituted Polycyclic Aromatic Hydrocarbons: Combinatorial Enumeration and High-Throughput First-Principles Modeling
Sabyasachi Chakraborty, Prakriti Kayastha, Raghunathan Ramakrishnan

TL;DR
This study exhaustively enumerates and models a vast chemical space of B, N-substituted polycyclic aromatic hydrocarbons, revealing potential photovoltaic applications through high-throughput first-principles calculations and trend analysis.
Contribution
It provides the first comprehensive enumeration and characterization of over 7.4 trillion B, N-substituted polycyclic hydrocarbons, combined with high-throughput DFT calculations for a subset of 33,000 molecules.
Findings
Identified a large fraction of molecules active in the solar spectrum range.
Discovered trends in structural stability and electronic properties.
Analyzed symmetry-controlled selectivity in synthesis yields.
Abstract
Combinatorial introduction of heteroatoms in the two-dimensional framework of aromatic hydrocarbons opens up possibilities to design compound libraries exhibiting desirable photovoltaic and photochemical properties. Exhaustive enumeration and first-principles characterization of this chemical space provide indispensable insights for rational compound design strategies. Here, for the smallest seventy-seven Kekulean-benzenoid polycyclic systems, we reveal combinatorial substitution of C atom pairs with the isosteric and isoelectronic B, N pairs to result in 7,453,041,547,842 (7.4 tera) unique molecules. We present comprehensive frequency distributions of this chemical space, analyze trends and discuss a symmetry-controlled selectivity manifestable in synthesis product-yield. Furthermore, by performing high-throughput ab initio density functional theory calculations of over thirty-three…
| CX-2 | CX-4 | CX-6 | CX-8 | CX-10 | CX-12 | Totala | |||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| B1N1 | B2N2 | B3N3 | B4N4 | B5N5 | B6N6 | ||||||
| 2 | 1 | 2 | 1 | 1 | {} | ||||||
| 3 | 2 | 4 | 3 | 3 | 6 | {} | |||||
| 3 | 3 | 4 | 6 | 4 | 10 | {} | |||||
| 4 | 4 | 6 | 5 | 18 | 4 | 27 | {} | ||||
| 4 | 5,6 | 6 | 8 | 27 | 6 | 82 | {} | ||||
| 4 | 7 | 6 | 15 | 48 | 10 | 73 | {} | ||||
| 4 | 8 | 6 | 16 | 48 | 12 | 76 | {} | ||||
| 4 | 9 | 6 | 30 | 90 | 20 | 140 | {} | ||||
| 5 | 10 | 8 | 14 | 114 | 140 | 22 | 290 | {} | |||
| 5 | 11 | 8 | 17 | 119 | 150 | 24 | 310 | {} | |||
| 5 | 12–18 | 8 | 28 | 216 | 280 | 38 | 3,934 | { } | |||
| 5 | 19–24 | 8 | 56 | 420 | 560 | 70 | 6,636 | {} | |||
| 6 | 25–27 | 10 | 23 | 330 | 1056 | 810 | 66 | 6,855 | {} | ||
| 6 | 28–41 | 10 | 45 | 640 | 2100 | 1590 | 126 | 63,014 | {} | ||
| 6 | 42–45 | 10 | 46 | 640 | 2112 | 1590 | 132 | 18,080 | {} | ||
| 6 | 46–76 | 10 | 90 | 1,260 | 4,200 | 3,150 | 252 | 277,512 | {} | ||
| 7 | 77 | 12 | 14 | 274 | 1,586 | 2,976 | 1428 | 96 | 6,374 | {} |
| CX-2 | CX-4 | CX-6 | CX-8 | CX-10 | CX-12 | CX-14 | CX-16 | Totala | |||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| B1N1 | B2N2 | B3N3 | B4N4 | B5N5 | B6N6 | B7N7 | B8N8 | ||||||
| 2 | 1 | 8 | 14 | 114 | 140 | 22 | 290 | {} | |||||
| 3 | 2 | 10 | 23 | 330 | 1,056 | 810 | 66 | 2285 | {} | ||||
| 3 | 3 | 10 | 45 | 640 | 2,100 | 1,590 | 126 | 4501 | {} | ||||
| 4 | 4 | 12 | 22 | 510 | 3,084 | 5,820 | 2,772 | 166 | 12,374 | {} | |||
| 4 | 5 | 12 | 33 | 765 | 4,620 | 8,730 | 4,158 | 246 | 18,552 | {} | |||
| 4 | 6 | 10 | 23 | 330 | 1,056 | 810 | 66 | 2,285 | {} | ||||
| 4 | 7,8 | 12 | 66 | 1,500 | 9,240 | 17,370 | 8,316 | 472 | 73,928 | {} | |||
| 4 | 9 | 12 | 132 | 2,970 | 18,480 | 34,650 | 16,632 | 924 | 73,788 | {} | |||
| 5 | 10 | 14 | 46 | 1,533 | 15,030 | 52,710 | 63,108 | 21,126 | 868 | 154,421 | {} | ||
| 5 | 11 | 12 | 33 | 765 | 4,620 | 8,730 | 4,158 | 246 | 18,552 | {} | |||
| 5 | 12–15, 18 | 14 | 91 | 3,024 | 30,030 | 105,210 | 126,126 | 42,112 | 1,716 | 1,541,545 | {} | ||
| 5 | 16 | 14 | 92 | 3,024 | 30,060 | 105,210 | 126,216 | 42,112 | 1,736 | 308,450 | {} | ||
| 5 | 17 | 12 | 66 | 1,500 | 9,240 | 17,370 | 8,316 | 472 | 73,928 | {} | |||
| 5 | 19,20& | 14 | 182 | 6,006 | 60,060 | 210,210 | 252,252 | 84,084 | 3,432 | 3,081,130 | {} | ||
| 22–24 | |||||||||||||
| 5 | 21 | 12 | 132 | 2,970 | 18,480 | 34,650 | 16,632 | 924 | 73,788 | {} | |||
| 6 | 25,26 | 16 | 60 | 2,772 | 40,040 | 225,540 | 504,504 | 420,840 | 102,960 | 3,270 | 2,599,972 | {} | |
| 6 | 27 | 14 | 46 | 1,533 | 15,030 | 52,710 | 63,108 | 21,126 | 868 | 154,421 | {} | ||
| 6 | 28,29 & | 14 | 91 | 3,024 | 30,030 | 105,210 | 126,126 | 42,112 | 1,716 | 1,233,236 | {} | ||
| 34,36 | |||||||||||||
| 6 | 30,31 | 12 | 66 | 1,500 | 9,240 | 17,370 | 8,316 | 472 | 73,928 | {} | |||
| 6 | 32,33 & | 16 | 120 | 5,488 | 80,080 | 450,660 | 1,009,008 | 841,120 | 205,920 | 6,470 | 31,186,392 | {} | |
| 35,37–45 | |||||||||||||
| 6 | 46–49,62 & | 14 | 182 | 6,006 | 60,060 | 210,210 | 252,252 | 84,084 | 3,432 | 4,929,808 | {} | ||
| 65,66,76 | |||||||||||||
| 6 | 50–61,63 & | 16 | 240 | 10,920 | 160,160 | 900,900 | 2,018,016 | 1,681,680 | 411,840 | 12,870 | 119,522,398 | {} | |
| 64,67–75 | |||||||||||||
| 7 | 77 | 12 | 11 | 265 | 1,542 | 2,940 | 1,386 | 90 | 6,234 | {} |
| CX-2 | CX-4 | CX-6 | CX-8 | CX-10 | CX-12 | CX-14 | CX-16 | CX-18 | CX-20 | CX-22 | CX-24 | CX-26 | Totalb | ||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| B1N1 | B2N2 | B3N3 | B4N4 | B5N5 | B6N6 | B7N7 | B8N8 | B9N9 | B10N10 | B11N11 | B12N12 | B13N13 | |||||
| 2 | 1 | 10 | 23 | 330 | 1,056 | 810 | 66 | 2,285 | |||||||||
| 3 | 2 | 14 | 46 | 1,533 | 15,030 | 52,710 | 63,108 | 21,126 | 868 | 154,421 | |||||||
| 3 | 3 | 14 | 91 | 3,024 | 30,030 | 105,210 | 126,126 | 42,112 | 1,716 | 308,309 | |||||||
| 4 | 4 | 18 | 51 | 3,096 | 61,890 | 510,888 | 1,837,836 | 2,859,726 | 1,750,320 | 328,500 | 8,110 | 7,360,417 | |||||
| 4 | 5 | 18 | 77 | 4,644 | 92,848 | 766,332 | 2,756,964 | 4,289,544 | 2,625,760 | 492,750 | 12,190 | 11,041,109 | |||||
| 4 | 6 | 16 | 63 | 2,785 | 40,142 | 225,690 | 504,894 | 421,050 | 103,140 | 3,290 | 1,301,054 | ||||||
| 4 | 7 | 18 | 153 | 9,216 | 185,640 | 1,531,908 | 5,513,508 | 8,577,408 | 5,250,960 | 984,870 | 24,310 | 22,077,973 | |||||
| 4 | 8 | 18 | 154 | 9,216 | 185,696 | 1,531,908 | 5,513,928 | 8,577,408 | 5,251,520 | 984,870 | 24,380 | 22,079,080 | |||||
| 4 | 9 | 18 | 306 | 18,360 | 371,280 | 3,063,060 | 11,027,016 | 17,153,136 | 10,501,920 | 1,969,110 | 48,620 | 44,152,808 | |||||
| 5 | 10 | 22 | 116 | 11,055 | 373,110 | 5,597,460 | 40,739,328 | 149,382,156 | 274,364,760 | 240,075,990 | 88,915,400 | 10,671,738 | 176,484 | 810,307,597 | |||
| 5 | 11 | 20 | 98 | 7,352 | 193,984 | 2,205,812 | 11,641,224 | 29,103,760 | 33,258,880 | 15,592,270 | 2,310,220 | 46,448 | 94,360,048 | ||||
| 5 | 12–15,18 | 22 | 231 | 22,000 | 746,130 | 11,192,940 | 81,477,396 | 298,755,072 | 548,725,320 | 480,140,430 | 177,827,650 | 21,340,704 | 352,716 | 8,102,902,945 | |||
| 5 | 16 | 22 | 232 | 22,000 | 746,220 | 11,192,940 | 81,478,656 | 298,755,072 | 548,729,520 | 480,140,430 | 177,830,800 | 21,340,704 | 352,968 | 1,620,589,542 | |||
| 5 | 17 | 20 | 190 | 14,580 | 387,600 | 4,409,580 | 23,279,256 | 58,200,240 | 66,512,160 | 31,179,150 | 4,618,900 | 92,504 | 188,694,160 | ||||
| 5 | 19,20,22-24 | 22 | 462 | 43,890 | 1,492,260 | 22,383,900 | 162,954,792 | 597,500,904 | 1,097,450,640 | 960,269,310 | 355,655,300 | 42,678,636 | 705,432 | 16,205,677,630 | |||
| 5 | 21 | 20 | 380 | 29,070 | 775,200 | 8,817,900 | 46,558,512 | 116,396,280 | 133,024,320 | 62,355,150 | 9,237,800 | 184,756 | 377,379,368 | ||||
| 6 | 25,26 | 26 | 163 | 22,542 | 1,151,216 | 27,343,030 | 334,640,790 | 2,230,954,440 | 8,286,315,840 | 17,090,574,930 | 18,989,469,950 | 10,634,147,524 | 2,636,560,416 | 219,721,684 | 2,600,612 | 120,907,006,274 | |
| 6 | 27 | 24 | 141 | 16,059 | 673,270 | 12,873,780 | 123,563,628 | 624,680,196 | 1,682,775,288 | 2,366,416,530 | 1,636,032,230 | 490,822,458 | 48,678,084 | 676,984 | 6,987,208,648 | ||
| 6 | 28,29,34 & 36 | 24 | 276 | 31,944 | 1,345,960 | 25,742,970 | 247,118,256 | 1,249,329,312 | 3,365,515,296 | 4,732,773,210 | 3,272,028,760 | 981,616,944 | 97,349,616 | 1,352,540 | 55,896,820,336 | ||
| 6 | 30,31 | 22 | 231 | 22,000 | 746,130 | 11,192,940 | 81,477,396 | 298,755,072 | 548,725,320 | 480,140,430 | 177,827,650 | 21,340,704 | 352,716 | 3,241,161,178 | |||
| 6 | 32,33,35,37–41 | 26 | 325 | 44,928 | 2,302,300 | 54,681,770 | 669,278,610 | 4,461,874,560 | 16,572,613,200 | 34,181,059,770 | 37,978,905,250 | 21,268,222,976 | 5,273,104,200 | 439,431,356 | 5,200,300 | 967,253,756,360 | |
| 6 | 42–45 | 26 | 326 | 44,928 | 2,302,432 | 54,681,770 | 669,281,580 | 4,461,874,560 | 16,572,631,680 | 34,181,059,770 | 37,978,939,900 | 21,268,222,976 | 5,273,120,832 | 439,431,356 | 5,201,224 | 483,627,173,336 | |
| 6 | 46–49,62,65,66 & 76 | 24 | 552 | 63,756 | 2,691,920 | 51,482,970 | 494,236,512 | 2,498,640,144 | 6,731,030,592 | 9,465,511,770 | 6,544,057,520 | 1,963,217,256 | 194,699,232 | 2,704,156 | 223,586,691,040 | ||
| 6 | 50–61,63,64,67–75 | 26 | 650 | 89,700 | 4,604,600 | 109,359,250 | 1,338,557,220 | 8,923,714,800 | 33,145,226,400 | 68,362,029,450 | 75,957,810,500 | 42,536,373,880 | 10,546,208,400 | 878,850,700 | 10,400,600 | 5,561,704,201,450 | |
| 7 | 77 | 24 | 49 | 5,411 | 224,626 | 4,292,790 | 41,190,876 | 208,237,164 | 560,936,856 | 788,825,460 | 545,356,070 | 163,616,810 | 16,228,212 | 226,150 | 2,329,140,474 |
| Inner sites | |||||||
| 4 | 6 | 7 | {} | 8 | {} | ||
| 5 | 8 | 10 | {2} | 11 | {} | ||
| 6 | 10 | 28–41 | {} | 42–45 | {} | ||
| Peripheral sites | |||||||
| 5 | 14 | 12–15, 18 | {} | 16 | {} | ||
| All sites | |||||||
| 4 | 18 | 7 | {} | 8 | {} | ||
| 5 | 22 | 12–15, 18 | {} | 16 | {} | ||
| 6 | 26 | 32, 33, 35, 37–41 | {} | 42–45 | {} | ||
| Compounds | %() | ||
|---|---|---|---|
| IR | visible | UV | |
| 1.77 eV | 1.77–3.09 eV | 3.09 eV | |
| naphthalene | |||
| C8B1N1H8 | 0.00 | 30.43 | 69.57 |
| C6B2N2H8 | 4.85 | 56.97 | 38.18 |
| C4B3N3H8 | 16.95 | 59.19 | 23.86 |
| C2B4N4H8 | 29.14 | 49.01 | 21.85 |
| B5N5H8 | 19.70 | 37.88 | 42.42 |
| larger rings | |||
| 3 rings | 7.30 | 62.04 | 30.66 |
| 4 rings | 28.86 | 55.60 | 15.55 |
| 5 rings | 43.14 | 49.36 | 7.50 |
| 6 rings | 53.59 | 42.88 | 3.53 |
| coronene | 4.08 | 87.76 | 8.16 |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
The Chemical Space of B, N-substituted Polycyclic Aromatic Hydrocarbons: Combinatorial Enumeration and High-Throughput First-Principles Modeling
Sabyasachi Chakraborty
Tata Institute of Fundamental Research, Centre for Interdisciplinary Sciences, Hyderabad 500107, India
Prakriti Kayastha
Tata Institute of Fundamental Research, Centre for Interdisciplinary Sciences, Hyderabad 500107, India
Raghunathan Ramakrishnan
Tata Institute of Fundamental Research, Centre for Interdisciplinary Sciences, Hyderabad 500107, India
Abstract
Combinatorial introduction of heteroatoms in the two-dimensional framework of aromatic hydrocarbons opens up possibilities to design compound libraries exhibiting desirable photovoltaic and photochemical properties. Exhaustive enumeration and first-principles characterization of this chemical space provide indispensable insights for rational compound design strategies. Here, for the smallest seventy-seven Kekulean-benzenoid polycyclic systems, we reveal combinatorial substitution of C atom pairs with the isosteric and isoelectronic B, N pairs to result in 7,453,041,547,842 (7.4 tera) unique molecules. We present comprehensive frequency distributions of this chemical space, analyze trends and discuss a symmetry-controlled selectivity manifestable in synthesis product yield. Furthermore, by performing high-throughput ab initio density functional theory calculations of over thirty-three thousand (33k) representative molecules, we discuss quantitative trends in the structural stability and inter-property relationships across heteroarenes. Our results indicate a significant fraction of the 33k molecules to be electronically active in the 1.5–2.5 eV region, encompassing the most intense region of the solar spectrum, indicating their suitability as potential light-harvesting molecular components in photo-catalyzed solar cells.
††preprint: AIP/123-QED
I Introduction
Heteroarenes containing a subset of B, N, O, P, and S atoms are very versatile organic compounds exhibiting useful mechanical, optoelectronic, chemisorption and catalytic propertiesWong et al. (2006); Marcon, von Lilienfeld, and Andrienko (2007); Campbell et al. (2010); Jiang, Li, and Wang (2013); Al-Hamdani et al. (2014); Hashimoto et al. (2014); Gong et al. (2015); Stepien et al. (2016); Ito, Ozaki, and Itami (2017); Wang et al. (2017). The most extensively studied heteroarenes are either those based on polycyclic aromatic hydrocarbon (PAH) molecules or their two-/three-dimensional (2D/3D) periodic forms: graphene and graphite. Borazine and hexagonal boron-nitride (BN) are fully heteroatom-substituted arenes; while their popular names, inorganic benzene and inorganic graphite, are inspired by their isoelectronicity with organic counterparts, their properties do exhibit stark contrast. The latter compound, due to its suitable band structure properties, plays a vital role in the design of graphene-based heterostructuresDean et al. (2010); while in its partially hydrogenated form shows visible-light activity for photocatalyzed water splittingLi, Zhao, and Yang (2013). Both at the molecular level and in the extended 2D domain, B, N-arenes exhibit unique physical and chemical propertiesDutta and Pati (2008); Ci et al. (2010); Yamijala, Bandyopadhyay, and Pati (2013). Due to the isoelectronic and isosteric relationships between C-C and B-N fragments, these compounds do exhibit similarities in chemistry to their parent hydrocarbon compounds–a fact that has motivated all the related synthesis endeavors during the past decadesDewar, Kubba, and Pettit (1958); Dewar and Dietz (1959); Chissick, Dewar, and Maitlis (1960); Dewar and Dougherty (1964); Davies, Dewar, and Rona (1967); Fang et al. (2006); Jaska et al. (2006); Bosdet et al. (2007a); Yamamoto and Takimiya (2007); Bosdet and Piers (2009); Matsui et al. (2017). On the other hand, these heteroarenes have also been reported to exhibit mechanical and electronic properties differing from those of pure carbon and fully heteroatomic compoundsMiyamoto et al. (1994); Watanabe et al. (1996). Combinatorial diversity arising from site-specific atomistic substitutions in the arene framework combined with the local polarity introduced by heteroatoms gives rise to continuously distributed molecular properties in ranges desirable for a multitude of applicationsGhosh, Periyasamy, and Pati (2011); Morgan and Piers (2016). Yet another combinatorial scenario arises in extended arenes; for instance, nanotubules made of hexagonal BC2N sheets show a wide degree of anisotropic conductivity stemming from distinct ways of rolling the heteroaromatic sheet into tubulesMiyamoto et al. (1994). From a molecular perspective, introduction of heteroatoms in the arene framework serves as an invaluable alternative to coupling C atoms with functional groups which in the case of aromatic arenes is thermodynamically amenable only at peripheral sitesMüller et al. (2014).
Mathematical methods for enumerating molecular datasets have been thoroughly reviewed by FaulonFaulon (1992). Historically, isomer counting of cyclic compounds has been based on the enumeration theorem named after George PólyaPólya (1937); Freudenstein (1967); Polya and Read (2012). Exhaustive applications of this technique have been carried out by Lindsey et al. to enumerate macrocyclic compound libraries Taniguchi, Du, and Lindsey (2011, 2013)suitable for light-harvestingYuen et al. (2018). Baraldi et al.Baraldi and Vanossi (2000); Baraldi, Fiori, and Vanossi (1999)have applied Pólya’s theorem and enumerated the stereoisomers of highly symmetric icosahedral topologies. Graph theoretical methods came into light with comprehensive enumerations of polycyclic hydrocarbons through the pioneering works of DiasDias (1985, 1986, 2007), Balaban, Balaban (1985) and othersLukšič and Pisanski (2007). Paton et al. have enumerated saturated acyclic alkanes by explicitly accounting for the instability of strained carbon skeletons Paton and Goodman (2007). Shao et al. have applied subgroup decomposition of isomer permutation groups and enumerated a few fullerene cagesShao, Wu, and Jiang (1996)and applied the same technique to list the isomers of B24-mNmShao and Jiang (1996).
All the aforementioned works have been based on a non-constructive strategy, i.e., enumeration is based on closed-form algebraic expressions rather than explicit generation of molecular structures. As far as benzenoid hydrocarbons are concerned, a constructive strategy has been found to be of larger applicability, wherein explicit generation of structures or the corresponding graphs is required. For example, benzenoid hydrocarbons have been enumerated using the perimeter or the area of self-avoiding polygons on a hexagonal latticeGutman and Cyvin (1989); Cyvin, Brunvoll, and Cyvin (1992); Vöge, Guttmann, and Jensen (2002). Using this approach the benzenoid compounds with up to 50 hexagonal cells have been enumerated with a parallel algorithmJensen (2009). To date, one of the fastest approaches to constructively enumerate Kekulean benzenoid compounds—which is of interest to chemists—has been that of Brinkmann et al.Brinkmann, Grothaus, and Gutman (2007) which utilizes the idea of dual-graphsBrinkmann, Caporossi, and Hansen (2002). A detailed account of the Kekulean structure as an important descriptor for benzenoid hydrocarbons can be found in the work of Cyvin *et al.*Cyvin and Gutman (2013).
B, N-arenes have been the subject of a diverse range of theoretical as well as combined theoretical/experimental studies. To begin with, already in the late 60’s Hoffmann et al. had applied the extended-Hückel molecular orbital theory to a dozen or so heteroaromatic compounds, discussed the role of intramolecular non-bonded electrostatic interactions in these compounds and commented on their stability Hoffmann (1964). Prior to that, Dewar had reported the synthesis of hetero-phenanthrene with 9:10 C atoms replaced by B, N pairs Dewar, Kubba, and Pettit (1958). Eventually, a number of investigations have studied larger compounds ranging from PAHsNeue et al. (2013); Wang et al. (2014); Long et al. (2016); Sanyal, Manna, and Pati (2014); Wang et al. (2015); Ishibashi et al. (2017); Fias, Chang, and von Lilienfeld (2018) to fullerene cagesZhu, Schmalz, and Klein (1997); Jensen (1993); Zhu et al. (1995); Seifert et al. (1997); Evangelisti (1997); Pattanayak, Kar, and Scheiner (2002); Balawender et al. (2018) partly or completely enriched by B, N pairs.
The purpose of this paper is to non-constructively enumerate all possible unique compounds formed by substituting pairs of C atoms in PAHs comprising up to six cycles with B and N atoms. We utilize the nuclear permutation groups that are isomorphic to the rotation groups of the PAH scaffolds and generate the corresponding pattern inventory. One of the aims of this study is to provide consolidated tabulations of B, N-substituted PAH (BN-PAH) compound frequencies—as a function of stoichiometries, symmetry and sites—to enable identification of statistical and combinatorial trends facilitating chemical space design strategies. As exemplars, we discuss (i) deviation of substitution patterns from commonly expected binomial/multinomial distributions; (ii) in the case of two or more PAHs of similar size (i.e. made of the same number of benzene rings) and symmetry (same point group), non-trivial frequency selectivities giving rise to distinct product distributions. We clarify the origin of both these effects using nuclear permutation groups. Furthermore, for a subset of 33,059 BN-PAH compounds, we present accurate geometrical features, energetics, optoelectronic properties, and inter-property correlations based on high-throughput density functional theory (DFT) calculations.
II Results and Discussion
II.1 Clar structures of Benzenoid Hydrocarbons
Benzenoid hydrocarbons (BHs) are a class of fusenes with two or more six-membered rings that are mutually-condensed, i.e. each ring sharing an edge with at least another ringBalaban and Harary (1968). The structures of BHs are typically isomorphic to that of hexagonal sub-lattices, however, helicenoid compounds—a class of BHs—adopt structures that are non-isomorphic to hexagonal sub-latticesBrinkmann, Grothaus, and Gutman (2007). The valence electronic properties of BHs can be explained by Clar aromatic sextet theoryClar (1964); Solà (2013). The essence of this theory is to derive a graphical representation of the electronic structure, which encodes bonding information from all the Kekulé structures (KSs) as well as the resonant hybrid one. For a given BH every KS can be represented as a Clar diagram composed of one or more aromatic -sextets (hexagons inscribed with circles) and non-sextet carbon-carbon double bonds. A Clar diagram with the most number of aromatic -sextets is the so-called Clar structure (CS). It is important to consider the CS as a versatile chemical descriptor for benzenoid-Kekulean structures. Another such descriptor is Wiener indexNikolić and Trinajstić (1995); Vukičević and Trinajstić (2004), which is a topological score mapping the molecular structure to various chemical properties. Algebraic representations of CSs have indeed been shown to capture more chemical information than a single KSRandić (2004); Randić and Balaban (2018). Multiple Clar diagrams, where the -sextets are located in adjacent rings, can be collectively represented by a single migrating CSSolà (2013). The total number of -sextets in the CS has been shown to correlate with bond-length variations, aromaticity and also HOMO-LUMO gap of the BHSolà (2013). Throughout this paper, all molecular cartoons are presented in the Clar form.
II.2 Constructive Enumeration of Benzenoid Hydrocarbons
KSs of PAHs with up to six benzene rings have been constructively enumerated using an in-house program. For a given number of rings, the program exhaustively generates all possible molecular structures by tiling (i.e. plane-filling) and ignores all the disconnected structures such as non-bonded molecular dimers. As an exemplary case, in Fig. 1 we have collected all the generated structures containing three rings. The resulting structures include benzenoid () as well as non-benzenoid () compounds. While the latter are excluded in this study, among benzenoids we have considered only those with a valid Kekulé formula (). We note in passing that compounds which do not have a valid Kekulé formula () such as the one shown in Fig. 1 are non-aromatic as long as the system is charge neutral. It is worthwhile to note that the aforementioned procedure allows certain benzenoid structures with more rings to qualify, for instance, isochrysene a formally 4-ring system (see Fig. 1, blue labels) can be generated using three benzene rings mutually connected at 1,2 sites in a non-benzenoid fashion (i.e. with sigma bondings). For a given number of rings, we exclusively collect all the benzenoid molecules with valid Kekulé formulae () (see Fig. 1, green labels). The structures of all the resulting hydrocarbons comprising up to six benzene rings are displayed in Fig. 2. For 2-, 3-, 4-, 5-, and 6-ring compounds, our procedure has resulted in 1, 2, 6, 15, and 52 hydrocarbons, respectively. These numbers agree with those given by Brinkmann et al.Brinkmann, Grothaus, and Gutman (2007) who further enumerated the hydrocarbons with more benzene rings and reported 195, 807, 3513, 16025 entries for 7-, 8-, 9-, and 10-ring compounds, respectively. We note that our plane-filling algorithm does not distinguish between the 6-ring compound hexahelicene (structure 45 in Fig. 2) and the peri-condensed 7-ring compound coronene a.k.a. superbenzene (structure 77 in Fig. 2)—both 3D structures surjectively mapped to the same plane-filling 2D pattern. For this reason, we have included coronene as the sole 7-ring system in this work. It is important to note that the helical compounds such as tetrahelicene (8 in Fig. 2), pentahelicene (13 in Fig. 2) and hexahelicene (45 in Fig. 2) are chiral with two enantiomers having same substitution pattern. For a given number of rings we found the number of hydrocarbon compounds in Fig. 2 to fully agree with a constructive enumeration performed using the program CaGe Brinkmann et al. (2010). The main objective of this study concerns the grouping of seventy-seven PAHs shown in Fig. 2 according to their molecular symmetry group as well as available substitution sites and enumerate the number of all possible unique compounds that can be formed by replacing pairs of C atoms in PAHs with B and N atoms.
II.3 Chemical Space Enumeration: Mathematical Formulation and Computation
Combinatorial enumeration of isomers, topologies, nuclear spin statistical weights, etc., can be efficiently performed using generalized character cycle index (GCCI)Balasubramanian (1985, 1992) of the molecular symmetry group that are subgroups of the complete nuclear permutation inversion (CNPI)Bunker and Jensen (2006) group. The GCCI () of a group corresponding to a particular irreducible representation is defined as
[TABLE]
where goes over all elements of the permutation group , is the character of ’s matrix representation for the given irreducible representation and corresponds to the cycle representation of element with running over all the factor-cycles of . In Eq. 1, denotes the group order and the figure counting series for cycle length is given by with , , etc. being the objects to be permuted. For the special case where is the (fully) symmetric representation of , one arrives at the Pólya enumeration formulaPólya (1937); Polya and Read (2012) that has been so successfully employed for enumerating various chemical subspacesFaulon, Visco Jr, and Roe (2005); Balaban (1991).
[TABLE]
For the enumeration of BN-PAH chemical space, takes the form , where is the cycle length. Application of Eq. 2, for instance, to naphthalene () with isomorphic to the point group results in the following pattern inventory
[TABLE]
where each term corresponds to the stoichiometry ( and ). In all cases, the number of atoms, not shown in Eq. 3, is 8 as in the parent hydrocarbon naphthalene. The coefficient of each term in Eq. 3 gives the number of unique compounds for that stoichiometry (i.e. the number of constitutional isomers). For example, the total number of naphthalene derivatives with stoichiometry is 23 as given by the fifth term on the right side of Eq. 3. For asymmetric molecules, the permutation group is isomorphic to the point group and the GCCI reduces to the multinomial expansion
[TABLE]
where is the number of sites available for substitution and the summation goes over all combinations of non-negative , and with the constraint .
In the present work, we are solely interested in the enumeration of isosteric/isoelectronic compounds with equal number of B and N atoms. For example, all naphthalene based compounds are of the form , where . For all seventy-seven hydrocarbons listed in Fig. 2, we have separately collected the pattern inventory for such substitutions of the inner C atoms (Table 1), outer C atoms (Table 2), and both inner and outer C sites without restrictions (Table 3). In these tables, we have also listed the cycle indices of the group generators denoted by . For naphthalene, in the case of substitution of the two inner C sites, the relevant group is and its generator is . The cycle index set is then denoted by , i.e., one cycle of length two. For the substitution pattern of the peripheral C atoms of naphthalene, the appropriate permutation group can be formed using the generator set with (see Table 2). The permutation group used for collecting the substitution pattern of all ten carbons of naphthalene is derived using the generator set with the corresponding cycle indices , i.e., a generator with five indices of length two, and another with four indices of length two. For all seventy-seven PAHs displayed in Fig. 2, the total number of compounds obtained by substituting all (both inner and outer C sites without restrictions) available C atoms is graphically presented in Fig. 3. The overall trend is that for a given number of substitution sites maximal number of products is noted for asymmetric hydrocarbon frameworks that follow a multinomial distribution, see Eq. 4.
II.4 Symmetry-controlled yield-pattern selectivity
For a given PAH, deviations from a multinomial distribution of B, N-substituted compounds increase with the order of the symmetry group of the hydrocarbon. Such a trend has been discussed in the past for the enumeration of gallium arsenide (GaxAsy) clustersBalasubramanian (1988, 1992). In the case of symmetric PAHs, it is interesting to note that even when two (or more) PAHs share the same rotational group and comprise of same number of substitution sites (), these compounds can lead to distinct GCCIs (), therefore distinct product distributions. For example, see the two rows corresponding to and in Table 1. While on one hand, the symmetry groups of the respective compounds, chrysene and tetrahelicene (7 and 8 in Fig. 2) correspond to the same isomorphic point group ; on the other, their inner-substituted analogues result in different constitutional isomer distributions (see Table 1). Similar trends are noted also in the cases of larger PAHs; few illustrative examples are on display in Fig. 4 while all such instances are collected in Table 4. Qualitatively, such a selectivity can be understood by considering the fact that, in the case of tetrahelicene (a cisoid fused system, structure 8 in Fig. 2), two C atoms lie on the principal axis whose positions are invariant with respect to the rotation. In other words, the corresponding labels form an invariant subspace of the isomorphic permutation operation. In the case of chrysene (a transoid fused system, structure 7 in Fig. 2), the principal axis is normal to the molecular plane and passes through the geometric center of the molecule without coinciding with any of the C atoms, i.e., zero-invariant subspace. So tetrahelicene, which has a larger invariant subspace, can be thought of as a permutationally less symmetric molecule than chrysene, as far as the inner C atom framework is considered.
To gain a more quantitative appreciation of the aforementioned selectivity and more importantly, to compute the actual distribution, one has to consider the corresponding nuclear permutation groups and inspect the cycle length structure of the group generators. Let us consider the GCCIs for the case of inner-site substitutions of the two PAHs chrysene () and tetrahelicene ():
[TABLE]
While expanding the expressions on the right side yields the pattern inventory, it is evident that the difference in the number of substituted products between the two PAHs arises from the generator cycle length structure. In Eq. LABEL:eq.symm, the first factor on the second term, , indicates that two C atoms are invariant with respect to the corresponding symmetry element . Typically, BN-PAH molecules are synthesized by starting with suitable precursor compounds, see for example LABEL:bosdet200710a. In order to realize the symmetry-controlled yield-pattern selectivity proposed here, for any given PAH, synthesis strategies must statistically account for all possible substituted compounds.
II.5 High-throughput first-principles modeling of BN-PAH compounds
All the 7,453,041,547,842, i.e. tera BN-PAH molecules enumerated in the previous sections feature even number of electrons and are of closed-shell type. Past computational investigations of some of these compounds have demonstrated the reliability of density functional approximations (DFAs) for semi-quantitative prediction of their structures and electronic propertiesMarcon, von Lilienfeld, and Andrienko (2007); Al-Hamdani et al. (2014); Ghosh, Periyasamy, and Pati (2011). It is important to note that while first-principles modeling of any single constituent of the BN-PAH dataset presents no conceptual challenge, the sheer size of this dataset will render any brute-force high-throughput computational endeavor aiming towards complete coverage impractical—even when depending on petascale computer facilities. For instance, geometry relaxation of a typical medium-sized organic molecule using even a semi-empirical method such as PM7 requires about 10 CPU seconds. Carrying out such calculations for all the tera BN-PAH molecules would require over two million CPU years. Deploying thousands of CPU cores for this purpose can at most decrease this time by three orders of magnitude.
In this first study, to gain rational insights into the stability and optoelectronic properties of BN-PAH compounds, we have performed high-throughput DFT modeling for only a representative subset. To this end, we restrict the number of molecules to a feasible size by considering all possible substitutions in naphthalene (row 1 of Table 3) and only single B, N pair substitutions in the remaining 76 PAHs (column 5 of Table 3), overall amounting to 33,059 (33k) compounds. For all these compounds we have performed geometry optimizations and collected various properties for minimum energy structures; details of the DFT calculations are provided in Computational Methods. In the following, we report on the geometric features of the BN-PAH molecules and discuss deviations of the predicted structures from those expected solely based on formal hybridization scenario of the unsubstituted PAH compounds. Then we present the distribution of HOMO-LUMO gap of the 33k compounds in the context of solar spectrum. Finally we comment on inter-property correlations between dipole moment, electronic gap and atomization energy which are relevant to rational compound design strategies.
E.1. Deviations from ideal geometry
Formally, we understand the aromatic character of benzenoid species by Clar’s -sextet rule Clar (1972); Feixas et al. (2008); Solà (2013), which predicts hydrocarbons with migrating sextets to contain bonds of equivalent lengths and a planar geometry, characteristic of an ideal hybridization. On the other hand, compounds with fewer -sextets show local aromaticity resulting in deviations from ideal structures. Further, trends in thermodynamic stability are also expected to correlate with the number of aromatic -sextets—the larger its value, greater should be the stability. However, it is important to note that all these assumptions that are expected to hold for PAHs need not hold for the hetero-compounds. To gain a first-principles understanding of key geometric features of both the PAHs as well as their hetero analogues, we have collected bond length distributions and out-of-plane deviations (OD) from DFT calculations in Fig. 5. The OD was obtained by finding a best-fit molecular plane through rigid body rotations, and then calculating the deviations along the normal (-axis) in a root-mean-squares fashion, , where is the number of atoms. In order to compare these structural features across various PAHs, we consider only those substituted compounds containing only one pair of B, N atoms resulting in 30,797 (31k) BN-PAH compounds.
First of all, let us inspect the CC bond distances of the hydrocarbons in Fig. 5A and note the bond lengths to vary between the values of a conventional CC double bond ( 1.34 Å) and a CC single bond ( 1.5 Å). In the case of BN-PAH compounds (see Fig. 5B), we note the CN double bond to display the widest distribution followed by BN and CB double bonds. This variation of bond lengths implies the heteroatoms to introduce structural inhomogeniety thereby weakening electron delocalization across the molecule. Such an effect may be understood via the electronegativity criterion alone, where one can classify the CN moiety to be electron-rich and CB moiety to be electron-deficient compared to the CC fragment resulting in a net gain in electron density on the CN fragments. Moreover, even though the BN fragment is isoelectronic and isosteric to the CC one, due to the larger ionic character, we find the average BN bond length to be larger than the average CC bond length (Fig. 5B).
Moving on to the OD of the hydrocarbons and their hetero counterparts, it is useful to note that such distortions signify the presence of strain. For a better clarity, it is worthwhile to recall the formal classification of benzenoid compounds based on the presence of the various topological features–fissure, bay, cove and fjordDias (2005); Pogodin and Agranat (2002); see Fig. 6. The bay, cove and fjord regions have H atoms in close proximity introducing strain in a strictly planar geometry. A cove may be thought to be comprised of two proximate bays while a fjord three proximate bays. For a given number of C atoms, a strictly peri-condensed compound—approaching a more circular topology—must have minimum number of external C atoms, hence fewer number of baysDias (1990). Evidently, structures with a fjord region are more susceptible to OD followed by ones with coves and lastly by those with bays.
The most striking feature in Fig. 5C is that 90% of the parent PAHs (70 out of 77) are perfectly planar, while a few do show moderate OD; such structures are those with fjords (13, 35, 45, 61, 64 & 67 in Fig. 2) and multiple coves (26 in Fig. 2). In contrast, in Fig. 5D, we note only about 66% of the hetero structures (about 20k out of 31k) to be planar. To further quantify, let us consider a threshold of Å. While only 9.1% of the PAHs have a larger with respect to this threshold, 19.3% of the B, N-substituted ones have Å.
These results show that B, N-substitution induces the molecules to distort from planar configurations. Loss of planarity also compromises the efficiency of these molecules as chromophores, or components of singlet-fission systems. Singlet-fission is a process by which an organic chromophore in an excited singlet state transfers its excess energy to a neighbouring chromophore in the ground state resulting in two triplet states there by doubling the number of carrier chargesSmith and Michl (2010). In inter-molecular singlet-fission, especially in molecular solids, planarity of the constituent molecules is crucial to stabilize the crystal via stackingBhattacharyya and Datta (2017). Out of the 31k BN-PAH compounds 80.7% satisfy this prerequisite. However, in actual singlet-fission applications, the singlet-triplet energetics also play a vital role. Therefore, more efforts are needed for a better evaluation of the structure-energetics trade-off across the BN-PAH dataset.
E.2. Trends in electronic structure
Clar structures also provide information about the electronic energy level separations of BHs. In general, with increasing number of -sextets in the Clar structure, the HOMO-LUMO transition is blue-shiftedSolà (2013). It is interesting to note that this formal empirical rule is corroborated by TPSSh results as seen in Fig. 7A. Out of seventy-seven hydrocarbons, most have HOMO–LUMO gaps () in the visible region of the solar spectrum (see Fig. 7A). At the longer wavelength region near the spectral maximum, only about half-a-dozen elongated hydrocarbons are active. For the linear PAHs—napthalene, anthracene, tetracene, pentacene and hexacene—the TPSSh values deviate from the experimental counterpartsGeorge and Morris (1968); Malloci et al. (2007, 2011) (i.e. experimentalTPSSh) by 0.24, -0.69, 0.47, 0.60, and 0.57 eV, respectively amounting to an average prediction error of 0.24 eV and a root mean square error of 0.54 eV. These error measures imply the above discussed trends in HOMO–LUMO gaps to retain their semi-quantitative accuracy at least for other PAH molecules. It may be noted that a number of solar cell applications have been based on the organic dye coumarin because of its very desirable electronic gap at 2.25 eV (552 nm) Mishra, Fischer, and Bäuerle (2009). However, none of the PAHs exhibit in the yellow region of the spectrum, where the spectral energy density of the solar black-body radiation is maximum. It is important to note that for these transitions to be allowed, the corresponding transition dipole moment integrals must also be non-vanishing. Computation of such integrals must be done using time-dependent DFT; such efforts are beyond the scope of the present study.
Fig. 7B presents the of 2,285 (B, N)x-substituted isomers of naphthalene and Fig. 7C shows the of 30,797 (B, N)1-substituted isomers for all PAHs. In both figures, we observe the property distributions to follow roughly a Gaussian-type trend. Such statistical trends often imply that the character of the electronic excitation is preserved across all the constitutional isomers with same stoichiometryRamakrishnan et al. (2015).
In Table 5 we have collected the gap distribution across different regions of solar spectra for (B, N)1-substituted compounds and all possible isomers of naphthalene. We observe that for most of the classes, majority of the molecules () lie in the visible region of the solar spectrum. An interesting correlation can be drawn based on works by Hoffmann et al. Hoffmann (1964); AlKaabi, Dasari, and Hoffmann (2012); Niedenzu and Dawson (2012); Zeng, Ananth, and Hoffmann (2014), where the authors explain the properties of a substituted PAH with BxNy units as the consequence of perturbation of parent hydrocarbon’s properties by heteroatoms. In Table 5 we note that for naphthalene, the (B, N)1 substitution shifts from predominantly UV to predominantly UV-visible with increase in the number of heteroatoms, a maximum shift to the visible region is noted for the stoichiometry C4B3N3H8.
When comparing the modulation of by (B, N)1 substitution (Fig. 7C) in all the PAHs with those of (B, N)x substitution in naphthalene (Fig. 7B), one notes a similar of spread of 0–4 eV in both cases with too few examples with eV. Overall, when comparing the PAH molecules with B, N-substituted ones, red-shifting of arises due to quasi-degenerate valence MOs characteristic of a diradical-type system, while blue-shifting of arises due to increase in the character (decrease in character) of the excitation. It may be worthwhile to relate this trend to the observation that 2D BN-sheet is a wide-gap insulator due to lack of extended -conjugationsNagashima et al. (1995).
E.3. Inter-property correlations
Rational chemical compound design based on high-throughput DFT computations often requires multi-property optimization. To this end, we explore some of the static ground state properties that are typically computed in a single-point calculation. Following , the next key property of interest is the thermodynamic stability of the molecules. For this purpose, we use atomization energy per electron () as a measure. In addition, ground state dipole moment ()—that is routinely computed during single point calculations—contain information about spatial separations of partial charges. In the following, we briefly discuss the correlations between these three properties.
Pairwise inter-property correlations: vs , vs , vs are on display in Fig. 8. For all properties, the range is largest for the 6-ring compounds; there are 25k such molecules in the 31k set. The spread in the property values decrease gradually with the number of rings. A noticeable feature in Fig. 8A is that molecules with large exhibit small . We ascribe this relation to the fact that these molecules have the shortest B-N separations leading to strong bonding. We note in Fig. 8B that smaller values typically correlate with smaller as expected in the case of diradical-type molecules such as those with well-separated B and N centers. Fully-conjugated aromatic molecules show electronic gap in the typical region of about 2-5 eV (see Fig. 8B) and these molecules are also found to be more stable with larger . Such an interpretation is also supported by the trends shown in Fig. 8C; molecules with longer diradical-type bonds show larger and smaller and vice versa. These trends are reminiscent of those noted in a previous high-throughput DFT study of 134k small organic moleculesRamakrishnan et al. (2014).
III Computational Methods
For all seventy-seven PAHs listed in the previous section, we have generated Cartesian coordinates using the program AvogadroHanwell et al. (2012). With the same program minimum energy structures were obtained by employing universal force-field (UFF) parametersRappé et al. (1992). The resulting structures were used as templates for combinatorially generating the atomic coordinates of all the B, N-substituted molecules; permutationally redundant structures were eliminated by comparing the principal moments of inertia. In the case of naphthalene, we have generated all possible molecules where pairs of C atoms were substituted by the isoelectronic B, N atom pairs resulting in 2,285 compounds; while for the larger PAHs comprising of more than two benzene rings, we have restricted the substitution to only a single pair of C atoms which gave rise to 30,774 compounds. These numbers tally perfectly with those from Polya enumeration, as long as the Cartesian coordinates encode the molecular symmetry (see Table 3). For all 33,059 molecules, we have performed geometry optimization and electronic structure calculations at the Kohn–Sham density functional theory (DFT) level using the ORCA (version 4.0.1.2) suite of programsNeese (2012). In order to reach a high-degree of quantitative accuracy to model the electronic excitation spectrum one must perform linear-response time-dependent (LR-TD)-DFT calculations preferably based on long-range corrected hybrid DFs and large basis sets. However, in the present study we only wish to provide qualitative and semi-quantitative insights to the stability and HOMO-LUMO gaps of the B, N-substituted PAHs. For this purpose, we limited our DFT explorations only to the TPSShPerdew et al. (1999, 2004) hyper-GGA functional—that has been shown to be applicable to model the electronic properties of organic and inorganic molecules Jensen (2008); Irfan et al. (2012); Zhang et al. (2013); El-Shishtawy et al. (2014); Liu et al. (2014)—in combination with the split valence basis set def2-SVP Weigend and Ahlrichs (2005). In all calculations, we have used the resolution-of-identity technique to approximate the two-electron Coulomb and exchange integrals (RI-JK approximation) with the corresponding def2/JK Weigend (2008) auxiliary basis sets along with Grid5-level integration grids to estimate the exchange-correlation energies using numerical quadrature.
IV Conclusions
We have applied a combinatorial algorithm to enumerate all possible compounds obtained by substituting a pair of carbon atoms in the smallest seventy-seven poly aromatic hydrocarbons containing 2-6 benzene rings with isoelectronic/isosteric B, N atom pairs. For a hydrocarbon with carbon atoms, maximal number of compounds are obtained when exactly carbon atoms are substituted by heteroatom pairs. The grand set of all the resulting compounds is eleven orders of magnitude (7,453,041,547,842= tera) larger than that of the parent hydrocarbons. To facilitate large scale data-mining and discovery of combinatorial trends across the BN-PAH dataset, we have provided consolidated tabulations of molecular distributions according to symmetry, stoichiometry and sites. Furthermore, we show more than one hydrocarbons with same number of carbon atoms and same point group symmetry to lead to distinct yield-patterns revealing a symmetry-controlled selectivity; we have rationalized this effect using the generalized character cycle indices (GCCIs). Our results based on B, N substitutions are also transferable when using other isovalent heteroatom pairs.
For a tiny fraction of the 7.4 tera set consisting of 33,059 (33 kilo) representative molecules, we have performed DFT calculations and analyzed structural and electronic features relevant for light-harvesting applications. For the unsubstituted hydrocarbons, we provide qualitative insights into the DFT-predicted properties using Clar’s valence electronic structure formulae. Replacing a couple of carbon atoms in the hydrocarbons with a heteroatom pair has been shown to perturbatively modulate all the properties by retaining the essential characteristics of the parent compounds. More importantly, our results indicate that combinatorial introduction of B, N atoms in smallest polycyclic aromatic hydrocarbons gives rise to a library of compounds with HOMO-LUMO gaps spanning the entire solar spectrum, with a significant fraction exhibiting HOMO-LUMO gap near the solar spectral maximum. This prompts us to suggest the suitability of the BN-PAH dataset for various light–harvesting, singlet fission applicationsPaci et al. (2006); Greyson et al. (2010); Zeng, Ananth, and Hoffmann (2014) or to design material with desirable exciton energeticsHill et al. (2000). Designing material exhibiting low exciton binding energies, in order for the absorbed photon to generate maximum output voltage, has been a major theme in studies of organic photovoltaics. The HOMO-LUMO gaps reported in the present work corresponds to the real gap (a.k.a. transport or fundamental gap), denoted in the relevant literatureBredas (2014) as or . There have also been studiesBäppler et al. (2014) addressing how to directly model the so-called optical gap, , which is the least energy required for the creation of a bound electron-hole pair; this excitation corresponds to the first (narrow) peak in the absorption/photo-luminescence spectra. The difference accounts for the exciton binding energy, , higher its value lower will be the charge photogeneration efficieny. As far as the unsubstituted PAHs are concerned, takes the value of 1 eV for naphthalene and 0.1–0.5 eV for pentaceneLanzani (2006). We hope the BN-PAH dataset along with the presented TPSSh results for could be of use in future high-throughput efforts towards screening materials with small .
Recent ventures in comprehensive chemical space design have demonstrated their pivotal role in accelerating the discovery of novel compounds for a multitude of application domainsShoichet (2004); Tu et al. (2012); Balawender et al. (2013); Reymond (2015); Ramakrishnan et al. (2014); Fias, Chang, and von Lilienfeld (2018). Such Big Data efforts also enable rational benchmarking and parameterization of approximate computational chemistry methods in a data-driven fashionKranz et al. (2018); Li et al. (2018). For more elaborate design studies based on the BN-PAH compound library the results provided in the present work may be considered as a baseline.
V Supplementary Material
For all seventy-seven PAH, complete pattern inventory for all possible B, N substitutions are collected. For seventy-seven PAH and 33,059 BN-PAH molecules, TPSSh/def2-SVP/RI-JK-level equilibrium structures and various electronic properties are also collected.
Acknowledgements.
We gratefully acknowledge Prof. Gunnar Brinkmann for providing the CaGe program, Prof. Ranjan Das and Dr. Vamsee Voora for useful discussions. PK is grateful to TIFR for Visiting Students’ Research Programme (VSRP) and junior research fellowships. RR and SC thank TIFR for financial support. All calculations have been performed using the helios computer cluster which is an integral part of the MolDis Big Data facility, TIFR Hyderabad (https://moldis.tifrh.res.in/).
References
- Wong et al. (2006) K.-T. Wong, T.-Y. Hwu, A. Balaiah, T.-C. Chao, F.-C. Fang, C.-T. Lee, and Y.-C. Peng, Org. Lett. 8, 1415 (2006).
- Marcon, von Lilienfeld, and Andrienko (2007) V. Marcon, O. A. von Lilienfeld, and D. Andrienko, J. Chem. Phys. 127, 064305 (2007).
- Campbell et al. (2010) P. G. Campbell, L. N. Zakharov, D. J. Grant, D. A. Dixon, and S.-Y. Liu, J. Am. Chem. Soc. 132, 3289 (2010).
- Jiang, Li, and Wang (2013) W. Jiang, Y. Li, and Z. Wang, Chem. Soc. Rev. 42, 6113 (2013).
- Al-Hamdani et al. (2014) Y. S. Al-Hamdani, D. Alfè, O. A. von Lilienfeld, and A. Michaelides, J. Chem. Phys. 141, 18C530 (2014).
- Hashimoto et al. (2014) S. Hashimoto, T. Ikuta, K. Shiren, S. Nakatsuka, J. Ni, M. Nakamura, and T. Hatakeyama, Chem. Mater. 26, 6265 (2014).
- Gong et al. (2015) Y. Gong, H. Fei, X. Zou, W. Zhou, S. Yang, G. Ye, Z. Liu, Z. Peng, J. Lou, R. Vajtai, B. I. Yakobson, J. M. Tour, and P. M. Ajayan, Chem. Mater. 27, 1181 (2015).
- Stepien et al. (2016) M. Stepien, E. Gonka, M. Żyła, and N. Sprutta, Chem. Rev. 117, 3479 (2016).
- Ito, Ozaki, and Itami (2017) H. Ito, K. Ozaki, and K. Itami, Angew. Chem. Int. Ed. 56, 11144 (2017).
- Wang et al. (2017) Y. Wang, H. Guo, S. Ling, I. Arrechea-Marcos, Y. Wang, J. T. López Navarrete, R. P. Ortiz, and X. Guo, Angew. Chem. Int. Ed. 56, 9924 (2017).
- Dean et al. (2010) C. R. Dean, A. F. Young, I. Meric, C. Lee, L. Wang, S. Sorgenfrei, K. Watanabe, T. Taniguchi, P. Kim, K. L. Shepard, and H. J, Nat. Nanotechnol. 5, 722 (2010).
- Li, Zhao, and Yang (2013) X. Li, J. Zhao, and J. Yang, Sci. Rep. 3, 1858 (2013).
- Dutta and Pati (2008) S. Dutta and S. K. Pati, J. Phys. Chem. B 112, 1333 (2008).
- Ci et al. (2010) L. Ci, L. Song, C. Jin, D. Jariwala, D. Wu, Y. Li, A. Srivastava, Z. Wang, K. Storr, L. Balicas, et al., Nat. Mater. 9, 430 (2010).
- Yamijala, Bandyopadhyay, and Pati (2013) S. S. Yamijala, A. Bandyopadhyay, and S. K. Pati, J. Phys. Chem. C 117, 23295 (2013).
- Dewar, Kubba, and Pettit (1958) M. Dewar, V. P. Kubba, and R. Pettit, J. Chem. Soc. (Resumed) , 3073 (1958).
- Dewar and Dietz (1959) M. Dewar and R. Dietz, J. Chem. Soc. (Resumed) , 2728 (1959).
- Chissick, Dewar, and Maitlis (1960) S. Chissick, M. Dewar, and P. Maitlis, Tetrahedron Lett. 1, 8 (1960).
- Dewar and Dougherty (1964) M. J. Dewar and R. C. Dougherty, J. Am. Chem. Soc. 86, 433 (1964).
- Davies, Dewar, and Rona (1967) K. M. Davies, M. J. Dewar, and P. Rona, J. Am. Chem. Soc. 89, 6294 (1967).
- Fang et al. (2006) X. Fang, H. Yang, J. W. Kampf, M. M. Banaszak Holl, and A. J. Ashe, Organometallics 25, 513 (2006).
- Jaska et al. (2006) C. A. Jaska, D. J. Emslie, M. J. Bosdet, W. E. Piers, T. S. Sorensen, and M. Parvez, J. Am. Chem. Soc. 128, 10885 (2006).
- Bosdet et al. (2007a) M. J. Bosdet, C. A. Jaska, W. E. Piers, T. S. Sorensen, and M. Parvez, Org. Lett. 9, 1395 (2007a).
- Yamamoto and Takimiya (2007) T. Yamamoto and K. Takimiya, J. Am. Chem. Soc. 129, 2224 (2007).
- Bosdet and Piers (2009) M. J. Bosdet and W. E. Piers, Can. J. Chem. 87, 8 (2009).
- Matsui et al. (2017) K. Matsui, S. Oda, K. Yoshiura, K. Nakajima, N. Yasuda, and T. Hatakeyama, J. Am. Chem. Soc. (2017).
- Miyamoto et al. (1994) Y. Miyamoto, A. Rubio, M. L. Cohen, and S. G. Louie, Phys. Rev. B 50, 4976 (1994).
- Watanabe et al. (1996) M. Watanabe, S. Itoh, T. Sasaki, and K. Mizushima, Phys. Rev. Lett. 77, 187 (1996).
- Ghosh, Periyasamy, and Pati (2011) D. Ghosh, G. Periyasamy, and S. K. Pati, Phys. Chem. Chem. Phys. 13, 20627 (2011).
- Morgan and Piers (2016) M. M. Morgan and W. E. Piers, J. Chem. Soc., Dalton Trans. 45, 5920 (2016).
- Müller et al. (2014) M. Müller, S. Behnle, C. Maichle-Mössmer, and H. F. Bettinger, Chem. Commun. 50, 7821 (2014).
- Faulon (1992) J. L. Faulon, J. Chem. Inf. Comp. Sci. 32, 338 (1992).
- Pólya (1937) G. Pólya, Acta mathematica 68, 145 (1937).
- Freudenstein (1967) F. Freudenstein, Journal of Mechanisms 2, 275 (1967).
- Polya and Read (2012) G. Polya and R. C. Read, Combinatorial enumeration of groups, graphs, and chemical compounds (Springer Verlag, 2012).
- Taniguchi, Du, and Lindsey (2011) M. Taniguchi, H. Du, and J. S. Lindsey, J. Chem. Inf. Model. 51, 2233 (2011).
- Taniguchi, Du, and Lindsey (2013) M. Taniguchi, H. Du, and J. S. Lindsey, J. Chem. Inf. Model. 53, 2203 (2013).
- Yuen et al. (2018) J. M. Yuen, J. R. Diers, E. J. Alexy, A. Roy, A. K. Mandal, H. S. Kang, D. M. Niedzwiedzki, C. Kirmaier, J. S. Lindsey, D. F. Bocian, and D. Holten, J. Phys. Chem. A 122, 7181 (2018).
- Baraldi and Vanossi (2000) I. Baraldi and D. Vanossi, J. Chem. Inf. Comp. Sci. 40, 386 (2000).
- Baraldi, Fiori, and Vanossi (1999) I. Baraldi, C. Fiori, and D. Vanossi, J. Math. Chem. 25, 23 (1999).
- Dias (1985) J. R. Dias, Acc. Chem. Res. 18, 241 (1985).
- Dias (1986) J. R. Dias, J. Mol. Struct. (THEOCHEM) 137, 9 (1986).
- Dias (2007) J. R. Dias, J. Chem. Inf. Model. 47, 707 (2007).
- Balaban (1985) A. T. Balaban, J. Chem. Inf. Comp. Sci. 25, 334 (1985).
- Lukšič and Pisanski (2007) P. Lukšič and T. Pisanski, J. Chem. Inf. Model. 47, 891 (2007).
- Paton and Goodman (2007) R. S. Paton and J. M. Goodman, J. Chem. Inf. Model. 47, 2124 (2007).
- Shao, Wu, and Jiang (1996) Y. Shao, J. Wu, and Y. Jiang, J. Phys. Chem. 100, 15064 (1996).
- Shao and Jiang (1996) Y. Shao and Y. Jiang, J. Phys. Chem. 100, 1554 (1996).
- Gutman and Cyvin (1989) I. Gutman and S. Cyvin, Introduction to the Theory of Benzenoid Hydrocarbons (Springer Verlag, 1989).
- Cyvin, Brunvoll, and Cyvin (1992) B. N. Cyvin, J. Brunvoll, and S. J. Cyvin, in Advances in the Theory of Benzenoid Hydrocarbons II (Springer Verlag, 1992) pp. 65–180.
- Vöge, Guttmann, and Jensen (2002) M. Vöge, A. J. Guttmann, and I. Jensen, J. Chem. Inf. Comp. Sci. 42, 456 (2002).
- Jensen (2009) I. Jensen, J. Stat. Mech. 2009, P02065 (2009).
- Brinkmann, Grothaus, and Gutman (2007) G. Brinkmann, C. Grothaus, and I. Gutman, J. Math. Chem. 42, 909 (2007).
- Brinkmann, Caporossi, and Hansen (2002) G. Brinkmann, G. Caporossi, and P. Hansen, Journal of Algorithms 45, 155 (2002).
- Cyvin and Gutman (2013) S. J. Cyvin and I. Gutman, Kekulé structures in benzenoid hydrocarbons, Vol. 46 (Springer Science & Business Media, 2013).
- Hoffmann (1964) R. Hoffmann, J. Chem. Phys. 40, 2474 (1964).
- Neue et al. (2013) B. Neue, J. F. Araneda, W. E. Piers, and M. Parvez, Angew. Chem. 125, 10150 (2013).
- Wang et al. (2014) X.-Y. Wang, F.-D. Zhuang, R.-B. Wang, X.-C. Wang, X.-Y. Cao, J.-Y. Wang, and J. Pei, J. Am. Chem. Soc. 136, 3764 (2014).
- Long et al. (2016) G. Long, X. Yang, W. Chen, M. Zhang, Y. Zhao, Y. Chen, and Q. Zhang, Phys. Chem. Chem. Phys. 18, 3173 (2016).
- Sanyal, Manna, and Pati (2014) S. Sanyal, A. K. Manna, and S. K. Pati, J. Mater. Chem. C 2, 2918 (2014).
- Wang et al. (2015) X.-Y. Wang, A. Narita, X. Feng, and K. Müllen, J. Am. Chem. Soc. 137, 7668 (2015).
- Ishibashi et al. (2017) J. S. Ishibashi, A. Dargelos, C. Darrigan, A. Chrostowska, and S.-Y. Liu, Organometallics 36, 2494 (2017).
- Fias, Chang, and von Lilienfeld (2018) S. Fias, S. Chang, and O. A. von Lilienfeld, J. Phys. Chem. Lett. (2018).
- Zhu, Schmalz, and Klein (1997) H.-Y. Zhu, T. Schmalz, and D. Klein, Int. J. Quantum Chem. 63, 393 (1997).
- Jensen (1993) F. Jensen, Chem. Phys. Lett. 209, 417 (1993).
- Zhu et al. (1995) H. Zhu, D. Klein, W. Seitz, and N. March, Inorg. Chem. 34, 1377 (1995).
- Seifert et al. (1997) G. Seifert, P. Fowler, D. Mitchell, D. Porezag, and T. Frauenheim, Chem. Phys. Lett. 268, 352 (1997).
- Evangelisti (1997) S. Evangelisti, Int. J. Quantum Chem. 65, 83 (1997).
- Pattanayak, Kar, and Scheiner (2002) J. Pattanayak, T. Kar, and S. Scheiner, J. Phys. Chem. A 106, 2970 (2002).
- Balawender et al. (2018) R. Balawender, M. Lesiuk, F. De Proft, and P. Geerlings, J. Chem. Theory Comput. (2018).
- (71) T. Allison, “NIST Polycyclic Aromatic Hydrocarbon Structure Index,” Https://pah.nist.gov/, accessed on 2018-10-29.
- Sander, Wise et al. (1997) L. C. Sander, S. A. Wise, et al., Polycyclic aromatic hydrocarbon structure index (US Department of Commerce, Technology Administration, National Institute of Standards and Technology Gaithersburg, MD, 1997).
- (73) “Chemspider search and share chemistry,” Http://www.chemspider.com/, accessed on 2018-10-29.
- Balaban and Harary (1968) A. Balaban and F. Harary, Tetrahedron 24, 2505 (1968).
- Clar (1964) E. Clar, in Polycyclic Hydrocarbons (Springer, 1964) pp. 119–125.
- Solà (2013) M. Solà, Front. Chem. 1, 22 (2013).
- Nikolić and Trinajstić (1995) S. Nikolić and N. Trinajstić, Croat. Chem. Acta 68, 105 (1995).
- Vukičević and Trinajstić (2004) D. Vukičević and N. Trinajstić, Bull. Chemist & Technoligist Macedonia 23, 113 (2004).
- Randić (2004) M. Randić, J. Chem. Inf. Comp. Sci. 44, 365 (2004).
- Randić and Balaban (2018) M. Randić and A. T. Balaban, Int. J. Quantum Chem. , e25657 (2018).
- Brinkmann et al. (2010) G. Brinkmann, O. D. Friedrichs, S. Lisken, A. Peeters, and N. Van Cleemput, MATCH Commun. Math. Comput. Chem 63, 533 (2010).
- Balasubramanian (1985) K. Balasubramanian, Chem. Rev. 85, 599 (1985).
- Balasubramanian (1992) K. Balasubramanian, J. Chem. Inf. Comp. Sci. 32, 47 (1992).
- Bunker and Jensen (2006) P. R. Bunker and P. Jensen, Molecular symmetry and spectroscopy (NRC Research Press, 2006).
- Faulon, Visco Jr, and Roe (2005) J.-L. Faulon, D. P. Visco Jr, and D. Roe, Rev. Comput. Chem. 21, 209 (2005).
- Balaban (1991) A. T. Balaban, Enumeration of isomers (Abacus Press-Gordon and Breach, New York, 1991) pp. 177–234.
- Balasubramanian (1988) K. Balasubramanian, Chem. Phys. Lett. 150, 71 (1988).
- Bosdet et al. (2007b) M. J. Bosdet, W. E. Piers, T. S. Sorensen, and M. Parvez, Angewandte Chemie International Edition 46, 4940 (2007b).
- Clar (1972) E. Clar, The aromatic sextet (John Wiley and Sons, London, 1972).
- Feixas et al. (2008) F. Feixas, E. Matito, J. Poater, and M. Solà, J. Comp. Chem. 29, 1543 (2008).
- Dias (2005) J. R. Dias, J. Chem. Inf. Model. 45, 562 (2005).
- Pogodin and Agranat (2002) S. Pogodin and I. Agranat, J. Org. Chem. 67, 265 (2002).
- Dias (1990) J. R. Dias, in Advances in the Theory of Benzenoid Hydrocarbons (Springer, 1990) pp. 123–143.
- Smith and Michl (2010) M. B. Smith and J. Michl, Chem. Rev. 110, 6891 (2010).
- Bhattacharyya and Datta (2017) K. Bhattacharyya and A. Datta, J. Phys. Chem. C 121, 1412 (2017).
- George and Morris (1968) G. George and G. Morris, J. Mol. Spectrosc. 26, 67 (1968).
- Malloci et al. (2007) G. Malloci, G. Mulas, G. Cappellini, and C. Joblin, Chem. Phys. 340, 43 (2007).
- Malloci et al. (2011) G. Malloci, G. Cappellini, G. Mulas, and A. Mattoni, Chem. Phys. 384, 19 (2011).
- Mishra, Fischer, and Bäuerle (2009) A. Mishra, M. K. Fischer, and P. Bäuerle, Angew. Chem. Int. Ed. 48, 2474 (2009).
- Ramakrishnan et al. (2015) R. Ramakrishnan, M. Hartmann, E. Tapavicza, and O. A. von Lilienfeld, J. Chem. Phys. 143, 084111 (2015).
- AlKaabi, Dasari, and Hoffmann (2012) K. AlKaabi, P. L. Dasari, and R. Hoffmann, J. Am. Chem. Soc. 134, 12252 (2012).
- Niedenzu and Dawson (2012) K. Niedenzu and J. W. Dawson, Boron-nitrogen compounds, Vol. 6 (Springer Science & Business Media, 2012).
- Zeng, Ananth, and Hoffmann (2014) T. Zeng, N. Ananth, and R. Hoffmann, J. Am. Chem. Soc. 136, 12638 (2014).
- Nagashima et al. (1995) A. Nagashima, N. Tejima, Y. Gamou, T. Kawai, and C. Oshima, Phys. Rev. Lett. 75, 3918 (1995).
- Ramakrishnan et al. (2014) R. Ramakrishnan, P. O. Dral, M. Rupp, and O. A. Von Lilienfeld, Sci. Data 1, 140022 (2014).
- Hanwell et al. (2012) M. D. Hanwell, D. E. Curtis, D. C. Lonie, T. Vandermeersch, E. Zurek, and G. R. Hutchison, J. Cheminformatics 4, 17 (2012).
- Rappé et al. (1992) A. K. Rappé, C. J. Casewit, K. Colwell, W. A. Goddard III, and W. Skiff, J. Am. Chem. Soc. 114, 10024 (1992).
- Neese (2012) F. Neese, Wiley Interdiscip. Rev. Comput. Mol. Sci. 2, 73 (2012).
- Perdew et al. (1999) J. P. Perdew, S. Kurth, A. Zupan, and P. Blaha, Phys. Rev. Lett. 82, 2544 (1999).
- Perdew et al. (2004) J. P. Perdew, J. Tao, V. N. Staroverov, and G. E. Scuseria, J. Chem. Phys. 120, 6898 (2004).
- Jensen (2008) K. P. Jensen, Inorg. Chem. 47, 10357 (2008).
- Irfan et al. (2012) A. Irfan, N. Hina, A. G. Al-Sehemi, and A. M. Asiri, J. Mol. Model. 18, 4199 (2012).
- Zhang et al. (2013) J. Zhang, Y.-H. Kan, H.-B. Li, Y. Geng, Y. Wu, Y.-A. Duan, and Z.-M. Su, J. Mol. Model. 19, 1597 (2013).
- El-Shishtawy et al. (2014) R. M. El-Shishtawy, A. M. Asiri, S. G. Aziz, and S. A. Elroby, J. Mol. Model. 20, 2241 (2014).
- Liu et al. (2014) X. Liu, Z. Cao, H. Huang, X. Liu, Y. Tan, H. Chen, Y. Pei, and S. Tan, J. Power Sources 248, 400 (2014).
- Weigend and Ahlrichs (2005) F. Weigend and R. Ahlrichs, Phys. Chem. Chem. Phys. 7, 3297 (2005).
- Weigend (2008) F. Weigend, J. Comp. Chem. 29, 167 (2008).
- Paci et al. (2006) I. Paci, J. C. Johnson, X. Chen, G. Rana, D. Popović, D. E. David, A. J. Nozik, M. A. Ratner, and J. Michl, J. Am. Chem. Soc. 128, 16546 (2006).
- Greyson et al. (2010) E. C. Greyson, J. Vura-Weis, J. Michl, and M. A. Ratner, J. Phys. Chem. B 114, 14168 (2010).
- Hill et al. (2000) I. Hill, A. Kahn, Z. Soos, and R. Pascal Jr, Chem. Phys. Lett. 327, 181 (2000).
- Bredas (2014) J.-L. Bredas, Materials Horizons 1, 17 (2014).
- Bäppler et al. (2014) S. A. Bäppler, F. Plasser, M. Wormit, and A. Dreuw, Phys. Rev. A 90, 052521 (2014).
- Lanzani (2006) G. Lanzani, Photophysics of molecular materials: from single molecules to single crystals (John Wiley & Sons, 2006).
- Shoichet (2004) B. K. Shoichet, Nature 432, 862 (2004).
- Tu et al. (2012) M. Tu, B. K. Rai, A. M. Mathiowetz, M. Didiuk, J. A. Pfefferkorn, A. Guzman-Perez, J. Benbow, C. R. Guimarães, S. Mente, M. M. Hayward, et al., J. Chem. Inf. Model. 52, 1114 (2012).
- Balawender et al. (2013) R. Balawender, M. A. Welearegay, M. Lesiuk, F. De Proft, and P. Geerlings, J. Chem. Theory Comput. 9, 5327 (2013).
- Reymond (2015) J.-L. Reymond, Acc. Chem. Res. 48, 722 (2015).
- Kranz et al. (2018) J. J. Kranz, M. Kubillus, R. Ramakrishnan, O. A. von Lilienfeld, and M. Elstner, J. Chem. Theory Comput. 14, 2341 (2018).
- Li et al. (2018) H. Li, C. Collins, M. Tanha, G. J. Gordon, and D. J. Yaron, J. Chem. Theory Comput. (2018).
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Wong et al. (2006) K.-T. Wong, T.-Y. Hwu, A. Balaiah, T.-C. Chao, F.-C. Fang, C.-T. Lee, and Y.-C. Peng, Org. Lett. 8 , 1415 (2006).
- 2Marcon, von Lilienfeld, and Andrienko (2007) V. Marcon, O. A. von Lilienfeld, and D. Andrienko, J. Chem. Phys. 127 , 064305 (2007).
- 3Campbell et al. (2010) P. G. Campbell, L. N. Zakharov, D. J. Grant, D. A. Dixon, and S.-Y. Liu, J. Am. Chem. Soc. 132 , 3289 (2010).
- 4Jiang, Li, and Wang (2013) W. Jiang, Y. Li, and Z. Wang, Chem. Soc. Rev. 42 , 6113 (2013).
- 5Al-Hamdani et al. (2014) Y. S. Al-Hamdani, D. Alfè, O. A. von Lilienfeld, and A. Michaelides, J. Chem. Phys. 141 , 18C 530 (2014).
- 6Hashimoto et al. (2014) S. Hashimoto, T. Ikuta, K. Shiren, S. Nakatsuka, J. Ni, M. Nakamura, and T. Hatakeyama, Chem. Mater. 26 , 6265 (2014).
- 7Gong et al. (2015) Y. Gong, H. Fei, X. Zou, W. Zhou, S. Yang, G. Ye, Z. Liu, Z. Peng, J. Lou, R. Vajtai, B. I. Yakobson, J. M. Tour, and P. M. Ajayan, Chem. Mater. 27 , 1181 (2015).
- 8Stepien et al. (2016) M. Stepien, E. Gonka, M. Żyła, and N. Sprutta, Chem. Rev. 117 , 3479 (2016).
