Chinese Sunspot Drawing and Its Digitization-(I) Parameter Archives
G.H. Lin, X.F. Wang, S. Liu, X. Yang, G.F. Zhu, Y.Y. Deng, H.S. Ji,, T.H. Zhou, L.N. Sun, Y.L. Feng, Z.Z. Liu, J.P. Tao, M.X. Ben, J. Lin, M.D., Ding, Z. Li, S. Zheng, S.G. Zeng, H.L. He, X.Y. Zeng, Y. Shu, X.B. Sun

TL;DR
This paper presents a comprehensive digitized dataset of Chinese historical sunspot drawings from 1925 to 2015, filling a long-term observational gap and supporting solar activity research.
Contribution
It is the first systematic digitization project of Chinese sunspot drawings, creating a valuable long-term data archive for solar physics research.
Findings
Constructed a dataset with scanned images and parameters from 1925 to 2015.
Filled a 90-year observational gap in Chinese solar data.
Provided detailed information for long-term solar cycle studies.
Abstract
Based on the Chinese historical sunspots drawings, a data set consisting of the scanned images and all their digitized parameters from 1925 to 2015 have been constructed. In this paper, we briefly describe the developmental history of sunspots drawings in China. This paper describes the preliminary processing processes that strat from the initial data (inputing to the scanning equipment) to the parameters extraction, and finally summarizes the general features of this dataset. It is the first systematic project in Chinese solar-physics community that the historical observation of sunspots drawings were digitized. Our data set fills in an almost ninety years historical gap, which span 60 degrees from east to west and 50 degrees from north to south and have no continuous and detailed digital sunspot observation information. As a complementary to other sunspots observation in the world,…
| A catalogue of sunspot observations | 165 BC- | Wittmann and Xu, (1987) |
| covering the period 165 BC to AD 1684 | AD 1684 | |
| University of Extremadura | 1610-2010 | http://haso.unex.es/haso/ |
| (UE, Spanish) | ||
| Greenwich Photoheliographic Results | 1872-1976 | Howard et al., (1984) |
| (GPR, Britain) | Howard, (1991) | |
| Sivaraman et al., (1993) | ||
| Willis et al., (2013) | ||
| Kodaikanal Solar Observatory | 1906-1987 | Howard et al., (1984) |
| (KK, India) | Howard, (1991) | |
| Sivaraman et al., (1993) | ||
| Ravindra et al., (2011) | ||
| Mount Wilson Observatory | 1917-1985 | Howard et al., (1984) |
| (MW, USA) | Howard, (1991) | |
| Sivaraman et al., (1993) | ||
| Ulrich et al., (2004) | ||
| Debrecen Photoheliographic Data | 1974-now | Baranyi et al., (2001) |
| (DPD, Hungarian) | Baranyi et al., (2015) |
| 1937-1945 | Observation continued during the Japanese occupation. |
| After the war, part of data, the 16 cm objective lens | |
| and photographic devices were plundered. | |
| 1947 | Sunspots drawing resumed with a 4-inch lens and 14.4 cm solar |
| diameter on the projection board. | |
| 1954 | Objective lens was updated to 15 cm. Focal length 2.2 m. |
| Projected solar diameter is the national uniform standard 17.4 cm. | |
| 1978-1983 | Observatory was rescinded. Observation stopped. |
| 1983-1988 | Supported by Chinese Academy of Sciences, observation was |
| resumed using a modified 20 cm aperture guide telescope with | |
| a 3.5 m focal length and solar projection diameter 17.4 cm. | |
| 1988-now | 32 cm refractive telescope equipped with sunspots |
| fine structure carmera. The penumbra fibril resolution reaches 0.7′′. | |
| The penumbra of sunspots in the drawings can be seen. |
| Observatory/Station (Abbreviation) | Observation Years |
| Qingdao Observatory Station (QDOS) | 1925 |
| N, E | 1947-1977 |
| 1980 | |
| 1982-1989 | |
| 1991-1992 | |
| 1995-1996 | |
| 2000-2009 | |
| 2011-now | |
| Purple Mountain Astronomical Observatory (PMO) | 1954-1963 |
| N, E | 1965-1980 |
| 1982 | |
| 1985-2011 | |
| Yunnan Astronomical Observatory (YNAO) | 1957-2016 |
| N, E | 2018-now |
| Sheshan Observatory Station (SSOS) | 1952-1964 |
| N, EE | |
| Beijing Planetarium (BJP) | 1979-1982 |
| N, E | 1989-1999 |
| Nanjing University (NJU) | 1986-2002 |
| N, E | 2004-2015 |
| Observatory/Station | Diameter | Focal Length | Diameter of the sun |
| QDOS | 32 cm | 350 cm | 17.4 cm |
| PMO | 20 cm | 350 cm | 17.4 cm |
| YNAO | 12.7 cm | 195cm | 17.4 cm |
| NJU | 43 cm | 217cm | 20 cm |
| Date | Order | Latitude | Longitude | Type | r | N | ||
| 19580107 | 43 | 21 | 49 | C | 0.4 | 0.4 | 70 | 4 |
| 19580107 | 44 | 16 | 48 | C | 1.4 | 0.7 | 68 | 5 |
| 19580107 | 45 | -19 | 40 | A | 0.2 | 0.2 | 58 | 2 |
| 19580107 | 47 | -22 | 33 | C | 0.8 | 0.7 | 52 | 5 |
| 19580107 | 48 | 4 | 14 | C | 0.6 | 0.5 | 24 | 4 |
| 19580107 | 50 | -17 | 1 | C | 1 | 0.9 | 20 | 3 |
| 19580107 | 51 | 11 | -2 | F | 9 | 6 | 22 | 64 |
| 19580107 | 52 | -28 | -38 | C | 1.2 | 1 | 60 | 4 |
| 19580107 | 53 | -17 | -40 | D | 3.6 | 2.3 | 55 | 9 |
| 19580107 | 54 | -11 | -56 | E | 10 | 4 | 70 | 21 |
| 19580107 | 55 | -16 | 5 | B | 0.1 | 0.1 | 21 | 4 |
| 19580107 | 56 | -38 | -13 | C | 0.6 | 0.4 | 52 | 6 |
| 19580107 | 57 | 14 | -31 | A | 0.1 | 0.1 | 50 | 2 |
| 19580107 | 58 | 26 | -68 | J | 0.8 | 0.8 | 82 | 1 |
| Error Type | Error Year | Number of Error | Sum Errors | Total Errors |
| B0 | 19610630 | 1 | 3 | |
| 19631105 | 1 | 6 | ||
| 19770803 | 1 | printed | ||
| P | 19691217 | 1 | 3 | parts |
| 19750621 | 1 | |||
| 19830528 | 1 | |||
| 19591201 | 1 | 3 | ||
| 19681109 | 1 | 29 | ||
| 19830228 | 1 | handwritten | ||
| 19680331 | 1 | 3 | parts | |
| 19681109 | 2 | |||
| r | 19680331 | 2 | 11 | |
| 19681109 | 1 | |||
| 19741010 | 1 | |||
| 19781208 | 1 | |||
| 19810128 | 2 | |||
| 19890707 | 1 | |||
| 19890624 | 2 | |||
| 19920225 | 2 | |||
| Lati. | 19680331 | 1 | 4 | |
| 19681109 | 1 | |||
| 19810421 | 1 | |||
| 19810503 | 1 | |||
| 19931226 | 1 | |||
| Long. | 19800611 | 1 | 2 | |
| 19890409 | 1 | |||
| Num | 19730505 | 1 | 4 | |
| 19780411 | 1 | |||
| 19830528 | 1 | |||
| 20010517 | 1 | |||
| Group Type | 19890624 | 1 | 2 | |
| 20040909 | 1 |
| Obs. | Years | Num. of | Printed Part | Handwritten Part | Total Num. of |
| Sunspot Drawing | (Records) | (Records) | Information | ||
| YNAO | 1957-2015 | 15752 | 15752 | 89262 | 1051422 |
| BJP | 1979-1982 | 3764 | 3764 | 28647 | 170704 |
| 1989-1999 | |||||
| PMO | 1954-1963 | 12508 | 12508 | 57570 | 626765 |
| 1965-1980 | |||||
| 1982 | |||||
| 1985-2011 | |||||
| QDOS | 1925 | 12634 | 12634 | 48424 | 395047 |
| 1947-1977 | |||||
| 1980 | |||||
| 1982-1989 | |||||
| 1991-1992 | |||||
| 1995-1996 | |||||
| 2000-2009 | |||||
| 2011-2016 | |||||
| SSOS | 1952-1964 | 2415 | 2415 | 11586 | 59471 |
| Obs. | Years | Date |
| YNAO | 1958 | 0208-0214 |
| 1122-1201 | ||
| 1959 | 0221-0301 | |
| 0524-0531 | ||
| 1960 | 0523-0613 | |
| 0713-0721 | ||
| 1963 | 0101-0109 | |
| 0608-0614 | ||
| 1016-1022 | ||
| 1226-0101 | ||
| PMO | 1956 | 0116-0123 |
| 0127-0204 | ||
| 0320-0402 | ||
| 0605-0612 | ||
| 0731-0807 | ||
| 0824-0831 | ||
| 0913-0925 | ||
| 1957 | 0107-0118 | |
| 0131-0213 | ||
| 0420-0503 | ||
| 0727-0807 | ||
| 1123-1130 | ||
| 1206-1215 |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Chinese Sunspot Drawing and Its Digitization-(I) Parameter Archives
G.H. Lin11affiliationmark: 22affiliationmark: , X.F. Wang11affiliationmark: 22affiliationmark: , S. Li11affiliationmark: 22affiliationmark: , X. Yang11affiliationmark: 22affiliationmark: , G.F. Zhu11affiliationmark: 22affiliationmark: , Y.Y. Deng11affiliationmark: 22affiliationmark: , H.S. Ji33affiliationmark: , T.H. Zhou33affiliationmark: , L.N. Sun44affiliationmark: , Y.L. Feng55affiliationmark: , Z.Z. Liug55affiliationmark: , J.P. Taog55affiliationmark: , M.X. Beng55affiliationmark: , J. Lin55affiliationmark: , M.D. Ding66affiliationmark: 77affiliationmark: , Z. Li66affiliationmark: 77affiliationmark: , S. Zhengg88affiliationmark: , S.G. Zen88affiliationmark: , H.L. He88affiliationmark: , X.Y. Zeng88affiliationmark: , Y. Shu88affiliationmark: , X.B. Sun88affiliationmark:
1Key Laboratory of Solar Activity, Datun Rd. 20A, Chaoyang District, Beijing, 100101, P. R. China
2National Astronomical Observatories, CAS, Datun Rd. 20A, Chaoyang District, Beijing, 100101, P.R. China
3Purple Mountain Observatory, CAS, 2 Beijing Xi Road, Nanjin, Jiangsu, 210008, P. R. China
4Qingdao observatory,Purple Mountain Observatory, CAS
5Yunnan Astronomical Observatoriy,CAS,396 Yanfangwang,Guandu District, Kunming, Yunnan,650216, P. R. China
6School of Astronomy & Space Science, Nanjing university, 22 Hankou Road, Gulou District, Nanjng, Jiangsu, 210093, P. R. China
7Key Laboratory for Modern Astronomy and Astrophysics, Nanjing University,Ministry of Education, Nanjing 210023, P. R. China
8College of Science, China Three Gorges University, Yichang 443002, P. R. China
Abstract
Based on the Chinese historical sunspots drawings, a data set consisting of the scanned images and all their digitized parameters from 1925 to 2015 have been constructed. In this paper, we briefly describe the developmental history of sunspots drawings in China. This paper describes the preliminary processing processes that strat from the initial data (inputing to the scanning equipment) to the parameters extraction, and finally summarizes the general features of this dataset. It is the first systematic project in Chinese solar-physics community that the historical observation of sunspots drawings were digitized. Our data set fills in an almost ninety years historical gap, which span 60 degrees from east to west and 50 degrees from north to south and have no continuous and detailed digital sunspot observation information. As a complementary to other sunspots observation in the world, our dataset provided abundant information to the long term solar cycles solar activity research.
Sunspot, Solar Cycle, Sunspot Drawings, Digitalization, Physical Parameters, Big Data
††slugcomment: Not to appear in Nonlearned J., 45.
1 Introduction
The sunspot observations in the past provide the most direct data resource for the changes in solar activity. The conventional sunspot observations not only make to realize the variation of sunspots themselves, but also reveal some hints to understand other solar activity phenomena such as solar magnetic field, solar rotation, white-light flares etc. The formation of sunspots leads to the formation of active regions, which are the strongest magnetic fields in solar photosphere and are formed through the magnetic flux emergence from under the photosphere. Due to the magnetic field, the active region containing sunspots forms the main location where violent eruptions such as solar flares can occur (Li, 2015). The intensive solar activity could have an impact on the Earth’s magnetosphere and ionosphere, e.g. the telecommunication could be seriously hindered or even suddenly interrupted for a while, which will cause serious threat to the high technology system safety of aircraft, ships, and satellites, as well as telecommunication, facsimile and so on.
The long period temporal and spatial distribution evolution of sunspots reflects the process of magnetic dynamo from an interior of the sun. The research of solar dynamo problem relies heavily on long-term historical data accumulated by different human being. Therefore, it is significant to improve the time zone coverage of the observations and the accuracy of the extracted information from historical observing data, for understanding and predicting the long-term solar behavior, and also for studying the Sun influence on the solar–terrestrial space environment and our human activities (Tang, 2014; Tang, 2015).
The historical sunspot observations in the world are widely spread over different longitudes of the globe. The major sunspot information databases are given in Table 1. In addition to these sunspot information database given in Table 1, the World Data Center-SILSO(Vanlommel et al., 2004; Clette et al., 2014, http://www.sidc.be/silso/) presents the production of total sunspot number (daily total sunspot number [1/1/1818 - now], Monthly mean total sunspot number [1/1749 - now], Yearly mean total sunspot number [1700 - now]) and the number of sunspot groups from 1610 to 2010. In this paper, our Chinese sunspot database is presented to fill the longitudinal gap in long term observations of sunspots from 1925 up to now to a great extent.
Ancient China attached the importance to astronomical records because the Emperor’s divine rights were explained to be granted by Heaven. Actually, China seems to be the first country in the world to record sunspots (traceable around 364 B.C., Leo, 2011) and has long-term text records of sunspots, however, scientific quantification was not implemented. The telescope was introduced in the ancient China between the Ming and Qin Dynasties in 1622 by the preacher Johann Adam Schall Von Bell, which played an important role in the brutal palace battle of calendar between the old one and the new one through the observation. From solar observations, the people at that time recognized that the Sunspots ”drift from east to west, 14 days along the diameter, the larger one reducing the Sun’s luminosity” (Wang et al., 2008). China may have a lot of sunspots drawings through telescopes, but at present, only the largest relatively comprehensive and systematic observations are gathered in our data set. This data set consists of observations from six stations, but the following three stations contribute the vast majority of data to our archives.
Qingdao Observing Station (QDOS) is the earliest in modern China to study sunspots using telescope, but its sunspot drawings had a rather difficult history (see Table 2). Its first observation in current archives was recorded by Gao Pingzi (1888–1970) on 1st May, 1925. An example of the same year observation is shown in Figure 1. He used a 16 cm equatorial telescope left by German and installed a projector board behind the eyepiece. Then sunspots and plage were depicted by hand on a paper paved on this board, adjusting the solar diameter to 18.2 cm. One picture is drawn every day except when it was overcast.
Purple Mountain Astronomical Observatory (PMO) carried out sunspot drawings with a 20 cm aperture refractive telescope made by Zeiss company in 1934. In 1937, PMO was forced to move away from Nanjing, but in the same year, the PMO staff set up an observatory at the top of Fenghuang Mountain in the vicinity of Kunming city. Sunspot observations were made with an 8 cm refractive telescope and relative sunspot numbers were published every half year. After the victory of the war against Japan, most of PMO staff moved back to Nanjing to rebuild the observatory that was in destruction. In 1954, all sunspot drawing data in China was collected and analyzed at PMO. Since then, regular observations of daily sunspot drawings in China started, and in 1957, all sunspot drawings were required to have solar disk image of 17.4 cm.
Yunan Astronomical Observatory (YNAO) was developed from PMO’s station in Fenghuang Mountain after the war with Japan. YNAO’s sunspots drawings tradition was inherited from PMO and its observation continuity make it best in our data set. So, YNAO contributes the largest part to our sunspots drawings archive. The sunspot drawings from YNAO were evaluated by international colleagues. Sunspots area data for each day were found to have the smallest random error in the world and they had good quality to supplement the global sunspot data, and also filled time zone gaps (Baranyi et al., 2001; Balmaceda et al., 2009).
Chinese Solar-Geophyscal Data (CSGD, printed journal) from 1971 to 2001, has published the sunspots’ daily relative numbers, areas and predicted smoothed numbers, which all came from the records of suspot drawings from Chinese observatories (Yan et al., 2018). Parts of CSGD are published online by NGDC (National Geophysical Data Center), whose website is http://www.ngdc.noaa.gov/nndc.
In general, Chinese sunspot drawings contain almost complete physical information realized in that era. However, these materials were not easy to be completely preserved due to the difficulties in long history, humidity, decay, pests damage, relocations, etc. In order to better utilize the scientific value of these data, it is necessary to extract information completely, accurately, and reliably from Chinese historical sunspot drawings. Also, make the data available through network sharing and keep the data for long term use by digitizing the sunpot drawings so that the information from the data is preserved.
With the support of National Basic Research Program from the Ministry of Science and technology of China, the digitization of Chinese historical sunspot drawings commenced from May 2014. By the end of April 2018, we finished the digitization of sunspots drawings from six Chinese observing stations and also completed their parametric extraction. In the process of digitization we go through these processe: the scanner selection, the scanning results evaluation of original image, the automatic extraction of handwritten parameters, the proof of extracting parameters by computer program, and manually checking through sampling. Here, we presented a part of statistical results from the entire analysis, and the details of process are given in the following six sections.
2 Observation Data of Chinese Sunspot Drawing
Historically, there mainly exists six observing stations for sunspot drawings in China: PMO, YNAO, QDOS, Sheshan Observing Station (SSOS), Beijing Planetarium (BJP) and Nanjing University (NJU). Table 3 lists Chinese sunspot drawing data, where the name, longitude and latitude of station, time range, number of images are recorded. The data continuity is best achieved by YANO, with a time span from 1957 to now (even if afterwards such traditional observations were no longer supported by operational funds), of 62 years. The second best, PMO with a time span of 57 years. At each individual station on sunny days, one sunspot image is drawn. In China, the first sunspot drawing was made at the QDOS in 1925. Unfortunately, the observations were soon interrupted by Japanese invasion of Nanjing in 1937 and sunspot drawing data of nearly two years were lost. Additionally, the observations were interrupted several times due to social turmoil of that time. Regular observations of daily sunspot drawings took place only after 1947. Since then, the observatory has been continuously performing observations. Table 3 shows that there are different observation periods for data of QDOS and PMO. However, from the perspective of complementarity of observations, there exists continuous sunspot observation data in China from 1947 to now. The main parameters of sunspot drawing telescope in all observing stations are listed in Table 4. In this paper, we mainly discuss the data processing of sunspot drawings by PMO and YNAO as examples among six observing stations. Figure 2 and 3 are sunspot drawing telescopes of PMO and YNAO, respectively.
2.1 Processes of Sunspot Drawing
Traditional projection method was used by PMO, YNAO, etc. wherein they obtain sunspots drawing by projecting an enlarged image of the Sun onto a projection plate. At first the preprinted sunspot observation record paper is fixed on the projection plate, and the position of directions (east, west, south and north) is determined accurately. Then, slowly the telescope is moved to make sure that the solar projection always overlaps with the solar limb printed on the recording paper. Finally, the specific information of sunspots is accurately drawn with a pencil. For example, according to the projection image of sunspots on the projection plate, the penumbra of the sunspots is firstly drawn with a hard pencil, and then the umbra of the sunspots is traced with a soft pencil. The western sunspots are drawn before the eastern ones. The larger sunspots are drawn first and then the smaller ones. After the observations, other conventional observing information is recorded: the date of observation, Beijing time (standard time of 120 degrees east longitude), international standard time (UTC) and Carrington rotation number (on November 9, 1853, the moment, when prime meridian switch to center of solar disk, is defined as the beginning of the first solar rotation, from then the solar rotations are numbered, the start and end dates and numbers for each solar rotation can be found in astronomical almanacs). At the same time, through the astronomical almanacs , the related parameters at the observation time are calculated, such as P (the position angle between the geocentric north pole and the solar rotational north pole, in the Carrington coordinates of the solar surface, positive toward the east and negative toward the west), B0 (heliographic latitude of the center of the solar disk observed at the universal time of zero on that day), L0 (heliographic longitude of the center of the solar disk at zero universal time on the observation day), L (heliographic longitude of the center of solar disk at the time of the observation). These data can be used to obtain heliographic coordinates, i.e. to calculate the latitude and longitude of the sunspots. Also, the area of the sunspot group can also be measured: A special transparent glass plate is placed on the projection plate. Each square area in the glass plate is 1 square millimeter and the number of squares corresponding to each sunspot group is recorded (Li et al., 2016). The number of squares contained in a sunspot group can be converted into the area of sunspots by correlation calculation. Then, the sunspot group is numbered and other information such as the coordinates of the sunspot group and the types of the sunspot group are recorded. Finally, the number of sunspots groups, the number of sunspots, the Wolf number on the southern hemisphere, the northern hemisphere and the whole solar surface are calculated, respectively (Rue et al., 2012).
2.2 The Content of Sunspot Drawing
An example of sunspot drawings is shown in Figure 4, in which the content of the rectangular and oval boxes marked with red color are described below. The rectangular boxes are printed at fixed position on the observation paper. Rectangular box 1 contains:
Observing day number within the year.
Date of observation.
Beijing time.
Universal Time Coordinated (UTC).
Rectangular box 2 contains:
P angle: the position angle between the geocentric north pole and the solar rotational north pole measured eastward from geocentric north.
B0 : heliographic latitude of the central point of the solar disk.
L0 : heliographic longitude of the central point of the solar disk.
L: heliographic longitude of the center of solar disk at the time of the observation.
Rectangular box 3 contains various combinations:
g: number of sunspot group.
gN: number of sunspot groups in the northern hemisphere.
gS: number of sunspot groups in southern hemisphere.
gNS: total number of sunspot groups, gNS=gN+gS.
f: total number of sunspot.
fN: number of sunspots in northern hemisphere.
fS: number of sunspots in southern hemisphere.
fNS: total number of sunspots, fNS=fN+fS.
R: Wolf number.
RN: Wolf number in the northern hemisphere.
RS: wolf number in the southern hemisphere.
RNS: total Wolf number, RNS=RN+RS.
K: the normalized coefficient of sunspot relative number, that related to site instrument and observere and it varied with time. Here Rz published in Zurich was used as the standard, K = Rz/Ry (Ry=RNS). After the stop of Zurich, the standard of international sunspot relative number Ri is used as standard.
K2: It is similar to K, but for normalized coefficient of sunspot area.
The observer’s last name.
Rectangular box 4 contains:
Weather condition: e.g. thin cloud.
Seeing: e.g. 3. Their values range from 1 to 5, in which 1 is the best.
The information of the oval boxes are the random handwritten region information. The contents are the description of sunspots on the surface, including:
Sunspot group number.
Longitude, latitude.
Structure types of sunspot: e.g. CHI, BXI, here the McIntosh classification was used (McIntosh, 1990).
: total area of the sunspot group, one in a million of area of solar disk.
: the area of the largest sunspot in the sunspot group, units of millionth of thesolar disk, referred to as the area of the largest sunspots.
r: linear distance between center of mass of the sunspot group and the center of the solar disk in units of mm.
For example in Figure 4 the oval handwritten area for the No. 63 sunspot group gives, latitude and longitude of -16.0∘ and + 44.0∘, sunspot group structure type CSI group area of = 1.3, the area of the largest sunspot = 1.1, linear distance between center of mass of sunspot group and the center of the solar disk r = 61.
3 Digital Processing
For the digital processing, we visited World Data Center in Brussels, Belgium and made collaboration with Dr. Frederic Clette who was converting Europe’s historical sunspot drawings into digital files. Referring to their standard suggestions and many scanning trials, we established a few regulations for the guidance of ever-repeated scanning work.
3.1 Scanner selection
For the selection of the scanner, there are three basic requirements, 1) resolution: the description of the smallest spot on the sunspot drawing is about the size of the pencil tip (0.05mm). To clearly scan such a point, at least 2 pixels are needed, which requires the resolution of the scanner to be 1000 ppi (pixels per inch). 2) gray level: in order to effectively distinguish the umbra from the penumbra (due to the long storage time of the early drawings and light color), the scanner with the maximum gray level is required. 3) width: A3. After careful investigation, Zhongjing 1960XL scanner that met the above requirements was chosen.
3.2 Scanning Rules
The rules of scanning originates from two aspects: Visions and qualities. The sources of different time (year) and observers are selected and put into the scaner. The main operation is to adjust the brightness and contrast, for example, when the brightness of gray scale of image is too bright or too dark, then drag the brightness slider to change it. If the brightness is too high, the image will look white. It will be too dark if the brightness is too low. The brightness of the image should be moderate while dragging the brightness slider. For other parameters, the local changes can be made according to the similar adjustment method. The team members carried out visual inspection until everyone was satisfied, and then formulated the technical parameters and operational rules for the scanning of sunspots. Quality assessment was carried out by comparing the relative number of sunspots between digital and original data to find out uncover potential, and repeatedly verified, the preparation ends when requirements (relative error 5%) are satisfied. After the scanner is adjusted according to the above rules, the specific rules of scanning process is as follows: 1) prepare a piece of pre-scanned original sunspot drawing, check the condition of the folding, wrinkle and dust on the surface, and carefully clean the paper surface and flatten it. 2) Put the original image face down into the scanner, adjust the position, and align the scanned paper with the corner of the scanner. Make sure the placement position that the north–south direction of the observation paper is perpendicular to the scanning direction, as shown in Figure 5. 3) In order to save all the details of the original observation, the 24-bit BMP (Bit Map Picture) color image format was chosen to save the image, resulted in the size of single image about 90 MB. Although this data format is more than 10 times larger than the commonly used JPG (Joint Photo Graphic) format, the original observations can be truly recorded without compression distortion (Figure 4), which allowed us to work with great precision to analyze each of the images.
3.3 File naming and Storage
Since the data was produced from multiple stations, the file name and storage directory of all observational data that were scanned are specified for the convenience of their database query. The file name is consistent with the basic rules of international astronomical naming, and the unified form is
<observatory>$$\_$$<sd>$$\_$$<year><month><date>$$\_$$<hour><minute>$$\_$$<observer>.bmp
The observatory refers to the source of observational data, the abbreviation and corresponding full name of China observatory can refer the first column of Table 3, Sd refers to the type of observational data which indicates sunspot drawing. The format of observation time is year, month, date, hour and minute (UTC). For example, for the sunspot drawing painted by YNAO observer whose last name is Ye at 5:40 on February 18, 1957, the file name after digitization is as follows:
Ynao$$\_$$sd$$\_$$19570218$$\_$$0540$$\_$$ye.bmp
After scanning, the file is stored on the server, and the folder is set as follows: /1957/02/YNAO/
4 Automated Extraction of Parameters
Based on the scanned image and convolutional neural network (CNN), an identification software is developed to recognize the sunspot drawing parameters automatically, which is further described in Section 2.2 of the automatic identification of the content in the rectangular (printed parts) and oval box (hand-written parts). Figure 6 shows the interface of identification software, wherein the content on the left is the recognition results shown by the middle image, which includes the whole content of rectangular and oval box. For a detailed introduction of this software, we refer the reader to a former paper (Zheng et al., 2016).
4.1 Parameter Record Format
The identified parameters are stored in the ”.txt” document format (on the right of Figure 7) and the”.csv” table format (see Table 5,6), which are available to users for further processing them as required. In the TXT document format, the printed part of the data content begins with ’#’, ’,’ is separator, ’;’ means end, ’?’ indicates that the information is unknown (or, due to the longer time the handwriting is blurred and cannot be recognized; or here is no original information), there is no space between the separator. The handwritten part of the data content begins with ’’, ’,’ is separator, ’;’ is end, ’?’ means the information is unknown (the reason is the same as above), as shown in Figure 7 on the right. An example of records is shown in Figure 7, wherein the information is labeled by red/green color boxes that were recorded as red/green rectangle labeled in documents, respectively.
5 Accuracy Verifications of Extraction Parameter
Due to the large number of sunspot drawings, YNAO alone has 15,752 sunspot drawings, and there are 1,051,422 records of information in rectangular and oval boxes that contained in images. In order to trust the original data and ensure the reliability of digital data, the following verification methods are used to check data, and finally carried out the accuracy test.
5.1 Check the Accuracy of Values by Data Type
Each sunspot drawing contains different types of sunspot data, and the length (number of digits) of different types of values are different, but the data format of the same type of data is uniform. For example, in Table 5, number, date of observation, Beijing time, UTC, gN, gS, gNS, fN, fS, fNS and so on are positive integers, but P, B0, L0 and so on are decimals. The numeric type of the data can be determined by verifying the number of decimal places in the data, and checking whether the number of decimal places is consistent with the length of the standard numerical values.
5.2 Abnormal Value Elimination
Most of the sunspot drawing recorded by YNAO are limited to a specific range, e.g., the year of observation is 1958-2015; the monthly range is from January to December; For the range of days, the longer month is from 1 to 31 and the shorter month is from 1 to 30; February is from 1 to 29/28 for leap/common; P angle varies from -26.31 to 26.31; B0 varies from -7.23 to 7.23; seeing is an integer between 1 and 5. Different types of data are sorted separately, and then verified to determine whether the data is valid and its value is within the normal range. If not, the data is then manually compared with the data recorded in the sunspot drawings of that day.
5.3 The Correlation Verification
Correlation verification consists of two parts, one of which is the conversion and comparison of relevant data in the printed part. The printed part of the sunspot drawings recorded by YNAO contains information such as Beijing time, UTC, gN, gS, gNS, fN, fS, fNS, etc. There exist certain conversion regularity, e.g., the time difference between Beijing time and UTC is about 8 hours; The number of sunspots on the visible solar surface is the sum of the number of sunspots on the northern and southern hemispheres, namely, the sum of gN and gS is equal to gNS. This rule can also be applied to fN, fS, fNS, and RN, RS, RNS. In addition, there are the following correlations between the number of sunspots, the number of sunspot groups, and the number of Wolf:
RN =10gN+fN
RS= 10gS+ fS
RNS =10gNS+fNS
R = K (10g + f), here R=RNS, f=fNS
The other part is to compare between the same data recorded in the printed and handwritten parts. For example, the printed part recorded the total number of sunspot groups and sunspots observed on that day, while the handwritten part recorded the specific information of each sunspot group, including the number of sunspots in each sunspot group. In this way, the total number of sunspot group on a given day can be obtained by counting the number of sunspots groups recorded in the handwritten part. When the number of sunspots labeled in each sunspot group is added, the total number of sunspots on that day can be obtained. Finally, it is compared with the values of gNS and fNS, respectively. Also, the method can be used to confirm if there is a sunspot group that has not been identified in handwritten part.
5.4 Verification of the Relevant Laws
In 1861, British astronomer Richard Christopher Carrington discovered what we now know as Spörer’s law, or butterfly diagram (Hopkins, 1976;Carrington, 1969), which was investigated in detail by German astronomer Gustav Spörer. During the solar cycle, sunspots are distributed within a 45∘ latitude range, of which most sunspots appear on both sides of the equator within latitudes of 15∘-20∘ that parallel to the equator, however there are very few appear on both sides of the equator’s 8∘ range. At the beginning of the sunspot cycle, sunspots mostly appear on the surface within 30∘ to 45∘ latitude. As the cycle progresses, sunspots appear in lower and lower latitudes. When the average latitude equals 15∘, the sunspot number reaches its maximum. The average latitude of sunspots will continue to decrease, at the end of the cycle the sunspots appear in about 7∘. Then the new cycle of sunspots began to appear at the relatively high latitudes (Phillips, 1995). For the processed sunspots, the validation is checked as follows: if latitude of sunspot group located between -45∘ to 45∘, and if the above rules are followed. If not, proofread them manually. At the same time, in the solar cycle, some other parameters of the same sunspot group, such as the coordinates of longitude and latitude, area and number of sunspots, also change with the formation and decay of the sunspot group. For example, sunspots generally appear in the eastern hemisphere and disappear in the west. In this way, the longitude variation of the sunspot group can be verified, then be checked whether it conforms to Spörer’s law in each sunspot cycle. The size and number of sunspots in an individual sunspot group are usually increased from smaller to larger and then they reverse, of which the rule can also be used to verified the accuracy of data.
5.5 Data Test
By random sampling, a total of 160 sunspot drawings ( 1%) were selected to test human recognition and input errors. The specific test scheme is as follows: Setting up a group of 10 members, then divided into five teams. Each team is equipped with two computers, in which one to identify the original image information and the other to check the digitized data. 160 images are selected randomly from the dataset of sunspot drawings, in this sample the sunspot related information was 918 (namely, the total lines of Table 5) and the valid data was 10,864. According to the test results, the number of data errors in the fixed region (printed parts in Table 5) was 6, the number of error in the relevant information of sunspots was 29 (handwritten parts as Table 6), and the total number of data errors was 35. In this 160 data samples, the error rate of fixed area is 3.75% (6/160); while for a line with information about a cluster of sunspots the error rate is 3.15% (29/918). According to the calculation of valid data, the error rate for this sample with 160 sunspot drawings was 0.32% (35/10864). The error data is shown in Table 7.
6 Data Summary and Analysis
Digitization, parameter extraction and quantitative statistics for decades of sunspot drawings in China are performed based on scanning mode of human–computer interaction and method of CNN handwritten character recognition. The results shown in Table 8 indicates the number of images are 47,073, for which the digitized results are stored in the format described in Section 4.1. Among them, the number of valid records included in the printed part is 47,073, and in handwritten part it is 235,489. Statistical distribution carried out on the digital data of PMO and YNO with a long observation time. Temporal evolutions of full-disk sunspot number based on the digitized sunspot drawings observed by YNAO (upper panel) and PMO (bottom panel) are shown in Figure 8, where the X-coordinate is year and the Y-coordinate indicates the total number of sunspots directly observed by sunspot drawing telescope on a given day. The the red solid curve is the 180 days smoothed curve superposed to the original data. The sunspot drawings from YNAO and PMO together contributed the largest part to our data archives. In present, the pure full-disk sunspots counts, which do not include the group numbers, are accurate statistics for this data archives (the sunspot relative number R can be provided after the cross-calibration with the international sunspot relative number are finished). Figure 9 shows the periodic variation of sunspots obtained from the digital data of PMO and YNAO, where X-coordinate represents time (year) and the Y-coordinate mean latitude. This results provides the pattern of butterfly diagram, which shows the variation of sunspot latitudes in the northern and southern hemispheres over the past 58 years. It is evident that the average solar cycle is 11 years, and the variation of the sunspot latitude is consistent with the Spörer’s law.. For example, at the beginning of the twentieth solar cycle (October 1964-June 1976), sunspots in the new cycle appeared to be at higher latitudes, and sunspots in the old cycle appeared at lower latitudes.
The statistics on the data continuity of each station are carried out, and the results are shown in Figure 10. Among them, PMO and YNAO have the longest observation time. Part of the monthly data is missing. We have counted the periods without sunspots drawing for more than seven days, and part of these results are given in Table 9. The YNAO has an interval of 100 times over seven days without images, while for PMO these are 283 times.
Sunspot drawings originated from various regions of China are digitized, and the corresponding Software is developed to recognize and extract the valuable parameters, then correctness verifications are also carried out at the same time. Additionally, according to the types and characteristics of data recorded in the image, a detailed data consolidation scheme and different data verification methods are developed. Based on this verification method, the accuracy of parameters are checked again. Finally, the database of sunspot drawings in China and their parameters was established. A preliminary analysis of the data shows that it is consistent with the significant pattern of sunspots. The work of digitizing Chinese sunspot drawings and extracting their parameters, not only makes Chinese historical sunspot observation permanently preserved, but also make the data to have continuity, integrity, complementarity and usability. These four aspects fill the gaps in the observation data of digital sunspots in this region, and provide more detailed and richer data for the study of long-term sunspots evolutions. The database is in the web site: http://sun.bao.ac.cn/SHDA_data/. At present, YNAO and QDOS continue to produce sunspot drawings, which can provide the subsequent continuous data for the ground-based observation of sunspots for the Chinese timezone. The quality assessment of sunspot drawing in China is under way and corresponding results will be published. A web-based English version query interface will be released soon (database is already accessible by the web site: http://sun.bao.ac.cn/SHDA_data/). The next step, we will pay more attention to these extracted parameters more accurately on the basis of combining original sunspot drawings and produced newer version parameters set.
7 Data Policy
The images and data from the Archive of Chinese Historical Sunspot Drawings (ACHSD) can be freely downloaded as public data. However, any public use, web based or paper publication of those data must include an explicit credit to the source: (ACHSD data/image, National Astronomical Observatories, CAS, Beijing ,China)
We thank the reviewer for valuable suggestions and constructive criticism, which improved the clarity of the article. The work was funded by National Science Foundation of China Grant Nos: u1531247 and 1142790111427901, the 13th Five-year InformatizationPlan of Chinese Academy of Sciences, Grant No. XXH13505-04 and the special foundation work of the ministry of science and technology of the of China Grant No: 2014fy120300. We would like to thank to the predecessors who have been engaged in the observation of sunspot in China over the past decades.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Balmaceda et al. (2009) Balmaceda, L. A., Solanki, S. K., Krivova, N. A., Foster,S.: 1976, J. Geophys. Res. 114 , A 07104.
- 2Baranyi et al. (2001) Baranyi, T. Győri, L., Ludmány, A., Coffey, H. E.: 2001, M o n , N o t . R o y . A s t r o n . S o c . formulae-sequence 𝑀 𝑜 𝑛 𝑁 𝑜 𝑡 𝑅 𝑜 𝑦 𝐴 𝑠 𝑡 𝑟 𝑜 𝑛 𝑆 𝑜 𝑐 Mon,Not.Roy.Astron.Soc. 323 , 223.
- 3Baranyi et al. (2015) Baranyi, T. Győri, L., Ludmány, A.: 2015, I A U G e n e r a l A s s e m b l y 𝐼 𝐴 𝑈 𝐺 𝑒 𝑛 𝑒 𝑟 𝑎 𝑙 𝐴 𝑠 𝑠 𝑒 𝑚 𝑏 𝑙 𝑦 IAU General Assembly 2257669 , 2257669.
- 4Carrington (1969) Carrington, R. C.: 1969, O b s e r v a t i o n s o f t h e s p o t s o n t h e s u n 𝑂 𝑏 𝑠 𝑒 𝑟 𝑣 𝑎 𝑡 𝑖 𝑜 𝑛 𝑠 𝑜 𝑓 𝑡 ℎ 𝑒 𝑠 𝑝 𝑜 𝑡 𝑠 𝑜 𝑛 𝑡 ℎ 𝑒 𝑠 𝑢 𝑛 Observations~{}of~{}the~{}spots~{}on~{}the~{}sun f r o m N o v e m b e r 9 , 1853 , t o M a r c h 24 , 1861 , m a d e a t R e d h i l l . [ M ] . R e a d e x M i c r o p r i n t f
- 5Clette et al. (2014) Clette, F., Svalgaard, L., Vaquero, J. M., Cliver, E. W.: 2014, Space Sci. Rev. 186 , 35.
- 6Hopkins (1976) Hopkins, J.: 1976, Chicago,University of Chicago Pres 174 p.
- 7Howard (1991) Howard, R. F.: 1991, Special. Pub. 50 , NASA, Washington, D.C. 451.
- 8Howard et al. (1984) Howard, R. F., Gilman, P. I., Gilman, P. A.: 1984, Sol. Phys. 136 , 251.
