Datasets for quantifying association between short-term exposure to maximum temperature and heatstroke-related ambulance dispatches in Japan: A time-stratified case-crossover design
Keita Wagatsuma

TL;DR
This paper provides open-access datasets to study how short-term high temperatures relate to heatstroke ambulance calls in Japan.
Contribution
The paper introduces a comprehensive dataset using a case-crossover design to analyze temperature and heatstroke ambulance dispatches across Japan.
Findings
Daily records of heatstroke ambulance dispatches and temperature data are provided for 47 Japanese prefectures from 2015 to 2019.
The dataset supports location-specific and regional analyses of temperature-heatstroke associations.
The resource is intended for research and educational use in time-series analysis.
Abstract
This study provides a comprehensive collection of open-access daily time-series datasets designed to quantify the association between short-term exposure to maximum temperature and heatstroke-related ambulance dispatches (HSAD) utilizing a two-stage time-stratified case-crossover design. The datasets include daily records of HSAD, maximum temperature (°C), and relative humidity (%) for the summer months (i.e., June to September) from 2015 to 2019, spanning all 47 Japanese prefectures and covering the entire country. These datasets are well-suited for time-series regression analyses to estimate location-specific maximum temperature-HSAD associations in Japan. The availability of data from multiple locations also facilitates the of regional differences by aggregating location-specific associations within broader geographical areas. This resource is intended to support researchers,…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsClimate Change and Health Impacts · Thermoregulation and physiological responses · Thermal Regulation in Medicine
Specifications TableSubjectGlobal and Planetary Change; Health, Toxicology and Mutagenesis; Epidemiology.Specific subject areaQuantifying the association between short-term exposure to maximum temperature and heatstroke-related ambulance dispatches with a two-stage time-stratified case-crossover design.Type of dataRaw, Analysed, FilteredData collectionDaily data were collected during the summer season (i.e., June to September) from 2015 to 2019 for all 47 Japanese prefectures. These datasets included daily counts of heatstroke-related ambulance dispatches, as well as records of prefecture-specific maximum temperature (°C) and relative humidity (%).Data source locationAll 47 Japanese prefectures (i.e., Hokkaido, Aomori, Iwate, Miyagi, Akita, Yamagata, Fukushima, Ibaraki, Tochigi, Gunma, Saitama, Chiba, Tokyo, Kanagawa, Niigata, Toyama, Ishikawa, Fukui, Yamanashi, Nagano, Gifu, Shizuoka, Aichi, Mie, Shiga, Kyoto, Osaka, Hyogo, Nara, Wakayama, Tottori, Shimane, Okayama, Hiroshima, Yamaguchi, Tokushima, Kagawa, Ehime, Kochi, Fukuoka, Saga, Nagasaki, Kumamoto, Oita, Miyazaki, Kagoshima, and Okinawa).Data accessibilityRepository name: ZenodoData identification number: 10.5281/zenodo.14560225Direct URL to data: https://zenodo.org/records/14560225
Value of the Data
1
- •This time-series dataset facilitates a comprehensive assessment of exposure-response relationships across multiple locations utilizing a two-stage time-stratified case-crossover design
- •The dataset also has educational value, illustrating their application to studies involving maximum temperatures and heatstroke-related ambulance dispatches through time-series regression models, distributed lag non-linear models, and meta-analysis
- •These examples offer foundational knowledge of time-series analysis in environmental epidemiology and enhance the reproducibility of research findings.
Background
2
Environmental epidemiology research frequently evaluates the short-term associations between ambient environmental stressors (e.g., temperature) and health outcomes (e.g., mortality) utilizing time-series data from multiple geographical locations [1]. In recent years, the two-stage time-stratified case-crossover design has become the prevailing analytical framework for these statistical assessments [2]. This approach comprises two sequential steps: first, location-specific exposure-response relationships are estimated while controlling for relevant potential confounders; second, the location-specific estimates are synthesized using meta-analytic techniques, integrating meta-predictors to capture location-specific characteristics [3]. The principal advantage of this case-crossover design lies in its ability to control for observed and unobserved time-invariant individual confounders, such as age, education, health care access, socio-economic status, and other related factors, by design [2].
This data article offers a contemporary introduction to the two-stage time-stratified case-crossover design for investigating environmental stressors and provides a practical guide to its application in environmental epidemiology. It features a detailed compilation of openly accessible time-series datasets tailored to quantify the association between short-term exposure to maximum temperature and heatstroke-related ambulance dispatches (HSAD) in Japan, as illustrated through a case study [[4], [5], [6]]. This resource is designed to support researchers, educators, and students in employing time-series analysis for research purposes and educational endeavours.
Data Description
3
Study setting
3.1
Situated between latitudes 26°N and 43°N and longitudes 127°E and 141°E, Japan is in the Western Pacific Region. The country consists of all 47 prefectures (i.e., Hokkaido, Aomori, Iwate, Miyagi, Akita, Yamagata, Fukushima, Ibaraki, Tochigi, Gunma, Saitama, Chiba, Tokyo, Kanagawa, Niigata, Toyama, Ishikawa, Fukui, Yamanashi, Nagano, Gifu, Shizuoka, Aichi, Mie, Shiga, Kyoto, Osaka, Hyogo, Nara, Wakayama, Tottori, Shimane, Okayama, Hiroshima, Yamaguchi, Tokushima, Kagawa, Ehime, Kochi, Fukuoka, Saga, Nagasaki, Kumamoto, Oita, Miyazaki, Kagoshima, and Okinawa) and predominantly experiences a temperate climate with four distinct seasons, including hot, humid summers (June to August) and cold, dry winters (December to February) .
Epidemiological data
3.2
We collected daily HSAD counts data in the summer season (June to September) from 2015 to 2019 of all 47 Japanese prefectures from the Fire and Disaster Management Agency of the Ministry of Internal Affairs (FDMA) database (https://www.fdma.go.jp/). Heat stroke-related diagnoses were classified based on the International Classification of Diseases (ICD) 10 codes. More specifically, the following diagnostic categories were included and consolidated under heat stroke: “heatstroke and sun stroke” (T67.0), “heat syncope” (T67.1), “heat cramp” (T67.2), “heat exhaustion, anhidrotic” (T67.3), “heat fatigue, unspecified” (T67.5), “heat fatigue, transient” (T67.6), “heat edema” (T67.7), and “other effects of heat and light” (T67.8) [[4], [5], [6]]. Meteorological data, including daily maximum temperature (°C) and relative humidity (%), were obtained for the same period from a single monitoring station managed by the Japan Meteorological Agency (JMA) (https://www.jma.go.jp/jma/index.html). The dataset included calendar variables such as date, year, month, day of the week, and national public holidays. All data were anonymized, aggregated, sourced from publicly available databases, and exempt from ethical approval requirements.
Experimental Design, Materials and Methods
4
The time-series regression design is a widely used analytical method in environmental epidemiology for examining short-term associations between environmental exposures and health outcomes [1]. In this study, we quantified the association between short-term exposure to maximum temperature and HSAD in Japan. Specifically, we conducted a two-stage time-series analysis to model multi-location data, incorporating time-series datasets for all 47 Japanese prefectures [3]. In the first stage, we estimated the exposure-response relationship in each prefecture. We then pooled the prefecture-specific effect estimates to obtain an overall mean in the second stage using a meta-analysis. Between June and September from 2015 to 2020, a total of 300,528 HSAD cases were documented across 47 Japanese prefectures (Table 1). During this period, the mean daily HSAD cases began increasing in June (mean: 2.7 cases [standard deviation: 4.4 cases]), reached a peak in July (18.8 cases [31.4 cases]) and August (18.3 cases [28.2 cases]), and then declined in September (2.8 cases [7.3 cases]). The maximum recorded temperature during the study period occurred in August at 32.1°C, while the highest relative humidity was observed in September at 76.5%.Table 1. Annual summary statistics of daily counts of HSAD and meteorological variables in four Japanese prefectures during summer period (from June to September) in 2015–2019.Table 1. VariablesJuneJulyAugustSeptemberOverallHSAD (cases per day)2.7 (4.4)18.8 (31.4)18.3 (28.2)2.8 (7.3)10.6 (22.9)Maximum temperature (°C)26.4 (3.3)30.6 (3.7)32.1 (3.4)27.6 (3.4)29.1 (4.1)Relative humidity (%)74.2 (12.0)77.4 (9.2)74.3 (9.3)76.5 (10.4)75.5 (10.3)Notes: The cell values represent the mean, with the standard deviation shown in parentheses. Abbreviations: HSAD, heatstroke-related ambulance dispatches.
In the first stage, we adopted a time-stratified case-crossover study design to analyze prefectural data using a conditional quasi-Poisson regression model combined with a distributed lag non-linear model (DLNM) [2,7]. Adjustments for seasonality, long-term trends, and day of the week effects were achieved through time stratification defined by a three-way interaction term of year, month, and day of the week. This case-crossover design is useful for controlling for individual characteristics which are unlikely to change within the small-time window, such as demographic characteristics, e.g., sex and education and living habits, e.g., smoking and drinking [2]. To investigate the non-linear and multi-delayed relationship between maximum temperature and HSAD, we applied a cross-basis function within the DLNM framework. This cross-basis function utilized a natural cubic B-spline for the exposure-response association, with internal knots placed at the 25^th^, 50^th^, and 75^th^ percentiles of maximum temperature, and another natural cubic B-spline for the lag-response association, spanning a 7-day lag period with three equally spaced knots on a logarithmic scale [8]. Other potential confounding factors were also addressed by incorporating relative humidity, modeled as a natural cubic B-spline with three degrees of freedom, and by accounting for national public holidays.
In the second stage, a multivariate meta-analysis based on a random-effects model was used to pool the prefecture-specific estimates from the first-stage analysis [9]. We included the prefecture-specific latitude in the model as meta-predictor to investigate heterogenicity. To quantify heterogeneity, we calculated the I^2^ statistics and performed the Cochran's Q test. We estimated the relative risks (RRs) and corresponding 95% confidence intervals (CIs) using the median maximum temperature, i.e., 29.3°C, as the reference value. The two-tailed *p-*values less than 0.05 were considered statistically significant. All statistical analyses were performed using R version 4.1.0 (R Foundation for Statistical Computing, Vienna, Austria) with the packages “gnm”, “dlnm”, and “mixmeta”. The R code for analysing the example is available in the repository of Zenodo.
Fig. 1 illustrates the overall cumulative exposure-response curves between maximum temperature and HSAD across for four major Japanese prefectures. Overall, a J-shaped association was evident in all prefectures, including Hokkaido, Tokyo, Osaka, and Fukuoka, with elevated RR values for heat temperatures. Notably, Tokyo and Osaka demonstrated exceptionally high RR values. Conversely, in Fukuoka, the RR for heat temperatures showed a tendency to decrease at maximum temperatures exceedingly approximately 35°C. The corresponding graphs for all 47 Japanese prefectures are reported in the supplementary material (Fig. S1).Fig. 1. Pooled estimates of overall lag-cumulative associations between maximum temperature and HSAD with 95% confidence intervals (shaded grey) for A: Hokkaido, B: Tokyo, C: Osaka, and D: Fukuoka. Centred at the median value of maximum temperature at 29.3°C. Abbreviations: RR, relative risk; HSAD, heatstroke-related ambulance dispatches.Fig 1
Fig. 2 depicts the pooled estimates of overall cumulative exposure-response relationship between maximum temperature and HSAD over a 7-day lag period for all 47 Japanese prefectures. A J-shaped association was also evident nationwide. At the 95^th^ percentile of the maximum temperature distribution (i.e., 37.0°C), the heat effect yielded an RR of 9.7 (95% CI: 7.8–12.1) compared to 29.3°C (left panel in Fig. 2). A meta-analysis revealed substantial heterogeneity in the maximum temperature-HSAD associations across prefectures, with an I² of 82.2% and a highly significant Cochran's Q test (p-value < 0.001). To investigate this variability, latitude was included as a predictor in multivariate meta-regressions (Table 2). In a univariable model, latitude emerged as a significant modifier of the maximum temperature-HSAD associations, with northern prefectures showing significantly higher risks at elevated maximum temperatures (right panel in Fig. 2). The likelihood ratio test confirmed that latitude accounted for the observed heterogeneity in HSAD across prefectures (p-value < 0.001).Fig. 2. Pooled estimates of overall lag-cumulative associations between maximum temperature and HSAD with 95% confidence intervals in all 47 Japanese prefectures. Notes: The panel A displays the average heat-HSAD curve derived from the multivariate meta-analysis, incorporating only intercepts and no predictors. The panel B illustrates the effect modification by latitude, as predicted from the full multivariate meta-regression, based on the 10^th^ to 90^th^ percentile values of the prefecture-specific meta-variable. Abbreviations: RR, relative risk; HSAD, heatstroke-related ambulance dispatches.Fig 2. Table 2Random-effects meta-regression models in the second-stage analysis.Table 2. Meta-predictorAICBICI^2^ statistics (%)Cochran's Q test (p-value)LR test (p-value)Intercept only560.6606.079.6< 0.0001Latitude 525.3583.667.3< 0.0001< 0.0001Abbreviations: AIC, Akaike information criterion; BIC, Bayesian information criterion; LR, likelihood ratio.
This case study demonstrated the applicability of the time-series datasets to estimate the exposure-response association between environmental stressors and health outcomes using a two-stage time-stratified case-crossover design. Furthermore, the availability of time-series data from multiple geographical locations also enabled the exploration of geographical differences by pooling location-specific associations within geographical regions. Recent advancements in statistical techniques for time-series analysis in environmental epidemiology have led to the creation of various evaluation frameworks. For instance, the time-stratified design in this study can be expanded into a space-time-stratified case-crossover design, allowing multilevel data to be analysed in a single step as an alternative to the conventional two-stage design [2,10]. In contrast, when individual exposure measurements are available, the recently introduced case-time-series design offers a flexible and computationally efficient approach for addressing temporal dependencies while avoiding potential biases from between-area comparisons [11]. Traditionally, spatial and spatio-temporal modelling analyses have utilized Bayesian hierarchical models, such as the integrated nested Laplace approximation [12]. Although these methods provide a rigorous framework for accounting for spatial correlations and geographically varying risks, their high computational requirements often limit their applicability to large datasets or complex relationships. Selecting the appropriate statistical methodology based on the characteristics of the time-series data set and evaluating its applicability is essential for researchers. Nonetheless, simulation studies remain necessary to investigate the performance and suitability of various statistical models.
Limitations
To our knowledge, this study is the first to model the maximum temperature-HSAD associations between in Japan. However, several limitations should be acknowledged. First, the time-series analysis was conducted within the framework of an ecological study, which inherently limits causal inference. Second, the study's scope was confined to 47 Japanese prefectures, which restricts the generalizability of the findings to other regions or countries. Third, the lack of individual-level data prevented stratified analyses by age, sex, illness severity, or incident place, thereby precluding a comprehensive assessment of vulnerability among subgroups.
Ethics Statement
The author has read and follow the ethical requirements for publica on in Data in Brief and confirming that the current work does not involve human subjects, animal experiments, or any data collected from social media platforms.
CRediT Author Statement
Keita Wagatsuma: Conceptualization, Writing – original draft, Software, Methodology.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Bhaskaran K.Gasparrini A.Hajat S.Smeeth L.Armstrong B.Time series regression studies in environmental epidemiology Int. J. Epidemiol.4220131187119510.1093/ije/dyt 09223760528 PMC 3780998 · doi ↗ · pubmed ↗
- 2Tobias A.Kim Y.Madaniyazi L.Time-stratified case-crossover studies for aggregated data in environmental epidemiology: a tutorial Int. J. Epidemiol.53202410.1093/ije/dyae 020dyae 020PMC 1087975138380445 · doi ↗ · pubmed ↗
- 3Sera F.Gasparrini A.Extended two-stage designs for environmental research Environ. Health 2120224110.1186/s 12940-022-00853-z 35436963 PMC 9017054 · doi ↗ · pubmed ↗
- 4Hatakeyama K.Ota J.Takahashi Y.Kawamitsu S.Seposo X.Effect of the COVID-19 pandemic on heatstroke-related ambulance dispatch in the 47 prefectures of Japan Sci. Total Environ.768202114517610.1016/j.scitotenv.2021.145176 PMC 975255933736302 · doi ↗ · pubmed ↗
- 5Seposo X.Madaniyazi L.Ng C.F.S.Hashizume M.Honda Y.COVID-19 pandemic modifies temperature and heat-related illness ambulance transport association in Japan: a nationwide observational study Environ. Health 20202112210.1186/s 12940-021-00808-w 34857008 PMC 8637525 · doi ↗ · pubmed ↗
- 6Hatakeyama K.Seposo X.Heatstroke-related ambulance dispatch risk before and during COVID-19 pandemic: subgroup analysis by age, severity, and incident place Sci. Total Environ.821202215331010.1016/j.scitotenv.2022.153310 PMC 878465135085629 · doi ↗ · pubmed ↗
- 7Gasparrinia A.Armstrong B.Kenward M.G.Distributed lag non-linear models Stat. Med.2920102224223410.1002/sim.394020812303 PMC 2998707 · doi ↗ · pubmed ↗
- 8Tobías A.Madaniyazi L.Gasparrini A.Armstrong B.High summer temperatures and heat stroke mortality in Spain Epidemiology 34202389289610.1097/EDE.000000000000166137757878 · doi ↗ · pubmed ↗
