Latvian herbal medicines under the infrared lens: An FTIR-ATR dataset
Una Lote Vītoliņa, Jurģis Krūmiņš, Renāte Teterovska, Ance Bārzdiņa, Baiba Mauriņa, Dace Bandere, Agnese Brangule

TL;DR
This paper presents a dataset of infrared spectral fingerprints for various Latvian medicinal plants, enabling identification and quality control.
Contribution
The novel contribution is a comprehensive FTIR-ATR dataset for Latvian medicinal plants, facilitating chemical analysis and species classification.
Findings
The dataset includes 329 spectra from 19 medicinal plant species, suitable for chemical characterization and quality control.
Minimal sample preparation was used, involving only grinding the plants into fine powder.
The dataset supports multivariate analysis and chemometrics for spectroscopic data handling.
Abstract
As more individuals use plant-derived products as prophylactic and therapeutic remedies, the herbal medicine market continues to grow rapidly. This trend shows the need for more advanced analytical methods, such as spectral “fingerprinting” – creating spectrum with a distinctive pattern, which is further used as a unique identifier for a plant. The dataset was generated using Fourier-transform infrared spectroscopy with attenuated total reflectance (FTIR-ATR) to produce distinctive spectral fingerprints of Latvian medicinal plants: red and white clover, purple coneflower, cowslip, common jasmine, sandy everlasting, horse chestnut, pot marigold, chamomile, coltsfoot, common daisy, elderberry, cornflower, Jerusalem artichoke, fireweed, common and Midland hawthorn, small-leaved and large-leaved lime, yarrow, and meadowsweet. The samples were obtained from ten Latvian herbal medicine…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpectroscopy and Chemometric Analyses · Spectroscopy Techniques in Biomedical and Chemical Research · Metabolomics and Mass Spectrometry Studies
Specifications TableSubjectHealth Sciences, Medical Sciences & PharmacologySpecific subject areaThe study applies a non-destructive method, FTIR-ATR Spectroscopy, to obtain fingerprint spectra of Latvian medicinal plants.Type of dataPre-processedTable (.csv format)Data collectionThe dataset consists of 19 files (.csv format) containing 329 FTIR-ATR spectra and one table (.csv format) providing a summary of the names of medicinal plants in English, Latin, and Latvian.Herbal samples were selected from ten different Latvian producers, and one sample set was collected by researchers.FTIR-ATR spectra were taken directly with Thermo Fisher Scientific Nicolet iS5 with a diamond crystal, recorded from 500 to 3800 cm^−1^ with 16 scans and a resolution of 4 cm^−1^, optical velocity of 0.4747, and aperture of 100 %.Each plant sample has at least four replications.Baseline correction and normalization were done in SpectraGryph 1.2.16.1 [1] to the most intense band in 850–1850 cm⁻¹.Data source locationCountry: LatviaAffiliation: Riga Stradins University, Faculty of Pharmacy, LV-1007, Riga, LatviaData accessibilityRepository name: ZenodoData identification number: 10.5281/zenodo.16892975 [2]Direct URL to data: https://doi.org/10.5281/zenodo.16892975Related research articleNone
Value of the Data
1
- • Direct usability across software environments and training set development for machine learning FTIR-ATR spectra are provided in CSV format, making them directly compatible with statistical and chemometric software (e.g., ORIGIN, SIMCA). These data are well-suited for building training sets in machine learning applications such as plant authentication, classification, and quality control, thereby reducing the need for extensive new data collection.
- • Chemical fingerprinting for herbal medicine characterization The spectra can serve as chemical fingerprints for herbal medicine characterization, and combined with other analytical methods, such as high-performance liquid chromatography (HPLC), and thin-layer chromatography (TLC), they provide a more comprehensive view of the plant’s chemical composition, which can be used to confirm the herbal medicines authenticity and formulation.
- • Non-destructive analysis enabling sample preservation As a non-destructive technique, FTIR-ATR offers a broad overview of plant chemical composition, requiring only a small sample amount while leaving the material unaltered. This is important because the samples remain available for further analysis, verification, or future use.
- • Applications in phytochemical profiling The dataset supports advanced phytochemical profiling and authenticity assessment in herbal-derived products. Representing medicinal plants from Latvia, it provides a basis for comparing chemical profiles with those from other regions, as plant composition can vary with species, growth location, age, harvesting season, drying conditions, and other factors. This collection can therefore serve as a valuable reference database for comparative analysis.
Background
2
The use of herbal medicine continues to grow across Europe – the market was worth USD 65 billion in 2024 and is projected to reach USD 110 billion by 2033 [3]. This trend is driven by a combination of historical traditions and modern consumer preferences.
Despite their natural origin, herbal medicines can pose higher risks of adverse effects when contaminated or sourced from unknown origins [[4], [5], [6]]. With market growth, counterfeit and impure herbal medicines are increasingly prevalent [[7], [8], [9]]. For example, a study of herbal products sold for addiction treatment in Iran 74 % contained illegal substances [10].
The World Health Organization, the Food and Drug Administration of the USA, and the European Medicines Agency have approved spectral fingerprinting as a valid method for herbal medicine quality screening [[11], [12], [13]]. With large amounts of data, correlations between specific species can be found and the entire FTIR-ATR spectrum can be used as a fingerprint [14]. Moreover, combining FTIR-ATR with HPLC and TLC, a more comprehensive view can be given about the medicinal plants' chemical composition [15].
The selected medicinal herbs are well-known in traditional Latvian folk medicine for their therapeutic properties and are still used to this day [16].
Data Description
3
The dataset consists of 19 CSV files containing 329 FTIR-ATR spectra of Latvian medicinal plants, acquired in the spectral range of 500–3800 cm⁻¹.
For this study, products from ten different Latvian herbal medicine producers were selected, including two biologically certified producers (Producer no 2 and no 7). In addition, one sample set was collected directly by researchers to serve as a reference from a natural source.
The dried tea samples were purchased from reputable Latvian herbal medicine producers who collected the plants in 2019–2020 at their recommended harvest age according to the European Pharmacopoeia [17].
Table 1 provides a summary of the names of medicinal plants in English, Latin, and Latvian and part used.Table 1. List of medicinal plant species in Latin, English, and Latvian and part used.Table 1. Part usedLatinEnglishLatvianFlowerAchillea millefolium L.YarrowParastais pelašķisAesculus hippocastanum L.Horse chestnutParastā zirgkastaņaBellis perennis L.Common daisyIlggadīgā mārpuķīteCalendula officinalis L.Pot marigoldĀrstniecības kliņģerīteCentaurea cyanus L.CornflowerZilā rudzupuķeChamaenerion angustifolium (L.) SchurFireweedŠaurlapu ugunspuķeEchinacea purpurea L.Purple coneflowerPurpursarkanā ehinācijaHelianthus tuberosus L.Jerusalem artichokeBumbuļu topinambūrsHelicrysum arenarium MoenchSandy everlastingDzeltenā salmene (kaķpēdiņa)Jasminum officinale L.Common jasmineJasmīnsMatricaria recutita L.(syn. M. chamomilla L.)ChamomileSmaržīgā (ārstniecības) kumelītePrimula veris L.(syn. P. officinalis Hill)CowslipGaiļbiksīteSambucus nigra L.ElderberryMelnais plūškoksTilia cordata Mill.(syn. T. parvifolia Ehrh.)Small-leaved lime (linden)Parastā (ziemas) liepaTilia platyphyllos Scop.Large-leaved lime (linden)Platlapu (vasaras) liepaTrifolium pratense L.Red cloverSarkanais āboliņšTrifolium repens L.White cloverBaltais āboliņšFruitCrataegus laevigata (Poir.) DC.(syn. C. oxyacantha L.)Midland hawthornNogludinātā (divirbuļu) vilkābele (krustābele)Crataegus monogyna Jacq.Common hawthornVienirbuļa vilkābele (krustābele)LeafTussilago farfara L.ColtsfootMāllēpeFlower, leaf, stemFilipendula ulmaria (L.) Maxim.(syn. Spiraea ulmaria L.)MeadowsweetParastā vīgrieze
File names follow standardized format: Plant name (in both English and Latin)Producer_Measurement replicate number. For example, White clover_Trifolium repens_Pro1_1. Each producer is assigned with a unique identifier, such as Pro1, Pro2, and so on. All file names and their number of replicates and total samples are listed in Table 2.Table 2. List of CSV file names, producers, number of replications and total samples.Table 2. File nameProducersMinimal number of replicatesNumber of spectra for the medicinal plant1.CSV_FTIR_ATR_Red clover_Trifolium pratensePro1, Pro25112.CSV_FTIR_ATR_White clover_Trifolium repensPro1, Pro25113.CSV_FTIR_ATR_Purple coneflower_Echinacea purpureaPro2, Pro35114.CSV_FTIR_ATR_Cowslip_Primula verisPro1, Pro2, Pro35155.CSV_FTIR_ATR_Common jasmine_Jasminum officinalePro1556.CVS_FTIR_ATR_Sandy everlasting_Helicrysum arenariumPro3, Pro45107.CSV_FTIR_ATR_Horse chestnut_Aesculus hippocastanumPro1, Pro25108.CSV_FTIR_ATR_Pot marigold_Calendula officinalisPro1, Pro3, Pro4, Pro55209.CSV_FTIR_ATR_Chamomile_Matricaria recutitaPro1, Pro2, Pro3, Pro4, Pro6, Pro7, Pro854010.CSV_FTIR_ATR_Coltsfoot_Tussilago farfaraPro1, Pro351011.CSV_FTIR_ATR_Common daisy_Bellis perennisPro25512.CSV_FTIR_ATR_Elderberry_Sambucus nigraPro15513.CSV_FTIR_ATR_Cornflower_Centaurea cyanusPro1, Pro2, Pro951614.CSV_FTIR_ATR_Jerusalem artichoke_Helianthus tuberosusPro1, Pro251015.CSV_FTIR_ATR_Fireweed_Chamaenerion angustifoliumPro1, Pro2, Pro351516.CSV_FTIR_ATR_Hawthorn fruit_Fructus CrataegiPro15517.CSV_FTIR_ATR_Linden flower_Flos TiliaePro1, Pro2, Pro3, Pro452218.CSV_FTIR_ATR_Yarrow_Achillea millefoliumPro1, Pro2, Pro3, Pro10,Pro11.1, Pro11.244519.CSV_FTIR_ATR Meadowsweet_Filipendula ulmariaPro1, Pro2, Pro3;Pro11.1, Pro11.3, Pro11.4563
Table 3 shows the approximate geographical coordinates of the medicinal plant collection sites.Table 3. Geographical coordinates of the producers harvest sites.Table 3. ProducersRegion in Latvia; Approximate coordinates WGS84Pro1Vidzeme, 57.03° N, 25.61° E,Pro2Vidzeme, 56.89° N, 26.29° EPro3Zemgale, 56.60° N, 23.30° EPro4Producer has not specifiedPro5Vidzeme, 57.31° N, 25.35° EPro6Vidzeme, 57.31 ° N, 25.27° EPro7Vidzeme, 57.31° N, 25.35° EPro8Zemgale, 56.81° N, 24.25° EPro9Vidzeme, 56.86 ° N, 24.23° EPro10Producer has not specifiedPro11Vidzeme, 57.01° N, 24.55° E
As shown in Fig. 1, the FTIR-ATR full spectra of a single medicinal herb are grouped into one file, which includes only the producers from whom researchers have collected the sample.Fig. 1FTIR-ATR spectra of medicinal plants: (A) Cornflower (Centaurea cyanus) with all the samples from Producer no 1, 2, and 9; (B) Yarrow (Achillea millefolium), Elderberry (Sambucus nigra) and Fireweed (Chamaenerion angustigolium) samples from Producer no.1 and Common daisy (Bellis perennis) sample from Producer no.2.Fig 1
The fingerprint region is one of the most important parts of the spectrum. The most intense signals are observed between 850 and 1850 cm^−1^, as shown in Fig. 2.Fig. 2. Fingerprint region of medicinal plant FTIR-ATR spectra: (A) Cornflower (Centaurea cyanus) Producer no.1, sample no.1; (B) Common daisy (Bellis perennis) Producer no.2, sample no 1.Fig 2
Experimental Design, Materials and Methods
4
Sample collection and preparation
4.1
Each producer (Pro1–Pro11) is associated with a specific geographical region in Latvia, and the approximate coordinates (WGS84) of these regions are provided in Table 3.
Herbal samples were selected from ten different Latvian producers, including two biologically certified producers (producer no 2 and no 7) in a form of dried tea; in addition, one sample was collected by the researchers and dried under varying conditions: 35 °C, 58 °C, 70 °C, and ambient room temperature.
Dried tea samples were purchased from reputable Latvian herbal medicine producers, with plants collected in 2019–2020 at the harvest age specified by the European Pharmacopoeia [17].
All dried plants were ground into a powder and sifted through a 2 mm sieve. The resulting powders were stored at room temperature for further analysis.
Instrument setup and spectral data acquisition
4.2
Attenuated total reflection (ATR) spectroscopy spectra of all herbal samples were taken with Thermo Fisher Scientific Nicolet iS5 with a diamond crystal. Spectra were recorded from 500 to 3800 cm^−1^ with 16 scans and a resolution of 4 cm^−1^, optical velocity of 0.4747, and aperture of 100 %.
The Thermo Scientific Nicolet iS5 FT-IR spectrometer used in this study maintains wavenumber accuracy through an internal He–Ne laser calibration system. Instrument performance and wavenumber precision were regularly validated using a standard polystyrene film according to the ASTM E1421 validation practices and European Pharmacopoeia (2.2.24) [18] guidelines, to ensure consistent wavenumber precision and signal quality during spectral acquisition (Thermo Fisher Scientific, 2016) [19]. All spectra were collected under stable environmental conditions (temperature 23±1 °C, controlled humidity) to minimize variation caused by atmospheric CO₂ and water vapor.
Spectral pre-processing
4.3
Baseline correction and normalization of the FTIR spectra were performed using the academic freeware SpectraGryph 1.2.16.1. The spectra were normalized to the most intense band within the fingerprint region (850–1850 cm⁻¹).
Limitations
As shown in Table 2, there is considerable variation in sample size among species (e.g., five spectra for Common jasmine vs. 63 for Meadowsweet), which may affect the statistical robustness of results for species with fewer samples. This imbalance partly reflects limited material availability and the small number of producers for certain plants. Consequently, results for these species should be interpreted with caution.
Ethics Statement
The authors confirm that they have read and comply with the ethical requirements for publication in Data in Brief. This work does not involve human subjects, animal experiments, or any data collected from social media platforms.
CRediT Author Statement
Una Lote Vītoliņa: Writing – original draft preparation, Data curation, Visualization; Jurģis Krūmiņš: Writing – original draft preparation; Renāte Teterovska: Writing – review and editing, Resources; Ance Bārzdiņa: Writing – review un editing, Resources; Baiba Mauriņa: Writing – review and editing, Resources; Dace Bandere: Writing – review and editing, Supervision, Resources; Agnese Brangule: Supervision, Conceptualization, Methodology, Writing – review and editing.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1F. Menges, Spectragryph v 1.2.16 [software], Spectragryph-optical spectroscopy software, June 20, 2022. https://www.effemm 2.de/spectragryph/index.html.
- 2U.L. Vītoliņa, J. Krūmiņš, A. Brangule, A. Bārzdiņa, R. Teterovska, D. Bandere, B. Maurina, Latvian herbal medicines under the infrared lens: an FTIR-ATR dataset, Zenodo, 2025. 10.5281/zenodo.16892975. · doi ↗
- 3Market data forecast, Europe herbal medicine market size, share, trends & growth forecast report. https://www.marketdataforecast.com/market-reports/europe-herbal-medicine-market, 2024 (accesssed 26 August 2025)
- 4Vahid B.Severe complications of herbal medicines Crit. Care Shock 1012007410
- 5Ernst E.Cardiovascular adverse effects of herbal medicines: a systematic review of the recent literature Can. J. Cardiol.197200381882712813616 · pubmed ↗
- 6Ernst E.Serious psychiatric and neurological adverse effects of herbal medicines – a systematic review Acta Psychiatr. Scand.1082003839110.1034/j.1600-0447.2003.00158.x 12823164 · doi ↗ · pubmed ↗
- 7Phan D.T.A.Kongkaew C.Heinrich M.Dao T.C.M.Vo T.H.From 'traditional' remedies to 'modern' supplements: a systematic review and meta-analysis of pharmaceutical adulteration in weight-loss natural products Front. Pharmacol.162025159497510.3389/fphar.2025.1594975 PMC 1211960340444036 · doi ↗ · pubmed ↗
- 8Crighton E.Coghlan M.L.Farrington R.Hoban C.L.Power M.W.P.Nash C.Maker G.Toxicological screening and DNA sequencing detects contamination and adulteration in regulated herbal medicines and supplements for diet, weight loss and cardiovascular health J. Pharm. Biomed. Anal.176201911283410.1016/j.jpba.2019.11283431472365 · doi ↗ · pubmed ↗
