Development of a comprehensive dietary dataset for idiopathic granulomatous mastitis (IGM) patients and matched controls: protocol, implementation, and future directions for nutrition-based research
Sadaf Alipour, Sakineh Shab-Bidar, Bita Eslami, Ramesh Omranipour, Shiller HessamiAzar, Marzieh Orouji, Ramin Mansouri, Sara Alipour, Kiana Kimiaei-Asadi, Reyhaneh Aghajani

TL;DR
This study creates a detailed dietary dataset for idiopathic granulomatous mastitis patients and controls to explore how diet might influence the disease.
Contribution
The novel contribution is the development of a comprehensive dietary dataset for IGM patients and matched controls in Iran.
Findings
A dataset with 430,572 data points was generated from 608 IGM patients and 568 controls across 36 Iranian cities.
Dietary intake was assessed using a validated 147-item FFQ and Nutritionist IV software including traditional Iranian foods.
The dataset includes demographic, hormonal, and clinical variables alongside dietary patterns for future IGM research.
Abstract
Idiopathic granulomatous mastitis (IGM) is a rare chronic inflammatory disease with unclear etiology and substantial morbidity. While immunological, hormonal, and infectious triggers have been proposed, the role of dietary factors in IGM pathogenesis remains underexplored despite emerging associations between diet and inflammatory conditions. This study describes the development of a comprehensive dataset of dietary intake and nondietary variables from IGM patients and matched controls, enabling future research into potential diet–IGM associations. Between 2022 and 2024, a multicenter case‒control study enrolled 608 female patients with histologically confirmed IGM and 568 matched controls. Data included demographic, reproductive, and anthropometric measurements, hormonal profiles, and detailed dietary intake data evaluated via a validated 147-item food frequency questionnaire (FFQ).…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
- —Deputy of Research of Tehran University of Medical Sciences, Tehran, Iran
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAutoimmune and Inflammatory Disorders · Mast cells and histamine · Inflammatory Bowel Disease
Introduction
Idiopathic granulomatous mastitis (IGM) is a chronic inflammatory disease of the breast that was recognized in the 1970s [1], but many uncertainties remain regarding its etiology, triggering factors, and treatment. IGM is rare in most parts of the world but is less common in some areas, such as Turkey, Iran, China, India, and the Hispanic population of the USA [2].
In addition to its protracted course and challenges in long-term management, the disease frequently imposes significant morbidity, manifesting as disfiguring breast lesions and persistent fistulas. The hypotheses behind the cause of the disease are several, but immunological alterations, infectious causes, and hormonal triggers are the most likely [3]. While aspects of the reproductive life of a woman, as well as underlying diseases or exogenous hazards such as smoking, have been frequently explored [3], abnormalities arising secondary to breastfeeding problems and milk stasis have recently been highlighted as the main initiating mechanism of the disease [4–6] Nevertheless, even in this scenario, stasis may lead to leakage into breast tissue, initiating a cascade of immunologic or autoimmune reactions that cause tissue inflammation [7, 8].
However, dietary factors have been largely overlooked. Although tissue inflammation and heightened immune responses are associated with nutrient intake [9, 10], research specifically examining these links in idiopathic granulomatous mastitis (IGM) remains limited. Existing studies have focused primarily on isolated dietary components, often based on anecdotal observations. For example, while one study suggested a connection between spicy food consumption and IGM [11], another disputes this association [4]. Additionally, recent findings implicate food allergies in IGM pathogenesis [12]. Furthermore, differences in gut microbiota metabolites among IGM patients [13] suggest a potential dietary influence.
There is a substantial knowledge gap regarding the role of diet in IGM. To address this, we developed a comprehensive dataset of systematically collected dietary information from both IGM patients and a control group. This article outlines the dataset’s development protocol and highlights its key features.
Methods
Study design, settings, and participants
This project was approved by the Ethics Committee of Tehran University of Medical Sciences, Tehran, Iran (Ethics Code: IT.TUMS.IKHC.REC.1300.437). This was a multicenter case‒control study centered at the Breast Diseases Research Center (BDRC), Cancer Institute, TUMS, and was carried out from November 2022 to October 2024.
The case group (IGM Group, IGG) comprised patients from breast and IGM clinics across Iran. The control group (CG) included accompanying women at these clinics, as well as women visiting dermatology, infectious disease, and pediatric clinics (mothers of sick children). The rationale behind selecting these specific clinics was that these individuals belonged to the same healthcare settings as the cases, ensuring similar geographic, cultural, and socioeconomic backgrounds. They were accessible and were less likely to harbor diseases that had dietary concerns. Although medical history and other issues regarding eligibility criteria were asked from every individual before inclusion, we targeted a population that were more likely to fulfil the criteria. Participants in CG were matched to ICG patients on age, sex (all female), and place of living.
The IGM group (IGG) included female patients aged > 18 years with histologically confirmed IGM diagnosed within the past 6 months. The control group (CG) comprised women with no personal or first-degree family history of IGM. The exclusion criteria for both groups were liver/heart failure, diabetes, memory disorders, recent significant dietary changes (e.g., adoption of vegetarianism), a history of breast cancer, and current pregnancy or lactation.
Data collection
Personal and demographic information, as well as data about reproductive and past medical history, were collected through interviews conducted by trained staff, and anthropometric variables were measured for all participants. The Persian language-validated version of a comprehensive 147-item food frequency questionnaire (FFQ) [14] was completed in detail. Previous studies employing this FFQ have demonstrated acceptable validity and reliability [14]. It represents every food item as a predefined standard portion size, and the time interval scales are presented as daily, weekly, monthly, or yearly consumption. The participants in the two groups were asked to report their consumption frequency for each item based on their intake patterns during the last 12 months. The English version of the 147-item FFQ is shown in Additional File 1.
An electronic online version of both questionnaires (diet and non-diet data) was developed to facilitate the interviews and enhance precision. The interviewers completed the online questionnaire, and the output was directly received at the project base (BDRC).
After completion of the FFQs, reported consumption amounts for each food item were converted to grams via standardized measurement guides [15]. Nutrient intake values were then calculated for each participant via Nutritionist IV software (First Databank, San Bruno, CA, USA), which utilizes the USDA Food Composition Database supplemented with traditional Iranian food items where applicable [16].
All the collected data together with the results of the nutrient value computations were entered into a statistical database platform specifically designed for this study. To verify the accuracy of data entry, a randomly chosen 10% of the data entered in each group were rechecked with their original forms and pertinent calculations.
Results
The participants were recruited from 36 cities in Iran. After the exclusion of 10 patients with IGG due to incomplete data, 608 and 568 participants were ultimately included in the IGG and CG, respectively. The list of nondiet variables of the dataset is shown in Table 1.
Table 1. Nondiet variables recorded in the dataset for participants in the two groupsRowVariablesIGM groupControl groupRowVariablesIGM groupControl group A
Demographic
C
Anthropometrics 1Patient age✳✳17Weight✳✳2City of birth✳✳18Height✳✳3Province of birth✳✳19Neck circumference^^✳✳4City of residence✳✳20Waist circumference^^✳✳5Province of residence✳✳21Hip circumference^**^✳✳6Education✳✳ D
Hormone use 7Job✳✳22Hx of OCP use✳✳ B
Reproductive 23Duration of OCP use✳✳8Age at menarche✳✳24Last OCP use✳✳9Gravidity✳✳25Hx of HRT✳✳10Parity✳✳26Duration of HRT✳✳11Abortion Hx✳✳27Last HRT✳✳12Age at 1st Pregnancy✳✳ E
Disease 13BF duration^^✳✳28Hx of previous diseases✳✳14Dominant BF side✳✳29Type of IGM treatment✳15Hx of Inf✳✳30Recent IGM side✳16Hx of Inf treatment✳✳31Specific comments✳✳ BF * Breastfeeding, HRT hormone replacement therapy, Hx History, Inf infertility^*^The sum of all periods of lactation, ^**^ considered in participants who opted for the measurements
During the project, feedback indicated participant reluctance regarding neck, waist, and hip circumference measurements. Consequently, these measurements were made optional, resulting in substantial missing data for these variables (ranging from one-third to half of the cases, as shown in Table 1). This limitation was deemed acceptable, and participants were not excluded solely because of missing anthropometric data.
The variables were categorized as necessary, important, or useful (Table 2). The exclusion criteria were as follows: participants missing data for more than one necessary variable, more than two important variables, or more than three useful variables. Since participants in both groups were selected based on their willingness and ability to complete dietary assessments, missing diet-related data were minimal.
Table 2. Classification of Nondiet variables according to importance in the studyClassImportantNecessaryUsefulOther^^Variables Patient ageAge at menarcheCity of birthNeck circumference GravidityAbortion HxProvince of birthWaist circumference ParityAge at 1st PregnancyCity of residenceHip circumference WeightHx of InfProvince of residence HeightDuration of OCP useEducation Hx of previous diseasesLast OCP useJob BF durationDominant BF sideHx of HRT Hx of OCP useHx of Inf treatmentDuration of HRTLast HRTType of IGM treatmentRecent IGM sideBF* Breastfeeding, HRT hormone replacement therapy, Hx History, Inf infertility^*^Variables that were only considered in participants who opted for the measurements and are missing in a large portion of participants
Overall, envisaging all the data, including general and system-related data (such as participant codes, durations of interviews, participant and physician names, etc.), 430,572 data points were created and finalized for this project.
Discussion
We conducted a project to develop a comprehensive dataset of dietary information for IGM patients and a control group. The methods used for data collection and clearance, and the features of the dataset are presented in this manuscript.
Dietary composition is one of the most important modifiable environmental factors affecting human health. As a result, the relationships between diet and various diseases, including specific categories such as malignancies and inflammatory conditions, have been extensively studied. Although the association between breast cancer and nutrition has been widely explored, benign breast disease (BBD) has also been investigated in this context. Aghababayan et al. compared dietary inflammatory index scores [17] and phytochemical indices [18] between women with and without BBD and reported a slightly positive association for the former and a negative association for the latter. Similarly, Tiznobeyk et al. [19] reported an inverse relationship between a healthy dietary pattern and BBD. More recently, Rastad et al. [20] examined multiple aspects of dietary habits in BBD. However, none of these studies included idiopathic granulomatous mastitis (IGM). From a broader perspective, the link between diet and inflammatory conditions has been well established through numerous studies [9, 10, 21, 22].
The inflammatory nature of IGM, its probable link to autoimmunity, and the triggering effect of milk stasis in a subset of women collectively suggest a more substantial association between IGM and dietary factors. Nevertheless, this detail remains obscure due to the scarcity of studies on this subject. Yurdacan et al. [12] compared serum IgG antibodies against 54 food allergens in 32 IGM patients and their matched controls. They reported significantly greater intolerance to lentils and curry in IGM and concluded that these factors may contribute to IGM pathogenesis. Among many other potential contributing factors, Zeng et al. [4] compared the use of spicy food in 594 cases and controls and reported no significant differences between the groups. Deng et al. [11] implemented a case management model on 152 patients with IGM to evaluate the factors affecting disease recurrence. One of the recommendations in the model was to abstain from eating exciting food, consisting of spicy foods and other considerations. While they reported that approximately 14.5% of the patients used these items before they affected them, they did not examine whether adherence to the food protocol affected recurrence rates.
There has been a link between the gut microbiota and breast inflammation in animal studies [23, 24], but a probable link with IGM has been considered in only one study. Dai et al. [13] compared the characteristics of gut microbiota metabolites (namely, short-chain fatty acids) in stool samples from 35 IGM patients and 26 healthy women. They revealed significant differences between the two groups, suggesting an association between gut microbiota dysbiosis and IGM. Although dietary concerns were not directly considered, they imply such a relationship given the known association between the type of food intake and the gut microbiota [25, 26].
These preliminary findings highlight a profound knowledge gap, warranting systematic investigation into possible associations between dietary factors and IGM. Such an investigation depends on the existence of detailed data on the dietary intake of IGM patients and comparisons with a control group selected with precise criteria. While such comprehensive data were not available until now, our project has provided this essential understructure. We present a dietary dataset encompassing 147 food items, quantifying consumption patterns, in a substantial number of IGM patients and a rigorously selected control group. This unique resource sets the ground for effective investigations into the associations between diet and IGM. In addition, related factors, including BMI components and reproductive and hormonal variables, are included in the dataset, enabling the consideration of confounding factors.
To date, physicians managing IGM patients have either avoided dietary instructions or provided recommendations based on personal opinions. On the basis of the present dataset, authentic research may yield validated dietary advice for IGM patients instead of the current speculative approaches.
The present dataset represents the first dedicated dietary resource for IGM research. Key strengths of this study include the consistent data collection process, the use of a validated tool, and its clinical relevance.
Although we planned to attract international collaboration, we have not yet succeeded in this endeavor. However, our next planned projects will assess the associations between IGM and the dietary inflammatory index, vitamin D, and B-group vitamins, and the healthy eating index. This dataset also opens avenues for additional research, and international collaborations would pave the way toward evidence-based dietary guidelines for IGM management.
Limitation
This study had some limitations. As with most research involving food frequency questionnaires, the potential for recall bias exists. Our choice of controls (companions of patients and outpatient visitors of selected clinics) may not fully eliminate the risk of selection bias. While community-based random sampling could theoretically reduce selection bias, this approach would have been logistically challenging in a multicenter study of this scale. Thus, the selected control group represented a practical and appropriate comparison population, according to our rationale explained in the Methods. Additionally, our population is limited to Iran, which may affect the generalizability of the findings to other populations.
Conclusion
This study presents the process of developing a large dietary regimen for IGM patients and a rigorously selected control group. It includes quantitative data approximately 147 food items collected through a validated tool and other related variables collected via standardized protocols. The data set sets the background for research on the associations between IGM and dietary details and will allow the definition of dietary guidelines for IGM patients.
Supplementary Information
Below is the link to the electronic supplementary material.
Supplementary Material 1.
