Early detection of risks in child development in Spanish-speaking countries: content validity

Luis Felipe Llanos; María Martínez de Anguita

PMC · DOI:10.3389/fped.2025.1444104·March 5, 2025

Early detection of risks in child development in Spanish-speaking countries: content validity

Luis Felipe Llanos, María Martínez de Anguita

PDF

Open Access

TL;DR

This study created a reliable screening tool for early detection of developmental risks in Spanish-speaking children to support timely interventions.

Contribution

A new culturally relevant screening scale with validated items for early developmental risk detection in Spanish-speaking primary education settings.

Findings

01

18 items with strong psychometric properties and content validity were identified through expert consensus.

02

The scale is designed to support early identification of developmental disorders in educational and clinical contexts.

03

Future validation in larger samples is needed to confirm reliability and generalizability.

Abstract

Early detection of developmental disorders like ADHD, ASD, and LD is critical for improving educational outcomes and enabling timely interventions. This study aimed to develop a reliable, practical screening scale for Spanish-speaking children entering primary education, addressing gaps in early identification within the region. In total, 151 items were identified. With a qualitative methodology and with the help of 18 specialists in child therapy from various Spanish-speaking countries, stabilized content validity. The analysis was articulated in its aspects of relevance. The validation process identified 18 items with strong psychometric properties. These items demonstrated high levels of agreement among experts and strong content validity, forming the foundation for a culturally relevant screening tool. The scale is designed to identify developmental risks early and support timely…

Linked entities

Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.

Diseases3

ADHD ASD developmental disorders

Tables4

Table 1. Items from the scales selected.

Scale	ADHD	ASD	LD	Total
CTRS-15	15			15
EDAH	10			10
VARS	18			18
CADBI	9			9
YAQ-I	8			8
ASQ-3		28		28
PEDS		8		8
Prodislex			30	30
Prodiscat			15	15
DDAMat			10	10
Total	60	36	55	151

Table 2. Suggested items identified from early detection tests in ADHD.

Item in English	Item in Spanish	Item source	Advising experts	Consensus among experts"	X ²	Average relevance
1. Constantly moves, is restless.	Se mueve constantemente, es intranquilo	EDAH	18	16	2	3.56
2. Easily distracted, shows limited attention span.	Se distrae fácilmente, muestra escasa atención	EDAH	18	16	2	3.56
3. Exhibits excessive motor activity.	Tiene excesiva inquietud motora	EDAH	18	17	1	3.50
4. Restless, always alert and in motion.	Es inquieto, siempre despierto y en movimiento	CTRS-15	18	17	1	3.50
5. Answers impulsively, even before hearing the complete question.	Responde precipitadamente, incluso antes de escuchar la pregunta completa	VARS	18	18	0	3.44
6. Cannot stay still.	No puedo quedarse quieto	CTRS-15	18	18	0	3.38
7. Stands up in the classroom when expected to remain seated.	Se pone de pie en el aula cuando debiera permanecer sentado	VARS	18	16	2	3.38
8. Acts as if “driven by a motor” or seems “on the go” during classroom activities	Actúa como si “impulsado por un motor” o pareciera “en marcha” durante las actividades del aula	CADBI	18	16	2	3.38

Table 3. Suggested items identified from early detection tests in ASD.

Item in English	Item in Spanish	Item source	Advising experts	Consensus among experts"	X ²	P-value	Average relevance
1. Makes unusual sounds or noises when speaking.	Dice o emite sonidos al hablar	PEDS	18	15	3	0.000	2.69
2. Cannot walk on tiptoes for 4.5 m.	No puede caminar de puntillas 4.5 m	ASQ-3	18	15	3	0.000	2.56

Table 4. Suggested items identified from early detection tests in LD.

Item in English	Item in Spanish	Item source	Advising experts	Consensus among experts’	X ²	Average relevance
1. Has difficulty with mental calculations.	Tiene dificultad para el cálculo mental	DDAMat	17	15	2	3.50
2. Exhibits reading difficulties.	Presenta dificultades de lectura	Prodislex	18	15	3	3.50
3. Has trouble handling numbers and mathematical symbols.	Tiene dificultad para manejar números y símbolos matemáticos	DDAMat	18	17	1	3.44
4. Struggles to name mathematical quantities, numbers, symbols, and establish relationships.	Tiene dificultad para nombrar cantidades matemáticas, números, símbolos y establecer relaciones	DDAMat	18	15	3	3.38
5. Shows less reading fluency compared to the class group.	Tiene poca fluidez lectora en comparación con el grupo de clase	Prodiscat	18	15	3	3.38
6. Finds it difficult to interpret arithmetic operations.	Tiene dificultad para interpretar operaciones aritméticas	DDAMat	18	17	1	3.31
7. Struggles with solving problems that involve a certain degree of logical-mathematical reasoning.	Tiene dificultad en la resolución de problemas que impliquen cierto grado de razonamiento lógico-matemático	DDAMat	18	16	2	3.25
8. Omits or adds letters, syllables, or words (omissions and additions)	Omite o añade letras, sílabas o palabras (omisiones y adiciones)	Prodislex	18	15	3	3.25

Keywords

early detectionattention deficithyperactivityautismlearning disorders

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsInfant Development and Preterm Care · Child and Adolescent Psychosocial and Emotional Development · Child and Adolescent Health

Full text

Introduction

1

This study aimed to identify commonly used early detection questionnaires for ADHD, ASD, and LD that primary school teachers can administer without extensive training. We compiled a list of relevant scales for each disorder, which were then evaluated by a committee of experts. To ensure content validity, we selected scales carefully to avoid omitting essential dimensions or excessively including instruments.

Early detection of developmental risks is crucial for identifying biological, psychological, and social factors, as well as for recognizing developmental and learning disorders at an early stage (1–4). Such detection supports the implementation of targeted intervention programs, and the development of neuropsychological guidelines tailored to children's needs (5). In Spain, the National Institute of Statistics (6) reported that 16% of cases involving neuronal disorders do not receive early specialist care. Teachers play a key role in the early identification and referral of such cases, as noted by Mateu and Sanahuja (7).

Programs aimed at families and teachers to foster children's socialization and prosocial behavior have shown benefits, including improved self-esteem and reduced disruptive behaviors, emotional issues, hyperactivity, and aggression (8, 9). This highlights the importance of understanding how parents and teachers can contribute and what actions they should take. Engaging teachers in early detection has demonstrated significant improvements in subsequent treatments (10). Thus, equipping educators to address the needs of students with behavioral disorders is critical (11).

Neuropsychological evaluation plays a vital role in identifying neurodevelopmental disorders (NDD), employing specialized tools tailored to the child while considering their family and school environments (12). According to the DSM-V (2013), NDDs are categorized as (1) genetic disorders, such as Down syndrome or Rett syndrome; (2) environmental factors, like fetal alcohol syndrome; and (3) multifactorial conditions, including ADHD, autism spectrum disorders (ASD), and learning disorders (LD), which encompass intellectual disabilities, communication disorders, motor disorders, and tic disorders (13, 14).

NDDs affect approximately 20% of children and adolescents, with 4%–5% experiencing severe conditions (15). These children often encounter challenges such as fatigue, boredom, low self-esteem, motor control difficulties, and difficulties understanding instructions, all of which contribute to stress (12). Furthermore, DSM-classified mental disorders affect an estimated 12% of the general population, though many mental health issues in individuals under 18 remain underdiagnosed (2, 15).

Among NDDs, learning disorders (LD) are the most prevalent, affecting approximately 10% of school-aged children (16). ADHD is the second most common, impacting 3%–6% of this population (15, 17–19). Comorbidities are frequent, with individuals often exhibiting symptoms of multiple disorders (20). For instance, children with ASD show a significantly higher prevalence of comorbid disorders compared to their non-ASD siblings (21).

ADHD and the school

1.1

ADHD is a polygenetic disorder with a neurological basis that chronically affects behavior, academic performance, and social relationships. According to the DSM-V criteria by the American Psychological Association (22), it primarily impacts three areas: attention, hyperactivity, and impulsivity. ADHD often co-occurs with other conditions, including oppositional defiant disorder, conduct disorder, depression, anxiety, and substance abuse. Due to its complexity, early diagnosis by medical, psychological, and educational professionals is essential (17).

Detection of ADHD typically involves neuropsychological evaluation and behavioral observation. Initial assessments often include psychometric tests using pencil-and-paper methods to evaluate attention and impulse control, establishing a baseline for subsequent interventions (23). ADHD symptoms generally emerge during preschool years and can significantly affect later academic performance. In Spain, there is a shortage of standardized scales adapted to the local language and culture for detecting ADHD in preschoolers, underscoring the importance of early detection for effective clinical and educational interventions (24).

Given its complexity, diagnosing ADHD requires a multidisciplinary approach. The process often begins with school-based detection through behavioral questionnaires and scales completed by parents and teachers. A diagnosis is confirmed when assessments converge, though discrepancies between evaluations are common and can complicate the process (25). Commonly used tools are behavior scales and questionnaires based on DSM-IV and DSM-V criteria, with training provided to parents and teachers to improve accuracy (26, 27).

ASD and the school

1.2

The diagnosis of autism spectrum disorder (ASD) is primarily clinical (28). The American Academy of Pediatrics recommends screening all young children at 9, 18, and 24–30 months to detect developmental delays (29). However, the complexity of ASD presents challenges for traditional psychometric instruments in accurately tracking developmental trajectories (12).

ASD assessments are generally recommended at 12 months, at 2 years, and again between 4 and 5 years. However, certain forms of ASD, such as Asperger syndrome, may not become evident until the child faces greater social demands, typically during the early years of primary school (30).

LD and the school

1.3

Learning disorders (LD) are neurological conditions that hinder children with normal intelligence from achieving academic success due to insufficient learning resources. These disorders affect skills such as reading, writing, arithmetic, and attention (31). LDs persist throughout life, with dyslexia and dyscalculia being the most common forms. Dyslexia, characterized by reading and writing difficulties, affects between 5% and 17% of the population (16).

Early detection and diagnosis facilitate the implementation of methodological intervention programs rather than content-based strategies (32). Proper guidance for families is essential, as addressing LD requires a multidisciplinary approach and collaboration among educational agents (16). This study focuses on students from various Spanish-speaking countries, with future research potentially extending findings to different cultural and linguistic contexts. However, as Griel and Elatia (33) suggest, changing the language of educational and psychological tests may interfere with results.

Dyscalculia, involving difficulties in mathematics, affects approximately 5% of the population. Notably, two-thirds of children with dyscalculia also present with another developmental disorder (16).

Debate between professional tests and early detection tools

1.4

There is considerable debate regarding the relevance of questionnaires and behavioral observation scales for detecting ADHD, given the extensive list of symptoms outlined in the DSM-V (26) and inconsistencies in results. School-administered questionnaires often reveal discrepancies between reports from teachers and family members (18). For example, in an ADHD screening test, interobserver agreement between parents and teachers was found to be low (Kappa = 0.28) (34). Similarly, neuropsychological tests conducted by professional examiners often fail to correlate with questionnaires completed by parents (35). Furthermore, neuropsychological tests alone may be insufficient to differentiate ADHD from other psychiatric disorders (36).

Early detection tools

1.5

Early Intervention (EI) aims to support children with developmental disorders or those at risk. According to the State Federation of Early Intervention Professional Associations (37), early detection is critical for maximizing these children's potential. EI is informed by theoretical models such as Bronfenbrenner's (38) general systems theory, Sameroff and Chandler's (39) transactional development approach, Guralnick's (40) evolutionary systems model, and King et al.'s (4) transdisciplinary theory. These models offer strategies, resources, and guidance for parents and environments to improve the functioning of both children and their families (41).

There is broad professional consensus that early detection and intervention significantly enhance overall prognosis, particularly in cases with a high risk of severe developmental outcomes (42).

Debate on the use of scales for teachers vs. parents

1.6

The natural diversity among individuals (43) presents challenges for teachers in accurately identifying students' needs. Teachers often easily recognize students with high cognitive abilities, focusing primarily on memorization and reproduction, but may undervalue students who exhibit creativity or defiance toward authority (43). Despite these difficulties, 15% of public-school students in the United States received special education services under the Individuals with Disabilities Education Act (IDEA) in the 2022–23 school year.

In Chile, attention and behavior questionnaires are more commonly used than neuropsychological tests, making them a prominent tool for identifying ADHD (44). Early detection questionnaires offer significant advantages in school settings: they are cost-effective and can be quickly administered by teachers (18). Research shows that teachers improve the accuracy and validity of their assessments with appropriate training (45, 46).

Educational institutions are increasingly providing support services during the early detection stages of learning disorders (47). The primary beneficiaries of these efforts are children who require special attention (1). Although symptoms vary across individuals, early detection remains essential to enable timely and effective interventions (48).

Methodology

2

To uphold the principle of content validity, the selection process ensured that the scales were neither too narrow—excluding important dimensions—nor excessively broad, including unnecessary instruments (49).

Instrument selection for early detection of ADHD

2.1

The selection of instruments for ADHD assessment was based on the scales identified in a study conducted in Chile by Carreño and Gatica (44). The authors identified the five most widely used questionnaires for ADHD evaluation: (a) Conners' Test (50), (b) Scale for the Assessment of Attention Deficit Hyperactivity Disorder (EDAH) (51), (c) Vanderbilt Assessment Rating Scale (VARS) for Parents and Teachers (52), (e) Child and Adolescent Disruptive Behavior Inventory (CADBI) – Teacher Report Version (53, 54), and (f) Young's ADHD Questionnaire (YAQ-I), informant version (36).

The selected ADHD tools

2.1.1

Teacher rating scale by Conners (CTRS-15). The scale is one of the most widely used tools for identifying childhood behavior problems, particularly ADHD. It has demonstrated high sensitivity and specificity, making it effective in distinguishing children with ADHD from those without it (55). The revised 15-item version (CTRS-15) by Purpura and Lonigan (56) selects five items from each subscale of the original CTRS-R. This streamlined version reduces the time required for teachers to complete it while retaining its ability to identify behavior problems. The 15 items are divided into three categories: 5 for inattention, 5 for hyperactivity/impulsivity, and 5 for oppositional behavior. Purpura and Lonigan (56) found the CTRS-15 to be psychometrically comparable to the original scale, while Gerhardstein et al. (57) confirmed its criterion validity, showing significant correlations with other ADHD measures.

Scale for the assessment of attention deficit hyperactivity disorder (EDAH). The scale is commonly used for evaluating ADHD in primary school students, with teachers completing it in 5–10 min. It includes five items each for hyperactivity and inattention, with the remaining items assessing behavioral problems. The EDAH enables structured teacher observations of a child's usual behavior, which are then analyzed to provide a global score and three standardized subscales (51).

Vanderbilt ADHD rating scales for parents and teachers (VARS). The Scale includes two versions: the Teacher (VADTRS) and the Parent (VADPRS) scales, designed for children and adolescents aged 6–12 years. These scales assess ADHD symptoms and other related behaviors. The VADTRS subscales cover inattention (items 1–9), hyperactivity (items 10–18), and additional areas such as disruptive behavior, anxiety-depression, academic performance, and school conduct. The factorial structure allows separate use of subscales for ADHD, conduct disorder (CD), and anxiety-depression disorder (58).

Child and adolescent disruptive behavior inventory (CADBI). The scale evaluates children and adolescents aged 3–18 years, with ratings provided by parents and teachers (53). Studies have validated its reliability and factorial structure across diverse samples from Brazil, Chile, Nepal, South Korea, Spain, Thailand, and the United States (59). CADBI comprises three subscales: behavior toward adults, behavior toward peers, and activity level at school. Each subscale includes 9 items related to ADHD. Its 8-point response scale facilitates ease of use, with scores ranging from 1 (“never in the last month”) to 8 (“10 or more times per day”).

Young ADHD questionnaire-I (YAQ-I). The scale has two versions: the self-reported YAQ-S and the informant-reported YAQ-I. Both versions include four subscales covering attention, hyperactivity, impulsivity, and emotional problems, with the YAQ-I adding 8 items specifically related to emotional issues (60, 61). The YAQ-I has demonstrated strong internal consistency across its subscales (36). Although most studies have focused on the YAQ-S version in adult students, results for this tool have been positive (62).

Excluded ADHD scales

2.1.2

Several notable scales were excluded despite their recognized utility in school settings. For instance, the SNAP-IV scale by Swanson, Nolan, and Pelham (63) was excluded due to significant correlations with the previously selected tools. Two studies demonstrated this overlap: a Brazilian study comparing SNAP-IV with Conners' Scale (64) and a Taipei study comparing SNAP-IV with both the Conners' and Vanderbilt Scales (65).

The Barkley ADHD Rating Scale (66), another robust instrument for assessing ADHD symptoms, was excluded due to potential cultural discrepancies in certain items or concepts. While it has shown convergent results with Conners' Rating Scale, ongoing research aims to improve its cultural sensitivity (67).

Other commonly used tests, such as the WISC-IV Working Memory Test (for intelligence), Continuous Performance Test (CPT), Five Digits Test (FDT), Stroop Test, and the Revised Perception of Differences Test (CARAS-R), were also excluded. These tools are more oriented toward neuropathological evaluations rather than early detection purposes (44).

Instrument selection for early detection of ASD

2.2

The selection of scales for early ASD detection prioritized practicality and minimal time requirements for primary care professionals. Zúñiga et al. (29) recommended two specific questionnaires for early ASD screening due to their ease of use, quick administration, and suitability for primary school teachers: a) the ASQ-3 (Ages and Stages Questionnaire) (68) and b) the PEDS (Parents' Evaluation of Developmental Status) (69).

The selected ASD scales

2.2.1

Ages and stages questionnaire (ASQ-3). The scale validated by Squires and Bricker (70) for Latin populations in the USA and Chile, is designed for children aged 8–30 months. This developmental screening tool tracks progress using a parent-focused approach, making it user-friendly and the most widely used developmental screener. The ASQ-3 consists of 30 items divided into five subscales, each with six items: (a) Communication, (b) Gross Motor, (c) Fine Motor, (d) Problem Solving, and (e) Personal-Social. If a child's total score falls within a specific range, the test recommends further evaluation by a professional (68).

Parents' evaluation of developmental Status (PEDS). Glascoe (69, 71) developed the Parents' Evaluation of Developmental Status (PEDS) based on extensive research into the predictive value of parental concerns for identifying behavioral and developmental problems in children. The PEDS is a 10-item scale, including two open-ended questions, for children aged 0–8 years. It incorporates a “PEDS Interpretation Form”, which provides an algorithm to guide professionals in responding to test results. Glascoe demonstrated that parental concerns are strong predictors of developmental and behavioral issues, making the instrument highly reliable and standardized.

Excluded ASD scales

2.2.2

The M-CHAT, a widely used international standard for early autism detection in young children, was excluded from this study. Hardy et al. (72) and Beecham, as cited in Kong et al. (73), reported a high correlation between the M-CHAT and the ASQ-3. Additionally, Schonhaut et al. (74) highlighted that the M-CHAT lacks items addressing socio-emotional aspects.

ADOS-2, another commonly used tool for ASD diagnosis, was also excluded. While it is highly effective, its administration requires extensive training and specialized expertise in autism, making it less practical for general early screening purposes (75).

Instrument selection for early detection of LD

2.3

The selection of instruments for assessing learning disorders (LD) focused on widely recommended tools for detecting dyslexia and dyscalculia. These include (a) PRODISCAT, recommended by Bosch et al. (76), (b) PRODISLEX, for their detection in Spanish in the different educational cycles (77), and (c) Detection of Difficulties in Mathematics (DDAMat), specifically designed to identify early signs of dyscalculia (78).

The selected LD tools

2.3.1

PRODISLEX. The scale protocol for detecting and intervening in dyslexia during early childhood education assesses two language-related dimensions: Oral Comprehension and Expression (6 items), and Reading/Writing (27 items). Additionally, it evaluates areas such as mathematics, understanding of time, cognitive aspects, health, personality, and psychomotor coordination. Responses are recorded in a binary format (yes/no) (79).

PRODISCAT. Developed by the College of Speech Therapists, PRODISCAT is designed for the educational field to assist teachers in early dyslexia detection. It includes a general dimension and a specific dimension, with the latter focusing on: Literacy (15 items), and Other areas such as mathematics and school performance. Like PRODISLEX, PRODISCAT uses binary responses (yes/no) (80).

Test for the detection of difficulties in the field of mathematics (DDAMat). The scale consists of 10 items in Spanish, each with five response options: Never, Almost Never, Sometimes, Frequently, and Always. Based on teacher observations, it identifies potential difficulties in mathematics (78).

Excluded LD scales

2.3.2

Several well-known tools were excluded due to their limitations in early screening for primary education settings: BADyG (Battery of Differential and General Aptitudes): This tool is widely used to assess multiple cognitive abilities, including those linked to learning, and is particularly effective in the differential diagnosis of dyslexia and dyscalculia. However, it was excluded for the following reasons, it requires extensive training in educational psychology or neuropsychology for proper administration and interpretation, and its application time of 60–90 min is impractical for early screening purposes (81).

PROLEC-R battery (revised reading processes). This instrument is highly regarded for evaluating reading processes and related cognitive skills. However, it is primarily a diagnostic tool rather than a screening instrument. Its exclusion was based on the need for specialized training in neuropsychology and speech therapy for accurate administration and interpretation, and its requirement for individual administration for each student, as detailed in its application manual (82).

Other transversal, developmental, or regional scales considered and discarded

2.4

Several scales with multicultural or regional orientations were reviewed but ultimately excluded for specific reasons. Below are examples of the scales considered.

Child behavior checklist (CBCL). This questionnaire covers a wide range of emotional and behavioral problems and is adaptable to various cultural contexts. It has been standardized in multiple languages and is widely used in research, making it a classic tool in child behavior assessment. However, the CBCL was excluded due to its length (140 items for children aged 6–18 years), which limits its practicality for large-scale screening in primary school settings (83).

Bayley scales of infant and toddler development (BSID). The BSID is well-known for assessing general developmental progress, including identifying developmental delays. However, it was excluded because it focuses on children under 42 months, which falls outside the target population of this study (84).

Evaluación neuropsicológica infantil (ENI). The ENI is commonly used in Mexico for assessing specific learning disorders but has limited application in other Spanish-speaking countries and is considered outdated (85). Additionally, it requires advanced expertise in child neuropsychology for proper administration and interpretation, making it impractical for early screening purposes (86).

By excluding these tools, this research focused on instruments that are practical, broadly applicable, and suitable for early detection in primary school settings.

Items for validation process

2.5

To create the Early Detection Index of Risks in Child Development for Spanish-speaking countries, designed to identify potential developmental disorders in children entering primary education, items were compiled from the selected scales for ADHD, ASD, and LD. Table 1 summarizes the number of items included from each scale.

Expert committee

2.6

The recommendations of Muñiz (87) and Hernández-Sampieri et al. (88) were followed to ensure content validity, defined as an instrument's ability to accurately measure the intended constructs. Additionally, McGartland et al.'s (89) guidelines were adopted, which suggest involving 2–20 experts, with a minimum of 5 and at least two specializing in measurement and evaluation (90). Based on these criteria, 20 experts were invited to participate in the validation process.

Expert selection process

2.6.1

Experts were selected through convenience sampling, leveraging the researchers' personal and professional networks, as well as their affiliated universities. This approach facilitated access to individuals with expertise in psychology, education, and related fields. Initially, over 30 experts were approached and informed about the project. Some declined due to professional commitments, while others chose to self-exclude, citing either doubts about the project's objectives or concerns about their qualifications to contribute. Ultimately, 18 experts agreed to participate, fulfilling the study's criteria for academic and professional relevance.

Expert profiles

2.6.2

The final group of experts included: 10 psychologists, 2 educational psychologists, 1 child clinical psychologist, 1 pedagogue, 3 graduates in education, and 1 graduate in physical and technical sciences.

These professionals represented both public and private institutions, with 10 holding specialized degrees. Their professional experience averaged 14.7 years, with a minimum of 6 years.

Validation process

2.7

Each expert received a randomized list of 151 items without being informed of the scale from which each item originated. The experts were sensitized to the objective of the exercise and asked to classify each item, based on their expertise, as representing a typical behavior associated with one of the three studied disorders: ADHD, ASD, or LD. Experts were permitted to abstain from classifying an item if unsure. Importantly, no interaction or consultation between experts was allowed during the process.

Content validity assessment

2.7.1

To determine item eligibility for inclusion in the tool, content validity was assessed using the Chi-square goodness-of-fit method, which compared experts' classifications with the theoretical diagnosis (percentage of agreement). This approach followed the steps outlined by Vargas and Hernández (17).

For each item item i (1 ≤ i ≤ 151), agreement between theory and practice was calculated using a variable Oij (Observed)

•Oij = 1: If the disorder identified by expert i matches the theoretical diagnosis of the item's source.
•Oij = 0: If there is no match.
•eij = 1: Constant, representing the expected value under perfect theoretical agreement.

The Chi-square goodness-of-fit statistic X^2^ for each item i, with degrees of freedom df = n−1, was calculated using Equation (1).

[eqn]

Item selection criteria

2.7.2

Items were selected based on the calculated X^2^ statistic and associated p-values:

•Items with p-value <0.05 were considered valid, indicating no significant evidence to reject them.
•Items with p-value ≥0.05 were excluded.

Additionally, experts rated the relevance of each item for early detection on a 0–4 scale: 4 = Essential, 3 = Desirable, 2 = Neutral, 1 = Not desirable, 0 = Not recommended.

The average relevance score across experts was used to prioritize no more than 8 non-dismissed items for each disorder, ensuring a concise tool. The 8-item limit was based on evidence that shorter scales reduce participant fatigue and improve response rates (91, 92).

Items that met theoretical agreement but lacked satisfactory consensus on relevance were also excluded.

Final discrepancies and consensus Index

2.7.3

After statistical reduction and item selection, a second review was conducted to confirm the robustness of the proposed test. This review employed the consensus index method described by Perales (93). An index of consensus ≥0.80 was considered indicative of high content validity. This step ensured that the final items met stringent criteria for both statistical significance and expert agreement.

Results

3

Items derived from ADHD scales

3.1

From the five ADHD scales selected, a total of 60 items were compiled. After consulting the experts about which disorder each item was related to, only 5 items (8.3%) showed 100% agreement between theory and practice among all experts who provided opinions. Additionally, 24 items (40%) demonstrated over 80% agreement between theoretical assignment and expert evaluation. For the remaining items, there was a notable theoretical-practical discrepancy among the experts.

The 29 items with agreement levels exceeding 80% had a Chi-square statistic X^2^ <2, corresponding to a p-value of 0.00. Subsequently, within this group, 8 items were selected based on having the highest average score for the variable “Relevance” (ranging from 3.38 to 3.56 on a scale of 0 to 4). Table 2 presents the 8 items selected for early detection of ADHD, ranked by their relevance as rated by the experts.

Three of the selected items were derived from the EDAH scale, two from the CTRS-15, two from the VARS, and one from the CABDI. Notably, no items from the YAQ-I scale ranked among the top 8. The highest-ranked item from this scale was in position 15, with a p-value of 0.00 and an average relevance score of 3.19, falling below the threshold set by the top 8 selected items.

Items derived from ASD scales

3.2

From the two ASD scales selected, a total of 36 items were compiled. After consulting the experts about which disorder each item was related to, no items achieved 100% agreement between theory and practice among all experts. Only 2 items (5.60%) demonstrated an agreement of 83.3% between theoretical assignment and expert evaluation. For the remaining items, there was a notable theoretical-practical discrepancy among the experts.

The 2 items with agreement levels exceeding 80% had a Chi-square statistic X^2^ = 3, corresponding to a p-value of 0.00. Within this subset, the relevance scores for these two items were 2.69 and 2.56 (on a scale of 0–4). Table 3 presents the 2 items selected for early detection of ASD, ranked by their average relevance scores as rated by the experts.

One item originated from the PEDS scale and the other from the ASQ-3 scale. Notably, the average relevance scores assigned by the experts for items assessing early detection of ASD ranged between 2 (neutral) and 3 (desirable). This trend may be attributed to the fact that ASD diagnosis is primarily clinical in nature (28).

Items derived from LD scales

3.3

From the three LD scales selected, a total of 55 items were compiled. After consulting the experts about which disorder each item was related to, no items achieved 100% agreement between theory and practice among all experts. However, 18 items (32.7%) demonstrated over 80% agreement between theoretical assignment and expert evaluation. For the remaining items, there was a notable theoretical-practical discrepancy among the experts.

The 18 items with agreement levels exceeding 80% had a Chi-square statistic X^2^ <3, corresponding to a p-value of 0.00. Subsequently, within this subset, 8 items were selected based on their highest average scores for the variable “Relevance” (ranging between 3.25 and 3.50 on a scale of 0–4). Table 4 presents the 8 items selected for early detection of LD, ranked by their relevance as rated by the experts.

Five items were derived from the DDAMat scale, two from the Prodislex, and one from the Prodiscat. Notably, the experts prioritized items related to mathematical learning over reading, selecting 5 items for the former and 3 items for the latter. This highlights the emphasis placed on detecting mathematical learning difficulties in early education.

Discussion

4

To promote early detection, it is essential for students entering primary education to undergo evaluations by teachers and other educational figures. The tests presented in this study are specifically designed for students attending school. While primary education is mandatory in Spanish-speaking countries, pre-primary education is not universally compulsory. However, some nations have made notable progress in this area.

Through the expert validation process, we identified 18 items from the 10 most widely used, validated, and reliable psychometric scales. These items exhibited strong psychometric properties in terms of relevance and importance, providing a solid foundation for developing an early screening tool. This tool holds significant potential for use in schools to refer children who may develop developmental disorders affecting educational performance in Spanish-speaking regions to specialists at an early stage.

The 18 experts reached a consensus to integrate into a single construct for early detection of developmental risks in schoolchildren: 8 items from the ADHD scales, 8 from the LD scales, and only 2 items from the ASD scales. This differentiation in preferences arises because ASD scales are typically designed for diagnosing children younger than primary school age, which is the target population of this tool. However, certain forms of ASD, such as Asperger syndrome, often manifest during primary education when social demands increase. Therefore, we decided to retain these two ASD items for inclusion in the project.

Once reliability tests are conducted, this item base could become a highly effective tool for the early detection of developmental risks in schoolchildren. This scale holds promise for identifying students with potential neurodevelopmental challenges, enabling timely and appropriate medical and educational support.

Limitations

4.1

One significant limitation of this research was securing expert collaboration. The time commitment required and the international scope of the project led many individuals to decline participation during the initial stages. Participants were carefully selected to ensure diverse roles and professional backgrounds that could contribute meaningfully to the study. However, identifying suitable participants and maintaining their involvement proved challenging.

Another limitation was the use of convenience sampling for the expert committee, which may introduce biases. Despite this, the diversity of the experts' profiles—including years of experience, areas of specialization, and representation from both public and private institutions—strengthens confidence in the robustness of the validation process.

However, subjectivity in expert judgments, lack of weighting for expertise, limited applicability to constructs with certain boundaries, and challenges in generalizing results indicate that, while this procedure is somewhat useful, it should be supplemented with reliability tests. Validating the scale with larger, diverse samples is necessary to ensure its reliability, establish scoring cutoff points, refine item wording and phrasing, provide clear administration guidelines, and confirm its generalizability across Spanish-speaking contexts, thereby offering a more comprehensive assessment of content validity.

Future avenues of research

4.2

We plan to implement the developed scale across a large, representative sample in Spanish-speaking countries, focusing on:

(a)Evaluating Reliability and Structural Validity: Conducting reliability assessments, including tests for convergent and discriminant validity, to measure internal consistency. Additionally, performing exploratory factor analysis (EFA) to examine the scale's dimensionality, ensuring alignment with the intended constructs.
(b)Establishing Cutoff Points for Categorical Classification: Comparing outcomes between experimental and control groups to identify the minimum scores that warrant professional evaluation. This approach aims to transform the one dimensional scale into a practical categorical tool suitable for educational settings. By leveraging a substantial and diverse sample from Spanish-speaking populations, these analyses aim to confirm the scale's robustness and generalizability, ensuring its applicability across various national contexts.

Bibliography93

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Fuller EA Kaiser AP. The effects of early intervention on social communication outcomes for children with autism spectrum disorder: a metaanalysis. J Autism Dev Disord. (2020) 50:1683–700. 10.1007/s 10803-019-03927-z 30805766 PMC 7350882 · doi ↗ · pubmed ↗
2Costello EJ Egger H Angold A. 10-year research update review: the epidemiology of child and adolescent psychiatric disorders: I. Methods and public health burden. J Am Acad Child Adolesc Psychiatry. (2005) 44:972–86. 10.1097/01.chi.0000172552.41596.6f 16175102 · doi ↗ · pubmed ↗
3Giambona P Ding Y Cho S Zhang C Shen Y. Parent perceptions of the effects of early intensive behavioral interventions for children with autism. Behav Sci. (2023) 13:1–19. 10.3390/bs 13010045 PMC 985504236661617 · doi ↗ · pubmed ↗
4King G Strachan D Tucker M Duwyn B Desserud S Shillington M. The application of a transdisciplinary model for early intervention services. Infants Young Child. (2009) 22:211–23. 10.1097/IYC.0b 013e 3181 abe 1c 3 · doi ↗
5MartínezÁC Matamoros AM. Neuropsicología infantil del desarrollo: detección e intervención de trastornos en la infancia. Rev Iberoam Psicol. (2010) 3:59–68.
6National Institute of Statistics [INE]. (2008). Encuesta de Discapacidad, Autonomía Personal y Situaciones de Dependencia 2008. Encuestas de discapacidades. Available online at: https://www.ine.es/dyngs/IN Ebase/es/operacion.htm?c=Estadistica_C&cid=1254736176782&menu=resultados&idp=1254735573175#_tabs-125473619471 (Accessed February 06, 2025).
7Mateu L Sanahuja A. Evaluación e intervención en TDAH y TND:: un caso abordado en el contexto escolar. Rev Psicol Clin Ninos Adolesc. (2020) 7:52–8. 10.21134/rpcna.2020.07.1.7 · doi ↗
8Sánchez VC González BM. Comportamiento prosocial y agresivo en niños: tratamiento conductual dirigido a padres y profesores. Acta Investig Psicol. (2017) 7:2691–703. 10.1016/j.aipprr.2017.03.005 · doi ↗