# Confirming SPSS Results With ChatGPT-4 and o3-mini Models

**Authors:** Frederick Strale, Isaac Riddle, Bowen Geng, Blake Oxford, Malia Kah, Robert Sherwin

PMC · DOI: 10.7759/cureus.82005 · Cureus · 2025-04-10

## TL;DR

This study compared SPSS with ChatGPT-4 and ChatGPT o3-mini for statistical analysis, finding that ChatGPT-4 matched SPSS in basic tests but had issues in advanced analyses.

## Contribution

The study evaluates the reliability of ChatGPT-4 and ChatGPT o3-mini in replicating SPSS statistical results using real behavioral healthcare data.

## Key findings

- ChatGPT-4 closely matched SPSS in basic statistics like central tendency and correlation.
- ChatGPT o3-mini showed inflated results in correlation and multivariate analyses.
- Both models had issues with degrees of freedom and F-statistics in advanced tests like MANOVA.

## Abstract

Background

This research compared the simple and advanced statistical results of SPSS (IBM Corp., Armonk, NY, USA) with ChatGPT-4 and ChatGPT o3-mini (OpenAI, San Francisco, CA, USA) in statistical data output and interpretation with behavioral healthcare data. It evaluated their methodological approaches, quantitative performance, interpretability, adaptability, ethical considerations, and future trends.

Methods

Fourteen statistical analyses were conducted from two real datasets that produced peer-reviewed, published scientific articles in 2024. Descriptive statistics, Pearson r, multiple correlation with Pearson r, Spearman's rho, simple linear regression, one-sample t-test, paired t-test, two-independent sample t-test, multiple linear regression, one-way analysis of variance (ANOVA), repeated measures ANOVA, two-way (factorial) ANOVA, and multivariate ANOVA were computed. The two datasets adhered to a systematically structured timeframe, March 19, 2023, through June 11, 2023, and June 7, 2023, through July 7, 2023, thereby ensuring the integrity and temporal representativeness of the data gathering. The analyses were conducted by inputting the verbal (text) commands into ChatGPT-4 and ChatGPT o3-mini along with the relevant SPSS variables, which were copied and pasted from the SPSS datasets.

Results

The study found high concordance between SPSS and ChatGPT-4 in fundamental statistical analyses, such as measures of central tendency, variability, and simple Pearson and Spearman correlation analyses, where the results were nearly identical. ChatGPT-4 also closely matched SPSS in the three t-tests and simple linear regression, with minimal effect size variations. Discrepancies emerged in complex analyses. ChatGPT o3-mini showed inflated correlation values and significant results where none were expected, indicating computational deviations. ChatGPT o3-mini produced inflated coefficients in the multiple correlation and R-squared values in two-way ANOVA and multiple regression, suggesting differing assumptions. ChatGPT-4 and ChatGPT o3-mini produced identical F-statistics with repeated measures ANOVA but reported incorrect degrees of freedom (df) values. While ChatGPT-4 performed well in one-way ANOVA, it miscalculated degrees of freedom in multivariate ANOVA (MANOVA), leading to significant discrepancies. ChatGPT o3-mini also generated erroneous F-statistics in factorial ANOVA, highlighting the need for further optimization in multivariate statistical modeling.

Conclusions

This study underscored the rapid advancements in artificial intelligence (AI)-driven statistical analyses while highlighting areas that require further refinement. ChatGPT-4 accurately executed fundamental statistical tests, closely matching SPSS. However, its reliability diminished in more advanced statistical procedures, requiring further validation. ChatGPT o3-mini, while optimized for Science, Technology, Engineering, and Mathematics (STEM) applications, produced inconsistencies in correlation and multivariate analyses, limiting its dependability for complex research applications. Ensuring its alignment with established statistical methodologies will be essential for widespread scientific research adoption as AI evolves.

## Full-text entities

- **Diseases:** autistic (MESH:D001321), AI (MESH:C538142)
- **Chemicals:** o3-mini (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12065437/full.md

## References

22 references — full list in the complete paper: https://tomesphere.com/paper/PMC12065437/full.md

---
Source: https://tomesphere.com/paper/PMC12065437