# Research on the dynamic evolution mechanism of disruptive technology based on the BERTopic model and Hidden Markov Model: A case study of industrial Internet technology

**Authors:** Heng Yang, Sheng Chen, Xin Yang

PMC · DOI: 10.1371/journal.pone.0319924 · 2025-04-17

## TL;DR

This paper studies how Industrial Internet technology has evolved over time using patent data and machine learning models to identify trends and development patterns.

## Contribution

The novel use of BERTopic and Hidden Markov Models to analyze the dynamic evolution of Industrial Internet technologies through patent data.

## Key findings

- Industrial Internet technologies are categorized into five layers, with the data layer being the most developed.
- The physical layer is more developed than the logical and interaction layers.
- The HMM model reveals evolutionary patterns in technological topics over time.

## Abstract

The development of key technologies for the Industrial Internet is a major concern for countries worldwide. This paper aims to comprehensively understand the technology of the Industrial Internet by analyzing its current application status and trends. It will dynamically examine the key technologies and development trends of the Industrial Internet, providing a valuable reference for technological advancements in this field.

This paper analyzed global patent data in the field of the Industrial Internet from 1965 to 2023. The paper applied the BERTopic model and the all-MiniLM-L6-v2 model to extract and vectorize topics related to industrial internet technology from patent texts. Based on the theory of Internet governance, the paper categorizes the topics into four categories. The paper then established the Hidden Markov Model (HMM) to investigate the evolutionary mechanism of technological topics. The paper utilized the newly divided topics as hidden states and the number of patent applications as observed states in the Hidden Markov Model (HMM).

Industrial internet technology encompasses five research directions. The physical layer focuses on interconnection platforms for equipment, as well as devices for the storage and monitoring of liquids and gases. The logical layer involves remote control systems for industrial equipment, while the data layer focuses on data processing and information services. The interaction layer included modular image processing and control methods. Among these types of technologies, the data layer technologies were the most developed and also contributed to the advancement of interaction layer technologies. The physical layer technologies were relatively more developed, while the logical and interaction layer technologies were relatively less developed.

## Full-text entities

- **Genes:** NFIC (nuclear factor I C) [NCBI Gene 4782] {aka CTF, CTF5, NF-I, NF-I/C, NF1-C, NFI}
- **Chemicals:** HMM (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

38 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12005535/full.md

---
Source: https://tomesphere.com/paper/PMC12005535