Table of Content detection using Machine Learning
Rachana Parikh, Avani R. Vasant

TL;DR
This paper presents a machine learning-based method for detecting Table of Content pages in multipage documents, aiding in document navigation and information retrieval.
Contribution
It introduces a novel machine learning approach utilizing various features to accurately identify TOC pages in diverse document layouts.
Findings
Effective detection of TOC pages demonstrated
Improved navigation in multipage documents
Enhanced information retrieval efficiency
Abstract
Table of content (TOC) detection has drawn attention now a day because it plays an important role in digitization of multipage document. Generally book document is multipage document. So it becomes necessary to detect Table of Content page for easy navigation of multipage document and also to make information retrieval faster for desirable data from the multipage document. All the Table of content pages follow the different layout, different way of presenting the contents of the document like chapter, section, subsection etc. This paper introduces a new method to detect Table of content using machine learning technique with different features. With the main aim to detect Table of Content pages is to structure the document according to their contents.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques · Image Retrieval and Classification Techniques · Image Processing and 3D Reconstruction
