Analise Semantica Automatizada com LLM e RAG para Bulas Farmaceuticas
Daniel Meireles do Rego

TL;DR
This paper explores combining RAG architectures with Large Language Models to automate the semantic analysis of pharmaceutical leaflets, improving information retrieval and interpretation of unstructured PDF documents.
Contribution
It introduces a novel approach integrating vector search, semantic data extraction, and natural language generation for pharmaceutical document analysis using RAG and LLMs.
Findings
Significant improvements in accuracy and completeness of information retrieval.
Faster response times in semantic queries.
Enhanced consistency in interpreting technical texts.
Abstract
The production of digital documents has been growing rapidly in academic, business, and health environments, presenting new challenges in the efficient extraction and analysis of unstructured information. This work investigates the use of RAG (Retrieval-Augmented Generation) architectures combined with Large-Scale Language Models (LLMs) to automate the analysis of documents in PDF format. The proposal integrates vector search techniques by embeddings, semantic data extraction and generation of contextualized natural language responses. To validate the approach, we conducted experiments with drug package inserts extracted from official public sources. The semantic queries applied were evaluated by metrics such as accuracy, completeness, response speed and consistency. The results indicate that the combination of RAG with LLMs offers significant gains in intelligent information retrieval…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
