The incorporation of Artificial Intelligence (AI) into BIREME/PAHO/WHO products and services represents a strategic advance to optimize the management and dissemination of health knowledge. The adoption of AI-based techniques can improve indexing, information retrieval, data analysis, and content recommendation processes, consolidating the efficiency of knowledge management systems.
Indexing documents is becoming one of the main challenges faced by information systems. Traditionally, this process requires experts to assign standardized descriptors to documents. The use of AI, particularly machine learning and natural language processing (NLP), is one way to automate the identification of controlled terms, reducing the human workload and improving the consistency of indexing. The recent implementation of DeCS Finder AI illustrates this advance, providing more efficient and accurate semi-automated indexing.
DeCS Finder AI is a NLP-based tool that enables the automatic classification of documents using models trained for different languages and domains. The use of algorithms such as Omikuji Bonsai enables agile and consistent indexing, optimizing the thematic treatment of large volumes of documents. The first version used a training base with around 3,000 records already indexed using the LILACS methodology in Portuguese, Spanish and English, and its incremental learning capacity allows for progressive improvements in the accuracy of the results when further training is carried out with increased data volume, such as the expansion and specialization of this training corpus.
Information retrieval also benefits from AI, since intelligent algorithms can refine search engines, offering more relevant results to users. In addition, AI models can improve the disambiguation of terms and the semantic understanding of queries, overcoming the limitations of traditional search systems based on keyword matching. In this context, the project to develop Abstract Summaries, or Plain Language Summaries, works to provide search results with a simple and objective summary of the articles retrieved.
Abstract Summaries are short texts generated from an article abstract using a Large Language Models (LLM) algorithm, which are visible on search portals. This makes it easier and quicker for researchers to select articles that meet their information needs, as they have access in just a few lines to the theme and main aspects of each document. The first version will soon be launched for an experimental set of 10 articles, and the algorithm used was LLAMA, from which we fine-tuned almost 1,000 summaries of abstracts in Portuguese, English and Spanish generated manually by the team.
To find out more about how BIREME is incorporating advanced technologies into its projects, products and services, follow the new editions of the BIREME Bulletin and also check out the following texts: