Bruno Menezes

1chapters authored

Chapters authored

Automatic BI-RADS Classification of Breast Magnetic Resonance Medical Records Using Transformer-Based Models for Brazilian Portuguese

By Ricardo de Oliveira, Bruno Menezes, Júnia Ortiz and Erick Nascimento

This chapter aims to present a classification model for categorizing textual clinical records of breast magnetic resonance imaging, based on lexical, syntactic and semantic analysis of clinical reports according to the Breast Imaging-Reporting and Data System (BI-RADS) classification, using Deep Learning and Natural Language Processing (NLP). The model was developed from transfer learning based on the pre-trained BERTimbau model, BERT model (Bidirectional Encoder Representations from Transformers) trained in Brazilian Portuguese. The dataset is composed of medical reports in Brazilian Portuguese classified into six categories: Inconclusive; Normal or Negative; Certainly Benign Findings; Probably Benign Findings; Suspicious Findings; High Risk of Cancer; Previously Known Malignant Injury. The following models were implemented and compared: Random Forest, SVM, Naïve Bayes, BERTimbau with and without finetuning. The BERTimbau model presented better results, with better performance after finetuning.

Part of the book: Machine Learning and Data Mining Annual Volume 2023

Bruno Menezes

Chapters authored

Related collaborators

Janine Zitianellis

Michael D. Wang

Júnia Ortiz

Erick Sperandio

Hany Helmy

Sherif El Diasty

Hazem Shatila

Ali Asghar Firoozi

Luwei Li

Ali Akbar Firoozi