TY - JOUR
T1 - Natural Language Generation Model for Mammography Reports Simulation
AU - Hoogi, Assaf
AU - Mishra, Arjun
AU - Gimenez, Francisco
AU - Dong, Jeffrey
AU - Rubin, Daniel
N1 - Publisher Copyright:
© 2013 IEEE.
PY - 2020/9
Y1 - 2020/9
N2 - Extending the size of labeled corpora of medical reports is a major step towards a successful training of machine learning algorithms. Simulating new text reports is a key solution for reports augmentation, which extends the cohort size. However, text generation in the medical domain is challenging because it needs to preserve both content and style that are typical for real reports, without risking the patients' privacy. In this paper, we present a conditioned LSTM-RNN architecture for simulating realistic mammography reports. We evaluated the performance by analyzing the characteristics of the simulated reports and classifying them into benign and malignant classes. An average classification AUC was calculated over two distinct test sets. A qualitative analysis was also performed in which a masked radiologist classified 0.75 of the simulated reports as real reports, showing that both the style and content of the simulated reports were similar to real reports. Finally, we compared our RNN-LSTM generative model with Markov Random Fields. The RNN-LSTM provided significantly better and more stable performance than MRFs (p< 0.01, Wilcoxon).
AB - Extending the size of labeled corpora of medical reports is a major step towards a successful training of machine learning algorithms. Simulating new text reports is a key solution for reports augmentation, which extends the cohort size. However, text generation in the medical domain is challenging because it needs to preserve both content and style that are typical for real reports, without risking the patients' privacy. In this paper, we present a conditioned LSTM-RNN architecture for simulating realistic mammography reports. We evaluated the performance by analyzing the characteristics of the simulated reports and classifying them into benign and malignant classes. An average classification AUC was calculated over two distinct test sets. A qualitative analysis was also performed in which a masked radiologist classified 0.75 of the simulated reports as real reports, showing that both the style and content of the simulated reports were similar to real reports. Finally, we compared our RNN-LSTM generative model with Markov Random Fields. The RNN-LSTM provided significantly better and more stable performance than MRFs (p< 0.01, Wilcoxon).
KW - Natural language generation
KW - RNN-LSTM
KW - mammo-graphy reports
KW - simulation
UR - http://www.scopus.com/inward/record.url?scp=85090492093&partnerID=8YFLogxK
U2 - 10.1109/JBHI.2020.2980118
DO - 10.1109/JBHI.2020.2980118
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
C2 - 32324577
AN - SCOPUS:85090492093
SN - 2168-2194
VL - 24
SP - 2711
EP - 2717
JO - IEEE Journal of Biomedical and Health Informatics
JF - IEEE Journal of Biomedical and Health Informatics
IS - 9
M1 - 9072639
ER -