2019年に医療言語処理のビッグウェーブが来たかもしれない

医学博士過程で自然言語処理読影レポートへの応用をテーマにしようとしています。

サーベイの副産物です。

 

医療言語処理の国際会議論文を探す

この機械学習の一大ブームのなかでも,特に医療への応用が遅れがちな自然言語処理

現状はどうなっているでしょうか。

ACL, NAACL, EMNLP, CoNLL, IJCNLP, COLING, EACL, LRECの2015年以降の国際会議論文のうち,医療と関連していそうなものを独断と偏見でひたすら列挙してみました。

具体的には,ACL Anthologyから以下の検索語をタイトルに含む論文を全探索し,医療と明らかに無関係なものを除外しています。

検索語: 'medic', 'biomedic', 'clinic', 'health', 'life', 'care', 'pharma', 'hospital', 'drug', 'surg', 'ICU', 'emergen', 'patient', 'disease', 'symptom', 'illness', 'radiolog', 'x-ray', 'CT', 'MRI', 'radiograph', 'tomograph', 'magnetic'

 

2019年に急激に流行りだしている

はじめに本数をまとめておきます。

2019年は現時点でACL, NAACL以外の会議論文が公開されていませんが,すでにACLとNAACLの合計だけで去年までの本数をはるかに凌いでいます。

今年に入って急激に医療言語処理が流行りだしてきているのが見て取れます。

特にNAACLで医療言語処理や精神医学への応用のワークショップが開催された影響はかなり大きいとみて良いでしょう。

なお言わずもがなですが,毎年開催ではない国際会議もある点にはご留意を。

 

 

f:id:radiology-nlp:20190625222445p:plain

(2019年の勢いがすごい!!) 

 

 

 

以下,ひたすら論文を列挙します。

2015

ACL-IJCNLP 2015

Who caught a cold ? - Identifying the subject of a symptom

Disease Event Detection based on Deep Modality Analysis

Sieve-Based Entity Linking for the Biomedical Domain

注目すべきは上から2つの論文はいずれも首都大・小町研究室から発表されていることです。医療言語処理の黎明期の段階から3本中2本も医療言語処理のpaperをACLに通されていることには敬服するばかりです。

NAACL 2015

Matching Citation Text and Cited Spans in Biomedical Literature: a Search-Oriented Approach

Extracting Information about Medication Use from Veterinary Discussions

EMNLP 2015

Key Concept Identification for Medical Information Retrieval

Adapting Phrase-based Machine Translation to Normalise Medical Terms in Social Media Messages

#SupportTheCause: Identifying Motivations to Participate in Online Health Campaigns

CoNLL 2015

(No medical NLP articles)

 

2016

ACL 2016

Normalising Medical Concepts in Social Media Texts by Learning Semantic Representation

Recurrent neural network models for disease name recognition using domain invariant features

Vector-space topic models for detecting Alzheimer’s disease

Identifying Potential Adverse Drug Events in Tweets Using Bootstrapped Lexicons

DeepLife: An Entity-aware Search, Analytics and Exploration Platform for Health and Life Sciences

EMNLP 2016

Structured prediction models for RNN based sequence labeling in clinical text

CoNLL 2016

(No medical NLP articles)

COLING 2016

Appraising UMLS Coverage for Summarizing Medical Evidence

Adverse Drug Reaction Classification With Deep Neural Networks

A Hybrid Approach to Generation of Missing Abstracts in Biomedical Literature

LREC 2016

Sieve-based Coreference Resolution in the Biomedical Domain

Staggered NLP-assisted refinement for Clinical Annotations of Chronic Disease Events

A Tagged Corpus for Automatic Labeling of Disabilities in Medical Scientific Papers

Giving Lexical Resources a Second Life: Démonette, a Multi-sourced Morpho-semantic Network for French

BosphorusSign: A Turkish Sign Language Recognition Corpus in Health and Finance Domains

Twitter as a Lifeline: Human-annotated Twitter Corpora for NLP of Crisis-related Messages

Automatic Biomedical Term Polysemy Detection

A Corpus of Clinical Practice Guidelines Annotated with the Importance of Recommendations

Identification of Drug-Related Medical Conditions in Social Media

Transfer-Based Learning-to-Rank Assessment of Medical Term Technicality

The Language Resource Life Cycle: Towards a Generic Model for Creating, Maintaining, Using and Distributing Language Resources

A Large Rated Lexicon with French Medical Words

The Scielo Corpus: a Parallel Corpus of Scientific Publications for Biomedicine

Towards Using Social Media to Identify Individuals at Risk for Preventable Chronic Illness

Managing Linguistic and Terminological Variation in a Medical Dialogue System

A Corpus of Word-Aligned Asked and Anticipated Questions in a Virtual Patient Dialogue System

Annotating Named Entities in Consumer Health Questions

Age and Gender Prediction on Health Forum Data

Monitoring Disease Outbreak Events on the Web Using Text-mining Approach and Domain Expert Knowledge

On Developing Resources for Patient-level Information Retrieval

Annotating and Detecting Medical Events in Clinical Notes

Text Segmentation of Digitized Clinical Texts

A Turkish Database for Psycholinguistic Studies Based on Frequency, Age of Acquisition, and Imageability

The PsyMine Corpus - A Corpus annotated with Psychiatric Disorders and their Etiological Factors

QUEMDISSE? Reported speech in Portuguese

Semantic Relation Extraction with Semantic Patterns Experiment on Radiology Reports

Medical Concept Embeddings via Labeled Background Corpora

2017

ACL 2017

Joint CTC/attention decoding for end-to-end speech recognition

Lifelong Learning CRF for Supervised Aspect Extraction

Computational Characterization of Mental States: A Natural Language Processing Approach

Negotiation of Antibiotic Treatment in Medical Consultations: A Corpus Based Study

Automating Biomedical Evidence Synthesis: RobotReviewer

Life-iNet: A Structured Network-Based Knowledge Exploration and Analytics System for Life Sciences

Olelo: A Question Answering Application for Biomedicine

Semedico: A Comprehensive Semantic Search Engine for the Life Sciences

NLP for Precision Medicine

EMNLP 2017

Reporting Score Distributions Makes a Difference: Performance Study of LSTM-networks for Sequence Tagging

CoNLL 2017

Neural Domain Adaptation for Biomedical Question Answering

Learning local and global contexts using a convolutional recurrent network model for relation classification in biomedical text

Idea density for predicting Alzheimer’s disease from transcribed speech

IJCNLP 2017

Extraction of Gene-Environment Interaction from the Biomedical Literature

Learning to Diagnose: Assimilating Clinical Narratives using Deep Reinforcement Learning

Identifying Protein-protein Interactions in Biomedical Literature using Recurrent Neural Networks with Long Short-Term Memory

Identifying Empathetic Messages in Online Health Communities

PubMed 200k RCT: a Dataset for Sequential Sentence Classification in Medical Abstracts

Language-Independent Prediction of Psycholinguistic Properties of Words

Correlation Analysis of Chronic Obstructive Pulmonary Disease (COPD) and its Biomarkers Using the Word Embeddings

WiseReporter: A Korean Report Generation System

CYUT at IJCNLP-2017 Task 3: System Report for Review Opinion Diversification

EACL 2017

Recognizing Mentions of Adverse Drug Reaction in Social Media Using Knowledge-Infused Recurrent Models

Multitask Learning for Mental Health Conditions with Limited Social Media Data

Named Entity Recognition in the Medical Domain with Constrained CRF Models

Psycholinguistic Models of Sentence Processing Improve Sentence Readability Ranking

Structured Learning for Temporal Relation Extraction from Clinical Records

Entity Extraction in Biomedical Corpora: An Approach to Evaluate Word Embedding Features with PSO based Feature Selection

A Computational Analysis of the Language of Drug Addiction

Neural Networks for Joint Sentence Classification in Medical Paper Abstracts

Temporal information extraction from clinical text

 

2018

ACL 2018

A Corpus with Multi-Level Annotations of Patients, Interventions and Outcomes to Support Language Processing for Medical Literature

Modeling Naive Psychology of Characters in Simple Commonsense Stories

On the Automatic Generation of Medical Imaging Reports

Enhancing Drug-Drug Interaction Extraction from Texts by Molecular Structure Information

Pushing the Limits of Radiology with Joint Modeling of Visual and Textual Information

Biomedical Document Retrieval for Clinical Decision Support System

Embedding Transfer for Low-Resource Medical Named Entity Recognition: A Case Study on Patient Mobility

Identifying Risk Factors For Heart Disease in Electronic Medical Records: A Deep Learning Approach

Keyphrases Extraction from User-Generated Contents in Healthcare Domain Using Long Short-Term Memory Networks

Ontology alignment in the biomedical domain using entity definitions and context

Sub-word information in pre-trained biomedical word representations: evaluation and hyper-parameter optimization

PICO Element Detection in Medical Text via Long Short-Term Memory Neural Networks

Coding Structures and Actions with the COSTA Scheme in Medical Conversations

A Neural Autoencoder Approach for Document Ranking and Query Refinement in Pharmacogenomic Information Retrieval

Biomedical Event Extraction Using Convolutional Neural Networks and Dependency Parsing

Phrase2VecGLM: Neural generalized language model–based semantic tagging for complex query reformulation in medical IR

Domain Adaptation for Disease Phrase Matching with Adversarial Networks

Predicting Discharge Disposition Using Patient Complaint Notes in Electronic Medical Records

A Framework for Developing and Evaluating Word Embeddings of Drug-named Entity

CRF-LSTM Text Mining Method Unveiling the Pharmacological Mechanism of Off-target Side Effect of Anti-Multiple Myeloma Drugs

Prediction Models for Risk of Type-2 Diabetes Using Health Claims

On Learning Better Embeddings from Chinese Clinical Records: Study on Combining In-Domain and Out-Domain Data

Investigating Domain-Specific Information for Neural Coreference Resolution on Biomedical Texts

Toward Cross-Domain Engagement Analysis in Medical Notes

Report of NEWS 2018 Named Entity Transliteration Shared Task

A Corpus of Corporate Annual and Social Responsibility Reports: 280 Million Tokens of Balanced Organizational Writing

CYUT-III Team Chinese Grammatical Error Diagnosis System Report in NLPTEA-2018 CGED Shared Task

NAACL 2018

Label-Aware Double Transfer Learning for Cross-Specialty Medical Named Entity Recognition

Explainable Prediction of Medical Codes from Clinical Text

CliCR: a Dataset of Clinical Case Reports for Machine Reading Comprehension

Similarity Measures for the Detection of Clinical Conditions with Verbal Fluency Tasks

Multi-Task Learning Framework for Mining Crowd Intelligence towards Clinical Treatment

Syntactic Patterns Improve Information Extraction for Medical Search

From dictations to clinical reports using machine translation

Towards Generating Personalized Hospitalization Summaries

An automated medical scribe for documenting clinical encounters

Generating Continuous Representations of Medical Texts

RiskFinder: A Sentence-level Risk Detector for Financial Reports

Using Paraphrasing and Memory-Augmented Models to Combat Data Sparsity in Question Interpretation with a Virtual Patient Dialogue System

A Report on the Complex Word Identification Shared Task 2018

Modeling Second-Language Learning from a Psychological Perspective

Proceedings of the Fifth Workshop on Computational Linguistics and Clinical Psychology: From Keyboard to Clinic

What type of happiness are you looking for? - A closer look at detecting mental health from language

CLPsych 2018 Shared Task: Predicting Current and Future Psychological Health from Childhood Essays

Hierarchical neural model with attention mechanisms for the classification of social media text related to mental health

Current and Future Psychological Health Prediction using Language and Socio-Demographics of Children for the CLPysch 2018 Shared Task

Predicting Psychological Health from Childhood Essays with Convolutional Neural Networks for the CLPsych 2018 Shared Task (Team UKNLP)

A Psychologically Informed Approach to CLPsych Shared Task 2018

Predicting Psychological Health from Childhood Essays. The UGent-IDLab CLPsych 2018 Shared Task System.

Can adult mental health be predicted by childhood future-self narratives? Insights from the CLPsych 2018 Shared Task

RSDD-Time: Temporal Annotation of Self-Reported Mental Health Diagnoses

A Report on the 2018 VUA Metaphor Detection Shared Task

EMNLP 2018

Fine-Grained Emotion Detection in Health-Related Online Posts

Lessons from Natural Language Inference in the Clinical Domain

Annotation of a Large Clinical Entity Corpus

emrQA: A Large Corpus for Question Answering on Electronic Medical Records

Marginal Likelihood Training of BiLSTM-CRF for Biomedical Named Entity Recognition from Disjoint Label Sets

Structured Multi-Label Biomedical Text Tagging via Attentive Neural Tree Decoding

Hierarchical Neural Networks for Sequential Sentence Classification in Medical Scientific Abstracts

Predicting Factuality of Reporting and Bias of News Media Sources

CARER: Contextualized Affect Representations for Emotion Recognition

Learning Disentangled Representations of Texts with Application to Biomedical Abstracts

Proceedings of the 6th BioASQ Workshop A challenge on large-scale biomedical semantic indexing and question answering

Semantic role labeling tools for biomedical question answering: a study of selected tools on the BioASQ datasets

Extraction Meets Abstraction: Ideal Answer Generation for Biomedical Questions

UNCC QA: Biomedical Question Answering system

Does it care what you asked? Understanding Importance of Verbs in Deep Learning QA System

Proceedings of the Ninth International Workshop on Health Text Mining and Information Analysis

Revisiting neural relation classification in clinical notes with external information

Supervised Machine Learning for Extractive Query Based Summarisation of Biomedical Data

Comparing CNN and LSTM character-level embeddings in BiLSTM-CRF models for chemical and disease named entity recognition

Deep learning for language understanding of mental health concepts derived from Cognitive Behavioural Therapy

Investigating the Challenges of Temporal Relation Extraction from Clinical Text

De-identifying Free Text of Japanese Dummy Electronic Health Records

Automatically Detecting the Position and Type of Psychiatric Evaluation Report Sections

Iterative development of family history annotation guidelines using a synthetic corpus of clinical text

CAS: French Corpus with Clinical Cases

Analysis of Risk Factor Domains in Psychosis Patient Health Records

Patient Risk Assessment and Warning Symptom Detection Using Deep Attention-Based Neural Networks

Syntax-based Transfer Learning for the Task of Biomedical Relation Extraction

In-domain Context-aware Token Embeddings Improve Biomedical Named Entity Recognition

Listwise temporal ordering of events in clinical notes

Time Expressions in Mental Health Records for Symptom Onset Extraction

Evaluation of a Sequence Tagging Tool for Biomedical Texts

Learning to Summarize Radiology Findings

Proceedings of the 2018 EMNLP Workshop SMM4H: The 3rd Social Media Mining for Health Applications Workshop & Shared Task

Overview of the Third Social Media Mining for Health (SMM4H) Shared Tasks at EMNLP 2018

Changes in Psycholinguistic Attributes of Social Media Users Before, During, and After Self-Reported Influenza Symptoms

Thumbs Up and Down: Sentiment Analysis of Medical Online Forums

Identification of Emergency Blood Donation Request on Twitter

Dealing with Medication Non-Adherence Expressions in Twitter

Detecting Tweets Mentioning Drug Name and Adverse Drug Reaction with Hierarchical Tweet Representation and Multi-Head Self-Attention

Classification of Medication-Related Tweets Using Stacked Bidirectional LSTMs with Context-Aware Attention

Neural DrugNet

Drug-Use Identification from Tweets with Word and Character N-Grams

Automatic Identification of Drugs and Adverse Drug Reaction Related Tweets

Deep Learning for Social Media Health Text Classification

Using PPM for Health Related Text Detection

Leveraging Web Based Evidence Gathering for Drug Information Identification from Tweets

Classification of Tweets about Reported Events using Neural Networks

A Call for Clarity in Reporting BLEU Scores

Findings of the WMT 2018 Biomedical Translation Shared Task: Evaluation on Medline test sets

Ensemble Sequence Level Training for Multimodal MT: OSU-Baidu WMT18 Multimodal Machine Translation System Report

Translation of Biomedical Documents with Focus on Spanish-English

Ensemble of Translators with Automatic Selection of the Best Translation – the submission of FOKUS to the WMT 18 biomedical translation task –

Hunter NMT System for WMT18 Biomedical Translation Task: Transfer Learning in Neural Machine Translation

UFRGS Participation on the WMT Biomedical Translation Shared Task

Neural Machine Translation with the Transformer and Multi-Source Romance Languages for the Biomedical WMT 2018 task

CoNLL 2018

(No medical NLP articles)

COLING 2018

SMHD: a Large-Scale Resource for Exploring Online Language Usage for Multiple Mental Health Conditions

Improving Feature Extraction for Pathology Reports with Precise Negation Scope Detection

Adaptive Multi-Task Transfer Learning for Chinese Word Segmentation in Medical Text

T-Know: a Knowledge Graph-based Question Answering and Infor-mation Retrieval System for Traditional Chinese Medicine

Document Representation Learning for Patient History Visualization

Automatic Curation and Visualization of Crime Related Information from Incrementally Crawled Multi-source News Reports

Language-Based Automatic Assessment of Cognitive and Communicative Functions Related to Parkinson’s Disease

A Rich Annotation Scheme for Mental Events

An Evaluation of Information Extraction Tools for Identifying Health Claims in News Headlines

A Dialogue Annotation Scheme for Weight Management Chat using the Trans-Theoretical Model of Health Behavior Change

The Interplay of Form and Meaning in Complex Medical Terms: Evidence from a Clinical Corpus

A Treebank for the Healthcare Domain

LREC 2018

A FrameNet for Cancer Information in Clinical Narratives: Schema and Annotation

Parallel Corpora for the Biomedical Domain

Medical Entity Corpus with PICO elements and Sentiment Analysis

Annotating Spin in Biomedical Scientific Publications : the case of Random Controlled Trials (RCTs)

Visualization of the occurrence trend of infectious diseases using Twitter

Expert Evaluation of a Spoken Dialogue System in a Clinical Operating Room

A Corpus of Drug Usage Guidelines Annotated with Type of Advice

BioRo: The Biomedical Corpus for the Romanian Language

Sharing Copies of Synthetic Clinical Corpora without Physical Distribution — A Case Study to Get Around IPRs and Privacy Constraints Featuring the German JSYNCC Corpus

Biomedical term normalization of EHRs with UMLS

Mining Biomedical Publications With The LAPPS Grid

J-MeDic: A Japanese Disease Name Dictionary based on Real Clinical Usage

BioRead: A New Dataset for Biomedical Reading Comprehension

Medical Sentiment Analysis using Social Media: Towards building a Patient Assisted System

Constructing a Chinese Medical Conversation Corpus Annotated with Conversational Structures and Actions

A Semi-autonomous System for Creating a Human-Machine Interaction Corpus in Virtual Reality: Application to the ACORFORMed System for Training Doctors to Break Bad News

From ‘Solved Problems’ to New Challenges: A Report on LDC Activities

Annotating Reflections for Health Behavior Change Therapy

Construction of the Corpus of Everyday Japanese Conversation: An Interim Report

Carcinologic Speech Severity Index Project: A Database of Speech Disorder Productions to Assess Quality of Life Related to Speech After Cancer

Profiling Medical Journal Articles Using a Gene Ontology Semantic Tagger

2019

NAACL 2019

Neural language models as psycholinguistic subjects: Representations of syntactic state

Sentence Embedding Alignment for Lifelong Relation Extraction

Courteously Yours: Inducing courteous behavior in Customer Care responses using Reinforced Pointer Generator Network

Analyzing the Perceived Severity of Cybersecurity Threats Reported on Social Media

Biomedical Event Extraction based on Knowledge-driven Tree-LSTM

Predicting Annotation Difficulty to Improve Task Routing and Model Performance for Biomedical Information Extraction

Multilingual prediction of Alzheimer’s disease through domain adaptation and concept-based language modelling

Inferring Which Medical Treatments Work from Reports of Clinical Trials

Augmenting word2vec with latent Dirichlet allocation within a clinical application

Fast Prototyping a Dialogue Comprehension System for Nurse-Patient Conversations on Symptom Monitoring

Train One Get One Free: Partially Supervised Neural Network for Bug Report Duplicate Detection and Clustering

Applications of Natural Language Processing in Clinical Research and Practice

A Report on the Third VarDial Evaluation Campaign

Permanent Magnetic Articulograph (PMA) vs Electromagnetic Articulograph (EMA) in Articulation-to-Speech Synthesis for Silent Speech Interface

A Survey on Biomedical Image Captioning

Proceedings of the 2nd Clinical Natural Language Processing Workshop

Effective Feature Representation for Clinical Text Concept Extraction

An Analysis of Attention over Clinical Notes for Predictive Tasks

Extracting Adverse Drug Event Information with Minimal Engineering

Towards Automatic Generation of Shareable Synthetic Clinical Notes Using Neural Language Models

A Novel System for Extractive Clinical Note Summarization using EHR Data

Study of lexical aspect in the French medical language. Development of a lexical resource

A BERT-based Universal Model for Both Within- and Cross-sentence Clinical Temporal Relation Extraction

Publicly Available Clinical BERT Embeddings

A General-Purpose Annotation Model for Knowledge Discovery: Case Study in Spanish Clinical Text

Predicting ICU transfers using text messages between nurses and doctors

Medical Entity Linking using Triplet Network

Annotating and Characterizing Clinical Sentences with Explicit Why-QA Cues

Extracting Factual Min/Max Age Information from Clinical Trial Studies

Distinguishing Clinical Sentiment: The Importance of Domain Adaptation in Psychiatric Patient Health Records

Medical Word Embeddings for Spanish: Development and Evaluation

Automatically Generating Psychiatric Case Notes From Digital Transcripts of Doctor-Patient Conversations

Clinical Data Classification using Conditional Random Fields and Neural Parsing for Morphologically Rich Languages

Probing Biomedical Embeddings from Language Models

Modeling performance differences on cognitive tests using LSTMs and skip-thought vectors trained on reported media consumption.

Distantly Supervised Biomedical Knowledge Acquisition via Knowledge Graph Based Attention

Understanding the Polarity of Events in the Biomedical Literature: Deep Learning vs. Linguistically-informed Methods

Browsing Health: Information Extraction to Support New Interfaces for Accessing Medical Evidence

From News to Medical: Cross-domain Discourse Segmentation

Toward a Computational Multidimensional Lexical Similarity Measure for Modeling Word Association Tasks in Psycholinguistics

Proceedings of the Sixth Workshop on Computational Linguistics and Clinical Psychology

The importance of sharing patient-generated clinical speech and language data

Computational Linguistics for Enhancing Scientific Reproducibility and Reducing Healthcare Inequities

Mental Health Surveillance over Social Media with Digital Cohorts

Overcoming the bottleneck in traditional assessments of verbal memory: Modeling human ratings and classifying clinical group membership

 

 

傾向

眺めていると,医療言語処理の中でもよく扱いやすいテーマがあるのか,

  • 精神科や認知症はよくテーマに取り上げられる
  • 薬の副作用の検出もよくテーマにされている
  • 一方,放射線科に関係のある論文はかなり少ない

という傾向が見てとれました。