BoF at rstudio::conf - Natural Language Processing



Natural Language Processing / NLP

Keywords: Natural Language Processing, Natural Language Processing in R, NLP, Text analysis
Facilitated by @julia
When and where: TBD. Likely during a session-break in the BoF Lounge.


If you'd like to get notifications about this group, be sure that you are "Watching" this topic-thread.

If you would like to focus on a specific topic within this category, or ensure you are connecting with the right folks, reply below, discuss, and share widely!

What is a Birds of a Feather Session? Learn more at the BoF Directory


I'm excited about hosting this BoF session at rstudio::conf! :tada: Who thinks they might come, and what kinds of topics would be fun to chat about in a casual, face-to-face setting?


Hey :slight_smile:

I'd be interesting in hearing if anyone's using deep learning for NLP (you guys at SO certainly must be ;-))?

Especially in light of the latest advances in DL for NLP due to transfer learning (BERT, ULMFit, ELMO...)


Yep, we are. I just finished running a concept extraction model (BiLSTM-CRF network) this morning to see how this baseline model would handle part of our data. Not too bad for the first try via transfer learning.


Cool! Extracting embeddings from ELMO, then building your own classifier on top, as (probably) done here:


Precisely. Im using the approach just presented at the NeuroIPS 2018 conf described here:


Our application is related to free-form medical notes, which is notoriously difficult (and less accurate) than the usual NLP dataset. To make things even MORE challenging, our domain is veterinary medicine. So we have text that may look like “ the boy just ain’t right”, with no ICD-10 codes for outcomes. :confused:


We've been working with some ULMFit models internally, which has been really interesting and fun. I'm looking forward to getting to chat in person at rstudio::conf with some of you about the kinds of work you are doing!


Interesting, Julia. I'm trying to get a feel for the relative merits of ULMFiT compared to ELMo. It's interesting that ELMo seems to benefit from training on a domain-specific data set....even though it is supposed to be more general than word embeddings. Have you found the same or similar observations with ULMFiT?