State-of-the-art NLP models from R

Nowadays, Microsoft, Google, Facebook, and OpenAI are sharing lots of state-of-the-art models in the field of Natural Language Processing. However, fewer materials exist how to use these models from R. In this post, we will show how R users can access and benefit from these models as well.

Turgut Abdullayev, QSS Analytics - July 29, 2020


The Transformers repository from “Hugging Face” contains a lot of ready to use, state-of-the-art models, which are straightforward to download and fine-tune with Tensorflow & Keras.

For this purpose the users usually need to get:

  • The model itself (e.g. Bert, Albert, RoBerta, GPT-2 and etc.)
  • The tokenizer object
  • The weights of the model

In this post, we will work on a classic binary classification task and train our dataset on 3 models:

However, readers should know that one can work with transformers on a variety of down-stream tasks, such as:

  1. feature extraction
  2. sentiment analysis
  3. text classification
  4. question answering
  5. summarization
  6. translation and many more.


