cannot upload tweets with emojis to mysql database using RMariaDB

I am trying to gather some tweets using the {rtweet} package and store them in a MySQL database. But I am getting errors whenever I try to upload a tweet that contains emojis.

Here is my code:

# loading packages
library(rtweet)
library(DBI)
library(RMariaDB)
library(dplyr)
library(dbplyr)
library(lubridate)

# create twitter api token
twitterToken <- rtweet_bot(
  api_key = "*****",
  api_secret = "*****",
  access_token = "*****",
  access_secret = "*****"
)

# search tweets
tweets <- search_tweets(q = "beautiful", n = 500, type = "recent", include_rts = FALSE, token = twitterToken)
tweets$Topic <- "beautiful"
tweets$created_at  <- parse_date_time(tweets$created_at, orders = c("%a %b %d %T %z %Y", "%Y-%m-%d %H:%M:%S"))
tweets$screen_name <- users_data(tweets)$screen_name
tweets$status_url <- paste0("https://twitter.com/", tweets$screen_name, "/status/", tweets$id_str)
tweets <- tweets %>% select(Topic, status_url, screen_name, created_at, text, favorite_count, retweet_count)

# upload to database
con <- dbConnect(MariaDB(), dbname="dbname", username="username", password="password", host="db_host", port=db_port, ssl.ca = "ssl.pem", load_data_local_infile = TRUE)
dbWriteTable(con, "testTweetsDB", tweets, overwrite = T)

This throws the following error:

Error in (function (classes, fdef, mtable)  : 
  unable to find an inherited method for function ‘dbWriteTable’ for signature ‘"MariaDBConnection", "character", "tweets"’

This is a bit odd because it used to work before I upgraded my R version and updated all the packages. But is not the main issue. I can work around this by the following codes:

tweets <- as.data.frame(tweets)
dbWriteTable(con, "testTweetsDB", tweets, overwrite = T)

This time I get the following error:

Error: Error executing query: Invalid utf8 character string: '@Alyri_tv So beautifull girl '

The string it complains about is the first tweet that contains emojis. It works perfectly fine if I only select tweets that don't have any emojis in them. It works even if the tweets contain Chinese, Korean and other language characters. It is the emojis that are causing the problem.
The default collation for the database is utf8mb4_unicode_ci

Could you provide an example of such string? Something like dput(tweets$text[which_emoji]) might help to see if there is anything in it. I don't think the problem is in how rtweet process and provides the data. Have you tried inserting other strings with emojis that are not from rtweet? That would rule rtweet out and help diagnose the problem with a more minimal reproducible example.

This topic was automatically closed 42 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.