jsonl to dataframe

We have gathered historical tweets regarding public organisations and want to do a sentiment analysis on them. The type of file is a ".jsonl", meaning that every line is a separate json (see screenshot). We need to convert the information to a normal dataframe to start working on the data. Can somebody please explain which code we need? I can't find clear answers on the internet on how to deal with json lines.

Kind regard

Hi, welcome!

We don't really have enough info to help you out. Could you ask this with a minimal REPRoducible EXample (reprex)? A reprex makes it much easier for others to understand your issue and figure out how to help.

You can find the jsonl-file in the following dropbox: https://www.dropbox.com/s/49k8luoup9a1miu/twitter_premium.jsonl?dl=0

I still am unable to read it as a simple dataframe in r and can't find clear answers online.

I can get you this far, but no further... theres an issue with each json not having the same dimensions necessarily as the others, so making one single table to hold them all is frought. my solution works as far as giving you a list of tables, one for each of the 400 json


fileName <- "twitter_premium.jsonl"
tpj <- readLines(fileName, file.info(fileName)$size)
tpj2 <- paste0("[",tpj,"]")

list_of_json <- map(tpj2,~as_tibble(jsonlite::fromJSON(.)))

#error from inconsistent table structures

try_one_table <- map_dfr(tpj2,~as_tibble(jsonlite::fromJSON(.)))
#Error: Argument 13 can't be a list containing data frames

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.