I'm dealilng with data that is, well, heavy Unicode. That is, both the column names and the data utilize characters that are not always roman, write in both directions, and even have more than one diacritical mark on some letters. It's a fake class of 35 students who each have names in a couple dozen languages in the world. My sort of "research" question is: if you have such data, what are the best ways to represent it (csv, JSON, sasv7bdat, Excel, &c.).
So I'm looking at the JSON file, which reads in with an error with
jsonlite. I'm also trying
rjson, but have run into a problem. According to everything I've read, the correct way to process the file is to read it into an object, then call
as.data.frame() on that object, to wit:
raw_JSON <- rjson::fromJSON(file = "World Class.json") rjson_fromJSON <- as.data.frame(raw_JSON)
However, this results in a data frame with a single row and 1400 columns. The JSON file looks fine, reads into other readers fine. After reading into R,
raw_JSON appears to have read everything correctly, but I've only verified that by looking at it in RStudio's Envorinment -- not programatically.
I'm at a loss as to what the problem could be. I have asked over at StackOverflow, but the answer I get is "just use jsonlite". I'm not so much looking for a way to read the file in as I am trying to figure out what's going on here. For example, is
as.data.frame() not parsing the lists correctly, maybe because of the wild Unicode? I don't think that's correct, but I'm grasping at straws here.
The original data can be found here.