Abnormal characters changed in readr

readr

#1

I have data that includes the name "Björk", but readr changes it to Bj<f6>rk. Is this on purpose, and is there a way I can read it exactly how it's written in my csv file? I don't get this problem with read.csv

Thanks


#2

This is because readr assumes by default that your character encoding is UTF-8, so you have to set locale and/or encoding if not. See here in docs for detail:
http://readr.tidyverse.org/articles/locales.html#character


#3

Thanks, Mara! I didn't know what the terminology of all this was, so I wasn't quite sure what to look up.


#4

:+1: I'm no stranger to that problem! (TFW you can think of the dataviz you want, but don't know the name for it!)


#5

For future reference, I used readr::guess_encoding(), and the function guessed that my file encoding was windows-1252, and I'm pretty sure that's right

So the code would be,

readr::read_csv("filename.csv", locale = locale(encoding = "windows-1252"))