Strange locale problems in R after update to Mojave

I'm having strange problems with R after updating to OsX Mojave.

When I start up a RStudio project I get:

During startup - Warning messages:
1: Setting LC_CTYPE failed, using "C" 
2: Setting LC_COLLATE failed, using "C" 
3: Setting LC_TIME failed, using "C" 
4: Setting LC_MESSAGES failed, using "C" 
5: Setting LC_MONETARY failed, using "C" 

Than, if I source() R files raise an error if I don't remove the encoding = 'UTF-8' option if there are non-english characters like accented vowels.

Finally, the same characters if I use the View function get encoded, so àèòù become <c3><a0><c3><a8><c3><b2><c3><b9>

How can I solve this? it's pretty annoying expecially when save/sourcing R files from RStudio, since it puts encoding = 'UTF-8' by default.

Does this happen both in Vanilla R and in RStudio, or only in RStudio?
(see FAQ below for disambiguation)

Self reply to my last post:

defaults write org.R-project.R force.LANG de_DE.UTF-8

does the job. replace de_DE.UTF-8 with the output of
Terminal: locale

Facing the same problem as above.

Tried in Vanilla R: no problems, works fine.

R version 3.5.1 (2018-07-02) -- "Feather Spray"
Copyright (C) 2018 The R Foundation for Statistical Computing
Platform: x86_64-apple-darwin15.6.0 (64-bit)

RStudio: Version 1.1.447 – © 2009-2018

Cannot load my csv. files
error: 1: In read.table(file = file, header = header, sep = sep, quote = quote, :
invalid input found on input connection '../data/fallexport.csv'

Hmm…I haven't upgraded to Mojave yet, so I'm not sure what all would have changed.

Is there any chance using a preview build (https://www.rstudio.com/products/rstudio/download/preview/) resolves this?

Also, I noticed you cross-posted this to StackOverflow:

In the event that you cross-post, we ask that you include the link to the other thread or issue so that there's no unnecessary duplication of effort, in the event that it gets resolved over there.

Thanks

I'm sorry for the cross-posting; at the beginning I didn't realize it was a RStudio specific problem, but then I noticed that plain R doesn't suffer from it, so I created an issue in github problem because it looked like a bug to me.

In github they noticed from a RStudio diagnostic report that I provided that in the locale setting there was an impossible locale:

$LANG
[1] "en_IT.UTF-8"

$LC_CTYPE
[1] "en_IT.UTF-8"

en_IT doesn't make sense because it would mean an italian version of english. In fact those options are not present in my machine locales (fetched by locale -a on terminal).

The problem is confirmed running Sys.getenv()

LANG                                           en_IT.UTF-8
LC_CTYPE                                       en_IT.UTF-8

I don't really know where these setting come from.

The workaround to make everything work is using Sys.setlocale(locale = 'en_US.UTF-8'). But I think it's a problem of RStudio interacting with Mojave.

Thanks to @phahn who gave me a useful hint.

If you could post a link to the issue here as well that would be great. It will allow anyone in the future to see whether the issue has been closed, which will help them figure out if they're facing the same problem!

Thanks

1 Like

Here's the issue: https://github.com/rstudio/rstudio/issues/3587
but they closed it because I already wrote here. I don't really agree with closing it, because the issue is real and is not solved; I just found a (not very elegant) workaround (which I posted both here and on github).

UPDATE:

using WriteXLS::WriteXLS() raises an error even after resetting locale with Sys.setlocale(locale = 'en_US.UTF-8')

perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
	LC_ALL = (unset),
	LC_CTYPE = "en_IT.UTF-8",
	LANG = "en_IT.UTF-8"
    are supported and installed on your system.
perl: warning: Falling back to the standard locale ("C").

Had to output the file using Vanilla R

This is getting annoying...

Hey @bakaburg1, that sounds like a tough problem :weary: This might be a basic check, but have you checked the Default text encoding option in RStudio's Tools > Global Options > Code > Saving? I would think this would only affect the encoding of files you save in RStudio, but I suppose it can't hurt to check!

EDIT: this support thread was the closest I could find to your problem :frowning: Not sure whether it'll help you if your problem is isolated to RStudio, but :confused:

I also encoutered a similar problem. After upgrading to Mojave, and even after clean-installing it,

  • the start message ("R is a free software...") appears in English, not in Japanese.
  • the following warning message appears
During startup - Warning messages:
1: Setting LC_CTYPE failed, using "C"
2: Setting LC_COLLATE failed, using "C"
3: Setting LC_TIME failed, using "C"
4: Setting LC_MESSAGES failed, using "C"
5: Setting LC_MONETARY failed, using "C"
  • I couldn't use Japanese characters on Rstudio.

With R-GUI and R running on a macOS terminal, everything is fine.

Adding

Sys.setlocale(category="LC_ALL", locale="ja_JP.UTF-8")

to my .Rprofile didn't solve the problem.

defaults write org.R-project.R force.LANG ja_JP.UTF-8

(at least seemingly) fixed it.

I tried defaults write org.R-project.R force.LANG en_US.UTF-8, and the error at the startup of R projects seems gone.

But I still see the same locale setting error when I render rmarkdown files. In one of such files I put Sys.setlocale() and I got C.

@hideaki: What do you see when you execute:

defaults read NSGlobalDomain AppleLocale

from a terminal? I suspect you're seeing something erroneous like en_JP? It seems like macOS Mojave is setting a strange locale here, and RStudio is inheriting that locale.

I'm getting it_IT, so it looks fine

What do you see when you execute:

defaults read NSGlobalDomain AppleLocale

ja_JP

Dang; thanks. That's not what I expected (I'm surprised that I see en_IT) but I'll see if I can learn more.

I noticed something else too.

I inserted Sys.setlocale(locale = 'en_US.UTF-8') into the rmarkdown file in the first code chunk. If I create a data.frame with special characters eg. df <- data.frame(a = 'aaèòà±é') in the first chunk, where I reset the locale, and then kable() it, I only get aa (that is the special characters get eliminated). If I create it in a subsequent chunk it prints ok.

So the locale reset seems to apply only to subsequent chunks, not to the one in which is done.

Sorry I didn't see the post. In the setting there is [Ask]. The defaults write org.R-project.R force.LANG en_US.UTF-8 trick seems to have solved the error when sourcing (or at least I believe so, since I'm not seeing it anymore, and I've touched so many things around that is hard to say what fixed it). But still I'm facing other locale problems here and there as written in the other posts.

1 Like

Thank you for re-opening the GitHub issue and validating the problem.
I don't think en-IT is odd, if you've changed your macbook's region to Italy and keep setting English as a primary language.