Presentation .Rpres did not get converted to .md file using utf-8

I have written my first .Rpres presentation file in Chinese. It was working for fine for the first few hours. then all of sudden, the preview started showing junk characters. I found out the .md file that was created by .Rpres was enocded in utf-8, but the contents was not in UTF-8. so during the process of converting .Rpres to .md, utf-8 encoding was not use. My presentation file is in the proejct, and project setting is already UTF-8. the system locale and language are all set to Chinese UTF-8 correctly.

> sessionInfo()
R version 4.1.2 (2021-11-01)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19043)

Matrix products: default

locale:
[1] LC_COLLATE=Chinese (Simplified)_China.936  LC_CTYPE=Chinese (Simplified)_China.936    LC_MONETARY=Chinese (Simplified)_China.936
[4] LC_NUMERIC=C                               LC_TIME=Chinese (Simplified)_China.936    
system code page: 65001

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] RMySQL_0.10.23 DBI_1.1.2     

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.8        lattice_0.20-45   here_1.0.1        png_0.1-7         digest_0.6.29     rprojroot_2.0.2   rappdirs_0.3.3    grid_4.1.2       
 [9] jsonlite_1.7.2    magrittr_2.0.1    evaluate_0.14     rlang_0.4.12      DT_0.20           Matrix_1.3-4      reticulate_1.23   rmarkdown_2.11   
[17] tools_4.1.2       htmlwidgets_1.5.4 rsconnect_0.8.25  xfun_0.29         fastmap_1.1.0     compiler_4.1.2    htmltools_0.5.2   knitr_1.37       
> system('locale')
LANG=zh_CN.UTF-8
LC_CTYPE="zh_CN.UTF-8"
LC_NUMERIC="zh_CN.UTF-8"
LC_TIME="zh_CN.UTF-8"
LC_COLLATE="zh_CN.UTF-8"
LC_MONETARY="zh_CN.UTF-8"
LC_MESSAGES="zh_CN.UTF-8"
LC_ALL=

Hi Golferwu

This sound like exact same problem I have seen - just on a Danish Windows 10 installation.
I was able to solve by changing region setting in windows controlpanel for "program that don't use Unicode"

I changed, in my case Danish, to Latin (world) and unchecked "Beta: Use Unicode UTF-8..."

Like this in a Danish installation. Don't not the precise wording in an English instalation (or Chinese installation :slight_smile: )

Requires most likely restart and admin rights

My guess is it will also solve in your case - I hope so

Sessioninfo now is without codepage 65001 - on my PC like this:

locale:
[1] LC_COLLATE=Danish_Denmark.1252  LC_CTYPE=Danish_Denmark.1252   
[3] LC_MONETARY=Danish_Denmark.1252 LC_NUMERIC=C                   
[5] LC_TIME=Danish_Denmark.1252

Brgds. Henrik

Thanks for the suggestion. I did try that, setting the Windows code page to 437 (English), but after reboot, RStudio sessionInfo() still shows system code page 65001! that really puzzles me... anyway, problem still exists.. and for now, I have to use other way to present my stuff..

Hi

I don’t think that will work.

Standard language in windows setting can be as you prefer

But the settings for "Language for non-Unicode programs" system locale has to be changed to "Latin (World)". (not English or any other codepage)

And if checked, uncheck "Beta: Use Unicode UFT-8 for worldwide ..."

At least at the two PC I have tested on (both is danish versions of windows 10) - it has worked :slight_smile:

Note: It didn't help here (RStudio can't deal with file names with unicode characters - #16 by mirirai).

... but I would still suggest you first try changing to Latin (World) for "non-unicode programs"

Brgds. Henrik

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.