Yes, RStudio comes with a version of pandoc but not R. Pandoc is needed by rmarkdown. However, as for other :package: with R that have some Systeme Requirements, you can install pandoc yourself and everything will work find. (as @rensa explained)

RStudio is an IDE that ships with a lot of feature, some interactive, that helps user do analysis and develop some programs. Everything is run by R at the end. Moreover, apart from IDE option, the options for R and the R specific environment variable are configure into R not RStudio (.Rprofile and .Renviron files).

About the command line, to complete @rensa example, you'll find a very interesting series of post here

Hope it helps.

2 Likes

Thanks for your reply. Command line utilities is precisely the direction we are headed.

I do understand that scripts run in RStudio are actually running in R, in the end. As I understand it, RStudio is essentially running the R package I already have installed, but with what amounts to an altered configuration (e.g., additional or different folders in the PATH). Ideally, what I'd like is something analogous to dumping the implicit RStudio configuration into a file that I can then call when I run R, so that R runs under the same configuration.

Yes RStudio runs script with R using R :package: configured in the R session - it is not a RStudio option. It is difficult to know what RStudio brings that would be so different that you can't run a script outside of RStudio. I never encounter this upto now. Most (if not all) the configuration concerning R sessions, are from R configuration files not RStudio.

Are you trying to anticipate or did you encountered some issue already ?

I've encountered the issue already, with the absence of pandoc in the default R installation, while it is in the RStudio installation. I submitted a script that included outputting a plotly interactive graph with htmlwidgets::saveWidget(...) . That failed for lack of access to pandoc. I know how to solve that specific problem, but I'm hoping to find a process for avoiding similar issues in the future.
I guess you could say that I'm trying to get my dev (RStudio) and prod (command line Rscript) environments to match.

I have never encounter this kind of configuration comparison yet...
When I want to reproduce the dev environment (RStudio) in prod environment, I use packrat to isolate :package: dependencies and R version. Then, I installed all these :package: in the new environment with their System Requirement. As pandoc is a requirement for rmarkdown :package:, it is installed.

I am interested in this kind of list of things to be aware of if you make one.

Thanks again for your replies. I initially thought that when my script successfully ran library(plotly) (after installing it, if necessary), it meant that whatever plotly might need (like pandoc) was available. I saw that plotly imports htmlwidgets, which is what uses pandoc.

Your mention of packrat and dependencies has me wondering: Did my packagename::functionname() syntax do some sort of end-run around the full htmlwidgets installation? Could I avoid this "runs in RStudio but not R" problem by avoiding that packagename::functionname() syntax?

You mention installing packages with their System Requirement. If I install a package (e.g., htmlwidgets), are the system requirements automatically installed, or do I need to do that separately? Does something tell me what a package's system requirements are? Maybe that's a path I can travel toward a solution - I could define the set of packages my group uses, and make sure all the server and laptops have all of their system requirements.

Still, it seems a shortcoming of RStudio to include, by default, resources that are not in the default R installation yet might be used in R scripts, and not document those differences. If a dev tool is so user-friendly that it lets things run that won't run in prod, it should list the differences someplace obvious.

When you look up a package that's on CRAN, it should have a field called SystemRequirements that describes external dependencies. For example, the CRAN listing for rmarkdown includes:

SystemRequirements: pandoc (>= 1.12.3) - http://pandoc.

But this field isn't checked or acted on in any way, so normally it's up to you to ensure those requirements are met.

I know some packages, like tinytex, provide additional installation functions to install the external stuff that can't be handled by the usual package installation function. Others just look for the external dependency in places where it would expect to find them on a system, like $PATH on *nix systems, and they let you know if it isn't there.

I'm not sure of a way to automate this, unfortunately. The reason RStudio can include additional external dependencies is because it has its own installer; I have to assume the R package installation process is intentionally limited for security reasons. I found one (albeit WIP) attempt to automate it, but I imagine it's a massive task to handle any conceivable external dependency. Much easier when you know which ones you're including, as RStudio does :confused:

I think the best you can do here is check the CRAN listings (or GitHub READMEs) of the packages you need in production and ensure that you can manually install any external dependencies beforehand (if the package doesn't provide a way to do it for you).

@RobertMyles

View() is not just for RStudio and it totally predates the IDE. The only thing that function needs is a window system (present by default on Windows, macOS, and on linux as long as you have a desktop environment (e.g. X11)).

2 Likes

You just blew my mind.

(it is ugly af on X11 tho :laughing:)

EDIT: holy moly, and there's data.entry too!

1 Like

It really is lol. I am using DT to have a quick peek at my data (I am not an RStudio user) because it looks too ugly. But eh! it works :slight_smile: And you can even scroll up/down, left/right with Pgup/Pgdown and Home/End :smile:

1 Like

And of course, you need to load rmarkdown (which I believe is not necessary if you just click Knit in RStudio)

1 Like

Ah ah ah :laughing: Who needs Excel/Libreoffice anymore, between the old school data.entry and the new school tribble?

1 Like

I regularly develop in RStudio then run the scripts in R command line, perhaps you are running into an issue where the R packages are being installed in different locations? That would be related to your .libPaths() setting - run that in RStudio and R command and check they return the same locations.

If running on the same computer, all instances of R should point at the same default R package location, which would include the pandoc package if its been installed by RStudio. What sometimes happens is that you instead install R packages locally to the user, so when a global R session tries to run the same script, the packages are not available to it. Perhaps your .RProfile or .Renviron are set to save packages to a local user's folder, then when R base is run it is not picking up the same environment arguments. See ?Startup to see how R base reads setup configuration.

But all in all, AFAIK RStudio installs nothing that shouldn't run in R command, it just has extra UI tools such as Shiny gadgets and buttons in the UI which are easier to view in RStudio, but underneath are shortcuts to executing R code.

1 Like

Good to know! I read somewhere before that it was RStudio only, happy to learn that it's not the case :slight_smile:

1 Like

Supposedly RStudio replaces View() with their own View() (see https://stackoverflow.com/questions/48234850/how-to-use-r-studio-view-function-programatically-in-a-package), which I also didn't know...

2 Likes

So I guess we were both right, depending on the "point of View" :wink:

2 Likes

A great answer was posted on another thread by @nutterb that seems relevant here:

1 Like

This is specific to the Pandoc issue you have on Windows. Some might think that this is a bit of a hack. I assume that you have RStudio + R installed on all PC, including the ones you run your batch scripts on.

RStudio comes bundled with Pandoc and adds a few environment variables that are not in a pure R session. You can use Sys.getenv() to see which variables are added in a RStudio session and compare this to a pure R session.

If you save your specific RSTUDIO_PANDOC path as MY_RSTUDIO_PANDOC in a .Renviron file, then you can write you own function to render the Rmd file. In your render function you set the Sys.setenv("RSTUDIO_PANDOC" = Sys.getenv("MY_RSTUDIO_PANDOC")) and then you'll will use the Pandoc bundled with RStudio when the rmarkdown::render function is called, and you won't have a need to install pandoc separately. The env var should of course be added by all who are running the function, including the server/PC running the batch jobs.

1 Like

Thanks. So, to make R aware of RStudio's resources, do I just need to set some variables in the .Renviron file that R uses? If that's true, and someone can point me to the environment variable names, maybe I'll have a systematic solution.

Expanding from your answer, I opened RStudio and ran

sink(file = "~/RStudioEnv.txt")
Sys.getenv()
sink()

then opened R and ran

sink(file = "~/REnv.txt")
Sys.getenv()
sink()

When I compared "~/RStudioEnv.txt" to "~/REnv.txt", I found that RStudio had 13 environmental variables that were not in R's environment. Four seemed to point to resources that my R session might need:

  1. RMARKDOWN_MATHJAX_PATH
  2. RSTUDIO_MSYS_SSH
  3. RSTUDIO_PANDOC
  4. RSTUDIO_WINUTILS

(the other nine were: GFORTRAN_STDERR_UNIT, GFORTRAN_STDOUT_UNIT, RS_LOCAL_PEER, RS_RPOSTBACK_PATH, RS_SHARED_SECRET, RSTUDIO, RSTUDIO_SESSION_PORT, RSTUDIO_USER_IDENTITY, SESSIONNAME).
Are there four analogous environmental variables that my R session will look for, which I could created using the folder paths the RStudio variables use? Or maybe I can put something in .Renviron that will add those four folder paths to my PATH. If there is a "right" or recommended approach to this kind of thing in R, please let me know.

I would experiment a bit, and be careful not to replace all RStudio env vars every time you load RStudio and only set the vars needed to render a file without problems in batch.

I have created my own pandoc environment variable and specifically set the RSTUDIO_PANDOC path in the function to render my Rmd file. I use the .Renviron file, but the .Rprofile is also an option.