package install issue with FSA and jupyter -

Split from Why not r via conda?


I have installed FSA and all its dependencies via:

conda skeleton cran FSA
conda build r-fsa

When trying to run within jupyter lab I get:

Error in library("FSA"): there is no package called ‘FSA’

I also tried

install.packages("FSA", "/users/setegonz/anaconda3/lib/R/library")

Same result

Looking for the package folder this is the path that I found:
/Users/setegonz/anaconda3/pkgs/r-fsa-0.8.22-r351h6115d3f_0

Why Jupyter is not able to use this package?
Any suggestions?

Also,
I'm trying to run

shapiro.test within condas jupyter lab using IR Kernel

shapiro.test(data$Baseline.RT) # Not normal
shapiro.test(data$TBPM.RT) # Not normal

and I'm getting this output

Error in is.numeric(x): object of type 'closure' is not subsettable
Traceback:

1. shapiro.test(data$Baseline.RT)
2. stopifnot(is.numeric(x))
3. withCallingHandlers(tryCatch(if (missE) ...elt(i) else eval(cl.i, 
 .     envir = envir), error = function(e) {
 .     e$call <- cl.i
 .     stop(e)
 . }), warning = function(w) {
 .     w$call <- cl.i
 .     w
 . })
4. tryCatch(if (missE) ...elt(i) else eval(cl.i, envir = envir), 
 .     error = function(e) {
 .         e$call <- cl.i
 .         stop(e)
 .     })
5. tryCatchList(expr, classes, parentenv, handlers)
6. tryCatchOne(expr, names, parentenv, handlers[[1L]])
7. value[[3L]](cond)

Expected Behavior

When using anacondas Rstudio I get:

Shapiro-Wilk normality test
data:  data$Baseline.RT
W = 0.86448, p-value = 0.0001234

Shapiro-Wilk normality test
data:  data$TBPM.RT
W = 0.91325, p-value = 0.003238

So I expect to have the same output within conda's jupyter lab but nothing works....

I think your problem is more related to the anaconda distribution than there is to rstudio tools or general R, maybe you would have better luck making this question on a anaconda related forum.

Hi @ sete.gonz. Unfortunately, the RStudio community does not have a very good attitude towards the Conda community and will generally misunderstand or balk at offering help.

BTW, I would also encourage you to stay away from Anaconda, which is a commercial product, and stick to the open source Conda (miniconda).

I believe your problem may be that the Jupyter kernel is not running inside your conda environment (even if it is base). If you are not using conda environments, then you are more likely to run into problems with programs like R and Python installed outside of conda due to issues with libraries used during compilation.

Try this:

  1. Create a conda environment. In that environment:
  2. Install Jupyter
  3. Install R
  4. Install as many of the other R libraries you want to use as possible using conda, including FSA
  5. Install packages for which there are no conda recipes with install.packages(<pkg>, lib="~/R35")
  6. Start Jupyter within that environment.

I'm pretty sure you will not have any trouble.

If you do, and even if you don't, please check out my .RProfile script that helps switch between conda environments from within R.

2 Likes

The negativity around *conda in the R community is because it introduces an additional layer of abstraction into managing packages in R. I've been using R for >10 years, and conda with python for 6 or so. Within the last year I tried installing R and Rstudio with conda and eventually just wiped my machine and started over because of the royal mess I had managed to create. I'm not saying that conda won't work. But I am saying that introduces conda does the following:

  1. Increased complexity: forget about following install instructions for anything you find online if those instructions were not written with conda in mind. It just won't work that way using conda.
  2. Increased debugging time: Is this failing because of how conda installed something?
  3. Confusion: can we install a package from github using devtools? I don't honestly know.

Those are pretty high costs to adoption. What are we getting in return? It's not obvious to me.

2 Likes

The "high costs of adoption" ARE because the R community shuns conda and not the other way around. Conda has become the staple of the data science world and is also extremely prominent in the bioinformatics world (thus the bioconda channel of bioconductor packages). I also had challenges having conda and R work well together, until I figured out how to make it work and wrote my startCondaEnv script. Now I have no problems whatsoever.

The main problem that arises using conda and R together is that R and RStudio are very efficient at asking you to update packages and offer a great tool with install.packages. This works great UNLESS you want to integrate with conda. If you say yes to when R offers to install or update packages then you wind up with packages being located outside of the conda environment and worse compiled with different C and Fortran libraries. Then when you update your conda environment things get even worse.

I solved this problem by ALWAYS using conda for things that conda will install, and also ALWAYS using the lib= option in install.packages for ONLY things conda won't install. It takes discipline, but works fantastically.

If I have time I'll expand my script into a full R library that will take care of all of this seamlessly behind the scenes.

On the other hand, if the makers of RStudio, which is basically the maintainer of the R ecosystem, decided to cooperate with the conda world, they would do this, and do a much better job than I can ever hope to.

In the mean time, I've created a solution (after a lot of work) that allows me to use conda as my package manager for all of my R packages and has no conflicts whatsoever.

1 Like

This sounds like you have some really helpful stuff in store, here. I can only speak for myself here, but I think it's probably representative of this community (as in this site, not R as a whole) that many of us don't have a deep background in conda and R package management, so, for example, what you describe below wasn't something I was aware of.

It's really good to know that bioconductor has a bioconda channel, as that may be a useful resource to point people toward (ourselves included). I think with smaller discourse communities, it's not a matter of balk[ing] at offering help so much as trying to make sure someone is asking a question in a place where they can get the best help possible!

Anyhow, again, this is just my take, but, in the same way that I recommend people ask specific Bioconductor package questions on their community site, or look at Cross Validated for general statistics questions, my aim (at least) is effective triage/trying to help people find those who can help them best.

4 Likes

Hi @mara. I'm not intending to troll your comment. But I think the explanation for

that many of us don't have a deep background in conda and R package management

IS because, despite the fact that there is an entire datascience universe out there using conda, those who guide R are not interested in participating. Otherwise, conda tools would be part of R and RStudio by now.

I compare it to Apple vs. the rest of the world. From the outset, Apple wanted to control both hardware and software. I feel like the makers of R want to make sure that R-users have to stick with their product 100% (or as much as possible). Canonical (which produces Ubuntu), and Anaconda (formerly Continuum Analytics) are both commercial companies that design their open-source products to integrate freely with the rest of the world. Although RStudio does release their products to the community (in a limited form), it is also more insular. It's a choice of business model that probably works well for RStudio, but, IMO, not as well for the datascience community.

Most importantly, I think the use of (conda) environments is an essential component of reproducible research as it provides both a shareable recipe and provenance for reproducing previous analysis. I see no reason why RStudio and the R community in general would want to stay in their walled community other than control and profit.

1 Like

you should probably be open to the possibility that other communities have solved some of the same problems using different methods and tools. I don't work for RStudio and I don't even maintain any CRAN packages. But I have zero interest in adopting conda for R package management because it does not solve any problems which I have. But it does introduce considerable friction and a learning curve. So I don't mean to dissuade you from advocating for conda or sharing your learnings. But an assertion that anyone who doesn't see your way is interested in only "control and profit" is, at best, short sighted.

1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.