I have been using Rmarkdown for building reports from python code for a while. But recently have stumbled across some behaviour that is causing issues with other users building the report on their machine.
That is it appears that python packages from outside of the virtual environment are able to be imported on creation of the document.
I put together a quick script to reproduce this behaviour
#!/bin/bash
# new virtual environment
# no packages should be installed
virtualenv venv
cat > markdown_test.Rmd << EOF
# Issue
\`\`\`{r}
library(reticulate)
use_virtualenv('./venv')
\`\`\`
In a new environment this should cause an error with missing pandas
\`\`\`{python}
import sys
sys.executable
import pandas
pandas.DataFrame([1,2,3])
print(pandas.__path__)
\`\`\`
does not cause an error
## For info
\`\`\`{r}
sessionInfo()
py_config()
getwd()
\`\`\`
EOF
cat > py_test.py << EOF
import sys
logf = open('output.log','w')
logf.write(str(sys.executable) + '\n')
try:
import pandas
except Exception as e:
logf.write(str(e) + '\n')
finally:
pass
EOF
. venv/bin/activate
Rscript -e "rmarkdown::render('markdown_test.Rmd')"
python py_test.py
# causes an error but output.log contains the same executable path
deactivate
For me the output log file from python gives an error saying no module pandas. But in the markdown output pandas loads fine. The executable is shown as that in the virtualenv but the path to the imported pandas package is the anaconda library site-packages.
Can anyone shed any light on what might be going on here? Or how I force reticulate to ignore my anaconda packages and only use those in the virtualenv