@filipwastberg You are right that it is possible to distribute Python functions in a Python package similar to how it works with R packages.
Let's say you have a Python module named mypandas.py
that contains some custom data analysis functions. You want to make it easier to use them in different projects and also to share with collaborators. To do this, create a new directory (the name doesn't matter, but I name it mypandas
for consistency), move the file there, and create a file called setup.py
.
mypandas/
├── mypandas.py
└── setup.py
Then add the following to setup.py
:
from distutils.core import setup
setup(name='mypandas',
version='0.1.0',
py_modules=['mypandas'],
install_requires=['pandas']
)
This specifies the following:
- The package name is "mypandas"
- The version is 0.1.0
- The package consists of the Python module
mypandas.py
- The package depends on pandas
Then you can run pip install .
to install the package (it will install pandas if it isn't already installed). Now you can run import mypandas
from anywhere on your machine, without having to worry about the current working directory or setting PYTHONPATH
.
To share the package with your colleagues, you can run python setup.py sdist
, and then send them the file mypandas-0.1.0.tar.gz
. They can download it, extract it, and then run pip install mypandas-0.1.0.tar.gz
, which will install mypandas
and pandas
.
This is the simplest case with only one module. Similar to R packages, the organization of Python packages can get complex. To get started with a more complex Python package setup, you can use the Python package cookiecutter (this works similarly to usethis::create_project()
):
pip install cookiecutter
cookiecutter https://github.com/audreyr/cookiecutter-pypackage.git
Here are some resources I found useful: