install arrow from posit package manager binary in Ubuntu

After failing to build arrow from source I did manage to install it from the binary with this"

install.packages("arrow", repos = "https://packagemanager.rstudio.com/all/__linux__/jammy/latest")

But I can't load it.

> library("arrow")
Error: package or namespace load failed for β€˜arrow’ in dyn.load(file, DLLpath = DLLpath, ...):
 unable to load shared object '/usr/local/lib/R/site-library/arrow/libs/arrow.so':
  /usr/local/lib/R/site-library/arrow/libs/arrow.so: cannot open shared object file: No such file or directory
>

I checked the directory and arrow.so definitely exists and I've set the read/write permission on the file to all but still nothing. What's wrong? Thanks.

My longer post is waiting for approval (?), but the short and sweet solution is

apt-get -y install libcurl4-openssl-dev libssl-dev

EDIT: I put my long answer here: arrow R package binary and system requirements Β· GitHub

You need some system packages as well that arrow.so is linking to. There are different ways to see what exactly.

  1. You can manually inspect arrow.so using ldd arrow.so so see what is missing.

  2. Some packages, e.g. pak can look up system requirements. The devel version (from All about installing pak. β€” Installing pak β€’ pak) is better at this:

    > pak::pkg_sysreqs("arrow")
    ── Install scripts ────────────────────────────────────── Ubuntu 22.04 ──
    apt-get -y update
    apt-get -y install libcurl4-openssl-dev libssl-dev
    
    ── Packages and their system dependencies ───────────────────────────────
    arrow – libcurl4-openssl-dev, libssl-dev
    
  3. Or, if you use pak to install the package, then it will automatically install the system requirements for you. (This is if you are the root user, or have password-less sudo. Otherwise it'll still print the system packages you need.)

    > pak::pkg_install("arrow")
    βœ” Loading metadata database ... done
    
    β†’ Will install 9 packages.
    β†’ Will download 9 packages with unknown size.
    + R6           2.5.1    [dl]
    + arrow        12.0.1.1 [dl] + βœ– libcurl4-openssl-dev, βœ– libssl-dev
    + assertthat   0.2.1    [dl]
    + bit          4.0.5    [dl]
    + bit64        4.0.5    [dl]
    + magrittr     2.0.3    [dl]
    + purrr        1.0.1    [dl]
    + tidyselect   1.2.0    [dl]
    + withr        2.5.0    [dl]
    β†’ Will install 2 system packages:
    + libcurl4-openssl-dev  - arrow
    + libssl-dev            - arrow
    β„Ή Getting 9 pkgs with unknown sizes
    βœ” Got assertthat 0.2.1 (x86_64-pc-linux-gnu-ubuntu-22.04) (52.46 kB)
    [...]
    βœ” Downloaded 9 packages (21.98 MB)in 4.5s
    β„Ή Installing system requirements
    β„Ή Executing `sh -c apt-get -y update`
    β„Ή Executing `sh -c apt-get -y install libcurl4-openssl-dev libssl-dev`
    βœ” Installed R6 2.5.1  (1.1s)
    βœ” Installed arrow 12.0.1.1  (1.1s)
    [...]
    βœ” 1 pkg + 13 deps: kept 5, added 9, dld 9 (21.98 MB) [17.5s]
    

Thank you. So much interesting stuff but no luck. I DO have the libs installed.

root@ip-172-31-35-136:/home/ubuntu# apt-get -y install libcurl4-openssl-dev libssl-dev
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
libcurl4-openssl-dev is already the newest version (7.81.0-1ubuntu1.13).
libssl-dev is already the newest version (3.0.2-0ubuntu1.10).
0 upgraded, 0 newly installed, 0 to remove and 25 not upgraded.
root@ip-172-31-35-136:/home/ubuntu#

and apparently pak hangs when building arrow from source, just like install.,packages() does (which is why I downloaded the binary). There's this:

> pak::sys.reqs("arrow")
Error: 'sys.reqs' is not an exported object from 'namespace:pak'

> pak::pkg_install("arrow")
βœ” Loading metadata database ... done

β†’ Will update 1 package.
β†’ The package (4.10 MB) is cached.
+ arrow 12.0.1.1 β†’ 12.0.1.1 [bld][cmp] + βœ” libcurl4-openssl-dev, βœ” libssl-dev
? Do you want to continue (Y/n) y
β„Ή No downloads are needed, 1 pkg (4.10 MB) is cached
β„Ή Building arrow 12.0.1.1
βΈ¨β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ βΈ© | πŸ“¦  14/15 β Ό 1 | βœ…  14/15     | building arrow

and there it hangs.

So if you have the system packages installed, and the binary still fails to load, then can you run the following from a shell and tell us the output?

ldd /usr/local/lib/R/site-library/arrow/libs/arrow.so
ubuntu@ip-172-31-35-136:~$ ldd /usr/local/lib/R/site-library/arrow/libs/arrow.so
        not a dynamic executable

That does not look right to me. What is that file then? E.g. can you run ls -l and file on it?

Or, if the binary arrow package is installed somewhere else, then can you run ldd on that?

ubuntu@ip-172-31-35-136:/usr/local/lib/R/site-library/arrow/libs$ ls -l
total 46336
-rwxrwxrwx 1 root root 47444208 Jul 28 19:49 arrow.so
ubuntu@ip-172-31-35-136:/usr/local/lib/R/site-library/arrow/libs$ file arrow.so
arrow.so: ELF 64-bit LSB shared object, x86-64, version 1 (GNU/Linux), dynamically linked, BuildID[sha1]=f5cba9d367a8fd36a1b22b7a395247e723d2ec77, stripped

The binary package is not installed. The build always hangs.

> options(ARROW_R_DEV=TRUE)
> install.packages("arrow")
Installing package into β€˜/usr/local/lib/R/site-library’
(as β€˜lib’ is unspecified)
trying URL 'https://cloud.r-project.org/src/contrib/arrow_12.0.1.1.tar.gz'
Content type 'application/x-gzip' length 4097373 bytes (3.9 MB)
==================================================
downloaded 3.9 MB

* installing *source* package β€˜arrow’ ...
** package β€˜arrow’ successfully unpacked and MD5 sums checked
** using staged installation
*** Building on linux aarch64
*** Found local C++ source: 'tools/cpp'
*** Building libarrow from source
    For build options and troubleshooting, see the install guide:
    https://arrow.apache.org/docs/r/articles/install.html
**** cmake: /usr/bin/cmake
**** arrow

and there it sits.

I am sorry, I don't understand. So if you run this below, that hangs?

I also don't get how file says that the file is a ELF 64-bit LSB shared object and then ldd says that it is not a dynamic executable?

and there it sits.

Is it possible that it is taking very long? If you look all the processes, can you compiler processes compiling arrow? On a relatively modern machine it often takes 20-30 minutes to compile arrow for me.

yes. but I think I solved my problem. I was running on an EC2 ARM architecture instance, because it's cheaper. I switched to x86 and used r4u to get the Ubuntu binaries rather than compile and that worked. Thanks for your patient replies.

Right, you were probably using a small instance, and compiling arrow takes very long, especially without a lot of memory. That's why the compilation was hanging.

Btw. if you are using x86_64 already, you can also use PPPM (https://packagemanager.posit.co/), and pak to install binary R packages, together with their system requirements automatically.

1 Like

This topic was automatically closed 42 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.