Save multiple HDF files from an ftp list in R, giving a different name to each according to the name of the ftp link

I have a text list of HDF files I need to download from an ftp server.

This is the (example) structure of the list:

ftp://username:password@ftppath/File#1_hh_mm_ss.HDF
ftp://username:password@ftppath/File#2_hh_mm_ss.HDF
ftp://username:password@ftppath/File#3_hh_mm_ss.HDF
...

I tried to download a single file with this basic script:

url = "ftp://username:password@ftppath/File#3_hh_mm_ss.HDF"
       download.file(url, destfile = "Test1.HDF")

What I would like to do is to download multiple files at once (i.e., the ones in the list above), and save them automatically, giving at each file the name of the file as it is in the ftp link (i.e., File#1_hh_mm_ss.HDF, File#2_hh_mm_ss.HDF, File#3_hh_mm_ss.HDF)

Is anyone able to help?

Thanks!

Assuming your basic download script was a success, you could try the following. The save_file() function creates a file name by stripping out the base path from each .HDF filepath, and then it assigns the downloaded file to this newly created file name. The walk() function passes each of the hdf_files to save_file. The "Complete!" message is not necessary but was added to show when the walk completes.

library(tidyverse)

hdf_files = c('ftp://username:password@ftppath/File#1_hh_mm_ss.HDF',
              'ftp://username:password@ftppath/File#2_hh_mm_ss.HDF',
              'ftp://username:password@ftppath/File#3_hh_mm_ss.HDF')

base_path = 'ftp://username:password@ftppath/'

save_file = function(i) {
  filename = str_replace(i, base_path, '')
  download.file(url = i, destfile = filename)
  
  # message letting us know the final file is complete 
  if(i == hdf_files[length(hdf_files)]) {print('Complete!')}
}

walk(hdf_files, save_file)
1 Like

Thanks Scotty, really appreciate your help! This works great, however, I noticed that the list of files that I need to download, also includes different FTP URLs (i.e.:

              'ftp://username1:password1@ftppath/File#1_hh_mm_ss.HDF',
              'ftp://username1:password1@ftppath/File#2_hh_mm_ss.HDF',
              'ftp://username2:password2@ftppath/File#3_hh_mm_ss.HDF',
              'ftp://username2:password2@ftppath/File#4_hh_mm_ss.HDF',
              'ftp://username3:password3@ftppath/File#5_hh_mm_ss.HDF',
              'ftp://username3:password3@ftppath/File#6_hh_mm_ss.HDF',
              ...

This makes everything more complicated.

Would it be possible, instead, to download all of the files per each ftp URL?

For example (simplified):


ftp://username1:password1@ftppath/File#1, File#2, File#3, File#4, ... .HDF    #(DOWNLOAD ALL .HDF FILES in the ftp folder)
ftp://username2:password2@ftppath/File#1, File#2, File#3, File#4, ... .HDF    #(DOWNLOAD ALL .HDF FILES in the ftp folder )
ftp://username3:password3@ftppath/File#1, File#2, File#3, File#4, ... .HDF    #(DOWNLOAD ALL .HDF FILES in the ftp folder )
...

Thanks a lot for your help!

Does it work if you assign the filename this way within save_file()? Then you could completely ignore base_path.

filename = str_split(i, '/')[[1]][4]

This is another option

library(tidyverse)

hdf_files = c('ftp://username1:password1@ftppath/File#1_hh_mm_ss.HDF',
              'ftp://username1:password1@ftppath/File#2_hh_mm_ss.HDF',
              'ftp://username2:password2@ftppath/File#3_hh_mm_ss.HDF',
              'ftp://username2:password2@ftppath/File#4_hh_mm_ss.HDF',
              'ftp://username3:password3@ftppath/File#5_hh_mm_ss.HDF',
              'ftp://username3:password3@ftppath/File#6_hh_mm_ss.HDF')


walk(hdf_files, ~ download.file(.x, destfile = str_extract(.x, "File#\\d+.+$")))

Hi Andresrcs, thanks for your help!

Would it be possible to download all of the files within a given folder (accessible through the ftp link) without specifying the file names?
If so, could these be saved with the original names?

For instance:

ftp://username1:password1@ftppath/    #(DOWNLOAD ALL .HDF FILES within the ftp folder)
ftp://username2:password2@ftppath/    #(DOWNLOAD ALL .HDF FILES within the ftp folder)
ftp://username3:password3@ftppath/    #(DOWNLOAD ALL .HDF FILES within the ftp folder)

Thanks a lot for your help!