How to download a google drive's contents based on Drive ID, or URL?

tidyverse

#1

Is there a way I can download a drive's full contents based on the Drive ID from the URL? If so, how would I go about doing this?


#2

The googledrive package allows you to programmatically download files from your google drive account. After you load the package, run drive_find() to list files which will give you an opportunity to give R access and, if you wish, to cache your logon credentials for R to use in future sessions. You might want to set the n_max argument of drive_find to a low number if you're trying that out interactively, as it will return a list of files in your google drive.

drive_download() allows you to download files. See the linked Vignette for details.


#3

Note that you'd have to somehow go through and download the files in bulk— in the same way that people read a batch of csv files in from a folder. With Google native formats, you have to export them to something else, see:

We can download files from Google Drive. Native Google file types (such as Google Documents, Google Sheets, Google Slides, etc.) need to be exported to some conventional file type. There are reasonable defaults or you can specify this explicitly via type or implicitly via the file extension in path .

So, in terms of your broader question, can you download the full contents with a URL, it depends…but there's not a built-in mechanism to do so.


#4

All of the files are in .csv format. How would I go about using Google Drive to download the full contents, or at a minimum, use the ID to format file by file?


#5

I'm actually not sure what ID this is, because googledrive (the tidyverse package) uses oauth, I don't think you actually use this type of ID directly. So, I think the answer to your initial question (at least with this package) might be no (@joels, I don't know if there's something else you were thinking of).


#6

@mara, the method I tested for downloading files is as follows:

library(googledrive)
library(tidyverse)

x = drive_find(n_max=10)

map(x$name, drive_download)

This downloaded the first 10 files in my google drive to the local directory.

So, to download all of the files in your google drive, you could do:

map(drive_find()$name, drive_download)

If you have lots of file types and just want to download the csv files:

map(drive_find(type="csv")$name, drive_download)

#7

Thanks Joel. Is there a way to do this on someone else's shared google drive if I have the ID as part of the URL?


#8

I'm not sure what you mean by the ID. As long as you have the google account login credentials you can use the googledrive package to do the download. When you run drive_find() the first time, a browser window will pop up asking you to authorize R to access the account. After that you should be good to go with the method described above.


#9

Yeah, I was speaking specifically to the ID thing. I don't know how that would work with the auth setup.


#10

If I understand correctly, @realhiphop is talking about a file ID? I think we are confused by this wording:

"download a drive's full contents"
"use the ID to format file by file"

Are you asking how to download the "full contents" of a single file on Google Drive or all files included someone's Google Drive? Or all files in a specific folder on Google Drive? Those are three different questions.

To download a file based on its ID or URL:

googledrive::drive_download(as_id(YOUR_ID_OR_URL_GOES_HERE))

What exactly is the ID you are talking about?


#11

Sorry for the lack of clarity everyone!

I have the following URL to a shared drives folder contents. I'd like to be able to download either all of the contents, or specific files.

The URL is: https://drive.google.com/drive/folders/0B7tJg2i5HAo2c0VzVFVhLUdQcnM

The ID I was referring to is "0B7tJg2i5HAo2c0VzVFVhLUdQcnM"


#12

If you use "Add to My Drive" (see below), you can map drive_download(), as @joels described.


You'd just map over the files in the folder, which you can get using drive_ls(), which is basically a wrapped version of drive_find() that allows you to easily specify folder.


#13

OK that is the "file ID" for a folder on Drive. Here's how to download the csv files within it, based just on the URL.

library(googledrive)
library(purrr)

## store the URL you have
folder_url <- "https://drive.google.com/drive/folders/0B7tJg2i5HAo2c0VzVFVhLUdQcnM"

## identify this folder on Drive
## let googledrive know this is a file ID or URL, as opposed to file name
folder <- drive_get(as_id(folder_url))

## identify the csv files in that folder
csv_files <- drive_ls(folder, type = "csv")

## download them
walk(csv_files$id, ~ drive_download(as_id(.x)))
#> File downloaded:
#>   * steamer_pitchers_2018_opp_batter_splits_preseason_final.csv
#> Saved locally as:
#>   * steamer_pitchers_2018_opp_batter_splits_preseason_final.csv
#> File downloaded:
#>   * steamer_pitchers_2018_ros_split_preseason_final.csv
#> Saved locally as:
#>   * steamer_pitchers_2018_ros_split_preseason_final.csv
#> File downloaded:
#>   * steamer_pitchers_2018_ros_preseason_final.csv
#> Saved locally as:
#>   * steamer_pitchers_2018_ros_preseason_final.csv
#> File downloaded:
#>   * steamer_hitters_2018_ros_split_preseason_final.csv
#> Saved locally as:
#>   * steamer_hitters_2018_ros_split_preseason_final.csv
#> File downloaded:
#>   * steamer_hitters_2018_ros_preseason_final.csv
#> Saved locally as:
#>   * steamer_hitters_2018_ros_preseason_final.csv
#> File downloaded:
#>   * steamer_hitters_2018_ros_multi_split_preseason_final.csv
#> Saved locally as:
#>   * steamer_hitters_2018_ros_multi_split_preseason_final.csv

Created on 2018-10-24 by the reprex package (v0.2.1)


#14

Jenny-

Thank you so much! Worked perfectly on the CSV files. I tried to do a drive download of all files in the folder.

full_drive <- drive_ls(folder)
walk(full_drive$id, ~ drive_download(as_id(.x)))

I'm getting the following error:

Error: The file doesn't seem to have downloaded.

It downloads a bunch of files before hitting this error. Any idea why?


#15

Hard to say. Perhaps you don't have necessary permission for every single file or perhaps there are subfolders. Folders cannot be downloaded. You could wrap the drive_download() call in purrr::safely() and use map() instead of walk() to proceed past failures and store the result. Then you could inspect the failures more closely.


#16

Thanks Jenny. I tried doing exactly what you said with the corresponding code:

map(other_files$id, ~ purrr::safely(drive_download(as_id(.x))))
File downloaded:
  * Steamer_Projections_2016_with_playingtime
Saved locally as:
  * Steamer_Projections_2016_with_playingtime.xlsx
Error: Can't convert a list to function

I wasn't sure if I formatted your code properly so then I tried:

purrr::safely(map(full_drive$id, ~ drive_download(as_id(.x))))
Saved locally as:
  * Steamer_Batters_2010.xlsx
Error: The file doesn't seem to have downloaded.

This got most of the files, but it stopped before getting all of the files.


#17

purrr::safely() is an adverb. You apply it to a function.

So, more like map(other_files$id, ~ purrr::safely(drive_download)(as_id(.x))) caveat: untested cde