Missing observations

Hi, I am new here.
I am trying to import a zip/csv from github. It has 130 observations of 146 variables. However, all I can see after I read.csv the file are 62 observations of 146 variables. Any idea what I did wrong?
Thanks!

no accompanying errors or warnings?

No ... and I can read all observations fine in a browser.

is it on a public repo on github, so that we can look ?

Yes: https://github.com/geersk/ClassX/blob/master/testfoot.zip.
For a class I am taking on R programming.
Thanks!

I had to change the url to the 'actual' file contents, because I think your link is to the webpage that serves up the data, rather than the data itself.

using this code I was able to get all the contents :

download.file("https://github.com/geersk/ClassX/raw/master/testfoot.zip",
                        "testfoot.zip")

unzip("testfoot.zip")
tf <- unzip("testfoot.zip", list = TRUE)
tf
#Name Length                Date
#1 CFB2019.csv  74643 2020-01-15 16:59:00

mycsv <- read.csv(tf$Name)
dim(mycsv)
#[1] 130 146

Thank you so much. I was unable to get the same result. I am on Ubuntu if that matters :slight_smile:

> download.file("https://github.com/geersk/ClassX/raw/master/testfoot.zip",
+               "testfoot.zip")
trying URL 'https://github.com/geersk/ClassX/raw/master/testfoot.zip'
Content type 'application/zip' length 32814 bytes (32 KB)
==================================================
downloaded 32 KB

> unzip("testfoot.zip")
Warning message:
In unzip("testfoot.zip") : error -1 in extracting from zip file
> tf <- unzip("testfoot.zip", list = TRUE)
> tf
         Name     Length                Date
1 CFB2019.csv 4294967295 2020-01-15 16:59:00
> mycsv <- read.csv(tf$Name)
> dim(mycsv)
[1]  55 146
> View(mycsv)
Warning message:
closing unused connection 3 (/tmp/Rtmp9dX0z1/file206055c9be8b:CFB2019.csv)

For some weird reason unzip command is failing on Linux with this file but if you unzip using a system command it works normally so you can use this walk-around solution.

download.file("https://github.com/geersk/ClassX/raw/master/testfoot.zip", "testfoot.zip")
system2("unzip", args = "testfoot.zip")
mydata <- read.csv("CFB2019.csv")
dim(mydata)
#> [1] 130 146

Thank you ... However, I am sorry, I could not make this code work either. I am new to Linux and maybe that is the problem. So: would completing my R Studio course be easier with a MacBook?

Can you explain why you say it is not working?, do you get any error message? I'm also using Ubuntu and this works so it should work for you too.

I just made it work by resaving the data as a text file, and read.csv from working directory. For my class, however, I must upload the data somewhere public and automatically load it in the code. Any advice for the simplest way to accomplish the latter? Thanks!

host the raw csv in a public git repo ?
or on some other file share , dropbox maybe ?

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.