Import data set from CSV file

Hello all,
First message here. I'm a french student, and I'm working with Rstudio in an analytics class. We strated 2 weeks ago, but since then I can't make R work properly. My problem is so simple that it's almost stupid. When i try to import a CSV file that my teacher gave to the class, I get an error message.


https://imgur.com/a/JqVVbj4

Any help ?

One guess would be that it's an encoding problem with your file path.

Could you try reading a file from a location without the special c character (cedille?).

2 Likes

Another possible problem is the decimal point representation (typically "," in France) which could interfere with the processing of a comma delimited file.

Can you open the file in a simple text editor to see how the data is stored?

1 Like

The system won't let me save my file somewhere before "François", (C:) or Users are not available as saving spaces.
As for the comma problem, I don't know because in fact the csv file is simply an excel sheet, I don't have a problem opening it anywhere else.

don't know because in fact the csv file is simply an excel sheet

What is the file extension? .csv or .xlxs / .xls?

Also, can you send a screenshot of the folder where the file is located?

The file is actually saved on my desktop. I tried to download it many times in different places on my PC, to see if it was really a path problem, I always get the same error message.

Does this work

setwd("C:/Users/François/Desktop")
readr::read_csv("europeancities.csv")

Alternatively, if you shift + right click on the desktop > Open command window then enter

Rscript -e readr::read_csv("europeancities.csv")

I think that @pomchip might be onto something:

In countries (such as France), where the decimal point is represented by a comma, the separation character is usually a tab. So it is possible that your file is not a real csv file (comma separated values), but a tsv file (tab separated values). Which would explain the error message you are getting: your file might be called "whatever.csv", but it might not be a csv file. How did you create it?

If you open that file, you will be able to see that (and if you aren't sure, you can post a section of it here for us to double-check).

If this were the case, one way to import a tab separated file (.tsv) with base R is:

read.table("file_name.tsv", sep = "\t", header = T)

Or if you want to use the package readr from the tidyverse, then you can run:

readr::read_tsv("file_name.tsv")

1 Like

I don't think the problem comes from the file, all the other students in my class have the same as me and it works just fine. We've all downloaded the file in the same way (it has been posted by our teacher within the intranet of our school).

I tried to enter manual command @hughparsonage and I get what you can see on the picture (the file is not called europeancities anymore as we moved onto another exercice but the problem remains the same)

Is this possibly due to the 'ç' in François messing with the file path?

Can you try putting the csv file on an USB stick or external hard drive and then loading it from there?

@rywhale You gave me hope, but I get this :

I moved the file to a USB key, and I tried to read it from the USB directory.

Alternatively, I tried with a friend's PC, no csv file problem here, but a package problem. When I want to open the file, a certain "R6" package is supposed to be installed, and it is not. When I try to install it, i get this :

Maybe just try the base-R function for reading csv instead of the readr version?

read.csv()

Seems like you have a bunch of overlapping problems... Might be worth trying a fresh install of R and R-Studio.

That's about the extent of the advice I can provide, hopefully someone more knowledgeable can help you sort this out if my suggestions don't work.

Bonne chance!

1 Like

There are several things going on:

The problem you got on this picture:

is that your working directory and the directory holding your .csv file are different. Your working directory is your desktop (as you can see from your setwd() line and on the error message), while your file is in a different path (as you can see from the folder navigator window). That explains the error message you were getting. Setting the working directory to the location of your file or giving the full file path to the import function would solve this.

But this does not explain the other error messages that you got elsewhere.

Oh, sorry: I missed that one. So ignore my previous message.

Ya. Now that I am looking at this more carefully (sorry for my rushed message above), I think that this is the most likely. There is an error message about some locale issue and another one about an unexpected character... I think that R is not happy with your windows home path. Playing with the locale might fix this.

Are you using UTF-8? You might not be. And maybe you are the only one in your group having issues because you are the only one having a special character (le c cedille) in your name.

The problem you had when trying the USB path could be something else entirely.

OK this looks to be a limitation with readr. Here's a reprex:

library(hutils)
library(readr)
library(data.table)

hutils::provide.dir(problem_path <- file.path(tempdir(), "Fran", intToUtf8(231L), "ois/"))
file.csv <- file.path(problem_path, "mtcars.csv")
readr::write_csv(mtcars, file.csv)

head(readr::read_csv(file.csv))
#> Error in guess_header_(datasource, tokenizer, locale): Cannot read file C:/Users/hughp/AppData/Local/Temp/RtmpiIrWc0/Fran/ç/ois/mtcars.csv: The system cannot find the path specified.
head(data.table::fread(file.csv))
#>     mpg cyl disp  hp drat    wt  qsec vs am gear carb
#> 1: 21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
#> 2: 21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
#> 3: 22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
#> 4: 21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
#> 5: 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
#> 6: 18.1   6  225 105 2.76 3.460 20.22  1  0    3    1
head(read.csv(file.csv))
#>    mpg cyl disp  hp drat    wt  qsec vs am gear carb
#> 1 21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
#> 2 21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
#> 3 22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
#> 4 21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
#> 5 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
#> 6 18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

Created on 2018-10-06 by the reprex package (v0.2.1)

3 Likes

Might be worth an issue?

I believe it's already been filed. Possibly fixed: https://github.com/tidyverse/readr/pull/838

1 Like

That said, special characters are not a great idea, neither in paths, nor in file names...

But windows doesn't exactly help with that.

1 Like