Slavek
October 2, 2019, 4:07pm
1
Hi,
I can see and issue with csv import in R.
I have some data files provided to me in csv format. Unfortunately I cannot import them to R. Although when I open them in Excel and import them in Excel format, everything is fine. I know I can convert all files before importing them but I am just curious that that could be.
this doesn't work:
data.source.csv <- read.csv("P:/User/yyy.csv", header = TRUE, sep = ",")
but this is fine:
library(readxl)
data.source.xls <- read_excel("P:/User/yyy.xlsx")
and I've got that as a result:
data.frame(stringsAsFactors=FALSE,
þÿ.URN. = c("10BE022654416", "10BE022662462", "10BE022001922"),
QUESTION = c("Recommendation", "Recommendation", "Choice Dealer"),
VERBATIM = c("aaa meer!!!!!", "bbb gesteld", "ccceid"),
CONCEPTS = c("-[5|68|180]", "+[7|0|61]\t-[5|61|106]", "+[7|0|107]")
)
any thoughts? Maybe that is related to a weird character in the name of the first variable (þÿ.URN)?
Any chance you can share a link to a sample .csv file that reproduces the issue?
1 Like
jdb
October 2, 2019, 7:29pm
3
I had this happen to me and the solution was providing the encoding as UTF-16 .
Does the following help at all?
data.source.csv <- read.csv("P:/User/yyy.csv", header = TRUE, sep = ",", encoding = "UTF-16")
Slavek
October 3, 2019, 8:46am
4
jdb:
, encoding = "UTF-16"
Thank you but encoding did not help. I still have this error:
Warning messages:
1: In read.table(file = file, header = header, sep = sep, quote = quote, :
line 1 appears to contain embedded nulls
2: In read.table(file = file, header = header, sep = sep, quote = quote, :
line 2 appears to contain embedded nulls
3: In read.table(file = file, header = header, sep = sep, quote = quote, :
line 3 appears to contain embedded nulls
4: In read.table(file = file, header = header, sep = sep, quote = quote, :
line 4 appears to contain embedded nulls
5: In read.table(file = file, header = header, sep = sep, quote = quote, :
line 5 appears to contain embedded nulls
I don't know how I could attached the file as only pdfs and pictures are allowed.
Slavek
October 3, 2019, 8:49am
5
You haven't made the file public, we don't have access to it.
Slavek
October 4, 2019, 8:51am
7
Slavek
October 7, 2019, 9:16am
8
I have a feeling this csv problem is an R issue which can be resolved only by converting csv files to excel prior to importing them to R environment. Unless there is a package helping to import weird or corrupter csv files
Your csv file has a rare encoding that I can't identify but using "utf16" allows you to read the data, although you lose the special characters like "þÿ"
url <- "https://drive.google.com/uc?authuser=0&id=1U5Psb8CxEBYx74RfquGWLlaN1aJWmJmo&export=download"
read.csv(url, header = TRUE, fileEncoding = "utf16")
#> URN QUESTION VERBATIM CONCEPTS
#> 1 10BE022654416 Recommendation aaa meer!!!!! -[5|68|180]
#> 2 10BE022662462 Recommendation bbb gesteld +[7|0|61]\t-[5|61|106]
#> 3 10BE022001922 Choice Dealer ccceid +[7|0|107]
Created on 2019-10-07 by the reprex package (v0.3.0.9000)</sup
If you know what the exact encoding is then you can specify it and have the special characters to appear.
1 Like
Slavek
October 8, 2019, 9:41am
10
O wow!
That was really weird.
Thank you!
system
Closed
October 15, 2019, 9:41am
11
This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.