How to read this parsing issue with read_csv

Hi everyone! Here's the code that creating a parsing issue and I'm not sure how to interpret it.

library(tidyverse)

setwd("/Users/myusername/Dropbox/R Scripts/")

po_xtract <- read_csv("purchase-orders_20211020_141215.csv")

This is the result:

I'm not sure how to interpret the parsing issue. I viewed the row # 126 and compared it to the original file.

po_xtract %>% 
  slice(126) %>% 
  View()

All values in all columns are present.

I'm not sure what the issue is.

Thanks in advance!

The function is reading the first line as a comma delimited header with 56 columns and is not seeing enough commas in the rows indicated.

1 Like

Thanks technocrat! Is that something I should be worried about? :sweat_smile:

1 Like

the problem() listings tell you the rows of the file read in on which there were issues , not the rows in the resulting df post readin. i.e. if row 1 is the header that doesnt get a row in your data.frame; you should slice(125)

2 Likes

Yes, it means that lines in the source data are inconsistently formatted.

1 Like

I had a problem with a file and discovered many of the entries had commas within the columns. I used Excel's find and replace to change the commas to semi-colons (I think there were over 87,000 commas). That solved my problem.

2 Likes

Thank you for that information! I did check it again but didn't see anything out of the ordinary compared to the other rows. I wish I can recreate the problem using another dataset so I can show you guys.

Thanks for the help technocrat! I wish I can recreate the problem using another dataset so I can show you. I didn't see anything out of the ordinary compared to the other rows. All information were loaded completely anyway.

Try bringing in the first half only. If a problem, the first quarter. Then the first eighth. If necessary go on to the second half. The missing comma problem is hard to see, especially if you have quotation marks in the file. BTW: this is called a "quartering search."

Have you tried the native R "read.csv" function? You could try opening it in a LibreOffice spreadsheet and then save it as a CSV file.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.