How to delimit data using a comma, but not a comma with a space afterwards (SOLVED)

Is there a way to delimit based only on commas without spaces after them?

I have been looking but cannot find an answer to this, so I decided to make a post. I am trying to delimit ".txt" files that I have based on commas. There are hundreds of these files and they have column names. However, one of the columns includes sentences and some of the sentences have commas with spaces after them. This splits the sentence into the next column(s) and moves the values from the last columns into the row below . Then, it goes down another row and starts the next row.

I would manually remove the commas from the sentences, but there are so many files/lines that there has to be an easier way.

How does your file layout differ from this example

# file1
# V1,V2
# foo, i am a "strange" loop

# space separated, escapes embebbed quotes
readr::read_csv("~/Desktop/file1.csv")
#> Rows: 1 Columns: 2
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> chr (2): V1, V2
#> 
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#> # A tibble: 1 × 2
#>   V1    V2                 
#>   <chr> <chr>              
#> 1 foo   i am a strange loop

# file2
# V1,V2
# "foo", "i am a "strange" loop"
readr::read_csv("~/Desktop/file2.csv")
#> Rows: 1 Columns: 2
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> chr (2): V1, V2
#> 
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#> # A tibble: 1 × 2
#>   V1    V2                 
#>   <chr> <chr>              
#> 1 foo   i am a strange loop

The data was like: "1,00:30,2021-09-09 10:35:31.804,Fire Scout A,UAV,7,360,-1420,Find the two cars parked next to each other, isolated from the group,correct". The extra comma, in this case, being "other, isolated". The issue was that not all entries have the extra commas, and there are 30 participants with 3 logs each, each log containing 20-40 entries. I ended up using gsub() and some loops to replace the extra commas with just a space and then I was able to use the appropriate commas as the delimiters to separate them into columns.

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.