I frequently read large .csv files with many columns > 300.
I only need ~ 50 of the columns. Because the names of the columns are from a bad formatted excel-file I need to locate them by numbers. It is awkward to locate them by counting them from the beginning in a text-file viewer.
Any ideas how I can improve this process ? A viewer which gives automatically the number of the columns?
# Load libraries
library('tidyverse')
# Create dummy data
d = tibble(v_1 = rnorm(10))
for( i in 2:50 ){
d = d %>% mutate(!!str_c('v_', i) := rnorm(10))
}
# See column names and numbers
cbind(colnames(d), 1:ncol(d)) %>% View
Hope it helps
Ps. if this is recurrent task, I would look into creating a regex based "column finder" function
That gives me a data frame with column- names and numbers. Thats what I need when I am looking for the position of a specific column within the csv. file