I have been using read_csv for reading tables and has worked like a charm so far. The column type inference is very nice, since it saves from having to do type conversions.
Except, for when it behaves a bit too naïve. Here is an example where some of date columns are misinferred as
Warning message: “One or more parsing issues, call `problems()` on your data frame for details, e.g.: dat <- vroom(...) problems(dat)” Rows: 15052 Columns: 286 ── Column specification ──────────────────────────────────────────────────────────────────────────────────────────── Delimiter: ";" dbl (285): min_date, max_date, ... date (1): disease_start_date
I know what is triggering this: Columns like
max_date have NA rows. Only
disease_start_date is complete enough to infer the pattern.
Is there anything I can do to make the type inference a bit smarter?