How to Read this Text Files(when they are in thousands) & how to convert RowNames into Variables, not able to do with Readr

Could anyone help in, How to read below .txt file( such files are in thousands) in tabular form and convert the row names into Variables names along with DataType

Delivery_person_Age 22.000000
Delivery_person_Ratings 4.700000
Restaurant_latitude 18.530963
Restaurant_longitude 73.828972
Delivery_location_latitude 18.560963
Delivery_location_longitude 73.858972
Order_Date 01-03-2022
Time_Orderd 23:35
Time_Order_picked 23:40
Weather conditions Sunny
Road_traffic_density Low
Vehicle_condition 2
Type_of_order Drinks
Type_of_vehicle scooter
multiple_deliveries 0.000000
Festival No
City NaN
Time_taken (min) 22.000000
Name: 13, dtype: object

Is this file complete and in the way, your actual *.txt files are saved?
Your last two lines will cause problems, since the rowname Time_taken (min) is poorly chosen. Especially the last line with two ":" will break the reading.

However, if you always have 18 rows, you could just do this and skip the last line:

library(tidyverse)
data <- read_delim('given_file.txt', col_names = FALSE, n_max = 18) |>
  pivot_wider(names_from = X1, values_from = X2) |>
  mutate(Time_taken = str_extract(string = Time_taken, "[0-9\\.,]+"))
#> Warning: One or more parsing issues, see `problems()` for details
#> Rows: 18 Columns: 2
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: " "
#> chr (2): X1, X2
#> 
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
data
#> # A tibble: 1 × 18
#>   Delivery_per…¹ Deliv…² Resta…³ Resta…⁴ Deliv…⁵ Deliv…⁶ Order…⁷ Time_…⁸ Time_…⁹
#>   <chr>          <chr>   <chr>   <chr>   <chr>   <chr>   <chr>   <chr>   <chr>  
#> 1 22.000000      4.7000… 18.530… 73.828… 18.560… 73.858… 01-03-… 23:35   23:40  
#> # … with 9 more variables: Weather <chr>, Road_traffic_density <chr>,
#> #   Vehicle_condition <chr>, Type_of_order <chr>, Type_of_vehicle <chr>,
#> #   multiple_deliveries <chr>, Festival <chr>, City <chr>, Time_taken <chr>,
#> #   and abbreviated variable names ¹​Delivery_person_Age,
#> #   ²​Delivery_person_Ratings, ³​Restaurant_latitude, ⁴​Restaurant_longitude,
#> #   ⁵​Delivery_location_latitude, ⁶​Delivery_location_longitude, ⁷​Order_Date,
#> #   ⁸​Time_Orderd, ⁹​Time_Order_picked
#> # ℹ Use `colnames()` to see all variable names

Created on 2022-08-31 by the reprex package (v2.0.1)

You can now take care of the columns by yourself and clean them up (e.g. convert to correct types instead of chr).

Kind regards

1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.