How to parse data frame column with JSON strings (lists of dicts) into data frame column of matrices?


#1

Hi, I have a dataframe column that consists of something closely resembling a python list of dictionaries as strings that contain coordinate points. I would like to parse these such that the output is a new data frame column in which each cell is a two-column matrix (for lat, long, respectively; sample below).

Has anyone come across something like this before? I'm a complete regex novice, so am not sure how to best approach this. Thanks!

path_id <- c("foo","bar")
path_shape <- c("[{\"lat\": 25.2, \"lng\": 55.4}, {\"lat\": 25.3, \"lng\": 55.3}]",
          "[{\"lat\": 25.1, \"lng\": 55.4}, {\"lat\": 25.1, \"lng\": 55.3}, {\"lat\": 25.2, \"lng\": 55.3}, {\"lat\": 25.3, \"lng\": 55.4}]")

path_data <- data.frame(path_id, path_shape, stringsAsFactors=FALSE)

desired_outout_df_cell_sample <- matrix(data = c(25.2,25.3,55.4,55.3), ncol = 2)
desired_outout_df_cell_sample

#2

Isn't your path_shape JSON? At least, jsonlite parses it exactly in the format you need:

path_id <- c("foo", "bar")
path_shape <- c("[{\"lat\": 25.2, \"lng\": 55.4}, {\"lat\": 25.3, \"lng\": 55.3}]", 
  "[{\"lat\": 25.1, \"lng\": 55.4}, {\"lat\": 25.1, \"lng\": 55.3}, {\"lat\": 25.2, \"lng\": 55.3}, {\"lat\": 25.3, \"lng\": 55.4}]")

purrr::map_dfr(path_shape, jsonlite::fromJSON)
#>    lat  lng
#> 1 25.2 55.4
#> 2 25.3 55.3
#> 3 25.1 55.4
#> 4 25.1 55.3
#> 5 25.2 55.3
#> 6 25.3 55.4

#3

@mishabalyasin you're right; I've updated the question subject, sorry for the confusion!

re: your solution, is there a way to keep list items separate such that the output is a vector of length 2, in which the first element is a 2x2 matrix, and the second element is a 4x2 matrix?


#4

Sure:

path_id <- c("foo", "bar")
path_shape <- c("[{\"lat\": 25.2, \"lng\": 55.4}, {\"lat\": 25.3, \"lng\": 55.3}]", 
  "[{\"lat\": 25.1, \"lng\": 55.4}, {\"lat\": 25.1, \"lng\": 55.3}, {\"lat\": 25.2, \"lng\": 55.3}, {\"lat\": 25.3, \"lng\": 55.4}]")

purrr::map(path_shape, jsonlite::fromJSON)
#> [[1]]
#>    lat  lng
#> 1 25.2 55.4
#> 2 25.3 55.3
#> 
#> [[2]]
#>    lat  lng
#> 1 25.1 55.4
#> 2 25.1 55.3
#> 3 25.2 55.3
#> 4 25.3 55.4

#5

Thank you very much! This solves it!