Parse a json field from a csv which shows only the max value among a serie of values

Hi,
I need help on one issue:
I need to parse a json field called "payload" in the csv, and I would like to obtain twon columns for each ID: Counters and last_counters
For each of these two columns I want to see the maximun value.
For example: in the json counters I can have these values: 10, 11, 130, 150.
I want to see only the value 150

at this link you find the csv:

Can you help me plase?
Thanks!

Giada

Hi,

Could you explain in a bit more detail what the final data structure would look like (generate a data frame with some mock values). I'm confused about the columns per ID.

  • Do you want a separate data frame per id, or each ID a new column? The latter is not possible as the columns would be different length.
  • Some IDs don't have last_counters so how would you handle that?
  • What should happen if both are empty?

Again, please generate an example of the final table (or whatever you want to create) so we can see what you want. You can use the reprex guide to help you. A reprex consists of the minimal code and data needed to recreate the issue/question you're having. You can find instructions how to build and share one here:

Hope this helps,
PJ

Thanks for your response!
I want to see a dataframe with three columns:

  • id ---> as-is
  • counters --- > with only the max value of the series
  • last counters --- > with only the max value of the series

I can parse the json with the following code but I cannot how to sho the max value:

install.packages("tidyverse")
library(tidyverse)
setwd("D:/")
statistics_total <- read_delim("statistic_example.csv", delim = ";", quote = """)

statistics_total %>%
separate_rows(payload, sep = ",") %>%
separate(payload, c("payload", "value"), sep = ":") %>%
drop_na() %>%
mutate_if(is.character, ~str_remove_all(., "[\"\s]")) %>%
mutate(value = as.numeric(value)) %>%
pivot_wider(id_cols = id, names_from = payload, values_from = value)

new_dataframe <- statistics_total %>%
separate_rows(payload, sep = ",") %>%
separate(payload, c("payload", "value"), sep = ":") %>%
drop_na() %>%
mutate_if(is.character, ~str_remove_all(., "[\"\s]")) %>%
mutate(value = as.numeric(value)) %>%
pivot_wider(id_cols = appliance_id, names_from = payload, values_from = value)

Hi,

just to be clear: the original file is JSON text pasted in a csv file right? There is no original valid JSON file?

PJ

Yes, right. I made a query from the database and I downloaded it as a csv.

Hi,

How about this:

library(stringr)
library(dplyr)
library(purrr)

#This should be the same as your csv file when read
myData = data.frame(
  stringsAsFactors = FALSE,
                ID = c(1L,2L,3L,4L,5L,6L,7L,8L,
                       9L,10L,11L,12L,13L,14L,15L,16L,17L,18L,19L,
                       20L,21L,22L,23L,24L),
           payload = c("{\"counters\":\"\",\"last_counters\":\"\"}",
                       "{\"counters\":\"\",\"last_counters\":\"\"}",
                       "{\"counters\":\"10,65538,244,10,11,13,18\",\"last_counters\":\"10,3,244,10,11,13,18\"}",
                       "{\"counters\":\"\",\"last_counters\":\"\"}",
                       "{\"counters\":\"\",\"last_counters\":\"\"}",
                       "{\"counters\":\"\",\"last_counters\":\"\"}","{\"counters\":\"\",\"last_counters\":\"\"}",
                       "{\"counters\":\"\",\"last_counters\":\"\"}",
                       "{\"counters\":\"13908,13105,13876,13414,13880,13876,13873,13364,13413,14137,13112,14137,14133,13107,13876,13366,13107,13105,14129,12897,13366,13365,13413,13619,13618,13625,13616,13620,13414,13410,13621,13113,0,13361,13362,14130,13921,13365,13107\"}",
                       "{\"counters\":\"13868,13105,13876,13414,13880,13876,13873,13364,13413,14137,13112,14137,14133,13107,13876,13366,13107,13105,14129,12897,13366,13365,13413,13619,13618,13625,13616,13620,13414,13410,13621,13113,0,13622,13366,13625,13365,14136,13366\"}",
                       "{\"counters\":\"\",\"last_counters\":\"\"}",
                       "{\"counters\":\"\",\"last_counters\":\"\"}",
                       "{\"counters\":\"12905,14391,25138,78130,13366,12851,77874,78130,13363,13874,13620,25908,13108,12853,14645,77876,13365,26165,90675,13620,14645,91186,79926,77874,12595,12598,13365,25142,13878,14389,14643,25910,0,14133,13877,12343,13110,12852,25136\",\"last_counters\":\"0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0\"}",
                       "{\"counters\":\"\",\"last_counters\":\"\"}","{\"counters\":\"\",\"last_counters\":\"\"}",
                       "{\"counters\":\"\",\"last_counters\":\"\"}",
                       "{\"counters\":\"0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0\"}",
                       "{\"counters\":\"\",\"last_counters\":\"\"}",
                       "{\"counters\":\"\",\"last_counters\":\"\"}",
                       "{\"counters\":\"13925,13105,13876,13414,13880,13876,13873,13364,13413,14137,13112,14137,14133,13107,13876,13366,13107,13105,14129,12897,13366,13365,13413,13619,13618,13625,13616,13620,13414,13410,13621,13113,0,13412,13362,13873,13880,13366,14179\"}",
                       "{\"counters\":\"12902,14391,25138,12595,13366,12851,12339,12595,13363,13874,13620,25908,13108,12853,14645,12341,13365,26165,25140,13620,14645,25651,13110,13109,25908,12853,14390,14390,25142,25398,14134,24885,0,14131,12339,13623,25908,25654,13156\"}",
                       "{\"counters\":\"13826,13105,13876,13414,13880,13876,13873,13364,13413,14137,13112,14137,14133,13107,13876,13366,13107,13105,14129,12897,13366,13365,13413,13619,13618,13625,13616,13620,13414,13410,13621,13113,0,14179,13880,13154,14134,13366,13367\"}",
                       "{\"counters\":\"13919,13105,13876,13414,13880,13876,13873,13364,13413,14137,13112,14137,14133,13107,13876,13366,13107,13105,14129,12897,13366,13365,13413,13619,13618,13625,13616,13620,13414,13410,13621,13113,0,13365,13107,13112,14135,13625,13364\"}",
                       "{\"counters\":\"0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0\"}")
)

#Extract the values from the JSON string per line
extractData = map(myData$payload, function(x){
  data = str_match_all(x, ":\"([\\d,\\.]+)\"")[[1]][,2]
  
  #Sometimes there is no last_counters, in that case return NA for that
  if(length(data) == 2){
    list(
      max(str_split(data[1], ",") %>% unlist %>% as.integer()),
      max(str_split(data[2], ",") %>% unlist %>% as.integer())
    )
  } else {
    list(
      max(str_split(data[1], ",") %>% unlist %>% as.integer()),
      NA
    )
  }
})

#Build a data frame with the IDs and the extracted data
myData = data.frame(ID = myData$ID, 
                    maxCounter = sapply(extractData, "[[", 1),
                    maxLastCounter = sapply(extractData, "[[", 2))

There are a LOT of NAs in this data frame, so I'm not sure what you want to use it for but here it is anyway :slight_smile:

Hope this helps,
PJ

1 Like

Thanks! Buy i have Only One question: you insert for each id the value: id=c(1,2,3....) Buy i have a lot of id.. how i can insert the entire column without insert each value manually?
The same thing with counters and last counters....i have around 10.000 rows....

Thanks again

That is just sample data for the sake of making the example reproducible, you are not supposed to manually type those values, you have to use your actual data instead.

1 Like

Perfect, it works! here n example of the output obtained
Many thanks for your help!

image

If your question's been answered (even by you!), would you mind choosing a solution? It helps other people see which questions still need help, or find solutions if they have similar problems. Here’s how to do it:

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.