List within Data.Frame still?

Okay, I have been trying my hardest to get this to split out but have no idea what i am doing. I do apologize as I don't have the greatest reproducible example.

This column still has several other items within it that need to be split out into columns with the original dataset.
image

#               SCRIPT for Tenable Connection and Retrieval
headers = c('x-apikey' = 'accesskey=redacted; secretkey=redacted', 'type' = 'vuln')
#connection string, change parameters as needed. This points to 'All NMHS Systems'.
tenable <- httr::POST(url='https://scnessus.redacted.org/rest/analysis?sourceType=cumulative&type=vuln&query[id]=6075',
                      httr::add_headers(.headers=headers), config = httr::config(ssl_verifypeer = FALSE))
#convert from raw text, to JSON, then Data.Frame

 tenable_text <- httr::content(tenable, as = "text")
 tenable_json <- jsonlite::fromJSON (tenable_text, flatten = TRUE)

 
 tenable_json <- lapply(tenable_json$response$results, function(x) { 
   x[sapply(x, is.null)] <- NA
   unlist(x)
 })

 tenable.df <- as.data.frame(tenable_json)

Hi,

Since we can't get to the source data (we don't have the API key and you should not share it), why don't you share a part of the JSON file that then needs to be converted into a data frame. Use the reprex guide to help you share the data more easily, or even just copy paste a sample JSON input here.

A reprex consists of the minimal code and data needed to recreate the issue/question you're having. You can find instructions how to build and share one here:

PJ

I'm going to give this a shot. Here's a sample of the dataset I am using:

pluginID        Name                 vprContext
100             test1    
101             test2
102             test3
103             test4

Under the vprContext column (tried putting it in and it threw off the formatting too much:

[{"id":"age_of_vuln","name":"Vulnerability Age","type":"string","value":"730 days +"},{"id":"cvssV3_impactScore","name":"CVSS v3 Impact Score"}]

and all rows under the vprContext row look like that but way longer.

Hi,

You did not provide the original JSON (tenable_json), so I worked from the data frame you already have after first changes (tenable.df) by recreating it:

library(tidyverse)
library(jsonlite)

#Example of a JSON string
string = '[{"id":"age_of_vuln","name":"Vulnerability Age",
"type":"string","value":"730 days +"},
{"id":"cvssV3_impactScore","name":"CVSS v3 Impact Score"}]'

#Generate data frame as input
myData = tibble(
  pluginID = 1:5, name = paste0("test", 1:5),
  vprContext = c("[]", "[]", string,"[]", string)
)
myData
#> # A tibble: 5 x 3
#>   pluginID name  vprContext                                                     
#>      <int> <chr> <chr>                                                          
#> 1        1 test1 "[]"                                                           
#> 2        2 test2 "[]"                                                           
#> 3        3 test3 "[{\"id\":\"age_of_vuln\",\"name\":\"Vulnerability Age\",\n\"t~
#> 4        4 test4 "[]"                                                           
#> 5        5 test5 "[{\"id\":\"age_of_vuln\",\"name\":\"Vulnerability Age\",\n\"t~

#COnvert the json part to a new data frame
vprDF = apply(myData, 1, function(x){
  y = fromJSON(x["vprContext"])
  y$pluginID = x["pluginID"]
  y
})

vprDF = bind_rows(vprDF, row.names = NULL) %>% 
  mutate(pluginID = as.integer(pluginID))
vprDF
#> # A tibble: 7 x 5
#>   pluginID id                 name                 type   value     
#>      <int> <chr>              <chr>                <chr>  <chr>     
#> 1        1 <NA>               <NA>                 <NA>   <NA>      
#> 2        2 <NA>               <NA>                 <NA>   <NA>      
#> 3        3 age_of_vuln        Vulnerability Age    string 730 days +
#> 4        3 cvssV3_impactScore CVSS v3 Impact Score <NA>   <NA>      
#> 5        4 <NA>               <NA>                 <NA>   <NA>      
#> 6        5 age_of_vuln        Vulnerability Age    string 730 days +
#> 7        5 cvssV3_impactScore CVSS v3 Impact Score <NA>   <NA>

#Join it with rest of the data
myData = myData %>% select(-vprContext) %>% 
  left_join(vprDF, by = "pluginID")
myData
#> # A tibble: 7 x 6
#>   pluginID name.x id                 name.y               type   value     
#>      <int> <chr>  <chr>              <chr>                <chr>  <chr>     
#> 1        1 test1  <NA>               <NA>                 <NA>   <NA>      
#> 2        2 test2  <NA>               <NA>                 <NA>   <NA>      
#> 3        3 test3  age_of_vuln        Vulnerability Age    string 730 days +
#> 4        3 test3  cvssV3_impactScore CVSS v3 Impact Score <NA>   <NA>      
#> 5        4 test4  <NA>               <NA>                 <NA>   <NA>      
#> 6        5 test5  age_of_vuln        Vulnerability Age    string 730 days +
#> 7        5 test5  cvssV3_impactScore CVSS v3 Impact Score <NA>   <NA>

Created on 2021-03-18 by the reprex package (v1.0.0)

Hope this helps,
PJ

It keeps throwing numerous errors. Isn't there any easier way to accomplish this with R?

What errors does Pieter's code throw for you ? it seems to run fine in my session...

Error in fromJSON(content, handler, default.size, depth, allowComments,  : 
  invalid JSON input

and

Error in list2(...) : object 'vprDF' not found

I'm guessing you arent reporting an error with the same data going in as has been shared here so far ?
because again I have to say, I copy and paste Pieters code top to bottom, and it produces results without errors.

No, I'm trying to use my actual data that I am trying to fix.

Here's a snippet which has been glued together so I figure I have done something wrong (im still new to this).


tenable_mit <- httr::content(ten_mit, as = "text")
tenable_mitJSON <- jsonlite::fromJSON (tenable_mit, flatten = TRUE)


tenable_mitJSON <- lapply(tenable_mitJSON$response$results, function(x) { 
  x[sapply(x, is.null)] <- NA
  unlist(x)
})

tenable.mitdf <- as.data.frame(tenable_mitJSON)

vprDF <- apply(tenable.mitdf, 1, function(x){
     y = fromJSON(x["vprContext"])
     y$pluginID = x["pluginID"]
     y
   })

vprDF = bind_rows(vprDF, row.names = NULL) %>% 
  mutate(pluginID = as.integer(pluginID))

I made a change to the code and now it throws this:

Error in (function (classes, fdef, mtable) :
unable to find an inherited method for function ‘fromJSON’ for signature ‘"logical", "missing"’

To help us help you, could you please prepare a reproducible example (reprex) illustrating your issue? Please have a look at this guide, to see how to create one:

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.