# Extracting only numeric values from list of lists

Hello!

I am looking for the easiest way to strip my nested list of all non numeric values or to only return those lists (`df`) and components that are numeric. I don't mind it being flattened but I want it in a format I can work with as numerics. I am sure there is some way clever way of doing with `purrr` but not entirely sure how best to approach it.

``````library(tidyverse)
library(purrr)

df <- list(a = list(a1 = list(1,2,3),
b1 = list("a","b","c")),
b = list(1,2,3,4,5),
c = list(a1 = list(1,2,3),
b1 = list("a","b","c"),
c1 = list("a","b", c3 = c(1,2,3))
)
)

df2 <- df %>% purrr::flatten()

lapply(df2, function(x){
is.numeric(x)
})

df3 <-
purrr::keep(df2,is.numeric)

df3
``````

Hi there,

Here is a way of doing this by using the `flatten()` function from purrr

``````library(tidyverse)

#Data
df <- list(a = list(a1 = list(1,2,3),
b1 = list("a","b","c")),
b = list(1,2,3,4,5),
c = list(a1 = list(1,2,3),
b1 = list("a","b","c"),
c1 = list("a","b", c3 = c(1,2,3))
)
)

#Flatten until only one dimension (but keep type)
while(any(lengths(df) > 1)){
df = flatten(df)
}

#Only keep numeric values
numVals = df[sapply(df, is.numeric)] %>% unlist()

numVals
#>   1 2 3 1 2 3 4 5 1 2 3 1 2 3
``````

Created on 2022-03-09 by the reprex package (v2.0.1)

Hope this helps,
PJ

Hi there,

This is definitely a start but I would like to preserve some of the complex list structure. So either I would like to take out all the non numeric bits or only return that structure with those still containing lists with numbers or vectors or dataframes with numbers/numeric.

Hi,

Your original post suggested otherwise Anyway, I spent too much time trying to get this, but I wanted to see it through and found a solution that preserves the structure ``````#Data
df <- list(a = list(a1 = list(1,2,3),
b1 = list("a","b","c")),
b = list(1,2,3,4,5),
c = list(a1 = list(1,2,3),
b1 = list("a","b","c"),
c1 = list("a","b", c3 = c(1,2,3))
)
)

#Recursive function that checks for numeric vaues
myFun = function(x){
if(class(x) == "list"){

#Go to next level if more dimensions
y = lapply(x, myFun)

#Ignore any NULL returns
y = y[sapply(y, function(z){
length(z) > 0
})]

return(y)

} else {
#Check for the logic
if(all(is.numeric(x))){
return(x)
}
}
}

myFun(df)
#> \$a
#> \$a\$a1
#> \$a\$a1[]
#>  1
#>
#> \$a\$a1[]
#>  2
#>
#> \$a\$a1[]
#>  3
#>
#>
#>
#> \$b
#> \$b[]
#>  1
#>
#> \$b[]
#>  2
#>
#> \$b[]
#>  3
#>
#> \$b[]
#>  4
#>
#> \$b[]
#>  5
#>
#>
#> \$c
#> \$c\$a1
#> \$c\$a1[]
#>  1
#>
#> \$c\$a1[]
#>  2
#>
#> \$c\$a1[]
#>  3
#>
#>
#> \$c\$c1
#> \$c\$c1\$c3
#>  1 2 3
``````

Created on 2022-03-09 by the reprex package (v2.0.1)

Is this what you are looking for?

PJ

2 Likes

I can propose a shorter alternative, but it only indicates the structure, doesn't preserve it.
@pieterjanvc 's solution would is best for preserving the full structure.

`df %>% unlist() %>% enframe() %>% filter(str_detect(value, "\\d+"))`

1 Like

Pieter beat me to a solution.
Mine was

``````library(tidyverse)
library(purrr)

df <- list(
a = list(
a1 = list(1, 2, 3),
b1 = list("a", "b", "c")
),
b = list(1, 2, 3, 4, 5),
c = list(
a1 = list(1, 2, 3),
b1 = list("a", "b", "c"),
c1 = list("a", "b", c3 = c(1, 2, 3))
)
)

test_and_do <- function(x) {
if (is.list(x)) {
process_list(x)
} else if (is.numeric(x)) {
return(x)
} else {
return(NA)
}
}
process_list <- function(x) {
sublist_results <- map(
x,
test_and_do
)
all(is.na(x))
})
}

(df2 <- test_and_do(df))``````
2 Likes

Thank you Pieter! You're right I did mention flattened wouldn't be a problem. That would work for most cases but I could have a point at which that would lead to some errors in this specific problem.

Thanks for the recursive function to solve this problem! This should help me with what I need to get these similar enough for comparison now with `waldo` 1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.