 # Importing multiple images, checking the variance and eliminating the once with low variance

Hello everyone,

I am new to R and have been trying to eliminate blurred/snow covered images from camera trap data. I have decided to import all these images and check the pixel variance for each of them, I want to then eliminate all the images with very low pixel variance and keep rest of them.

My code is a bit of mess right now, and I can't seem to figure out how to move forward with it.
Any kind of help would be appreciated.

install.packages("jpeg")
install.packages ("Imager")
install.packages("magick")
install.packages("EBImage")

library(magick)
library(purrr)
library("jpeg")
library(EBImage)
library(dplyr)

setwd(".....")

Folder <- "....."
images <- list.files(path = Folder, pattern = "*.JPG", full.names = TRUE)
images
image_variance <- function(x) {
x %>%
as.raster() %>%
as.vector() %>%
map(col2rgb) %>%

Hi @Alorecia,

So there are still some questions that need to be answered to get to a solution. Are your pictures colour or grayscale? If coloured, how should that be handled? Is the variance calculated channel-wise, or should the channels be averaged together to get a single pixel-wise intensity?

If the latter, which seems to make most sense, should each channel have equal weight, or should the channels be averaged according to how each channel is perceived by the human eye? I believe something like `0.3 * red + 0.6 * green + 0.1 * blue` is approximately correct, but I need to double check those numbers.

With these questions answered, we can think about how we would compute the variance for a single image, and then it should be straightforward enough to extend this to multiple images.

Here is a quick attempt using two images I grabbed from Unsplash...

``````library(magick)
library(tidyverse)

imgs <- c("~/Desktop/winter.jpg", "~/Desktop/summer.jpg")

image_variance <- function(x) {
x %>%
as.raster() %>% # convert to array
as.vector() %>% # flatten to vector
map(col2rgb) %>%  # Hex to RGB
map_dbl(function(rgb) (rgb[]*0.3) + (rgb[]*0.6) + (rgb[]*0.1)) %>% # average color channels
var() # compute variance
}

# Variance for the two images
map_dbl(imgs, image_variance)
#>  2917.129 2976.778
``````
1 Like

Thank you for the example.

My pictures are coloured, and are in jpeg. Well, honestly I am not sure which of way of calculating the variance would be the best. I tried the one that you mentioned and it works well as of now.
However, I think I would also like to try how the variance can be calculated channel-wise and compare which method works the best for me.

Could maybe explain what is the use of %>% in this particular function? I tried looking it up on other websites but it seems like it does something completely different here.

Could maybe explain what is the use of %>% in this particular function?

`%>%` is know as the pipe operator. Pipe operators have been used extensively in other languages (e.g. `|` in the shell). Pipes work by taking the left-hand side of the pipe and "piping" it into the function on the right-hand side.

Because of this "pass-it-on" property of pipes, you can chain together long sequences of pipes in order to pass some data along a pipeline for it to come out the other side transformed in some way you want. You can learn more about it here: https://magrittr.tidyverse.org/reference/pipe.html.

Pipes have gained so much favour in the `R` community thanks to the `magrittr` package, that they are being added into base `R` (look for `|>` as an alternative in the not-so-distant future).

In other words, if you have some variable `x` and functions `f()` and `g()`, you could write it two ways in `R` that both are equal:

``````# Pipe
x %>% f() %>% g()

# Nested
g(f(x))
``````

The first one is executed left-to-right, while the latter is executed inside-out. Typically the first is easier to read, though both will produce identical results.

In my code `x` is an image as is being piped along, undergoing various transformations:

``````image_variance <- function(x) {
x %>%
as.raster() %>% # convert to array
as.vector() %>% # flatten to vector
map(col2rgb) %>%  # Hex to RGB
map_dbl(function(rgb) (rgb[]*0.3) + (rgb[]*0.6) + (rgb[]*0.1)) %>% # average color channels
var() # compute variance
}
``````
1 Like

Thank you for the explanation. 