Inverse distance weighting

I have two variable (rain and temp). I am doing spatial interpolation using 'gstat' and 'ggplot2' package in R. In result, I'm getting only-- x1, x2, and var1.pred columns. where, x1 contains Longitude values, x2 latitude, and var1.pred interpolated values.
Please, I appreciate if someone can help me with how to get variables name and century columns also.
(--If I will get these two new columns then I can use facet_wrap in ggplot2 plot :wink: )

I'm using this script-

r.idw <- gstat::idw(mean ~ 1, data, newdata=grd, idp=2.0)

my data looks like this--
Variable Century Longitude Latitude Mean
Hum 900 13.931100 78.02170 1.003004829
Hum 1000 13.931100 78.02170 1.003004829
Hum 1100 13.931100 78.02170 -0.728323009
Hum 1200 13.931100 78.02170 -1.377570948
Hum 1300 13.931100 78.02170 -0.079075070
Rain 900 13.931100 78.02170 -1.377570948
Rain 1010 13.931100 78.02170 1.435836789
Rain 1030 13.931100 78.02170 0.786588849
Rain 1060 13.931100 78.02170 2.301500707

Thank you in advance :slight_smile:

Can you give us some sample code (minimal working example) and perhaps the data in a more usable format?

See FAQ: Tips for writing R-related questions.

A handy way to supply some sample data is the dput() function. In the case of a large dataset something like dput(head(mydata, 100)) should supply the data we need.

Hi, thank you so much.
dput () output is here--
structure(list(Variable = c("Temp", "Temp", "Temp", "Temp", "Temp",
"Temp", "Temp", "Temp", "Temp", "Temp", "Temp", "Temp", "Precip",
"Precip", "Precip", "Precip", "Precip", "Temp", "Temp", "Temp",
"Temp", "Temp", "Temp", "Precip", "Precip", "Precip", "Precip",
"Precip", "Precip", "Precip", "Precip", "Precip", "Precip", "Precip",
"Precip", "Precip", "Precip", "Precip", "Temp", "Temp", "Temp",
"Temp", "Temp", "Temp", "Temp", "Temp", "Temp", "Temp", "Temp",
"Temp"), Century = c(800L, 900L, 1000L, 1100L, 1200L, 1300L,
800L, 900L, 1000L, 1100L, 1200L, 1300L, 900L, 1000L, 1100L, 1200L,
1300L, 800L, 900L, 1000L, 1100L, 1200L, 1300L, 900L, 1000L, 1100L,
1200L, 1300L, 900L, 1000L, 1100L, 1200L, 1300L, 900L, 1000L,
1100L, 1200L, 1300L, 800L, 900L, 1000L, 1100L, 1200L, 1300L,
800L, 900L, 1000L, 1100L, 1200L, 1300L), Longitude = c(12.5,
12.5, 12.5, 12.5, 12.5, 12.5, 0.75, 0.75, 0.75, 0.75, 0.75, 0.75,
1, 1, 1, 1, 1, 28.325, 28.325, 28.325, 28.325, 28.325, 28.325,
15.733333, 15.733333, 15.733333, 15.733333, 15.733333, 15.733333,
15.733333, 15.733333, 15.733333, 15.733333, -4.98, -4.98, -4.98,
-4.98, -4.98, 6.8833, 6.8833, 6.8833, 6.8833, 6.8833, 6.8833,
6.8833, 6.8833, 6.8833, 6.8833, 6.8833, 6.8833), Latitude = c(47.5,
47.5, 47.5, 47.5, 47.5, 47.5, 42.5, 42.5, 42.5, 42.5, 42.5, 42.5,
52.5, 52.5, 52.5, 52.5, 52.5, 62.325, 62.325, 62.325, 62.325,
62.325, 62.325, 68.8, 68.8, 68.8, 68.8, 68.8, 68.8, 68.8, 68.8,
68.8, 68.8, 58.15, 58.15, 58.15, 58.15, 58.15, 50.1167, 50.1167,
50.1167, 50.1167, 50.1167, 50.1167, 50.1167, 50.1167, 50.1167,
50.1167, 50.1167, 50.1167), Mean = c(-0.674401224, 0.383627143,
1.472242402, 0.220960495, -0.597933997, -1.520406824, 1.282612958,
1.008766116, -0.762620237, -1.082777938, 0.268867128, -0.714848027,
-1.070371799, -1.006936142, 0.03260452, 0.306612444, 0.083288067,
0.882435346, -0.040712438, 1.274604058, -0.641877141, -1.040919279,
0.784619088, 0.874041638, -0.241763974, 1.065917808, 0.48992382,
-0.642693799, 0.874041638, -0.241763974, 1.065917808, 0.48992382,
-0.642693799, -0.301215399, -1.726692947, 0.515607908, 1.228251917,
0.379726471, -1.103446109, 1.514326234, 0.866236965, -0.722217127,
-0.099543123, -0.45535684, 0.469462913, -0.904574882, -0.870223937,
1.706096929, 0.125953465, -0.526714488)), row.names = c(NA, 50L
), class = "data.frame")

where does the grd come from ?

Hi, thank for your response. My grid comes from expand.grid function. Which includes--

       xmin        ymin        xmax        ymax 
-180.000000    2.053389  180.000000   81.250400 

The idea is to make your issue reproducible. We aren't at your side by your computer looking over your shoulder and free to type in our ideas of how to make it work. So your challenge is to reproduce enough of what you are doing on our computers, so that we can make some attempt.

You should be providing the forum with in principle the necessary parts to have the possibility of code that runs.

So please consider; what would be simpler
provide the code that makes grd from other parts you already provided (assuming thats what you do) ?
provide the dput of the grd ?
please pick the one which is the least text, and extend your post to include that;

Thank you very much :slight_smile:

Please, find the details--

Shapefile--https://naciscdn.org/naturalearth/110m/cultural/ne_110m_admin_0_countries.zip

dput () output is here--

structure(list(Variable = c("Temp", "Temp", "Temp", "Temp", "Temp",
"Temp", "Temp", "Temp", "Temp", "Temp", "Temp", "Temp", "Precip",
"Precip", "Precip", "Precip", "Precip", "Temp", "Temp", "Temp",
"Temp", "Temp", "Temp", "Precip", "Precip", "Precip", "Precip",
"Precip", "Precip", "Precip", "Precip", "Precip", "Precip", "Precip",
"Precip", "Precip", "Precip", "Precip", "Temp", "Temp", "Temp",
"Temp", "Temp", "Temp", "Temp", "Temp", "Temp", "Temp", "Temp",
"Temp"), Century = c(800L, 900L, 1000L, 1100L, 1200L, 1300L,
800L, 900L, 1000L, 1100L, 1200L, 1300L, 900L, 1000L, 1100L, 1200L,
1300L, 800L, 900L, 1000L, 1100L, 1200L, 1300L, 900L, 1000L, 1100L,
1200L, 1300L, 900L, 1000L, 1100L, 1200L, 1300L, 900L, 1000L,
1100L, 1200L, 1300L, 800L, 900L, 1000L, 1100L, 1200L, 1300L,
800L, 900L, 1000L, 1100L, 1200L, 1300L), Longitude = c(12.5,
12.5, 12.5, 12.5, 12.5, 12.5, 0.75, 0.75, 0.75, 0.75, 0.75, 0.75,
1, 1, 1, 1, 1, 28.325, 28.325, 28.325, 28.325, 28.325, 28.325,
15.733333, 15.733333, 15.733333, 15.733333, 15.733333, 15.733333,
15.733333, 15.733333, 15.733333, 15.733333, -4.98, -4.98, -4.98,
-4.98, -4.98, 6.8833, 6.8833, 6.8833, 6.8833, 6.8833, 6.8833,
6.8833, 6.8833, 6.8833, 6.8833, 6.8833, 6.8833), Latitude = c(47.5,
47.5, 47.5, 47.5, 47.5, 47.5, 42.5, 42.5, 42.5, 42.5, 42.5, 42.5,
52.5, 52.5, 52.5, 52.5, 52.5, 62.325, 62.325, 62.325, 62.325,
62.325, 62.325, 68.8, 68.8, 68.8, 68.8, 68.8, 68.8, 68.8, 68.8,
68.8, 68.8, 58.15, 58.15, 58.15, 58.15, 58.15, 50.1167, 50.1167,
50.1167, 50.1167, 50.1167, 50.1167, 50.1167, 50.1167, 50.1167,
50.1167, 50.1167, 50.1167), Mean = c(-0.674401224, 0.383627143,
1.472242402, 0.220960495, -0.597933997, -1.520406824, 1.282612958,
1.008766116, -0.762620237, -1.082777938, 0.268867128, -0.714848027,
-1.070371799, -1.006936142, 0.03260452, 0.306612444, 0.083288067,
0.882435346, -0.040712438, 1.274604058, -0.641877141, -1.040919279,
0.784619088, 0.874041638, -0.241763974, 1.065917808, 0.48992382,
-0.642693799, 0.874041638, -0.241763974, 1.065917808, 0.48992382,
-0.642693799, -0.301215399, -1.726692947, 0.515607908, 1.228251917,
0.379726471, -1.103446109, 1.514326234, 0.866236965, -0.722217127,
-0.099543123, -0.45535684, 0.469462913, -0.904574882, -0.870223937,
1.706096929, 0.125953465, -0.526714488)), row.names = c(NA, 50L
), class = "data.frame")
Summary

Script is here--

rm(list=ls(all=TRUE))
dev.off()

#Dependencies
library(sf)
library(sp)
library(rgdal)
library(tidyr)
library(gstat)
library(ggplot2)
library (gplots)
library(ggthemes)

#set working directory
setwd("D:/Spatial_values")


##### Data file ####
file <-read.csv("Europe.csv", header = T, sep = ",")


#### Filter selected colomus
eu_Df <- file %>% 
  dplyr::select(Variable, Century, Longitude, Latitude, Mean) %>% 
  filter(!is.na(Mean))

##### IDW----interpolation #########
#countries 
countries <- st_read("D:/shape_files/ne_110m_admin_0_countries", layer="ne_110m_admin_0_countries") 
# Filter/subset continent/country of choice
sub_Europe <- countries[countries$CONTINENT == "Europe",]
eu_prj <- st_transform(sub_Europe, CRS("+proj=longlat +datum=WGS84 +no_defs"))

# generalization
#eu_cont_smpl <- sf::st_simplify(eu_prj, dTolerance = 200)
eu_cont_smpl <- st_cast(eu_prj, preserveTopology = T, dTolerance = 4000)


# create simple feature with filtered DATA....
eu_sf <- st_as_sf(eu_Df, coords= c("Longitude", "Latitude"),
                  crs = CRS("+proj=longlat +datum=WGS84 +no_defs")) ##4326

# project lat-lon (WGS84)
eu_sf <- st_transform(eu_sf, CRS("+proj=longlat +datum=WGS84 +no_defs"))


# Get the extent of interpolation area
grd_extent <- st_bbox(eu_cont_smpl)

# create grid
x.range <- as.numeric(c(grd_extent[1], grd_extent[3])) # min/max longitude of the interpolation area
y.range <- as.numeric(c(grd_extent[2], grd_extent[4])) # min/max latitude of the interpolation area
grd <- expand.grid(x = seq(from = x.range[1], to = x.range[2], by = 500), y = seq(from = y.range[1], to = y.range[2], by = 500)) # expand points to grid



#Convert grid to spatial object
coordinates(grd) <- ~x + y
#gridded(grd) <- TRUE


## convert simple feature to spatial format
glimpse(eu_sf)
eu_sp <- as_Spatial(eu_sf)
est_cont_sp <- as_Spatial(eu_prj)

# grid:
idw_grid <- spsample(est_cont_sp, type = "regular", n = 4000) 

# idw
rslt_pt <- gstat::idw(Mean~1, locations=eu_sp, newdata = idw_grid, idp = 2, nmax = length(eu_sp))

# Clean up
rslt_df2 <- as.data.frame(rslt_pt)
names(rslt_df2)[1:4] <- c("Longitude", "Latitude", "pred.mean", "Century")

glimpse(rslt_df2)

### Remove NAs from the cleaned data
rslt_grd_df <- rslt_df2 %>% filter(!is.na(pred.mean))
glimpse(rslt_grd_df)


palett <- rich.colors(40, palette = "temperature", rgb = T)

ggplot() +
  theme_map()+
  theme(panel.grid = element_line(color = "grey70", size = 0.2), legend.position = "right")+
  geom_tile(data=rslt_grd_df, aes(x = Longitude, y = Latitude, fill = pred.mean))+
  geom_sf(data = eu_cont_smpl, fill = NA, size = 0.2, colour = "gray45")+
  scale_fill_gradientn(colours = palett)+
  #facet_wrap(~Century + Variable)+     # I want to get output that fit with this  :(
  xlim(-30, 70)+
  ylim(25, 86)+
  theme_classic()

I am not getting very far but I think you have a couple of mistakes in your code.

countries <- st_read("D:/shape_files/ne_110m_admin_0_countries", layer="ne_110m_admin_0_countries") 

Should read

countries <- st_read("ne_110m_admin_0_countries.shp", layer="ne_110m_admin_0_countries") 

and I a pretty sure

eu_prj <- st_transform(sub_Europe, CRS("+proj=longlat +datum=WGS84 +no_defs"))

should be

eu_prj <- st_transform(sub_Europe, crs("+proj=longlat +datum=WGS84 +no_defs"))

R is very case-sensitive.

Can you point me to an example of

crs("+proj=longlat +datum=WGS84 +no_defs")

?
I just do not understand the usage.

Hi, thank you very much for the response. When I try to replace CRS====> crs, I'm getting this error---

> eu_prj <- st_transform(sub_Europe, crs("+proj=longlat +datum=WGS84 +no_defs"))
Error in crs("+proj=longlat +datum=WGS84 +no_defs") : 
  could not find function "crs"
>

I agree, I have mistakes in script :frowning: I should assign crs like this ===>

eu_prj <- st_transform(sub_Europe, crs=4326)

Probably, I misunderstood example from here--
https://rspatial.org/raster/spatial/6-crs.html

p2 <- spTransform(rob, CRS("+proj=longlat +datum=WGS84"))

And here too https://www.nceas.ucsb.edu/sites/default/files/2020-04/OverviewCoordinateReferenceSystems.pdf

You are correct. Either I have misread the function description or I am missing something entirely.

Sorry to take so long getting back to you. Friday and Saturday were a bit hectic.

I still have no idea what I am doing but I think I see the crs vs CRS issue. CRS is a function from the "rgdal" package and crs is a component of the st_transform function from the *sf * package.

We seem to have missing missing data.frames/tibbles in the lines below unless I have messed up a cut & paste. I do not see any trace of eu_Df or of eu_sf.

eu_sf <- st_as_sf(eu_Df, coords= c("Longitude", "Latitude"),
                  crs = CRS("+proj=longlat +datum=WGS84 +no_defs")) ##4326

Code problem.

grd <- expand.grid(x = seq(from = x.range[1], to = x.range[2], by = 500), y = seq(from = y.range[1], to = y.range[2], by = 500)) # expand points to grid

will not work. Your grid coordinates are:

grd_extent
       xmin        ymin        xmax        ymax 
-180.000000    2.053389  180.000000   81.250400 

and you have a sequence interval of 500. Is that a typo?

Hi, yes, I also got information about CRS and crs. I'm sorry, but may be during copy-paste you missed eu_Df or of eu_sf. In my code it is there.

Yes, it it my mistake here. I added 3 more zero and made it 500 km (i.e., 500000) :slight_smile:

So, should

grd <- expand.grid(x = seq(from = x.range[1], to = x.range[2], by = 500), y = seq(from = y.range[1], to = y.range[2], by = 500)) # expand points to grid.

read by = 5 or by = 0.5? For convenience I have it an 5.

Well, this gives us a plot. It looks like the only real problem was the expand.grid issue.

setwd("D:/Spatial_values")

library(tidyverse)
library(sf)
library(rgdal)
library(akima)
library(gstat)
library (gplots)
library(ggthemes)


df_file <- read.csv("Europe.csv",  sep = ",")

eu_Df <- df_file %>% 
  dplyr::select(Variable, Century, Longitude, Latitude, Mean) %>% 
  filter(!is.na(Mean))

eu_sf <- st_as_sf(eu_Df, coords= c("Longitude", "Latitude"),
                  crs = CRS("+proj=longlat +datum=WGS84 +no_defs")) ##4326

countries <- st_read("ne_110m_admin_0_countries.shp", layer="ne_110m_admin_0_countries") 

sub_Europe <- countries[countries$CONTINENT == "Europe",]

eu_prj <- st_transform(sub_Europe, crs=4326)

# eu_prj <- st_transform(sub_Europe, CRS("+proj=longlat +datum=WGS84 +no_defs"))

eu_cont_smpl <- st_cast(eu_prj, preserveTopology = T, dTolerance = 4000)

eu_sf <- st_as_sf(eu_Df, coords= c("Longitude", "Latitude"),
                  crs = CRS("+proj=longlat +datum=WGS84 +no_defs")) ##4326

# project lat-lon (WGS84)
eu_sf <- st_transform(eu_sf, CRS("+proj=longlat +datum=WGS84 +no_defs"))    

grd_extent <- st_bbox(eu_cont_smpl)

# create grid
x.range <- as.numeric(c(grd_extent[1], grd_extent[3])) # min/max longitude of the interpolation area
y.range <- as.numeric(c(grd_extent[2], grd_extent[4])) # min/max latitude of the interpolation area
grd <- expand.grid(x = seq(from = x.range[1], to = x.range[2], by = 5), y = seq(from = y.range[1], to = y.range[2], by = 500)) # expand points to grid

coordinates(grd) <- ~x + y

eu_sp <- as_Spatial(eu_sf)
est_cont_sp <- as_Spatial(eu_prj)

idw_grid <- spsample(est_cont_sp, type = "regular", n = 4000) 

rslt_pt <- gstat::idw(Mean~1, locations=eu_sp, newdata = idw_grid, idp = 2, nmax = length(eu_sp))

rslt_df2 <- as.data.frame(rslt_pt)
names(rslt_df2)[1:4] <- c("Longitude", "Latitude", "pred.mean", "Century")

rslt_grd_df <- rslt_df2 %>% filter(!is.na(pred.mean))

palett <- rich.colors(40, palette = "temperature", rgb = T)

ggplot() +
  theme_map()+
  theme(panel.grid = element_line(color = "grey70", size = 0.2), legend.position = "right")+
  geom_tile(data=rslt_grd_df, aes(x = Longitude, y = Latitude, fill = pred.mean))+
  geom_sf(data = eu_cont_smpl, fill = NA, size = 0.2, colour = "gray45")+
  scale_fill_gradientn(colours = palett)+
  #facet_wrap(~Century + Variable)+     # I want to get output that fit with this  :(
  xlim(-30, 70)+
  ylim(25, 86)+
  theme_classic()

Hi, thank you so much, it's working :slight_smile:
Is it possible to get output using this--

I want to arrange facet using Century and Variable :slight_smile:

Beats me. I'll have a try. I am sure I have seen some examples but it depends on the structure of the data.

It's 1203 here again so it may be a while. Lunch calls.

BTW, those filter(!is.na(VARNAME)) commands offend my sense of parsimony.

I'd suggest just throwing in a quick check that allows you to see if you need to filter the data as you walk through the code. See simpleminded example below

xx <- c(1, 2, 3, NA, 5, 6)
table(is.na(xx))

Enjoy your meal :slight_smile: Its 18:11 here.

I used that to check NA and pass in data.

This also not giving Century and Variable. I think we need to check formula so that we can get both Century and Variable :wink:

OOPS I missed something. We do not have Century in the final datasets rslt_df2 & rslt_df2. I clearly ignored a warning about something. It may have been something about a comment ? I'll have to start stepping through the data.frames to see what happened.

1 Like

A yes. rslt_pt looks to be the culprit. Now to figure out what kriging is and what that idw() function thinks it is doing. So far I think I have learned how to pronounce Andrey Kolmogorov's name. I may be getting sidetracked.

BTW while stepping through the script I noticed this:

eu_sf <- st_as_sf(eu_Df, coords= c("Longitude", "Latitude"),
                  crs = CRS("+proj=longlat +datum=WGS84 +no_defs")) ##4326

and later this

eu_sf <- st_transform(eu_sf, CRS("+proj=longlat +datum=WGS84 +no_defs"))

Yes, again my mistake :wink: I did the same things twice. I think I should remove this :slight_smile:

I think modification in formula can help us to get the result we want, I'm also trying to find a way :slight_smile:

I got some information from ResearchGate