 # Ipolate function in R

I am looking for a funciton in R similar to ipolate function in STATA (https://www.stata.com/manuals13/dipolate.pdf)

seems like that function simply fits an `lm`() and then when there are missing values takes the filling values from the predicted output of the lm.
Do you have experience with R's `lm` and `predict` ?

x 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019
y 53.142662 53.565829 54.623546 56.682212 59.495817 43.86624 53.134923 22.206799
y1 53.142662 53.565829 54.623546 56.682212 59.495817 43.86624 53.134923 37.670861 22.206799 6.7427376

y has a missing values, after using ipolate in stata, we get y1.
I really appreciate if you can show how to do so in R

I'm not a stata user but the documentation for the function you linked to implied to me that the the values would go vertically rather than horizontally as you illustrate them here ?

``````
#example data
(spoiled <- structure(list(x = 1:10, y = c(
1.41, 3.71, NA, 8.31, 10.61, 12.91,
NA, 5, 19.81, 22.11
)), row.names = c(NA, -10L), class = "data.frame"))

lm_1 <- lm(y ~ x, data = spoiled)

fixed <- spoiled
fixed\$y_lm <- predict(lm_1, newdata = spoiled)
fixed\$y_fin <- ifelse(is.na(fixed\$y), fixed\$y_lm, fixed\$y)
fixed

subset(fixed,
select = c(x, y_fin)
)``````

I shard the data horizontally because i copied from Excel.

``````(spoiled <- structure(list(x = 2010:2019, y = c(53.142662, 53.565829, 54.623546, 56.682212, 59.495817, 43.86624, 53.134923, NA, 22.206799, NA)), row.names = c(NA, -10L), class = "data.frame"))
lm_1 <- lm(y ~ x, data = spoiled)

fixed <- spoiled
fixed\$y_lm <- predict(lm_1, newdata = spoiled)
fixed\$y_fin <- ifelse(is.na(fixed\$y), fixed\$y_lm, fixed\$y)
fixed

subset(fixed,
select = c(x, y_fin)
)
``````

i am not geting the same result like iploate funciton in stata.
ie
y_fin should be : c(53.142662, 53.565829, 54.623546, 56.682212, 59.495817, 43.86624, 53.134923, 37.670861, 22.206799, 6.7427376)

Here is my implementation:

``````
library(purrr)

closest_points <- function(x1,y1){
diffs <- abs(x1 - y1)
names(diffs)<-y1
}

new_point <- function(x,x0,y0,x1,y1){
((y1-y0)/(x1-x0))*(x-x0)+y0
}

ipolate <- function(x,y){
missings_to_fill <- which(is.na(y))
filled_points <- setdiff(seq_along(y),missings_to_fill)

step1 <- map(missings_to_fill,
~closest_points(.x,filled_points))
names(step1) <- missings_to_fill
step1

step2 <- imap(step1,
~{
xlocal <- as.integer(.y)
x0 <- .x
x1 <- .x
y0 <- y[x0]
y1 <- y[x1]
new_point(x = xlocal,
x0=x0,
x1=x1,
y0=y0,
y1=y1)
})
y[missings_to_fill] <- unlist(step2)
y
}
``````
``````(spoiled <- structure(list(x = 2010:2019, y = c(53.142662, 53.565829, 54.623546, 56.682212, 59.495817, 43.86624, 53.134923, NA, 22.206799, NA)), row.names = c(NA, -10L), class = "data.frame"))

fixed <- spoiled
fixed\$y <- ipolate(spoiled\$x,spoiled\$y)
fixed``````
1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.