Interpolation using approxfunction

Hi,

I am trying to use the approxfunction to interpolate monthly time series data on air temperature (ts.tair), soil temperature (ts.tsoil), incoming solar radiation (ts.srad), soil moisture (ts.moist1) and (ts.moist2) from 2000 to 2019. There are 4,210,275 spatial points on climate data in total.

class(ts.tair)
[1] "matrix"

dim(ts.tair)
[1] 4210275 240

class(ERA.dates)
[1] "Date"

head(ERA.dates)
[1] "2000-01-15" "2000-02-15" "2000-03-15" "2000-04-15" "2000-05-15" "2000-06-15"

dim(ERA.dates)
[1] 240

range(ERA.dates)
[1] "2000-01-15" "2019-12-15"

When I run the approxfun, I get the following error;

    af.tair = approxfun(ERA.dates,ts.tair[1,])
    af.tsoil = approxfun(ERA.dates,ts.tsoil[1,])
    af.srad = approxfun(ERA.dates,ts.srad[1,])
    af.moist1 = approxfun(ERA.dates,ts.moist1[1,])
    af.moist2 = approxfun(ERA.dates,ts.moist2[1,])

Error in approxfun(ERA.dates, ts.tair[1, ]) :
need at least two non-NA values to interpolate

    af.tsoil = approxfun(ERA.dates,ts.tsoil[1,])

Error in approxfun(ERA.dates, ts.tsoil[1, ]) :
need at least two non-NA values to interpolate

    af.srad = approxfun(ERA.dates,ts.srad[1,])

Error in approxfun(ERA.dates, ts.srad[1, ]) :
need at least two non-NA values to interpolate

    af.moist1 = approxfun(ERA.dates,ts.moist1[1,])

Error in approxfun(ERA.dates, ts.moist1[1, ]) :
need at least two non-NA values to interpolate

    af.moist2 = approxfun(ERA.dates,ts.moist2[1,])

Error in approxfun(ERA.dates, ts.moist2[1, ]) :
need at least two non-NA values to interpolate

Your assistance will be highly appreciated

Regards
Edward

you are selecting from your matricies with [1,] impliying that you want a row (rather than a column) of the matrix to be the second parameter into approxfun, is that right ?

Yes, I want a row of the matrix to be the second parameter into approxfun.

could you please run

sum(is.na(ts.tair[1,]))
sum(!is.na(ts.tair[1,]))

and report back ?

sum(is.na(ts.tair[1,]))

[1] 240

sum(!is.na(ts.tair[1,]))

[1] 0

so the error message was accurate. You have an entire row of which each column has only NA values, not any numeric value, so there is no basis from which to form any approximations

Thank you for your reply.

I did a subset of the data to a smaller area, see below

dim(ts.tsoilb)
[1] 13509 240

dim(ts.tairb)
[1] 13509 240

class(ts.tsoilb)
[1] "matrix"

head(ERA.datesb)
[1] "2000-01-15" "2000-02-15" "2000-03-15" "2000-04-15" "2000-05-15" "2000-06-15"

dim(ERA.datesb)
[1] 240

range(ERA.datesb)
[1] "2000-01-15" "2019-12-15"

I seem not to get error messages when I run the approxfun on a smaller area,

af.tairb = approxfun(ERA.datesb,ts.tairb[1,])

    af.tsoilb = approxfun(ERA.datesb,ts.tsoilb[1,])
    af.sradb = approxfun(ERA.datesb,ts.sradb[1,])
    af.moist1b = approxfun(ERA.datesb,ts.moist1b[1,])
    af.moist2b = approxfun(ERA.datesb,ts.moist2b[1,])

af.tairb
function (v)
.approxfun(x, y, v, method, yleft, yright, f)
<bytecode: 0x5639ab982b28>
<environment: 0x5639b01f0580>

af.tsoilb
function (v)
.approxfun(x, y, v, method, yleft, yright, f)
<bytecode: 0x5639ab982b28>
<environment: 0x5639b01aa298>

This is strange.

thats not an error., its informing you that it has produced a function. which is the expected out for approxfun()
the function approx() is the one which directly calculates values.

Yes, exactly that is my emphasis that on a smaller area, the function seems to work well but gives error messages on a larger area which is strange.

perhaps the larger arrear contains similar rows of entirely NA values.
perhaps you should specifically identify and filter out such rows.
Would you like further advice on how to do that ?

Yes please. That would be helpful.

sum(is.na(ts.tairb[1,]))

[1] 0

sum(!is.na(ts.tairb[1,]))      

[1] 240

:grinning: really confusing...

what is the error you associate with that row of ts.tairb?

There is no error with the row of ts.tairb (subset of ts.tair).
However, there is an error with the row of ts.tair. Which reports that it needs at least two non-NA values to interpolate. When Identifying missing values in (ts.tairb[1,]) the output is 0, however I am getting 240 missing values in (ts.tair[1,]). Which is confusing because ts.tairb is subset of ts.tair.

is it so suprising ?


(set <- matrix(c(NA,1,3,NA,2,4),3,2)) 
(subset <- set[2:3,])

sum(is.na(subset[1,]))
sum(is.na(set[1,]))

Ok it now makes sense. How can I identify and filter out rows with NAs in the larger area?

using purrr package of tidyverse for iterating with map_* style functions.

(set <- matrix(c(NA,1,3,NA,2,NA),3,2)) 

goodrows <- map_lgl(1:dim(set)[[1]],
    ~sum(is.na(set[.,]))!=dim(set)[[2]])

keep_from_set <- set[goodrows,]

a set is checked for each row if it is entirely NA or populated with at least one value (although you might need to be harsher as approx fun will want at least two non NA values I think).
Then the goodrows are used to index set to form a useable subset

Ok I will try that. Thank you for your assistance.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.