Error on Ddply function

Hi,
I am a learner in R. So, I was trying to use/learn Ddply function of Plyr package and do some simple finding Minimum value based on Group steps but I found the below error. Though I know there are other ways to get the desired result.

ec_max <- ddply(AEC, .(Subject Identifier for the Study), summarise, min_vis = min(Visit Number, na.rm = TRUE))
when I use the above statement I get "Inf" for the Min_vis column as there are missing value present.
So I use the below steps to try out the ignore missing value and replace with 0

ec_max <- ddply(AEC, .(Subject Identifier for the Study), summarise, min_vis = function(x) {ifelse(!is.na(x$Visit Number), min(x$Visit Number), 0 )})

but this statement gives the below error

Error in vector(type, length) :
vector: cannot make a vector of mode 'closure'.
I would like to know what is this error and how to get rid of this error?

My data looks like this
|Subject Identifier for the Study|Visit Number|Visit Name|
|123|5|Week 5|
|123|6|Week 6|
|123|7|Week 7|
|123|8|Week 8|
|123|9|Week 9|
|124|NA|NA|
|124|5|Week 5|
|124|8|Week 8|
|124|9|Week 9|
|124|NA|NA|
|124|18|Week 18|
|125|4|Week 4|
|125|5|Week 5|
|125|9|Week 9|
|125|NA|NA|
|125|15|Week 15|
|125|16|Week 16|

what does it show when you

table(AEC$`Visit Number`)

?

because I don't see how its possible to get Inf from using min function, on any numeric vector that isnt entirely filled with Inf values.... Scratching my head

ya You are correct.
I am getting Inf not for all records but for those subjects where visit number is totally missing for the below code
ec_max <- ddply(AEC, .( Subject Identifier for the Study ), summarise, min_vis = min( Visit Number , na.rm = TRUE))
thats why i tried out
ec_max <- ddply(AEC, .( Subject Identifier for the Study ), summarise, min_vis = function(x) {ifelse(!is.na(x$ Visit Number ), min(x$ Visit Number ), 0 )})
but this code is throwing
Error in vector(type, length) :
vector: cannot make a vector of mode 'closure'.

Thanks a lot..

If I was in your shoes, rather than write a length anonymous function I would preprocess AEC (or at least make a temporary AEC2, where I simple replace all NA's with zero, before invoking ddply().

sjmisc package has an excellent helper function

sjmisc::replace_na(AEC,value=0)

ok, Thanks for your reply. will try the same. But could you please explain what the error statement actually means, how can i debug such error..

here, take a look at this

library(plyr)

# example data
dfx <- data.frame(
  group = c(rep('A', 8), rep('B', 15), rep('C', 6)),
  sex = sample(c("M", "F"), size = 29, replace = TRUE),
  age = runif(n = 29, min = 18, max = 54)
)
#making one of the groups entirely NA on age
  dfx$age <- ifelse(dfx$group=="C" & dfx$sex=='M',NA_real_,dfx$age)
  dfx


# the problem
ddply(dfx, .(group, sex), summarize,
      mean = round(mean(age), 2),
      sd = round(sd(age), 2),
      min = min(age,na.rm=TRUE))

#a solution with an explicit custom function

mymin <- function(x) {min(ifelse(is.na(x),0,x))}
ddply(dfx, .(group, sex), summarize,
      mean = round(mean(age), 2),
      sd = round(sd(age), 2),
      mymin = mymin(age))

# could you make it fully anonymous? but then how to pass age to it ?
ddply(dfx, .(group, sex), summarize,
      mean = round(mean(age), 2),
      sd = round(sd(age), 2),
      mymin = function(x) {min(ifelse(is.na(x),0,x))})
#how does it know we want to pass (age))
1 Like

Thanks a lot for your detailed explanation I am overwhelmed.., I just would like to clarify one thing.. How the Function(x) gets resolve in an Anonymous function. means what argument generally pass ? does x resolves to dfx for the below code and if so then why cant we use x$age? you can give any other apply function example as well.. it is kind of my general doubt.
Sorry for my silly doubts may be..
ddply(dfx, .(group, sex), summarize,
mean = round(mean(age), 2),
sd = round(sd(age), 2),
mymin = function(x) {min(ifelse(is.na(x),0,x))})

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.