Error message while using "humboldt" package.

Hey! I am currently using a package called humboldt (GitHub - jasonleebrown/humboldt: Welcome to the offical webpage for the R-package ‘humboldt’) in order to reduce the the climate factors, that affect some animal species. I wrote the following:

env1<-read.delim("Env1.csv",h=T,sep=",")
env2 <-read.delim("Env1.csv",h=T,sep=",")
sp1<- na.exclude(read.delim("Occ_Data_Total.csv",h=T,sep=","))
sp2<- na.exclude(read.delim("Occ_Data_Total.csv",h=T,sep=","))
reduc.vars<- humboldt.top.env(env1=env1,env2=env2,sp1=sp1,sp2=sp2,rarefy.dist=50, rarefy.units="km", env.reso=0.416669,learning.rt1=0.01,learning.rt2=0.01,e.var=(4:17),pa.ratio=4,steps1=50,steps2=50,method="contrib",contrib.greater=5)

When I try to run the last command I end up with the following error message:
Error in gbm.fit(x = x, y = y, offset = offset, distribution = distribution, :
The data set is too small or the subsampling rate is too large: nTrain * bag.fraction <= n.minobsinnode

If any of you guys can help i'd bre grateful!!!

Hi @Nickmofoe,
Welcome to the RStudio Community Forum.

Although I have no knowledge of the humboldt package, I assume Env1 and Env2 should contain different data, yet you are reading the same file into both objects. Same applies to sp1 and sp2.

Hey! So with the sp1 and sp2 I made a typo it is different data sets. Regarding env1 and env2 they're supposed to be the same (the package allows it)

It seems you will either need to get more data or reduce the subsampling rate.

Any clue how I can reduce the subsampling rate?

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.