Vegdist jaccard analysis - Data must be numeric error

R newbie here! I'm trying to carry out a Jaccard similarity test using vegdist, looking at 143 species across 3 sites over 2 time periods. I'm importing my data using

data = read.csv(file.choose(), header =TRUE, sep = "\t", row.names = 1)

When I view it, it looks like this:

I've tried to run Jaccard using:

vegdist( similarity, method="jaccard", binary=TRUE, header=TRUE)
vegdist(similarity, method="jaccard", binary=TRUE)
vegdist(similarity, method="jaccard", binary=TRUE, row.names = 1)
vegdist(similarity, method="jaccard", binary=TRUE, sep = "\t", row.names = 1)

but am getting this error

Error in vegdist(similarity, method = "jaccard", binary = TRUE, header = TRUE) :
input data must be numeric

Is this because I havent defined the first column as names? If so how do I do that?

Thanks in advance.

Hello LisaB,

Welcome to the forum! It was slightly difficult to figure out exactly what was going on. The best is always to create a minimally reproducible example.

You will see I've created a dummy dataframe below called df. As you can see it runs in both cases without a problem and they are identical. If you look at the second example you can see that I had a first column of letters and tried running which gave the error you have. I was able to circumvent it by running without. For Jaccard it is looking for a matrix only consisting of numbers.

#install.packages("vegan")
library(vegan)

# First example ----------------------------------------------------------

set.seed(333)

df <- data.frame(a = sample(c(0,1),10,replace = TRUE),
                b = sample(c(0,1),10,replace = TRUE),
                c = sample(c(0,1),10,replace = TRUE),
                d = sample(c(0,1),10,replace = TRUE),
                e = sample(c(0,1),10,replace = TRUE))


output1 <- vegdist(x = df,
        method = "jaccard",
        binary = TRUE)


output2 <- vegdist(x = df,
                   method = "jaccard",
                   binary = FALSE)

output1 == output2




# Second example ----------------------------------------------------------



set.seed(333)

df2 <- data.frame(t = letters[1:10],
                 a = sample(c(0,1),10,replace = TRUE),
                 b = sample(c(0,1),10,replace = TRUE),
                 c = sample(c(0,1),10,replace = TRUE),
                 d = sample(c(0,1),10,replace = TRUE),
                 e = sample(c(0,1),10,replace = TRUE))


output3 <- vegdist(x = df2,
                   method = "jaccard",
                   binary = TRUE)


output4 <- vegdist(x = df2[,-1],
                   method = "jaccard",
                   binary = TRUE)



output4 == output1

I just want to add here as well. The only arguments you can specify for vegdist are the below so you wouldn't be able to call up header = TRUE

vegdist(x, method="bray", binary=FALSE, diag=FALSE, upper=FALSE,
        na.rm = FALSE, ...) 

Let me know if you're sorted :slight_smile:

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.