Correlation between a limited set of numeric x values and a continuous set of y values?

Hi all,

I'm working with a data set in which I have a limited number of "x" variables but a continuous set of "y" variables corresponding to each "x" value. i.e. I am working with a set of seven different field sites, each which have a different width. I sampled multiple plots at each site, giving me a list of 12 sample diversities taken from each plot. I'd like to see if there's a correlation between the diversity at each site and the width of the site.

I ran a Pearson's correlation test: cor.test(buf, rnat, method = "pearson", use="complete.obs") and got a significant result back, but then realized that the data describing my site widths (my "x" values) were not normally distributed, which is an assumption of the Pearson's test.

My question is: is there a way to figure out the correlation between my limited set of x data and my larger set of of y data while forgoing normal distribution? I know Spearman's rank correlation can be used with non-normal data, but it requires me to rank the data, which doesn't seem to be a viable method given there are so many more "y" values than "x".

Thank you

Pearson's r does not require that the variables are normally distributed. Rather, it assumes that the distances between the true values of y and the straight line relating x and y are distributed normally, with mean zero. A good idea is to plot your data. This will be a bit ugly given that you have only seven x values, and so the y values will gather in something resembling vertical lines perpendicular to the x axis. Still, you'll be able to tell if something is obviously wrong.

1 Like

I see. So, as long as they are generally symmetrical around the x/y line, they should be good enough to run a Pearson's?

Yeah. I don't think you have many options, and this one is probably fine.

1 Like

Thanks Chuck, I went ahead with the correlations and they are working out just fine.

1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.