a problem in a correlation

hello
maybe this post is more mathematic than R
i have created (with R) a sample :
with a function plusrunifwhich is the one of my other post of this morning :

now i make the reverse job, reading the datas and trying to guess the correlation :
but it doesn't fit…

n <- 200
plusrunif <- function(unvecteur,unautre) {
  #il faut unvecteur < unautre
  retour <- c()
  for (i in 1:n){
    retour <- append(retour,runif(1,unvecteur[i],unautre[i]))
  }
  return(retour)
}


a3 <- 10
b3 <- -2
x3 <- runif(n,0,1)
e3 <- exp(a3*x3+b3)
y3bas <- e3/(1+e3)*0.8
y3haut<- e3/(1+e3)*1
y3 <- plusrunif(y3bas,y3haut)

z3 <- log(y3/(1-y3))
t3 <- (x3)
reg <- lm(z3~t3)
ab <- reg$coefficients
a <- ab[2]
b <- ab[1]
#a=10.5
#b=-2.36
plot(x3,y3)
curve((exp(a*x+b)/(exp(a*x+b)+1)), xmin, xmax, n = 200,col="green",add=TRUE)

the calculus are not very complicated indeed… i have made them again and mys students obtain the same bad fit…

i can't find where the problem is…

tout is missing. Could you provide this in the form of a reprex. See the FAQ.

sorry there is no "tout" since i create the datas myself
i have edited my first post
i had forgotten to put "tout$" off

I think theres just a little naivete about how much distortion is introduced by y3bas being *.8

well…

ok there is some distorsion…

but why are the a,b calculated by R
worse
than 10.5 and -2.36

how is it that R gives a,b that do not produce the best Pearson

you see if you manually give to a,b the values canceled by # in my code, the pearson is better and the curve fits visually much better that with the a,b calculated by R

there is something i dont understand…

Your a and b give you y3haut not y3...

Y3 wont be particularly close to y3haut on these scales unless you adjust the 0.8 on y3bas to be closer to 1 like 0.99995 or something.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.