Correlation between two variables

Hi. I'm doing a project related to face care where I have to find the relation between two categorical variables (gender (with F and M values) and type of product (normal skin, purifying, anti-age, hydrating, sensitive skin, other)).

I used the code gender_impact <- face_prod ~ gender
and summary(lm(gender_impact))

but it gives me the error "Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) : NA/NaN/Inf in 'y'"

How can I solve this?

UPDATE:
The code i wrote is:

face_prod <- factor(beauty$`type of face care prod`)
gender <- factor(beauty$Gender)

gender_impact <- face_prod ~ gender
model1 = glm(gender_impact, data=beauty, family=binomial)
summary(model1)

And the result is:

Call:
glm(formula = gender_impact, family = binomial, data = beauty)

Deviance Residuals:
Min 1Q Median 3Q Max
-1.7642 0.6884 0.6884 0.6884 0.6884

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 1.3193 0.2167 6.089 1.13e-09 ***
genderM 15.2468 1385.3778 0.011 0.991

Signif. codes: 0 ‘’ 0.001 ‘’ 0.01 ‘’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 133.29 on 130 degrees of freedom
Residual deviance: 131.89 on 129 degrees of freedom
AIC: 135.89

Number of Fisher Scoring iterations: 15

So what can I conclude? Are they related? What are the values I have to look at to understand it?

Welcome to the community. Your respnse variable is categorical (normal skin, purifying, anti-age, hydrating, sensitive skin, other). So, you need to use "multinomial regression" (as you have more than two classes) not linear model (lm).

1 Like

Try logistic regression
model1= glm(gender_impact, data=dataframename, family=binomial)

1 Like

The code i wrote is:

face_prod <- factor(beauty$`type of face care prod`)
gender <- factor(beauty$Gender)

gender_impact <- face_prod ~ gender
model1 = glm(gender_impact, data=beauty, family=binomial)
summary(model1)

And the result is:

Call:
glm(formula = gender_impact, family = binomial, data = beauty)

Deviance Residuals:
Min 1Q Median 3Q Max
-1.7642 0.6884 0.6884 0.6884 0.6884

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 1.3193 0.2167 6.089 1.13e-09 ***
genderM 15.2468 1385.3778 0.011 0.991

Signif. codes: 0 ‘ ’ 0.001 ‘ ’ 0.01 ‘ ’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 133.29 on 130 degrees of freedom
Residual deviance: 131.89 on 129 degrees of freedom
AIC: 135.89

Number of Fisher Scoring iterations: 15

So what can I conclude? Are they related? What are the values I have to look at to understand it?

Hi. I'm trying to see if there is a correlation between two categorical variables:

  • gender : F, M
  • type of product bought: hydrating, anti-age, purifying, normal skin, sensitive skin

The code I wrote is

gender_impact <- face_prod ~ gender
case1 = glm(gender_impact, data=beauty, family=binomial)
summary(case1)
#exponentiate the coefficients to get the odds ratios
round(exp(case1$coefficients),3) 

and the result is

Call:
glm(formula = gender_impact, family = binomial, data = beauty)

Deviance Residuals:
Min 1Q Median 3Q Max
-1.7851 0.6741 0.6741 0.6741 0.6741

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 1.3661 0.2241 6.097 1.08e-09 ***
genderM 15.2000 1385.3778 0.011 0.991

Signif. codes: 0 ‘’ 0.001 ‘’ 0.01 ‘’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 125.54 on 125 degrees of freedom
Residual deviance: 124.20 on 124 degrees of freedom
AIC: 128.2

Number of Fisher Scoring iterations: 15

(Intercept) genderM
3.92 3992694.08

What do I have to look at in order to say if there is a correlation or not?
And what if I have a number (age) instead of the gender? What code should I write and what should I look at?
Thank you

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.