I tried selecting for cases in a column but could not succeed in generating the summary. I wonder if you can help me

load("CONDO.Rdata")
CONDO.A <- CONDO[CONDO$SOLD=="A",]
CONDO$SOLD=="A"
as.numeric(CONDO$SOLD)
CONDO.A <- CONDO[as.numeric(CONDO$SOLD)==1,]
x1 <- CONDO.A$FLOOR
x2 <- CONDO.A$DIST
x3 <- CONDO.A$VIEW
x4 <- CONDO.A$END
x5 <- CONDO.A$FURNISH
y <- CONDO.A$SOLD
fit1 <- lm(y~x1+x2)

I have this code :fit1 <- lm(y~x1+x2)
Warning messages:
1: In model.response(mf, "numeric") :
using type = "numeric" with a factor response will be ignored
2: In Ops.factor(y, z$residuals) : ‘-’ not meaningful for factors

Please read Welcome to the RStudio Community!.

You list some statements that have no influence on the fit.
E.g. as.numeric(CONDO$SOLD) only shows the effect of taking that variable and trying to make it numeric. It does not actually change the variable. For that you need to specify CONDO$SOLD<-as.numeric(CONDO$SOLD) .
Probably you meant to execute:

CONDO$SOLD<-as.numeric(CONDO$SOLD)
CONDO.A <- CONDO[CONDO$SOLD==1,]
fit1 <- lm(SOLD~FLOOR+DIST,data=CONDO.A)

?

It worked with warning. I added the input you provided washout understanding.
input


or

summary(fit1)

Call:
lm(formula = SOLD ~ FLOOR + DIST, data = CONDO.A)

Residuals:
Min 1Q Median 3Q Max
-9.557e-15 -9.590e-17 5.900e-17 2.657e-16 6.231e-16

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.000e+00 3.088e-16 3.238e+15 <2e-16 ***
FLOOR 1.073e-16 4.851e-17 2.212e+00 0.0292 *
DIST 2.710e-17 2.368e-17 1.144e+00 0.2551

Signif. codes: 0 ‘’ 0.001 ‘’ 0.01 ‘’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 9.77e-16 on 103 degrees of freedom
Multiple R-squared: 0.4999, Adjusted R-squared: 0.4901
F-statistic: 51.47 on 2 and 103 DF, p-value: 3.185e-16

Warning message:
In summary.lm(fit1) : essentially perfect fit: summary may be unreliable

based on all of this
load("CONDO.Rdata")
CONDO.A <- CONDO[CONDO$SOLD=="A",]
CONDO$SOLD=="A"
as.numeric(CONDO$SOLD)
CONDO.A <- CONDO[as.numeric(CONDO$SOLD)==1,]
x1 <- CONDO.A$FLOOR
x2 <- CONDO.A$DIST
x3 <- CONDO.A$VIEW
x4 <- CONDO.A$END
x5 <- CONDO.A$FURNISH
y <- CONDO.A$SOLD
CONDO$SOLD<-as.numeric(CONDO$SOLD)
CONDO.A <- CONDO[CONDO$SOLD==1,]
fit1 <- lm(SOLD~FLOOR+DIST,data=CONDO.A)
summary(fit1)

What should be eliminated ? I am trying to understand. All the idea was to select A (auction) and remove fixed priced (f) from SOLD column ...

I have added some comments to your code.

Why are you setting SOLD to have only one value?

1 Like

Thank you. I am addressing the auction price which is coded as number 1, and taking off the fixed price . This is why we subsetted( or selected cases need to use x1 through x5 when I test the second order model and the interactions with dummies. I basically have three models to test.
This is the complete model
E(y) = β0 + β1x1 + β2x2 + β3x3 + β4x4 + β5x5

X1 floor height , x2: distance from elevator x3: dummies 1 if ocean view 0 if not, x4, 1 if an end unit, 0 if not, x5: 1 if furnished, 0 if not.

We need a reproducible example (reprex)

As @ FJCC points out your code is not making sense. There is clearly something we are missing.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.