I tried selecting for cases in a column but could not succeed in generating the summary. I wonder if you can help me

load("CONDO.Rdata")
CONDO.A <- CONDO[CONDO$SOLD=="A",]
CONDO$SOLD=="A"
as.numeric(CONDO$SOLD)
CONDO.A <- CONDO[as.numeric(CONDO$SOLD)==1,]
x1 <- CONDO.A$FLOOR
x2 <- CONDO.A$DIST
x3 <- CONDO.A$VIEW
x4 <- CONDO.A$END
x5 <- CONDO.A$FURNISH
y <- CONDO.A$SOLD
fit1 <- lm(y~x1+x2)

I have this code :fit1 <- lm(y~x1+x2)
Warning messages:
1: In model.response(mf, "numeric") :
using type = "numeric" with a factor response will be ignored
2: In Ops.factor(y, z$residuals) : ‘-’ not meaningful for factors

Please read Welcome to the RStudio Community!.

You list some statements that have no influence on the fit.
E.g. as.numeric(CONDO$SOLD) only shows the effect of taking that variable and trying to make it numeric. It does not actually change the variable. For that you need to specify CONDO$SOLD<-as.numeric(CONDO$SOLD) .
Probably you meant to execute:

CONDO$SOLD<-as.numeric(CONDO$SOLD)
CONDO.A <- CONDO[CONDO$SOLD==1,]
fit1 <- lm(SOLD~FLOOR+DIST,data=CONDO.A)

?

It worked with warning. I added the input you provided washout understanding.
input


or

summary(fit1)

Call:
lm(formula = SOLD ~ FLOOR + DIST, data = CONDO.A)

Residuals:
Min 1Q Median 3Q Max
-9.557e-15 -9.590e-17 5.900e-17 2.657e-16 6.231e-16

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.000e+00 3.088e-16 3.238e+15 <2e-16 ***
FLOOR 1.073e-16 4.851e-17 2.212e+00 0.0292 *
DIST 2.710e-17 2.368e-17 1.144e+00 0.2551

Signif. codes: 0 ‘’ 0.001 ‘’ 0.01 ‘’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 9.77e-16 on 103 degrees of freedom
Multiple R-squared: 0.4999, Adjusted R-squared: 0.4901
F-statistic: 51.47 on 2 and 103 DF, p-value: 3.185e-16

Warning message:
In summary.lm(fit1) : essentially perfect fit: summary may be unreliable

based on all of this
load("CONDO.Rdata")
CONDO.A <- CONDO[CONDO$SOLD=="A",]
CONDO$SOLD=="A"
as.numeric(CONDO$SOLD)
CONDO.A <- CONDO[as.numeric(CONDO$SOLD)==1,]
x1 <- CONDO.A$FLOOR
x2 <- CONDO.A$DIST
x3 <- CONDO.A$VIEW
x4 <- CONDO.A$END
x5 <- CONDO.A$FURNISH
y <- CONDO.A$SOLD
CONDO$SOLD<-as.numeric(CONDO$SOLD)
CONDO.A <- CONDO[CONDO$SOLD==1,]
fit1 <- lm(SOLD~FLOOR+DIST,data=CONDO.A)
summary(fit1)

What should be eliminated ? I am trying to understand. All the idea was to select A (auction) and remove fixed priced (f) from SOLD column ...

I have added some comments to your code.

Why are you setting SOLD to have only one value?

1 Like

Thank you. I am addressing the auction price which is coded as number 1, and taking off the fixed price . This is why we subsetted( or selected cases need to use x1 through x5 when I test the second order model and the interactions with dummies. I basically have three models to test.
This is the complete model
E(y) = β0 + β1x1 + β2x2 + β3x3 + β4x4 + β5x5

X1 floor height , x2: distance from elevator x3: dummies 1 if ocean view 0 if not, x4, 1 if an end unit, 0 if not, x5: 1 if furnished, 0 if not.

We need a reproducible example (reprex)

As @ FJCC points out your code is not making sense. There is clearly something we are missing.