I keep getting the following error when I try to run a linear regression model.
Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) : **
** NA/NaN/Inf in 'y'
In addition: Warning message:
In storage.mode(v) <- "double" : NAs introduced by coercion
Here is my code:
housing.Mod<lm(Housing_Data$Region~YearEnd+MedianSalePrice+HomesSold+NewListings+Inventory+
DaysonMarket, data=Housing_Data)
Here is an example of the data set I am using:
Region | YearEnd | MedianSalePrice | HomesSold | NewListings | Inventory |
---|---|---|---|---|---|
National | 2012 | 182109.25 | 4890071 | 6152897 | 24404994 |
National | 2013 | 204697.3333 | 5371014 | 6724389 | 22427155 |
National | 2014 | 217788.3333 | 5346107 | 6917961 | 23229408 |
National | 2015 | 232530.5833 | 5891468 | 7391337 | 22772096 |
National | 2016 | 248143.8333 | 6219612 | 7417211 | 20863938 |
National | 2017 | 265047.75 | 6401982 | 7507710 | 18837459 |
Is it because I am using a character column for the response variable? If this is the case, how do I write code where we see the response by the region (there are six regions in total)? Any help is greatly appreciated!