I keep getting the following error when I try to run a linear regression model.
Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) : **
** NA/NaN/Inf in 'y'
In addition: Warning message:
In storage.mode(v) <- "double" : NAs introduced by coercion
Here is my code:
housing.Mod<lm(Housing_Data$Region~YearEnd+MedianSalePrice+HomesSold+NewListings+Inventory+
DaysonMarket, data=Housing_Data)
Here is an example of the data set I am using:
| Region |
YearEnd |
MedianSalePrice |
HomesSold |
NewListings |
Inventory |
| National |
2012 |
182109.25 |
4890071 |
6152897 |
24404994 |
| National |
2013 |
204697.3333 |
5371014 |
6724389 |
22427155 |
| National |
2014 |
217788.3333 |
5346107 |
6917961 |
23229408 |
| National |
2015 |
232530.5833 |
5891468 |
7391337 |
22772096 |
| National |
2016 |
248143.8333 |
6219612 |
7417211 |
20863938 |
| National |
2017 |
265047.75 |
6401982 |
7507710 |
18837459 |
Is it because I am using a character column for the response variable? If this is the case, how do I write code where we see the response by the region (there are six regions in total)? Any help is greatly appreciated!