'newdata' had x rows but variables found have y rows - Error Code

Hi!

I'm trying to build a decision tree regressor model using a data that consists of 260 rows and 56 columns (1 index column, 1 target variable, and 54 predictors).
I have separated the data into training and testing sets for building the model using these lines.

dt_WTI <- sort(sample(nrow(DataWTI_Lag1), nrow(DataWTI_Lag1)*.8))
dt_train_WTI<-DataWTI_Lag1[dt_WTI,]
dt_test_WTI<-DataWTI_Lag1[-dt_WTI,]

Training data has 208 rows and testing has 52.
I want to build a regression model of WTI Price towards each predictor variable at a time, not all 54 at the same time to see the RMSE value and decide on the optimal lag I must choose for each predictor.

The first model I'm trying to build is price as a function of USDX. So I built it like this.

dtWTI_Lag1 <- rpart(dt_train_WTI$WTIPrice ~ dt_train_WTI$USDX)
summary(dtWTI_Lag1)

I tried to use the model for prediction by predicting the testing data using these lines below.

predictor <- as.data.frame(dt_test_WTI[,c(3)])
colnames(predictor) <- "USDX"
prediction <- predict(dtWTI_Lag1,predictor)

But a warning message showed up.

Warning message:
'newdata' had 52 rows but variables found have 208 rows 

Here's the result of the prediction

 1        2        3        4        5        6        7        8        9       10       11       12 
48.02825 48.02825 37.75662 48.02825 48.02825 48.02825 48.02825 37.75662 37.75662 48.02825 48.02825 37.75662 
      13       14       15       16       17       18       19       20       21       22       23       24 
37.75662 37.75662 37.75662 54.38706 54.38706 54.38706 54.38706 54.38706 58.26438 58.26438 58.26438 67.98679 
      25       26       27       28       29       30       31       32       33       34       35       36 
51.58749 51.58749 51.58749 63.08409 51.58749 51.58749 51.58749 67.98679 67.98679 67.98679 51.58749 51.58749 
      37       38       39       40       41       42       43       44       45       46       47       48 
67.98679 51.58749 63.08409 63.08409 63.08409 63.08409 63.08409 63.08409 63.08409 63.08409 63.08409 63.08409 
      49       50       51       52       53       54       55       56       57       58       59       60 
63.08409 63.08409 63.08409 63.08409 51.58749 51.58749 67.98679 67.98679 67.98679 67.98679 67.98679 67.98679 
      61       62       63       64       65       66       67       68       69       70       71       72 
67.98679 67.98679 67.98679 67.98679 67.98679 67.98679 58.26438 67.98679 67.98679 67.98679 67.98679 67.98679 
      73       74       75       76       77       78       79       80       81       82       83       84 
67.98679 45.54071 54.38706 54.38706 54.38706 58.26438 58.26438 58.26438 58.26438 54.38706 45.54071 58.26438 
      85       86       87       88       89       90       91       92       93       94       95       96 
54.38706 45.54071 54.38706 54.38706 54.38706 54.38706 54.38706 54.38706 54.38706 54.38706 54.38706 54.38706 
      97       98       99      100      101      102      103      104      105      106      107      108 
54.38706 54.38706 54.38706 58.26438 54.38706 54.38706 54.38706 54.38706 54.38706 54.38706 54.38706 54.38706 
     109      110      111      112      113      114      115      116      117      118      119      120 
54.38706 54.38706 37.75662 54.38706 54.38706 54.38706 54.38706 54.38706 54.38706 54.38706 54.38706 54.38706 
     121      122      123      124      125      126      127      128      129      130      131      132 
45.54071 54.38706 54.38706 54.38706 54.38706 54.38706 37.75662 37.75662 54.38706 45.54071 45.54071 48.02825 
     133      134      135      136      137      138      139      140      141      142      143      144 
37.75662 37.75662 37.75662 37.75662 37.75662 37.75662 37.75662 37.75662 37.75662 54.38706 45.54071 54.38706 
     145      146      147      148      149      150      151      152      153      154      155      156 
54.38706 54.38706 45.54071 58.26438 67.98679 51.58749 51.58749 51.58749 51.58749 51.58749 51.58749 51.58749 
     157      158      159      160      161      162      163      164      165      166      167      168 
51.58749 67.98679 51.58749 51.58749 51.58749 51.58749 51.58749 63.08409 63.08409 63.08409 63.08409 63.08409 
     169      170      171      172      173      174      175      176      177      178      179      180 
63.08409 63.08409 63.08409 63.08409 63.08409 63.08409 63.08409 63.08409 63.08409 63.08409 63.08409 63.08409 
     181      182      183      184      185      186      187      188      189      190      191      192 
63.08409 63.08409 63.08409 63.08409 63.08409 63.08409 63.08409 51.58749 63.08409 63.08409 51.58749 51.58749 
     193      194      195      196      197      198      199      200      201      202      203      204 
51.58749 63.08409 63.08409 51.58749 67.98679 67.98679 67.98679 67.98679 67.98679 67.98679 67.98679 58.26438 
     205      206      207      208 
58.26438 58.26438 58.26438 58.26438 

Can somebody help me on this problem? I'm fairly new in R Studio and would appreciate great help from an expert. Thank you!

You shouldnt use rpart with $ syntax
Use the data param to tell it the fitting dataset and name the variables involved without $

1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.