Hi @integrity! Welcome to RStudio Community!
Can you please provide a minimal REPRoducible EXample (reprex)? A reprex makes it much easier for others to understand your issue and figure out how to help.
Typically this will include:
- a toy data set (or in your case two toy datasets)
- the code you are using
- the output and/or error messages you are getting
- your desired output
By providing these things, especially with the help of the reprex
package, you have a much better shot at getting the help that you are looking for!
You can try and use dput(data[1:5,])
and paste the result. alternatively, you can install the datapasta
package using devtools::install_github("MilesMcBain/datapasta")
and run:
datapasta::dpasta(data[1:5,])
This will give you a more cleany formatted output that people here can copy and paste directly to their console.
2 Likes
So the reason the the CUSTUMER_NUMBER
column is changing values for the data1
table has to do with the fact that it is originally a factor. Factors are categorical data that are really just numbers associated with a label. So when you are running the as.numeric()
step on data1
you will first need to convert it to a character and then to a numeric. So try changing this line:
data1$CUSTOMER_NUMBER = as.numeric(data1$CUSTOMER_NUMBER)
to
data1$CUSTOMER_NUMBER = as.numeric(as.character(data1$CUSTOMER_NUMBER))
2 Likes