Issue with full_join(): new rows that do not belong to any data frame

Hi @integrity! Welcome to RStudio Community!

Can you please provide a minimal REPRoducible EXample (reprex)? A reprex makes it much easier for others to understand your issue and figure out how to help.

Typically this will include:

  • a toy data set (or in your case two toy datasets)
  • the code you are using
  • the output and/or error messages you are getting
  • your desired output

By providing these things, especially with the help of the reprex package, you have a much better shot at getting the help that you are looking for!

You can try and use dput(data[1:5,]) and paste the result. alternatively, you can install the datapasta package using devtools::install_github("MilesMcBain/datapasta") and run:

datapasta::dpasta(data[1:5,])

This will give you a more cleany formatted output that people here can copy and paste directly to their console.

2 Likes

So the reason the the CUSTUMER_NUMBER column is changing values for the data1 table has to do with the fact that it is originally a factor. Factors are categorical data that are really just numbers associated with a label. So when you are running the as.numeric() step on data1 you will first need to convert it to a character and then to a numeric. So try changing this line:

data1$CUSTOMER_NUMBER = as.numeric(data1$CUSTOMER_NUMBER)

to

data1$CUSTOMER_NUMBER = as.numeric(as.character(data1$CUSTOMER_NUMBER))
2 Likes