Comparing 2 csv files

I have 2 csv files with around 22 columns in each. I want to compare the 2 csv files based on either all the columns or a number of specified columns and show which rows are different and which rows are new

is that possible simply in R ?

cheers

Hi @ply,

You may want to try using the {waldo} package and the compare() function. You can compared two data frames after reading the CSVs into R.

library(waldo)

x <- read.csv(...)
y <- read.csv(...)

compare(x, y)

If you only want to compare a subset of columns, you can subset those columns and then use compare() on the subsetted data frames.

1 Like

Thanks.. I did try waldo but i did not find the output results very clear ?

If you would like to share the output I could help clarify the meaning.

This is also a good place to start understanding waldo better: https://waldo.r-lib.org/

1 Like

Thanks

I widened my console and it made it more readable but is there a way of putting the results in a table/df
It says at the end .. and 27 more differences.... but I dont know how to see them

cheers

You could do something like this to wrap it into a tibble:

tibble::enframe(compare(x, y), NULL, 'diff')

But I'm not sure it improves the readability much.

This seems like the previous issue we discussed, you didnt comment on my final solution ...

1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.