some of you may remember the tidylog package. Right now, I'm working on improving the output for join operations such as
inner_join, etc., and would welcome feedback on what the package should report.
This is a first draft, loosely oriented on what Stata reports for merges:
> tidylog::left_join(flights[1:10000, ], airlines[1:10, ], by = "carrier") #>left_join: added one column (name) #> rows only in x 2,783 #> rows only in y ( 0) #> matched rows 7,217 #> ======== #> rows total 10,000
(Any time a number is printed in parentheses, it means that those rows are not included in the result.)
Because joins are complicated and cover a lot of different use cases, I would welcome additional input on this. It's also possible to test the current implementation (which surely still has some bugs). See the github issue here for more information: https://github.com/elbersb/tidylog/issues/25
Another interesting thing to report would be the numbers of rows that were duplicated, but I'm not sure yet on how to approach this.