Changing Base R code to dplyr/sparklyr

Could you please turn this into a self-contained reprex (short for reproducible example)? It will help us help you if we can be sure we're all working with/looking at the same stuff.

install.reprex("reprex")

If you've never heard of a reprex before, you might want to start by reading the tidyverse.org help page. The reprex dos and don'ts are also useful.

What to do if you run into clipboard problems

If you run into problems with access to your clipboard, you can specify an outfile for the reprex, and then copy and paste the contents into the forum.

reprex::reprex(input = "fruits_stringdist.R", outfile = "fruits_stringdist.md")

For pointers specific to the community site, check out the reprex FAQ, linked to below.


Without sample data especially, it is hard to know for sure if what I am doing is what you want. As an arbitrary stab in the dark, I would suggest maybe saving an intermediate dataframe after your third mutate call in the revised code. From there you can use the fuzzyjoin package to join it to itself based on whether the date is within 30 days. That may prove to not be the best option in your case, but it is where my head went when I first read your last post.