Looking for some clever ideas..
I have a dataset which has Car Registrations, the time they caught on camera, and which camera they were caught on.
Note: These are not real number plates...!
NumberPlate ANPR DateTime XX65 XXX CAMERA 3 2021-01-04 05:16:43 YY68 XXX CAMERA 3 2021-01-04 05:18:22 XX65 XXX CAMERA 2 2021-01-04 05:19:24 ZZ65 XXX CAMERA 3 2021-01-04 05:19:30 AA65 XXX CAMERA 3 2021-01-04 05:19:44 YC68 XXX CAMERA 1 2021-01-04 05:19:49 DD67 XXX CAMERA 3 2021-01-04 05:22:02
As you can see the Number Plate XX65 XXX was first caught on Camera 3, then Camera 2.
So using dplyr, I can do something like the below to get the last camera it used.
df <- df %>%
mutate(PreviousCamera = lag(ANPR,1))
However the example "YY68 XXX" appears once, but the chances are that the number plate "YC68 XXX" is the same vehicle with the camera not detecting all characters properly. Is there a clever way which you find the previous "similar" match?