Hello everyone,

I am stuck for quite a few hours now on this problem, and I cannot see where the issue is. I want to merge two datasets and left-join them. I have a key by which I want to merge them, and after the merge, there is not a single match between the two datasets. You can download the datasets from my secure cloud here: Tresorit Web Downloader

I use the following command:

merged <- merge(data1, data2, by = "ID", all.x = TRUE)

I get all NA values for the second variable. But there must be many matches... For example this ID: "JBzSursPQRBwYkbqHupwSo2H9RePnkHoHGetpExq2hQAxzg" is available in both datasets.

Can anybody help me?
Thanks!

loading wrongly typed files as a means to file share is frowned upon by Forum admins.
Please read this simple guide on how to provide representative data.


Short Version

You can share your data in a forum friendly way by passing the data to share to the dput() function.
If your data is too large you can use standard methods to reduce it before sending to dput().
When you come to share the dput() text that represents your data, please be sure to format your post with triple backticks on the line before your code begins to format it appropriately.

```
( example_df <- structure(list(Sepal.Length = c(5.1, 4.9, 4.7, 4.6, 5, 5.4, 4.6, 
5, 4.4, 4.9), Sepal.Width = c(3.5, 3, 3.2, 3.1, 3.6, 3.9, 3.4, 
3.4, 2.9, 3.1), Petal.Length = c(1.4, 1.4, 1.3, 1.5, 1.4, 1.7, 
1.4, 1.5, 1.4, 1.5), Petal.Width = c(0.2, 0.2, 0.2, 0.2, 0.2, 
0.4, 0.3, 0.2, 0.2, 0.1), Species = structure(c(1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L), .Label = c("setosa", "versicolor", "virginica"
), class = "factor")), row.names = c(NA, -10L), class = c("tbl_df", 
"tbl", "data.frame")))
```

I added a download link to the data

I didnt ask you to do that ?

You didn't ask me to do anything?

I asked you to read the guide, so that you would have the information on the forums recommended approach to reproducible examples.

I just fixed the issue (nothing that a minimal example could help with), my data generation script put a space before each variable and therefore it was not matchable.

The thread can be deleted.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.