Recommended_sku by person

Hello,

I want to do recommendations per persons, copy and paste this code:

X3 <- data.frame(ID = c("Person1","Person1","Person1",
"Person3","Person4","Person5"),
Recommend_sku = c(1,2,2,1,4,5))

X3

This code, eliminates the Recommend_sku in all the column:

X3[!duplicated(X3$Recommend_sku),]

I just need to eliminate the repeated sku but just per person, and not in all column.
Thanks!

lo

isnt it simply

 X3[!duplicated(X3),]

which eliminates the 3rd row which was the only entirely duplicated row

Your code eliminate all duplicates.
I need to eliminate the Recommend_sku just for that person and let available for others persons.

Or in more simply words: "eliminating Recommend_sku when is repeated by ID"

(Eliminate the repited Recommend_sku for ID, But leaving available that Recommend_sku for others ID)

There are repeated skus of 1 and 2, but only person 1 has two of the same sku (i.e.2
i.e. row 2 is the same as row 3 and person 1 is involved in both.
My code eliminated this record.
If this is not what you want, then please be explicit
what is the exact result that you want to see from your example?

I need a code that can help me how to eliminate the rows numbers: 3 and 11 as this example, because its a recommended_sku already recommeded for person1 and person7

F

PLEASE RUN THIS:

X3 <- data.frame(ID = c("Person1","Person1","Person1",
"Person3","Person4","Person5",
"Person5", "Person6", "Person6",
"Person7","Person7","Person7",
"Person8","Person9", "Person9",
"Person10","Person11","Person11"),
Recommend_sku = c(1,2,2,1,4,5,
2,3,4,5,5,6,
4,3,2,1,2,3))
X3

"ELIMINATE THE ROWS 3 and 11" (Blue)
Or in more simply words: "eliminating Recommend_sku when is repeated by ID"

Blue:
person1 have already recommended_sku 2 (For that I need to eliminate that row 3)
person7 have already recommended_sku 5 (For that I need to eliminate that row 11)

If you have the solution, help me, please.
that data frame it´s just an example, I need to do that in my real data 2.500.000 of rows.
Thank you!!

I used nirgrahamuk's code and it seems to do what you want. Do you reject it because of the row numbering? If so, there is a simple fix (and probably an even simpler one).

Being a tidyverse user, I also include a dplyr alternative.

X3 <- data.frame(ID = c("Person1","Person1","Person1",
"Person3","Person4","Person5",
"Person5", "Person6", "Person6",
"Person7","Person7","Person7",
"Person8","Person9", "Person9",
"Person10","Person11","Person11"),
Recommend_sku = c(1,2,2,1,4,5,
2,3,4,5,5,6,
4,3,2,1,2,3))
X3
#>          ID Recommend_sku
#> 1   Person1             1
#> 2   Person1             2
#> 3   Person1             2
#> 4   Person3             1
#> 5   Person4             4
#> 6   Person5             5
#> 7   Person5             2
#> 8   Person6             3
#> 9   Person6             4
#> 10  Person7             5
#> 11  Person7             5
#> 12  Person7             6
#> 13  Person8             4
#> 14  Person9             3
#> 15  Person9             2
#> 16 Person10             1
#> 17 Person11             2
#> 18 Person11             3

X4 <- X3[!duplicated(X3), ]  # drops rows 3 and 11
X4                           # gaps in row numbers at 3 and 11
#>          ID Recommend_sku
#> 1   Person1             1
#> 2   Person1             2
#> 4   Person3             1
#> 5   Person4             4
#> 6   Person5             5
#> 7   Person5             2
#> 8   Person6             3
#> 9   Person6             4
#> 10  Person7             5
#> 12  Person7             6
#> 13  Person8             4
#> 14  Person9             3
#> 15  Person9             2
#> 16 Person10             1
#> 17 Person11             2
#> 18 Person11             3
row.names(X4)<-1:nrow(X4)    # to renumber the rows
X4
#>          ID Recommend_sku
#> 1   Person1             1
#> 2   Person1             2
#> 3   Person3             1
#> 4   Person4             4
#> 5   Person5             5
#> 6   Person5             2
#> 7   Person6             3
#> 8   Person6             4
#> 9   Person7             5
#> 10  Person7             6
#> 11  Person8             4
#> 12  Person9             3
#> 13  Person9             2
#> 14 Person10             1
#> 15 Person11             2
#> 16 Person11             3

X5 <- dplyr::filter(X3, !duplicated(X3))
X5
#>          ID Recommend_sku
#> 1   Person1             1
#> 2   Person1             2
#> 3   Person3             1
#> 4   Person4             4
#> 5   Person5             5
#> 6   Person5             2
#> 7   Person6             3
#> 8   Person6             4
#> 9   Person7             5
#> 10  Person7             6
#> 11  Person8             4
#> 12  Person9             3
#> 13  Person9             2
#> 14 Person10             1
#> 15 Person11             2
#> 16 Person11             3

Created on 2023-07-11 with reprex v2.0.2

1 Like

to deduplicate a dataframe, based on particular columns you can do the following

X3[!duplicated(X3[c("ID", "Recommend_sku")]),]
rownames(X3) <- NULL #if you want to clean up the rownames after

alternatively using dplyr

(X3 <- dplyr::distinct(X3,
         ID,
         Recommend_sku,
         .keep_all=TRUE))
1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.