I have a data frame df which contains Sample names, the number of samples, and the cluster number. Example: There are 3 of Sample_A, and 2 of those samples are in cluster 12, the remaining one is in cluster 15 :
Sample
Number_Samples
Cluster
Sample_A
3
12
Sample_A
3
12
Sample_A
3
15
Sample_B
1
10
Sample_C
2
12
Sample_C
2
14
Sample_D
4
7
Sample_D
4
20
Sample_D
4
20
Sample_D
4
20
How can I add a column called Percent_Observed where I can get the value of what % each cluster represents for each sample type. For example, there is only 1 of Sample_B. Therefore, cluster 10 represents 100% of Sample_B.
I'm finding this a little tricky since the clusters are not unique. My goal is to have :