How to sort a subset of a dataframe

Hello experts,

Please find attached my sample dataset @ https://drive.google.com/file/d/1MzAdUEYQ_mf1qivOQtzIoDcdj8KNvJw5/view?usp=sharing.

What I really want to achieve with this is to create a dataframe that first sorts the data based on period on ascending order and within each period that data is sorted again by the descending order of count of case (grouped by period,region,manager,employee)

So far, I have done up to this but it does not give me what I want.

I am planning to plot this values, so that's why I need to sort it on ascending order first and within each of that I want the data to be sorted in descending order.

library(readxl)
SampleRADN <- read_excel("C:/Users/xxx/Documents/Exercises/PBIR/SampleRADN.xlsx")
df1<-unique(data.frame(SampleRADN))
library(dplyr)
df2<-df1%>%count(period,region,manager,employee)
df3<-df2[order(df2$period, -df2$n)

Any help is greatly appreciated.

Let's see if I understood you correctly.

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(openxlsx)
df <- read.xlsx("c:/users/fjcc/documents/R/Play/SampleRADN.xlsx")
df <- unique(df)
df2 <- df %>% count(period,region,manager,employee)
df3 <- df2 %>% arrange(period, desc(n))
df3
#> # A tibble: 154 x 5
#>    period region  manager employee     n
#>     <dbl> <chr>   <chr>   <chr>    <int>
#>  1      1 Eastern RP      DL          12
#>  2      1 Eastern RP      DG           8
#>  3      1 Eastern CF      RP           4
#>  4      1 Eastern RP      AC           3
#>  5      1 Western MT      VM           3
#>  6      1 Central RK      DF           2
#>  7      1 Central RP      DG           2
#>  8      1 Eastern JT      WP           2
#>  9      1 Eastern RP      GH           2
#> 10      1 Central CB      BN           1
#> # ... with 144 more rows

Created on 2019-08-03 by the reprex package (v0.2.1)

@FJCC thank you very much for your reply. Perhaps I was not very clear in my initial post to accurately explain the sorting.

I wanted the dataframe to be sorted in two ways
A. The data should be sorted on region (ascending order) and the total period-region count values should be on descending order to match the following

period region Count of case
1 Eastern 33
Central 15
Western 6
1 Total 54
2 Eastern 50
Central 15
Western 3
2 Total 68
3 Eastern 66
Central 20
Western 5
3 Total 91
4 Eastern 27
Central 15
Western 4
4 Total 46
5 Eastern 30
Central 18
Western 6
5 Total 54
6 Eastern 40
Central 13
Western 3
6 Total 56
Grand Total 369

B. Now once this is done, I want the subset inside each period-region to be sorted in descending order.

I have solved this in the following way

library(openxlsx)
library(dplyr)
library(tidyverse)
df <- read.xlsx("C:/Users/smpao1/Documents/Exercises/PBIR/SampleRADN.xlsx")
df1<-data.frame(df)
df2<-df1%>%count(period,region)
df3<-df2[order(df2$period, -df2$n),]
df4<-df1%>%count(period,region,manager,employee)
df5<-df4[order(df4$period, -df4$n),]
df6<-left_join(df3,df5,by=c("period"="period","region"="region"))
df7<-df6%>% select(period, region, manager, employee,n.y)

Now I can plot the data as per the order I had in my mind. However, if there is a better way of doing it and you can show me, I would be very eager to learn that.

Thanking you in advance.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.