Filter dataframe top 3 head and tail

x <- data.frame("occ_id" = c(1010, 1010, 1010, 1010, 1010, 1010, 1010,1234,1234,1234,1234, 4321, 4321,4321,4321,4321),
                             "Ind_id" = c(52418,52417,28339,27138,31224,33103,1112,27138,31224,1112,52418,33103,31224,1112,52417,26301),
                             "Change_occ_2000_2021" = c(1, -5 , 8 ,9 , - 11 ,15 ,16 ,-50,10,30,-5,20,10,50,30,-50))

I'm trying to see the change of occupations in the industry over time, so I have the occupations IDs like an engineer, teacher, and lawyer, and several IDs that correspond to the sector like construction, education, mineral extraction, fishing... I would like to extract, from each occupation -and the largest and smallest change. A sample of the data follows. In this example, I would like to extract from occupation 1010 that the largest positive change was in sector 1112 with an increase of 50 workers, and the largest negative change was in sector 31224 with a decrease of -50 workers. Could you guys help me?

Here's one way you could do it with dplyr::filter and dplyr::group_by:

library(dplyr)

x <- data.frame("occ_id" = c(1010, 1010, 1010, 1010, 1010, 1010, 1010,1234,1234,1234,1234, 4321, 4321,4321,4321,4321),
                "Ind_id" = c(52418,52417,28339,27138,31224,33103,1112,27138,31224,1112,52418,33103,31224,1112,52417,26301),
                "Change_occ_2000_2021" = c(1, -5 , 8 ,9 , - 11 ,15 ,16 ,-50,10,30,-5,20,10,50,30,-50))

x %>% 
    group_by(occ_id) %>% 
    filter(
        (Change_occ_2000_2021 == max(Change_occ_2000_2021)) |
            (Change_occ_2000_2021) == min(Change_occ_2000_2021)
    )
#> # A tibble: 6 x 3
#> # Groups:   occ_id [3]
#>   occ_id Ind_id Change_occ_2000_2021
#>    <dbl>  <dbl>                <dbl>
#> 1   1010  31224                  -11
#> 2   1010   1112                   16
#> 3   1234  27138                  -50
#> 4   1234   1112                   30
#> 5   4321   1112                   50

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.