Analyze one variable based on two others

Hi everyone!

My name is Manel, and I have just started a master's in bioinformatics.

In an exercise, I have 3 variables (Country, Value and Year). I have to create a new data frame with the maximum value of each country and the corresponding year. From each country I have its value in different years, so I do not know how to extract its maximum value, and associate it with the year it is due. I have tried in a thousand ways, but I do not understand how to relate the maximum value of each country to its respective year.

Greetings and thank you all very much!

Ahoy and welcome. First off, note our homework policy, FAQ: Homework Policy. It has tips on how best to work with this forum to get help with homework. And what might result in hiding your post (for example, please never post homework questions verbatim.)

We'd also strongly encourage you to quickly get comfortable posing these kinds of questions with a reproducible example (more on that FAQ: Tips for writing R-related questions)

Using the tidyverse, the following reprex is one approach.


library(dplyr)

# setting up the data,
# each country/year combo has a number of values. 
df <- data.frame(
  year = rep(c(1990, 1990, 1991, 1991), 2),
  country = rep(c("a", "b"), 4)
) %>% 
  mutate(
    value = rnorm(n())
  ) %>% 
  arrange(country, year)
df
#>   year country       value
#> 1 1990       a -0.38071845
#> 2 1990       a -1.47383021
#> 3 1991       a  0.18011637
#> 4 1991       a  1.53552271
#> 5 1990       b  0.09491687
#> 6 1990       b -0.75376016
#> 7 1991       b -0.49575320
#> 8 1991       b  1.13004382

# new data frame
# for each coutry-year, get the highest value
df_max <- df %>% 
  group_by(country, year) %>% 
  summarise(
    max_val = max(value)
  )
df_max
#> # A tibble: 4 x 3
#> # Groups:   country [2]
#>   country  year max_val
#>   <fct>   <dbl>   <dbl>
#> 1 a        1990 -0.381 
#> 2 a        1991  1.54  
#> 3 b        1990  0.0949
#> 4 b        1991  1.13

Created on 2020-03-25 by the reprex package (v0.3.0)

For a great introduction to R for Data Science generally, I'd really encourage you to check out the R4DS book. It has a section on data transformation that goes over how I set this up.

1 Like