 # Growth rate calculation in R

Hello,

So, I've got a dataset which has a value pertaining for each year, as shown below. I want to calculate the growth rate, for each year, using the dplyr package, and then calculate the average growth rate towards the end. Any idea how to go about it ?

I have read that we need to use the for loop in this, but I am not sure on how to go about it. Any help will be appreciated !

You can use the lag() / lead() function in dplyr, that takes the entry in the previous or next row in the dataset!

let's assume you stored your data in the dataframe called growth you can do the following:

``````growth_rate = growth %>%
# first sort by year
arrange(year) %>%
mutate(Diff_year = year - lag(year),  # Difference in time (just in case there are gaps)
Diff_growth = route - lag(route), # Difference in route between years
Rate_percent = (Diff_growth / Diff_year)/route * 100) # growth rate in percent
``````

Giving (used some random data as you didn't supplied the data in a way we could use it:

``````year   route Diff_year Diff_growth Rate_percent
<dbl>   <dbl>     <dbl>       <dbl>        <dbl>
1  1957 135801         NA         NA         NA
2  1958 148023.         1      12222.         8.26
3  1959 146543.         1      -1480.        -1.01
4  1960 131889.         1     -14654.       -11.1
5  1961 146396.         1      14508.         9.91
6  1962 152252.         1       5856.         3.85
7  1963 149207.         1      -3045.        -2.04
8  1964 144731.         1      -4476.        -3.09
9  1965 153415.         1       8684.         5.66
10  1966 153415.         1          0          0
11  1967 162620.         1       9205.         5.66
``````

Then the average growth rate can be calculated as:
`Average_growth = mean(growth_rate\$Rate_percent, na.rm = TRUE)`

Hope the mathematics are correct and as you want to have it, but the direction should be clear!?
Matthias

3 Likes

This is illuminating. Thank you , Matthias !

Um, I have to mention one thing. Isn't the formula for Growth rate supposed to be...

Growth rate = (Present value - Past value)/ Past Value * 100

So, as per your example, I guess that the formula should be...

Rate_percent = (Diff_growth /Diff_year)/ lag(route) * 100 Yes you are right, I mixed up lag() and lead(), should have asked the help before!

1 Like

While the lag / lead approach will give you a good result you can also consider a slightly more mathy approach.

Assuming your growth is exponential you consider the formula y = a * (1 + r) ^ x which can be solved via nonlinear least squares = `stats::nls()`

What approach is more appropriate would depend on your application; when calculating average bear in mind you are comparing rates, so geometric mean might be more appropriate than arithmetic.

2 Likes

I have another doubt, if you don't mind. Any idea how do I avoid the complex numbers as the output below ?

This is the code that I have written to get the necessary output. It seems to work fine for all age groups under the level_2 column other than this one. (there were more age groups, but since RStudio prohibits the sharing of csv files, I have to show you this screenshot)

Celebi <- singapore_data %>%
na.omit() %>%
group_by(year) %>%
filter(level_1 == "Total Residents", level_2 == "75 - 79 Years") %>%
group_by(level_2) %>%
select(year, level_2, value) %>%
mutate(Diff_year = year - lag(year),
Diff_growth = value - lag(value),
Rate_percent = ((Diff_growth/ Diff_year)/ lag(value)) * 100)
Celebi

Average_growth_Celebi = mean(Celebi\$Rate_percent, na.rm = TRUE) # 4005

Average_growth_Celebi

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.