HOW TO CALCULATE AND COMPARE YEARLY OUTCOMES BY TREATMENT GROUp?

image

Dear all, what is the relevant formula to calculate the yearly outcomes by treatment group? i have tried many ways however i am unable to sort out the data as required.

@Hash What aggregation method would you like to use? You want to calculate the yearly average for each group? Or something else?

i have been trying to calculate the yearly mean outcomes of Y by the treatment group.

Here is the answer to your question after providing me with the data and a bit more explanation in private messages:

# Load dplyr
library(dplyr)

# Download the full data
cost <- read.csv("diff_in_diff/cost_data.csv.csv")

# Compute the mean outcome by year and by treatment
cost %>%
  group_by(year, treatment) %>%
  summarize(avg_Y = mean(Y, na.rm = TRUE))

# A tibble: 14 x 3
# Groups:   year [7]
    year treatment avg_Y
   <int>     <int> <dbl>
 1  2005         0 5774.
 2  2005         1 5294.
 3  2006         0 5853.
 4  2006         1 5400.
 5  2007         0 5887.
 6  2007         1 5436.
 7  2008         0 6141.
 8  2008         1 5367.
 9  2009         0 6151.
10  2009         1 5389.
11  2010         0 6069.
12  2010         1 5321.
13  2011         0 5869.
14  2011         1 5195.

This is because you did not load the dplyr package with: library(dplyr). If you don't have it installed, you will need to install it first with install.packages("dplyr")

Hey I have a similar problem thanks, I was wondering how do you do a line graph using the aggregated data? It seems like doing the normal plot(x,y) doesn’t seem to work.

Hi @gagoko0087, how about you ask a new question where you provide more details on your issue. You may want to tag me with @gueyenono if you want. It would also help if you could share a sample of your data.