group by and summarize according to a list

Hi friends,

I have a dataframe
Column1 Column2
A 12
A 4
B 5
B 7
C 3
D 4

And ı want to use "group by"

Then ı want new dataframe like belove

Column1 Column2
A 16
B 12
C 3
D 4
E N/A or 0

But there is no "E" row in the first database.
How do ı add "E" row .while using "group by..
Just A,B,C,D are seen. İn the first dataframe"...

Is it possible in R?

How can we code it in R language?

Thanks for your help.

What does group_by() have to do with that? You don't do any grouped calculations whatsoever and the output seems rather random/arbitrary.

If you just want to add another row, you can use rbind() (or bind_rows() or rbindlist()). Otherwise, you should add a reprex or an actual example of relevant data (not just 4 rows with one entry "per group") as well as the logic behind your added row.

Kind regards

library("dplyr")

create df

df <- data.frame (Column1 = c("A", "A", "B", "B", "C", "D"),
Column2 = c(12, 4, 5, 7, 3, 4)
)

list_data <- list("A", "B", "C", "D", "E")

Group by sum using R Base aggregate()

agg_df <- aggregate(df$Column1, by=list_data, FUN=sum)
agg_df

here we see that ERROR

--Error in aggregate.data.frame(as.data.frame(x), ...) :
arguments must have same length--

ıt says you have 4 elements but you want 5 elements and lenghts aren't apropriate...

thınk like that my friend,

we need to report A,B,C,D and E in the df. ı know E is not in th df.... maybe E will be in next df2 but we need E in the report list.

Then you could transform your column 1 into an actual factor but specify the levels that they contain "E". With tapply() you can then create the sum. Does this look like something you can use?

df <- data.frame(
  col1 = c("A","A","B","B","C","D"),
  col2 = c(12,4,5,7,3,4)
)
df$col1 <- factor(df$col1, levels = LETTERS[1:5], labels = LETTERS[1:5])
str(df$col1)
#>  Factor w/ 5 levels "A","B","C","D",..: 1 1 2 2 3 4
tapply(df$col2, df$col1, sum)
#>  A  B  C  D  E 
#> 16 12  3  4 NA

Created on 2022-11-07 by the reprex package (v2.0.1)

Kind regards

1 Like

Usually is better if you paste the code instead of the screenshot so that we can paste it into R.

Is this useful?

library(tidyverse)

df <- data.frame(Column1 = c("A", "A", "B", "B", "C", "D"),
                 Column2 = c(12, 4, 5, 7, 3, 4))

df2 <- df %>% 
  group_by(Column1) %>% 
  summarise(sum = sum(Column2)) 

if(!any(grepl("E", df2$Column1))){
  df2[nrow(df2) + 1,] = list("E",0)
}


Output


# A tibble: 5 × 2
  Column1   sum
  <chr>   <dbl>
1 A          16
2 B          12
3 C           3
4 D           4
5 E           0

THANKS my friend . this is super solution.

if the answer was satisfactory you can click on solved

1 Like

for this scenarıo ı dont prefer adding a value to table after created df. but thanks for your solution.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.