Creating a function that adds a conditional sum column to a data frame

Hi,
I am trying to create a function that adds the total value corresponding to a factor within a data frame, as a column (I use sumif for it in excel)
This is my data table

    u_value<-seq(23,34,length=10)
u_name<-c("a","z","b","j","k","u","a","z","b","a")
tablo45<-data.frame(u_name,u_value)

I want the output to be like this

image

When I use this commands one by one, I obtain the desired output

ara<-aggregate(u_value ~ u_name, data = tablo45, sum)
df1<-merge(cbind(tablo45), cbind(ara),by.x = "u_name",by.y="u_name",no.dups = FALSE)

But when i try to add this as a function, it doesn't work

ekle_etopla<-function(df,a,b) {
ara<-aggregate(b ~ a, data = df, sum)
df1<-merge(cbind(df), cbind(ara),by.x = "a",by.y="a",no.dups = FALSE)
df1}

this is the error message:
"Error in fix.by(by.x, x) : 'by' must specify a uniquely valid column"

Can someone help me with this? I tried so many variations but I am stuck.
Thanks

I would use functions from the dplyr package.

u_value<-seq(23,34,length=10)
u_name<-c("a","z","b","j","k","u","a","z","b","a")
tablo45<-data.frame(u_name,u_value)
library(dplyr, warn.conflicts = FALSE)
SUMS <- tablo45 %>% group_by(u_name) %>% summarize(u_valueSum = sum(u_value))
#> `summarise()` ungrouping output (override with `.groups` argument)

df1 <- inner_join(tablo45, SUMS, by = "u_name") %>% arrange(u_name)
df1
#>    u_name  u_value u_valueSum
#> 1       a 23.00000   87.33333
#> 2       a 30.33333   87.33333
#> 3       a 34.00000   87.33333
#> 4       b 25.44444   58.22222
#> 5       b 32.77778   58.22222
#> 6       j 26.66667   26.66667
#> 7       k 27.88889   27.88889
#> 8       u 29.11111   29.11111
#> 9       z 24.22222   55.77778
#> 10      z 31.55556   55.77778

Created on 2020-09-22 by the reprex package (v0.3.0)

Thanks for the solution, summarise and inner_join works fine but i still cannot put this as a function. I want to create my own function (ekle_etopla) here.

   ekle_etopla<-function(tablo_adi,isim,deger) {
  SUMS <- tablo_adi %>% group_by(isim) %>% summarize(u_valueSum = sum(deger))
  df1 <- inner_join(tablo_adi, SUMS, by = isim)%>% arrange(isim)
  df1}

output<- ekle_etopla(tablo45,u_name,u_value)

Error message is:
" Must group by variables found in .data.

  • Column isim is not found.

If you want to do this using non-quoted variables (ie u_value instead of "u_value"), you will need to use {{}} and := from rlang, see ?`{{`

add_fun1 <- function(d, name, value, new_col) {
  d %>% 
    group_by({{ name }}) %>% 
    mutate({{ new_col }} := sum({{ value }})) %>% 
    arrange(name)
}

add_fun1(tablo45, u_name, u_value, u_val_sum)
2 Likes

That was exactly what I was trying to do. I think I have to study more on these expressions.
Thank you so much!

If your question's been answered (even if by you), would you mind choosing a solution? (See FAQ below for how).

Having questions checked as resolved makes it a bit easier to navigate the site visually and see which threads still need help.

Thanks

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.