Calculate Response Efficiency

I have the dataset of a distribution company. The actual result that I am looking for is to get the response efficiency when I divide Response over the Billed Population. The challenge here is that the data set there's no billed population as a variable, so I'll have to convert the individual billed energy to represent the billed population and then divide by the Response. How do I go about it?

Response <- df %>%
 mutate(Char = nchar(Payment.Efficiency....))%>%
 filter(!Char == "NotBilled")%>%
 group_by(Zone) %>%
 summarise(BilledPop = count(Char >0),
        Response = count(Resp >0),
        Efficiency = Response/Char)
#> Error in df %>% mutate(Char = nchar(Payment.Efficiency....)) %>% filter(!Char == : could not find function "%>%"
 Response
#> Error in eval(expr, envir, enclos): object 'Response' not found

Created on 2020-05-19 by the reprex package (v0.3.0)

Can you turn this into a proper reproducible example by including sample data on a copy/paste friendly format?

This is a sample dataframe I created. I don't really know how to randomize the names without replacement, that would have been much better. But, imagine with this data, the billed energy are in KwH and the Response are numbers. Instead of dividing the KwH by the Response, I would like to count the Kwh per zone and then divide by the count of response per zone.

activity <-data.frame(Num = c("Jane","Bill","Fred","Tina","Joe"),Zone=c("AKS","BYS","CRS",
                              "PHC-1","PHC-2"),
                       BilledEnergy=c(2300,4003,1121,3022,1713),Response = c(1,1,1,1,1))
 bind_rows(replicate(3, activity, simplify = F))
#> Error in bind_rows(replicate(3, activity, simplify = F)): could not find function "bind_rows"

Created on 2020-05-20 by the reprex package (v0.3.0)

library(tidyverse)

activity <-data.frame(Num = c("Jane","Bill","Fred","Tina","Joe"),Zone=c("AKS","BYS","CRS",
                                                                        "PHC-1","PHC-2"),
                      BilledEnergy=c(2300,4003,1121,3022,1713),Response = c(1,1,1,1,1))
(activity2 <- bind_rows(replicate(3, activity, simplify = F)))

(activity2 %>% group_by(Zone) %>% summarise(
   sum_billed_energy = sum(BilledEnergy),
  sum_of_response = sum(Response),
  div = sum_billed_energy/sum_of_response
))

?

Thanks. But it doesn't solve the problem... The units of measurements are different. My end result is to get the percentage of response over the billed population. Since there is not variable named billed population, I want to count (Convert) the number of billed energy (Just like Excel Pivot Table) to get the response efficiency.

sorry, but you are failing to explain yourself in non-ambigious terms.
I understand the concept of billed energy measured in KWh,
but what is response, what does it measure, in what unit?

Given that we have succesfully summed the billed energy per zone ( let me know if we havent) , lets focus on what response is and how it should be treated.

Sorry for my ambiguity. Response here means the number of individual customers that responded (paid their bills) in the billing month.

Just like working with Excel pivot, I just want to get the count of billed energy - that can suffice for the number of customers billed for the month or as the case may be.

I think I made that assumption.
For each Zone, we have summed billed energy,
For each Zone we have summed the responses (no of customers billed for the month for the zone).
Then we can divide the sum of billed energy by the sum of responders in zone to get the avarage bill of a typical responder in the zone.
I dont know what 'count' of billed energy means...
if it means sum of the energy , we summed that.
if you want to know how many bills were responded to, we summed that.
if you want to know something other than these, please try again to explain what that might be.

I apologise for not having understood your requirements, and appreciate your further efforts to explain.
Perhaps you can pick a zone, and write a 3 row table, and how you would like to see it processed. That would I hope be a clear signal to what you intend ?

I think the missing point is that we are summing instead of counting the number of rows. If we can get the count of billed energy per row likewise the count of response per row that solves it.
So in our dataset, it should be 5/5 = 100%
where count of billed energy = 5; and count of Response = 5 ... This is what we want to achieve.

Each billed energy represents each customer arranged in rows. By summing the billed energy and Response we are doing (5/12,159) where Response = 5 and billed energy = 12,159 which is not what we want.

counting rows is trivial.., its the n() function that can be passed to summarise.
its just not clear to me what you want to count.
rows per zone ? in the data you gave me there are a total of 15 rows, they can be grouped into 5 Zones each with 3 observations.

12159, is the sum of the activity frame (not the activity 2 frame that binds the activity frame 3 times )

should I ignore activity2 , and work with only the 5 row activity?

(activity %>% summarise(
  sum_billed_energy = sum(BilledEnergy),
  sum_of_response = sum(Response),
  number_of_observations = n(),
  div = sum_billed_energy/sum_of_response
))

 sum_billed_energy sum_of_response number_of_observations   div
              <dbl>           <dbl>                  <int> <dbl>
1             12159               5                      5 2432.