Help with group-by function

Im having trouble creating a table that count all the activities recorded at a specific time (30 min intervals).. please help

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(ggplot2)
tud <- monday
#> Error in eval(expr, envir, enclos): object 'monday' not found

# create a table of time and activity count for every 30min intervals
table <- monday %>%
  group_by(Time) %>%
  mutate(count = summarize(count = n())) 
#> Error in eval(lhs, parent, parent): object 'monday' not found

Created on 2019-11-29 by the reprex package (v0.3.0)

We do not know what your data look like but I have invented some data and binned each time stamp by the half hour.

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
DF <- data.frame(DateTime = seq.POSIXt(from = as.POSIXct("2019-01-01 00:00:00"),
                                  to = as.POSIXct("2019-01-02 23:50:00"), by = "10 min"))
DF <- DF %>% mutate(Date = as.Date(DateTime), HalfHour =  as.numeric(DateTime) %/% 1800)
Tbl <- DF %>% group_by(HalfHour) %>% summarize(count = n())                 
head(Tbl)       
#> # A tibble: 6 x 2
#>   HalfHour count
#>      <dbl> <int>
#> 1   859070     3
#> 2   859071     3
#> 3   859072     3
#> 4   859073     3
#> 5   859074     3
#> 6   859075     3

Created on 2019-11-28 by the reprex package (v0.2.1)

Thanks FJCC

Sorry im a bit new to R. My data is for 66 households where they record what they do in every 30min. My data looks like this

I've use the coding you sent and it does gives the following. Thanks for that

how can i use the "count()" but not to count other activities like "sleeping"?

Here is code that filters out the two categories Sleeping and Napping before grouping and counting. Notice the ! before Activity in the the filter(). That inverts the logical result so the filter looks for Activity that is not in the given list.

Tbl <- DF %>% filter(!Activity %in% c("Sleeping", "Napping") %>%  
group_by(Time) %>% 
summarize(count = n())

Awesome this is so helpful.

Was wondering how do you format the "Time" column to be in AM/PM or 24hours format?

When i group them by time it group the 02:30 (am) and 02:30 (pm) into one, hence, i get only a 12hrs table.

Your timestamps are probably coming as characters. The lubridate package has convenient functions for converting characters to dates or timestamps. What format do your timestamps have in the original data?

Yes i thought it was time but it is in characters when i import it

I've changed it to date

It doesn't differentiate the time between am and pm, what would be the best function for 24hr?

Please post a Reproducible Example with a small example of your data and the actual code you are using to transform the characters into dates.

If I use the ymd_hms() function from the lubridate package, it handles AM and PM correctly.

library(lubridate)
#> 
#> Attaching package: 'lubridate'
#> The following object is masked from 'package:base':
#> 
#>     date
x <- c("2019-12-01 03:34:00 AM", "2019-12-01 04:14:31 PM")
ymd_hms(x)
#> [1] "2019-12-01 03:34:00 UTC" "2019-12-01 16:14:31 UTC"

Created on 2019-12-01 by the reprex package (v0.3.0.9000)

2 Likes

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.