How do I change class instead of mode for your data frame?

Hello there, I am fairly new to R programming and am trying to complete a project for my course. I am trying to use the summary() function to give me my min, max, etc. but it isn't working. Instead of giving me regular min, 1st Qu, median, etc., it only tells me length, class, and mode. I'm aware that this is because it's not in numeric order, but I already changed my data frame using as.numeric(). It displays that mode is numeric, but my class is difftime. How can I change my class, or why is the summary function not working?

# Using the is.factor() function and told false. Proceeded to change character to numeric and when I ran the is.numeric() function afterward, is displayed True.
is.factor(Annual_trip_data$ride_length)

Annual_trip_data$ride_length <- as.numeric(as.character(Annual_trip_data$ride_length))

is.numeric(Annual_trip_data$ride_length)

1 Like

Hi, can you provide a reproducible example of your dataset, rather than a screenshot? It will make it easier to answer your question.

1 Like

My guess is that none of your variables are numeric

Do a
str(dat1) where dat1 is the name of your dataframe

Hey there William, I can try. I'm new to using R studio, if I am able to share files to do what I'm trying then I can reproduce my code. I had to get ride length by using difftime between two variables that I had to convert into date and time using ymd_hms. And the string for those two variables became posixct. When I use str(), ride length is classified as 'difftime' num

1 Like

I get 'difftime' num as the string for ride_length. I had to convert a character to date and time using ymd_hms and those two strings became POSIXct.

1 Like

This would make it numeric. A made up reproducible example:

library(tidyverse)
library(lubridate)


df <- tibble(time_a = seq(Sys.time(), Sys.time() + days(3), by = '6 hours')) %>% 
  mutate(time_b = time_a + as.vector(sample(1:1000, nrow(.))))

df %>% 
  mutate(diff = difftime(time_b, time_a),
         diff = as.vector(diff))

Can you explain what you want with this example?

1 Like

hey there william, I actually did create a reproduction of the same issue I can into. I'm trying to get the summary of the ride_length but only get class and mode rather than mean, min, max, etc. Here's the following code:

library(tidyverse)
library(dplyr)
library(lubridate)

start_time <- as.character(c("2021-09-12 10:15:22", "2021-10-05 16:50:11", "2021-11-12 12:02:44"))
end_time <- as.character(c("2021-09-12 11:16:21", "2021-10-05 16:57:00", "2021-11-13 04:22:11"))
ride_id_number <- as.numeric(c("2341", "1342", "1234"))

data_table <- data.frame(start_time, end_time, ride_id_number)

str(data_table)

data_table$start_time <- ymd_hms(data_table$start_time)

data_table$end_time <- ymd_hms(data_table$end_time)

str(data_table)

data_table$ride_length <- difftime(data_table$end_time,data_table$start_time, units = "secs")

str(data_table)

summary(data_table)

Try this. Pretty much the same as in the example earlier.

library(tidyverse)
# library(dplyr) # dplyr is loaded when you load the tidyverse
library(lubridate)

start_time <- as.character(c("2021-09-12 10:15:22", "2021-10-05 16:50:11", "2021-11-12 12:02:44"))
end_time <- as.character(c("2021-09-12 11:16:21", "2021-10-05 16:57:00", "2021-11-13 04:22:11"))
ride_id_number <- as.numeric(c("2341", "1342", "1234"))

# your code, but put in tidyverse style
data_table <- tibble(start_time, end_time, ride_id_number) %>% 
  mutate(across(ends_with("time"), ~ymd_hms(.x)),
         ride_length = difftime(end_time, start_time, units = "secs"), 
         ride_length = as.vector(ride_length)) # this line added to convert to vector. it becomes numeric

summary(data_table)
# start_time                     end_time                   ride_id_number  ride_length   
# Min.   :2021-09-12 10:15:22   Min.   :2021-09-12 11:16:21   Min.   :1234   Min.   :  409  
# 1st Qu.:2021-09-24 01:32:46   1st Qu.:2021-09-24 02:06:40   1st Qu.:1288   1st Qu.: 2034  
# Median :2021-10-05 16:50:11   Median :2021-10-05 16:57:00   Median :1342   Median : 3659  
# Mean   :2021-10-10 13:02:45   Mean   :2021-10-10 18:51:50   Mean   :1639   Mean   :20945  
# 3rd Qu.:2021-10-24 14:26:27   3rd Qu.:2021-10-24 22:39:35   3rd Qu.:1842   3rd Qu.:31213  
# Max.   :2021-11-12 12:02:44   Max.   :2021-11-13 04:22:11   Max.   :2341   Max.   :58767 

I appreciate that but I made up those tables. When I try to run that same code, it tells me that the data vectors have to be the same size for tibble to work. I am going to try and look at it tomorrow. I tried running as.vector(as.character(trip_data$ride_length)) and then ran is.numeric() afterward and got True but I don't know. Thank you for your help!

is.numeric() returns TRUE/FALSE
maybe you want as.numeric()?

for ride_length you can do

data_table$ride_length <- lubridate::as.duration(difftime(data_table$end_time,data_table$start_time, units = "secs"))

I didnt see that you used any packages/libraries apart from lubridate, so assuming you want to transition to using tidyverse I've reproduced the code in your post in a more tidyverse way

library(lubridate)
library(tidyverse)

start_time <- as.character(c("2021-09-12 10:15:22", "2021-10-05 16:50:11", "2021-11-12 12:02:44"))
end_time <- as.character(c("2021-09-12 11:16:21", "2021-10-05 16:57:00", "2021-11-13 04:22:11"))
ride_id_number <- as.numeric(c("2341", "1342", "1234"))

data_table <- tibble(start_time, end_time, ride_id_number)|>  mutate(
       start_time = ymd_hms(start_time),
       end_time = ymd_hms(end_time),
  ride_length = lubridate::as.duration(
  difftime(end_time,start_time, units = "secs")
  ))


summary(data_table)
2 Likes

Thank you so much nirgrahamuk. This worked for me without having to redo the whole analysis.

Thank you. I already used as.numeric() beforehand. But I found the solution thankfully.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.