Hello everyone. This is my first time posting here. I am new to R. I am currently working on my capstone project from the google data analytics course. I am trying to convert date/time data from character to date format. Some of my dates are in "POSIXct. POSIxt" and some are in character format. I have been using the following code
df$date <- as.Date(df$started_at)
here is an example of the results. my dates change from 10/08/2021 17:15 (this is the character format) to 0010-08-20 (this is after running the code)
I tried using other functions such as strptime(), as.POSIXct() but they give me errors of their own for example the date column has NAs instead of dates or new dates that don't exist appear such as the year 2030 or 2013. My time frame is between 2021 - 2022 so I don't understand where 2030 is coming from. Here is my code from the beginning. Let me know what I am missing
```{r}
library(tidyverse)
library(lubridate)
library(ggplot2)
```
# importing data
```{r}
aug_2021<-read_csv("202108-divvy-tripdata.csv")
sept_2021<-read_csv("202109-divvy-tripdata.csv")
oct_2021<-read_csv("202110-divvy-tripdata.csv")
nov_2021<-read_csv("202111-divvy-tripdata.csv")
dec_2021<-read_csv("202112-divvy-tripdata.csv")
jan_2022<-read_csv("202201-divvy-tripdata.csv")
feb_2022<-read_csv("202202-divvy-tripdata.csv")
mar_2022<-read_csv("202203-divvy-tripdata.csv")
apr_2022<-read_csv("202204-divvy-tripdata.csv")
may_2022<-read_csv("202205-divvy-tripdata.csv")
jun_2022<-read_csv("202206-divvy-tripdata.csv")
jul_2022<-read_csv("202207-divvy-tripdata.csv")
aug_2022<-read_csv("202208-divvy-tripdata.csv")
```
# wrangle and combine data into a single file
```{r}
colnames(aug_2021)
colnames(sept_2021)
colnames(oct_2021)
colnames(nov_2021)
colnames(dec_2021)
colnames(jan_2022)
colnames(feb_2022)
colnames(mar_2022)
colnames(apr_2022)
colnames(may_2022)
colnames(jun_2022)
colnames(jul_2022)
colnames(aug_2022)
```
# inspecting the dataframes and looking for incongurencies
```{r}
str(aug_2021)
str(sept_2021)
str(oct_2021)
str(nov_2021)
str(dec_2021)
str(jan_2022)
str(feb_2022)
str(mar_2022)
str(apr_2022)
str(may_2022)
str(jun_2022)
str(jul_2022)
str(aug_2022)
```
# combining the data into one big data frame
```{r}
all_trips<-rbind(aug_2021,sept_2021,oct_2021,nov_2021,dec_2021,jan_2022,feb_2022,mar_2022,apr_2022,may_2022,jun_2022,jul_2022,aug_2022)
```
# removing unnecessary columns
```{r}
all_trips<-all_trips%>%
select(-c(start_lat,start_lng,end_lat,end_lng,start_station_id,start_station_name,end_station_name,end_station_id))
```
# step 3 clean up and add data to prepare for analysis
# inspecting the new table that has been created
```{r}
colnames(all_trips)
nrow(all_trips)
dim(all_trips)
head(all_trips)
tail(all_trips)
str(all_trips)
summary(all_trips)
```
#check how many observations fall under each usertype
```{r}
table(all_trips$member_casual)
```
# Add columns that list the date, month, day, and year of each ride
#The default format is yyyy-mm-dd
# This will allow us to aggregate ride data for each month, day, or year. Before completing these operations we could only aggregate at the ride level
```{r}
all_trips$date <- as.Date(all_trips$started_at)
```