My first Data Analysis project - need help

Hello All,

I am working on my first data analysis project and not familiar with everything yet, I hope you would be able to help with my question. I am working with BellaBeat Dataset from Kaggle and wanted to do some data cleaning.
I run install.packages and library functions for all packages needed. Currently I am trying to run basic pipe function and I receive a message: Error in clean_names(.) : could not find function "clean_names". If I remove this function from the pipe, it gives me error for different line with the same error message. I researched online and in R Community and it looks like it could be issue with not reading libraries properly. I tried to re-install everything, however I am receiving the same message. I tried to run basic function without pipe, however it didn't recognize the column names which I copy pasted from the table.

my query:

install.packages("tidyverse")
library(tidyverse)
head(daily_activity)
str(daily_activity)
install.packages("here")
library(here)
install.packages("skimr")
library(skimr)
install.packages("janitor")
library(janitor)
install.packages("dplyr")
library(dplyr)
glimpse(daily_activity)
str(daily_activity)
install.packages("lubridate")
library(lubridate)
install.packages("tidyr")
library(tidyr)
head(daily_activity)


Daily <- daily_activity %>%
  clean_names() %>%
  mutate(ActivityDate=mdy(ActivityDate)) %>%
  rename(ActivityDate=Date, TotalSteps=Steps) %>%
  select(-c(5:10))

project link: RStudio Cloud
Any advise would be appreciated.

Thank you,
Valeryia

In your project, I have managed to get a pipeline working using this code:

library(tidyverse)
library(janitor)
library(lubridate)

daily_activity <- read_csv("/cloud/BellaBeat/dailyActivity_merged.csv")

daily <- daily_activity %>%
  clean_names() %>%
  mutate(activity_date=mdy(activity_date)) %>%
  rename(date = activity_date, 
         steps = total_steps) %>%
  select(-c(5:10))

head(daily)

To unpack what is happening here:

  1. janitor::clean_names makes the column names of your data frame "snake_case" (easier to use!)
  2. dplyr::mutate is converting the activity_date column to a date using the function lubridate::mdy
  3. dplyr::rename is renaming the column headers, using the syntax new_name = old_name
  4. dplyr::select is dropping the fifth through the tenth columns

When writing a script, it is not typical to include install.packages as you will not want to re-do that with every re-run of a script. You install once, you load (using library) every time.

Thank you very much Jack! Appreciate your help.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.