Importing A CSV file to Rstudio

I am beginning to work on a Case Study project using Fit bit Data, the data is in a zip file which I downloaded to my PC, I tried to import it to my R Studio directory however when I try to create a data frame I get an error message stating
fitbit_df <-read_csv("Fitabase_Data_4.12.16-5.12.16")
Error: 'Fitabase_Data_4.12.16-5.12.16' does not exist in current working directory ('/cloud/project').

The url. for the data is FitBit Fitness Tracker Data | Kaggle
Needless to say that I was not able to create a data frame so that I can clean my data. I am however able to view the data with the head function when I isolate the files inside of the zip files however when I tried to create a df it states again that the file does not exist in the directory.

View(dailyActivity_merged)
head("dailyActivity_merged")
[1] "dailyActivity_merged"
head(dailyActivity_merged)

A tibble: 6 Γ— 15

      Id Activ…¹ Total…² Total…³ Track…⁴ Logge…⁡ VeryA…⁢ Moder…⁷ Light…⁸ Seden…⁹
   <dbl> <chr>     <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>

1 1503960366 4/12/2… 13162 8.5 8.5 0 1.88 0.550 6.06 0
2 1503960366 4/13/2… 10735 6.97 6.97 0 1.57 0.690 4.71 0
3 1503960366 4/14/2… 10460 6.74 6.74 0 2.44 0.400 3.91 0
4 1503960366 4/15/2… 9762 6.28 6.28 0 2.14 1.26 2.83 0
5 1503960366 4/16/2… 12669 8.16 8.16 0 2.71 0.410 5.04 0
6 1503960366 4/17/2… 9705 6.48 6.48 0 3.19 0.780 2.51 0

… with 5 more variables: VeryActiveMinutes , FairlyActiveMinutes ,

LightlyActiveMinutes , SedentaryMinutes , Calories , and

abbreviated variable names ¹​ActivityDate, ²​TotalSteps, ³​TotalDistance,

⁴​TrackerDistance, ⁡​LoggedActivitiesDistance, ⁢​VeryActiveDistance,

⁷​ModeratelyActiveDistance, ⁸​LightActiveDistance, ⁹​SedentaryActiveDistance

:information_source: Use colnames() to see all variable names

read_csv("dailyActivity_merged.csv")
Error: 'dailyActivity_merged.csv' does not exist in current working directory ('/cloud/project').
read_csv("dailyAct

I know that I had a similar issue before however that was due to the link issued for the assignment. Any advice?

I don't see a file named Fitabase_Data_4.12.16-5.12.16 in your image. I see a folder named
Fitabase Data 4.12.16-5.12.16. Is the file you want to read in that folder? What is the name of the file?

You could extract each .csv file and upload one by one:

Next, try to do this:

dailyActivity <- read.csv("dailyActivity_merged.csv")

Hello @Jogr ,

I am assuming that each file in the zip folder has to be read in and should have a distinct name. Hopefully I have not misunderstood the problem. I have tried to do this on the Posit cloud, you can access it here in case needed (Posit Cloud). Below is the code. Hope this helps.

> # list all the files in the zip folder --------------
> 
> list.files("Fitabase Data 4.12.16-5.12.16/") -> names_data_files
> 
> names_data_files
 [1] "dailyActivity_merged.csv"           "dailyCalories_merged.csv"          
 [3] "dailyIntensities_merged.csv"        "dailySteps_merged.csv"             
 [5] "heartrate_seconds_merged.csv"       "hourlyCalories_merged.csv"         
 [7] "hourlyIntensities_merged.csv"       "hourlySteps_merged.csv"            
 [9] "minuteCaloriesNarrow_merged.csv"    "minuteCaloriesWide_merged.csv"     
[11] "minuteIntensitiesNarrow_merged.csv" "minuteIntensitiesWide_merged.csv"  
[13] "minuteMETsNarrow_merged.csv"        "minuteSleep_merged.csv"            
[15] "minuteStepsNarrow_merged.csv"       "minuteStepsWide_merged.csv"        
[17] "sleepDay_merged.csv"                "weightLogInfo_merged.csv"          
> 
>  ## it is clear from the file names that there are multiple files that may have
>  ## different sets of columns. These best be read individually. All are csv files.
> 
> 
> # Read all files --------------------------------------
> 
> 
> purrr::map(names_data_files,
+            ~readr::read_csv(here::here("Fitabase Data 4.12.16-5.12.16/",.x))) |>
+   setNames(stringr::str_remove(names_data_files,".csv")) -> all_data_files
Rows: 940 Columns: 15                                                                                                                        
── Column specification ──────────────────────────────────────────────────────────────────────────────────
Delimiter: ","
chr  (1): ActivityDate
dbl (14): Id, TotalSteps, TotalDistance, TrackerDistance, LoggedActivitiesDistance, VeryActiveDistance...

β„Ή Use `spec()` to retrieve the full column specification for this data.
β„Ή Specify the column types or set `show_col_types = FALSE` to quiet this message.
Rows: 940 Columns: 3                                                                                                                         
── Column specification ──────────────────────────────────────────────────────────────────────────────────
Delimiter: ","
chr (1): ActivityDay
dbl (2): Id, Calories

β„Ή Use `spec()` to retrieve the full column specification for this data.
β„Ή Specify the column types or set `show_col_types = FALSE` to quiet this message.
Rows: 940 Columns: 10                                                                                                                        
── Column specification ──────────────────────────────────────────────────────────────────────────────────
Delimiter: ","
chr (1): ActivityDay
dbl (9): Id, SedentaryMinutes, LightlyActiveMinutes, FairlyActiveMinutes, VeryActiveMinutes, Sedentary...

β„Ή Use `spec()` to retrieve the full column specification for this data.
β„Ή Specify the column types or set `show_col_types = FALSE` to quiet this message.
Rows: 940 Columns: 3                                                                                                                         
── Column specification ──────────────────────────────────────────────────────────────────────────────────
Delimiter: ","
chr (1): ActivityDay
dbl (2): Id, StepTotal

β„Ή Use `spec()` to retrieve the full column specification for this data.
β„Ή Specify the column types or set `show_col_types = FALSE` to quiet this message.
Rows: 2483658 Columns: 3                                                                                                                     
── Column specification ──────────────────────────────────────────────────────────────────────────────────
Delimiter: ","
chr (1): Time
dbl (2): Id, Value

β„Ή Use `spec()` to retrieve the full column specification for this data.
β„Ή Specify the column types or set `show_col_types = FALSE` to quiet this message.
Rows: 22099 Columns: 3                                                                                                                       
── Column specification ──────────────────────────────────────────────────────────────────────────────────
Delimiter: ","
chr (1): ActivityHour
dbl (2): Id, Calories

β„Ή Use `spec()` to retrieve the full column specification for this data.
β„Ή Specify the column types or set `show_col_types = FALSE` to quiet this message.
Rows: 22099 Columns: 4                                                                                                                       
── Column specification ──────────────────────────────────────────────────────────────────────────────────
Delimiter: ","
chr (1): ActivityHour
dbl (3): Id, TotalIntensity, AverageIntensity

β„Ή Use `spec()` to retrieve the full column specification for this data.
β„Ή Specify the column types or set `show_col_types = FALSE` to quiet this message.
Rows: 22099 Columns: 3                                                                                                                       
── Column specification ──────────────────────────────────────────────────────────────────────────────────
Delimiter: ","
chr (1): ActivityHour
dbl (2): Id, StepTotal

β„Ή Use `spec()` to retrieve the full column specification for this data.
β„Ή Specify the column types or set `show_col_types = FALSE` to quiet this message.
Rows: 1325580 Columns: 3                                                                                                                     
── Column specification ──────────────────────────────────────────────────────────────────────────────────
Delimiter: ","
chr (1): ActivityMinute
dbl (2): Id, Calories

β„Ή Use `spec()` to retrieve the full column specification for this data.
β„Ή Specify the column types or set `show_col_types = FALSE` to quiet this message.
Rows: 21645 Columns: 62                                                                                                                      
── Column specification ──────────────────────────────────────────────────────────────────────────────────
Delimiter: ","
chr  (1): ActivityHour
dbl (61): Id, Calories00, Calories01, Calories02, Calories03, Calories04, Calories05, Calories06, Calo...

β„Ή Use `spec()` to retrieve the full column specification for this data.
β„Ή Specify the column types or set `show_col_types = FALSE` to quiet this message.
Rows: 1325580 Columns: 3                                                                                                                     
── Column specification ──────────────────────────────────────────────────────────────────────────────────
Delimiter: ","
chr (1): ActivityMinute
dbl (2): Id, Intensity

β„Ή Use `spec()` to retrieve the full column specification for this data.
β„Ή Specify the column types or set `show_col_types = FALSE` to quiet this message.
Rows: 21645 Columns: 62                                                                                                                      
── Column specification ──────────────────────────────────────────────────────────────────────────────────
Delimiter: ","
chr  (1): ActivityHour
dbl (61): Id, Intensity00, Intensity01, Intensity02, Intensity03, Intensity04, Intensity05, Intensity0...

β„Ή Use `spec()` to retrieve the full column specification for this data.
β„Ή Specify the column types or set `show_col_types = FALSE` to quiet this message.
Rows: 1325580 Columns: 3                                                                                                                     
── Column specification ──────────────────────────────────────────────────────────────────────────────────
Delimiter: ","
chr (1): ActivityMinute
dbl (2): Id, METs

β„Ή Use `spec()` to retrieve the full column specification for this data.
β„Ή Specify the column types or set `show_col_types = FALSE` to quiet this message.
Rows: 188521 Columns: 4                                                                                                                      
── Column specification ──────────────────────────────────────────────────────────────────────────────────
Delimiter: ","
chr (1): date
dbl (3): Id, value, logId

β„Ή Use `spec()` to retrieve the full column specification for this data.
β„Ή Specify the column types or set `show_col_types = FALSE` to quiet this message.
Rows: 1325580 Columns: 3                                                                                                                     
── Column specification ──────────────────────────────────────────────────────────────────────────────────
Delimiter: ","
chr (1): ActivityMinute
dbl (2): Id, Steps

β„Ή Use `spec()` to retrieve the full column specification for this data.
β„Ή Specify the column types or set `show_col_types = FALSE` to quiet this message.
Rows: 21645 Columns: 62                                                                                                                      
── Column specification ──────────────────────────────────────────────────────────────────────────────────
Delimiter: ","
chr  (1): ActivityHour
dbl (61): Id, Steps00, Steps01, Steps02, Steps03, Steps04, Steps05, Steps06, Steps07, Steps08, Steps09...

β„Ή Use `spec()` to retrieve the full column specification for this data.
β„Ή Specify the column types or set `show_col_types = FALSE` to quiet this message.
Rows: 413 Columns: 5                                                                                                                         
── Column specification ──────────────────────────────────────────────────────────────────────────────────
Delimiter: ","
chr (1): SleepDay
dbl (4): Id, TotalSleepRecords, TotalMinutesAsleep, TotalTimeInBed

β„Ή Use `spec()` to retrieve the full column specification for this data.
β„Ή Specify the column types or set `show_col_types = FALSE` to quiet this message.
Rows: 67 Columns: 8                                                                                                                          
── Column specification ──────────────────────────────────────────────────────────────────────────────────
Delimiter: ","
chr (1): Date
dbl (6): Id, WeightKg, WeightPounds, Fat, BMI, LogId
lgl (1): IsManualReport

β„Ή Use `spec()` to retrieve the full column specification for this data.
β„Ή Specify the column types or set `show_col_types = FALSE` to quiet this message.
> 
> # names of all files -------------
> 
> # All files are read in as  data frames and each df is a distinct element fo a list (all_data_files)
> 
> names(all_data_files)
 [1] "dailyActivity_merged"           "dailyCalories_merged"           "dailyIntensities_merged"       
 [4] "dailySteps_merged"              "heartrate_seconds_merged"       "hourlyCalories_merged"         
 [7] "hourlyIntensities_merged"       "hourlySteps_merged"             "minuteCaloriesNarrow_merged"   
[10] "minuteCaloriesWide_merged"      "minuteIntensitiesNarrow_merged" "minuteIntensitiesWide_merged"  
[13] "minuteMETsNarrow_merged"        "minuteSleep_merged"             "minuteStepsNarrow_merged"      
[16] "minuteStepsWide_merged"         "sleepDay_merged"                "weightLogInfo_merged"          
> 
> ## one can acess individual df as follows
> 
> all_data_files$dailyIntensities_merged
# A tibble: 940 Γ— 10
           Id ActivityDay SedentaryMinutes LightlyActive…¹ Fairl…² VeryA…³ Seden…⁴ Light…⁡ Moder…⁢ VeryA…⁷
        <dbl> <chr>                  <dbl>           <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>
 1 1503960366 4/12/2016                728             328      13      25       0    6.06   0.550    1.88
 2 1503960366 4/13/2016                776             217      19      21       0    4.71   0.690    1.57
 3 1503960366 4/14/2016               1218             181      11      30       0    3.91   0.400    2.44
 4 1503960366 4/15/2016                726             209      34      29       0    2.83   1.26     2.14
 5 1503960366 4/16/2016                773             221      10      36       0    5.04   0.410    2.71
 6 1503960366 4/17/2016                539             164      20      38       0    2.51   0.780    3.19
 7 1503960366 4/18/2016               1149             233      16      42       0    4.71   0.640    3.25
 8 1503960366 4/19/2016                775             264      31      50       0    5.03   1.32     3.53
 9 1503960366 4/20/2016                818             205      12      28       0    4.24   0.480    1.96
10 1503960366 4/21/2016                838             211       8      19       0    4.65   0.350    1.34
# … with 930 more rows, and abbreviated variable names ¹​LightlyActiveMinutes, ²​FairlyActiveMinutes,
#   ³​VeryActiveMinutes, ⁴​SedentaryActiveDistance, ⁡​LightActiveDistance, ⁢​ModeratelyActiveDistance,
#   ⁷​VeryActiveDistance
# β„Ή Use `print(n = ...)` to see more rows
> 

Best,
Ayush

1 Like

@Deepti_Prasad ,

The message you receive is generated after reading in a csv. I receive a similar message as well, see quoted reference below. You can turn off these messages if you want by setting show_col_types = FLASE in read_csv function call. This message is not a mistake or error.

Assuming daily_activity is an object in your environment. I would still need more info of your code to understand why this message is show with the head function call. Is there a read_csv() command (or a similar one) in the same code chunk as the head() function call?

Regards,
Ayush

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.