I have four dataframes, "df1, df2, df3, and df4." Each has different times values corresponding to behavioral events. I want to do the following: I would first like to load all the dataframes into R. Then I would like to add a column using the mutate function with the specific dataframe (e.g. 1, 2, 3, or 4) to each dataframe. Finally, I would like to combine all the frames, after adding an individualized column to each, into one big dataframe so that I can perform other operations on this new dataframe.
The purpose of creating the new columns for each dataframe before combining them all into one, is so that I can refer back to the individual dataframes when I need to.
The dplyr::bind_rows() function might be what you're looking for. If you name each of your dataset inputs, you can add an identifier using the .id argument.
For example:
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
df1 <- df2 <- df3 <- df4 <- band_instruments
bind_rows(
`1` = df1,
`2` = df2,
`3` = df3,
`4` = df4,
.id = "identifier"
)
#> # A tibble: 12 x 3
#> identifier name plays
#> <chr> <chr> <chr>
#> 1 1 John guitar
#> 2 1 Paul bass
#> 3 1 Keith guitar
#> 4 2 John guitar
#> 5 2 Paul bass
#> 6 2 Keith guitar
#> 7 3 John guitar
#> 8 3 Paul bass
#> 9 3 Keith guitar
#> 10 4 John guitar
#> 11 4 Paul bass
#> 12 4 Keith guitar
Hey, so I've included the dput with a header of 100 (since they are pretty big dataframes) for three dataframes. I hope this helps. I am looking to create a new column for each that specifies the dataframe by its ID (1, 2, or 3) and then combine the three into one somehow.
Are you just trying to read in all the files, then add another column based on the source? Something like this using purrr, then adding the source indicator as shown.