Ideally, you would provide a sample of data that's already been read into R, rather than a csv or other data file. If you need help reading in your data, let us know. For now, if you have a set of files in your working directory, you can read them into a list of data frames with something like the following:
If your data files are called, say, file1.csv
through file4.csv
you could do:
library(tidyverse)
# Create a vector of file names
files.to.read = paste0("file", 1:4, ".csv")
# Read the files into a list
data.list = map(files.to.read, read_csv)
For the example below, we'll create some fake data to work with using the built-in mtcars
data frame. I'm hoping, based on your question, that this example will be close to your actual problem, but let me know if I'm off the mark.
First, create a new data frame called d
by adding x
and y
columns to the built-in mtcars
data frame. These are our "coordinates". Then, generate a list of four data frames, each containing 20 randomly selected rows from d
. Each of the four data frames includes the x
and y
columns, plus two other columns.
library(tidyverse)
# Generate fake data
d = mtcars %>%
mutate(x = 1:nrow(mtcars),
y = nrow(mtcars):1)
set.seed(45)
data.list = map(c(1,3,5,7), ~ d[sample(1:nrow(d), 20), c(12,13,.x,.x+1)])
Here's what the first three rows of each data frame in data.list
look like.
map(data.list, ~head(.x, 3))
[[1]]
x y mpg cyl
Hornet Sportabout 5 28 18.7 8
Merc 280C 11 22 17.8 6
Merc 450SLC 14 19 15.2 8
[[2]]
x y disp hp
Ferrari Dino 30 3 145.0 175
Lincoln Continental 16 17 460.0 215
Merc 230 9 24 140.8 95
[[3]]
x y drat wt
Cadillac Fleetwood 15 18 2.93 5.25
Merc 230 9 24 3.92 3.15
Fiat 128 18 15 4.08 2.20
[[4]]
x y qsec vs
Lincoln Continental 16 17 17.82 0
Datsun 710 3 30 18.61 1
Dodge Challenger 22 11 16.87 0
Now we'll iteratively run inner_join
on the four data frames, which will return a data frame with only those rows that are present in every data frame (as determined by sharing common values of x
and y
)
reduce(data.list, ~inner_join(.x, .y, by=c("x","y")))
x y mpg cyl disp hp drat wt qsec vs
1 16 17 10.4 8 460.0 215 3.00 5.424 17.82 0
2 6 27 18.1 6 225.0 105 2.76 3.460 20.22 1
3 22 11 15.5 8 318.0 150 2.76 3.520 16.87 0
4 8 25 24.4 4 146.7 62 3.69 3.190 20.00 1
Or we can run a full_join
, which will keep all rows of all data frames but rows other than the four shown above will have missing data in at least two columns:
reduce(data.list, ~full_join(.x, .y, by=c("x","y")))
Click to see output
x y mpg cyl disp hp drat wt qsec vs
1 5 28 18.7 8 360.0 175 NA NA NA NA
2 11 22 17.8 6 167.6 123 3.92 3.440 NA NA
3 14 19 15.2 8 275.8 180 NA NA 18.00 0
4 7 26 14.3 8 NA NA NA NA 15.84 0
5 26 7 27.3 4 NA NA NA NA 18.90 1
6 19 14 30.4 4 NA NA 4.93 1.615 18.52 1
7 31 2 15.0 8 NA NA 3.54 3.570 14.60 0
8 16 17 10.4 8 460.0 215 3.00 5.424 17.82 0
9 29 4 15.8 8 351.0 264 4.22 3.170 NA NA
10 6 27 18.1 6 225.0 105 2.76 3.460 20.22 1
11 4 29 21.4 6 258.0 110 3.08 3.215 NA NA
12 17 16 14.7 8 440.0 230 NA NA 17.42 0
13 24 9 13.3 8 NA NA 3.73 3.840 NA NA
14 25 8 19.2 8 400.0 175 NA NA 17.05 0
15 22 11 15.5 8 318.0 150 2.76 3.520 16.87 0
16 18 15 32.4 4 NA NA 4.08 2.200 NA NA
17 3 30 22.8 4 NA NA 3.85 2.320 18.61 1
18 30 3 19.7 6 145.0 175 NA NA NA NA
19 8 25 24.4 4 146.7 62 3.69 3.190 20.00 1
20 13 20 17.3 8 NA NA NA NA 17.60 0
21 9 24 NA NA 140.8 95 3.92 3.150 NA NA
22 23 10 NA NA 304.0 150 3.15 3.435 17.30 0
23 15 18 NA NA 472.0 205 2.93 5.250 17.98 0
24 2 31 NA NA 160.0 110 NA NA 17.02 0
25 21 12 NA NA 120.1 97 NA NA NA NA
26 28 5 NA NA 95.1 113 3.77 1.513 NA NA
27 27 6 NA NA 120.3 91 4.43 2.140 16.70 0
28 20 13 NA NA 71.1 65 NA NA NA NA
29 32 1 NA NA NA NA 4.11 2.780 NA NA
30 1 32 NA NA NA NA 3.90 2.620 16.46 0
31 10 23 NA NA NA NA 3.92 3.440 18.30 1
32 12 21 NA NA NA NA NA NA 17.40 0
Or if we want to keep all the data frames separate, we can identify the rows that are common to all the data frames and just keep those rows in each data frame separately:
common.rows = map(data.list, ~.x[,c("x","y")]) %>%
reduce(~intersect(.x,.y))
data.list %>%
map(~inner_join(.x, common.rows))
[[1]]
x y mpg cyl
1 16 17 10.4 8
2 6 27 18.1 6
3 22 11 15.5 8
4 8 25 24.4 4
[[2]]
x y disp hp
1 16 17 460.0 215
2 22 11 318.0 150
3 8 25 146.7 62
4 6 27 225.0 105
[[3]]
x y drat wt
1 22 11 2.76 3.520
2 6 27 2.76 3.460
3 8 25 3.69 3.190
4 16 17 3.00 5.424
[[4]]
x y qsec vs
1 16 17 17.82 0
2 22 11 16.87 0
3 8 25 20.00 1
4 6 27 20.22 1