Hi everyone, I am new to RStudio. I am working with a dataframe that consists of 5 columns: SampleID; chr; pos; ref; mut. These are variant calls from a large cohort of samples (>900 unique SampleIDs).
I want to filter this dataframe and create a new dataframe that includes rows only corresponding to a specific list of SampleIDs (~100 unique SampleIDs).
Is there a way I can import my list of desired SampleIDs, filter the original dataframe and create a new dataframe that consists only of the data from my list of SampleIDs?
### asssuming the csv is like
# SampleID
# 342
# 377
# 899
sample_ids <- read.csv(file=#path to your csv within quote marks "myfile.csv"
)
library(tidyverse)
my_second_df <- filter(my_first_df,
SampleID %in% sample_ids$SampleID)
Error: Problem with filter() input ..1.
x Input ..1 must be of size 5157308 or 1, not size 0.
i Input ..1 is df1$sampleID %in% sample_ids$sampleID.
Run rlang::last_error() to see where the error occurred.