Hi. I am trying to drop rows with negative values and values named "HQ QR". When I run my code my global environment shows a sensible reduction in obs but when I check my data frame it has so many NAs and things aren't adding up. All_trips is my df.
#This is the code I ran to drop rows
all_trips_v2 <- all_trips[!(all_trips$start_station_name == "HQ QR" | all_trips$ride_length<=0),]
#This is the code I ran to check any NA values present. The results it returned show that the NAs are more than the rows in my data set. My data set rows are about 6 million and the NA values returned are about 80 million.
My attempts at solving this:
When I drop the rows separately, the code with "HQ QR" values gives me NAs and the code with negative values works. My guess is "HQ QR" does not exist? I also tried checking for NAs in my original dataset before dropping the rows and the code showed that my NAs are significantly high about 3/4 of my rows.This is inaccurate. Is it possible that one of my calculations caused this?