Where I should ask a question about code review/code optimization

tidyverse

#1

I have a use case with a large data set where I need to remove real outliers (meaning with that, impossibly high or low measurements). I'm able to do that today, but I'm not sure I'm doing it the fastest way (surely, plotting is slow as heck) and the most tidyverse idiomatic way. So it's not a code bug question, but let's say code optimization/code review. What do you think the best place to ask would be? Here, Stack Overflow or Code Review? Thanks


#2

Do you have pre-set boundaries on what constitutes an impossibly high/low value in your data? If so I would use a dplyr filter function:

data_sans_anoms <- data %>%
  filter(Variable >= lower_boundary, Variable <= upper_boundary)

#3

I think that's a perfectly reasonable question for #general. You mention you have a large dataset, but even with a reprex snippet, we could use benchmarking to test the speed of different approaches.

I've seen many of these types of questions answered on Stack Overflow and elsewhere also.


#4

Thanks! I think I can prepare a large data set which preserves most of the challenges of my proprietary data set. Get ready for a reprex which will burn your CPUs :stuck_out_tongue_winking_eye: :fire: I’ll prepare it and I should post it by tomorrow, about this time, on #general.