Hi all,
Quite new to R but making steady progress. Started about 2 or 3 years ago and feeling more comfortable all the time. However, I feel like there are areas where I can be more efficient and likely make code easier to read, share, and utilise. I'm looking for some general tips and things that individuals have found that makes life a bit easier. A couple of specific areas that would be awesome:
- Rerunning analysis across slightly different datasets. Two examples here. First, if I rerun the analysis (machine learning, multilevel modelling) on the same dataset with different variables my current workflow has been to open a new markdown file, go to the data cleaning stage process and add/remove the variables of interest. Second, running the same machine learning analysis across different subsets of the data. Following a similar pattern here, where it'd be copy and pasted and then changed to assess subset 1 or subset 2.
- Removing large numbers of variables. I work within some datasets that have large variables and wondering if anyone has any tips or strategies to fine tune these down. I do use options such as starts_with, ends_with, contains, but at some stage I will have a dplyr::select() with a large number of variables written. Not sure if there is a better way to do this!
Thank you in advance for any tips or advice. If anything needs to be clearer please let me know