I have a dataset with with multiple classes (< 20) which I want to classify in reference to one of the classes. The final goal is to extract the variables of importance which are useful to distinguish each of the classes vs reference. If it helps to frame the question, an example would be to classify different cancer types vs a single healthy tissue and determine which features are important for the classification of each tumour.
My first naive approach is to subset the dataset and compare each non-reference class to the reference using any number of appropriate methods, starting with generalised linear model and / or random forest, determine model performance and extract VIPs for each comparison. Basically a loop.
However this feels inelegant, so I am wondering which other approaches should be considered for this problem.