One always has to guard against outcome motivation— modeling to achieve some desired outcome. p-hacking is the poster child for this. However, feature engineering differs in principle. It's more like the choice among statistical tests that do address the same question different ways (such as with or without a normality assumption). Also, using a subset reduces the likelihood of over-fitting.
The ultimate question is judgmental
What information am I losing my dropping this feature?