Applied Predictive Modeling is a great book and it's what got me into machine learning.
Elements of Statistical Learning is more theoretical. A lot of wisdom, but in my opinion, it's not really that necessary to know the finer details of how regularized regression converges or the differences between various tree regression approaches. I didn't take as much from it.
This is a boring recommendation, but the single book I have learned the most from is Applied Linear Statistical Models by Kutner et al. Honestly, data science is an incomplete subject on its own. I don't think the data science literature is going to do a good job of teaching the assumptions made by treating each row of data as an independent sample, establishing causality, least squares and maximum likelihood, autocorrelated data, multicollinearity, ... I've met a lot of data scientists without this stats foundation who get a data set and just start hacking.