Hi guys! I recently decided to refresh a bit some of my university learnings on statistics and have been looking for good books. Unfortunately those those that I was learning from weren't especially practice orientated - I'm looking for ones that talk about stats from a more data science, practical point of view and blend nicely some of the statistical concepts with machine learning. Would you have anything good to recommend? Thank!

# What are your favorite books on pure statistics?

**baptiste**#2

Note a book but I really enjoy watching the Opinionated Lessons On Statistics series of videos. Someday I'd like to make a companion package to reproduce the examples with R code.

Two really great free ones:

http://www-bcf.usc.edu/~gareth/ISL/

https://web.stanford.edu/~hastie/ElemStatLearn/

"An Introduction to Statistical Learning" is exactly that, an intro, while "Elements" goes a lot deeper on the same concepts.

Personally, I think that Applied Predictive Modeling does a really fantastic job of balancing theory and practice, and gives you a ton of re-workable examples in R using the caret package.

**konradino**#4

I completely agree: Applied Predictive Modelling is definitely my number 1. I read "An Introduction to Statistical Learning" and use "Elements" more as a go to reference when I need to check something cause that one is really a biggie. I completely agree that all 3 are really awesome but more for ML purposes, whereas I'm looking for something slightly more pure stats orientated. Describing things such as: estimators theory, hypothesis testing, confidence intervals, power and sample size etc. Anything else you can recommend?

**dlsweet**#5

It's a little harder to read and pretty theoretical but Statistical Inference by Casella & Berger has been fairly common among graduate programs for years.

Also, it's ridiculously expensive on Amazon, I got mine for ~$30 on Ebay.

**Myfanwy**#6

I'm a big fan of Richard McElreath's *Statistical Rethinking*. It provides a great intro to Bayesian statistical applications, with lots of practice problems. No dedicated section on machine learning, though.

**terence**#7

I second Richard McElreath's *Statistical Rethinking*. I also like Andrew Gelman and Jennifer Hill's *Data Analysis using Regression and Multilevel/Hierarchical Models*. I believe Gelman is working on a second edition that uses Stan/RStan. No machine learning though.

Depending on whether you're looking at graduate or undergraduate level texts, there's also Angrist and Pischke's *Mostly Harmless Econometrics* and *Mastering 'Metrics*.

**pavopax**#8

*Wasserman: All of Statistics*

If you are slogging through a stats class (with Casella and Berger as your textbook, as above), then grab this and it will give you the basics in a clear manner.

+1 for *Mostly Harmless Econometrics*. IMHO one of the most underappreciated gems in this genre. Econometrics hasn't reached the same buzzword status as machine learning or data science, but it brings a really valuable perspective thinking "what would be my *ideal* data and experimental *design* for this?" as a tool to think about how to approach a problem with whatever actual data and information you have

**alexpghayes**#10

Highly recommend Statistics by Freedman, Pisani and Purves as a first statistics text. Clearest and easiest to read math book I've ever found. Mathematical details at about a highschool level, and it really does wonders for intuition. Also it's great to share with family members.

**alexpghayes**#12

Yeah I had a real hard time with both Casella and Berger and Gelman until I had a semester of probability under my belt.

**Max**#14

Statistics for Experimenters (aka BHH) is one of my favorites. The first edition is a little more concise but is probably more difficult to find.

**scottbrenstuhl**#15

I just started working my way through Applied Predictive Modeling a couple weeks ago so excited to see it recommended so many times!

My recommendation might be a bit off from what you're asking for but I loved *Naked Statistics* by Charles Wheelan because reading it was the first time I felt like I really would be able to learn statistics and helped give me the motivation/courage to crack more intimidating textbooks to get into the details.

**raybuhr**#16

In addition to the classics, of Introduction to Statistical Learning in R and Elements of Statistical Learning, I also recommend the newer entry from Hastie, Computer Age Statistical Inference. I haven't finished CASI -- only read a few random chapters -- but I really like how it is laid out, with focus on not just the math, but also the history. It's a great way to introduce some of the statistics in data science and help explain how the field has grown into what it is today.

If you are not 100% focused on using R and open to learning through Python, I also highly recommend the Allen Downey books Think Stats 2 and Think Bayes. They are well written and favor teaching through code instead of just math, which was really helpful for me.

Lastly, I thoroughly enjoyed Machine Learning for Hackers and its corresponding GitHub repo. It's a whirlwind tour of the most common/basic algorithms used in data science (outside of deep learning) and is focused more on making sure you understand the high-level concepts and how to use them than making sure you understand the math. In that regard, it's a great companion book to ISLR/ESL.

**konradino**#17

Thank you so much guys! I think everyone in this group will find the best content for his/ herself. Personally I will start either with this one: "All of Statistics: A Concise Course in Statistical Inference" or "Statistical Rethinking: A Bayesian Course with Examples in R and Stan" just to get all the foundations right. I think " Statistics by Freedman, Pisani and Purves" would be a good choice but it has twice as many pages so I'd rather take a shortcut here

**mara**#18

If you're ever stuck and need something explained in a new way, I really like Introduction to Probability Theory and Statistics by Javier R Movellan.

Among other things, it's free and online ! I often use it as my go-to for explaining something that might require a quick stats refresher.

**rdpeng**#19

Just want to give a plug for one of my all time favorite statistics books, Richard Royall’s *Statistical Evidence*. I read this very early in my career and it had a profound effect on the way that I think about statistics and data analysis. While it’s easy to get sucked into arguments about likelihood vs. Bayesian vs. frequentist, I think the overall message of the book is nevertheless very interesting.

**alexilliamson**#20

While it’s easy to get sucked into arguments about likelihood vs. Bayesian vs. frequentist

Are you saying that *Statistical Evidence* avoids the Bayesian vs frequentist debate, or that it indulges it but is a worthwhile read anyway?

Either way I'll probably check it out, so thanks for the suggestion. Just want to know what I'm getting myself into