Baseball Data Question

Hello,

I have this data and I am trying to get a specific name out with all their data.

player_name spin_rate sw_strike_pc
1 Justin Verlander 2618 0.158
2 Reynaldo Lopez 2102 0.077
3 Sean Newcomb 2179 0.102
4 J.A. Happ 2334 0.120
5 Kevin Gausman 2168 0.075
6 Max Scherzer 2487 0.154
7 Mike Clevinger 2298 0.090
8 James Paxton 2283 0.137
9 Gerrit Cole 2379 0.153
10 Jake Odorizzi 2255 0.119
11 Luis Severino 2365 0.106
12 Blake Snell 2364 0.103
13 Jose Quintana 2019 0.080
14 German Marquez 2207 0.090
15 Nick Pivetta 2267 0.120

My data.frame name is called bdat.

bdat["Justin Verlander",]

When I type this, I get
player_name spin_rate sw_strike_pc
NA NA NA

Any thoughts?

Try any of these

bdat[bdat$player_name == "Justin Verlander", ]
library(dplyr)
filter(bdat, player_name == "Justin Verlander")
subset(bdat, player_name == "Justin Verlander")

Awesome. Thanks for your help.

4-seam%20fastball

I just created this graph. Is there any way to label exact points with the player's name?

Also, do you think that using geom_smooth to show the general trend of the points is a good idea. Or should I find a regression line using the lm() function?

You can use geom_text() or geom_label() to add text to points. You will probably want to subset your data set to just a few points and do something like

+ geom_text(mapping = aes(x = spin_rate, y = sw_strike_pc, label = player_name), data = ReducedData)

There are parameters like vjust, hjust, nudge_x and nudge_y to tweak the text position relative to the points.
If you make a reproducible example, my suggestions would be less likely to have errors because I could test them.

1 Like

You can add a regression line using geom_smooth(). I think the syntax is

+ geom_smooth(method = "lm")

Either method is probably fine, unless you know the relationship should be linear.

1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.