Separating Words in a Column

Hi, I was wondering if anyone can help me separate certain words from the "des" column and put it into a new column. I was hoping to take out just the location of the hit part. For example, in the first one I would just want "right center field". Another example is "left field".

homeruns <- data.frame(
des = as.factor(c("Trevor Story homers (37) on a fly ball to right center field. ", "Nolan Arenado homers (38) on a fly ball to left field. ",
"Max Muncy homers (35) on a fly ball to left center field. Joc Pederson scores. ",
"Cody Bellinger homers (25) on a fly ball to right center field. Max Muncy scores. ",
"Anthony Rizzo homers (25) on a fly ball to right field. ",
"Travis Shaw homers (32) on a line drive to right field. ",
"Taylor Ward homers (6) on a fly ball to left center field. Jefry Marte scores. ",
"Austin Barnes homers (4) on a fly ball to left center field. ",
"Trevor Story homers (36) on a fly ball to center field. ",
"Nolan Arenado homers (37) on a line drive to left field. ",
"Max Kepler homers (20) on a fly ball to right field. Mitch Garver scores. ",
"Franklin Barreto homers (5) on a fly ball to center field. Matt Olson scores. ",
"Willson Contreras homers (10) on a line drive to left field. Kris Bryant scores. ")),
game_date = as.factor(c("10/1/2018", "10/1/2018", "10/1/2018", "10/1/2018",
"10/1/2018", "9/30/2018", "9/30/2018",
"9/30/2018", "9/30/2018", "9/30/2018",
"9/30/2018", "9/30/2018", "9/30/2018"))
)

Like this?

homeruns <- data.frame(
    des = as.factor(c("Trevor Story homers (37) on a fly ball to right center field. ", "Nolan Arenado homers (38) on a fly ball to left field. ",
                      "Max Muncy homers (35) on a fly ball to left center field. Joc Pederson scores. ",
                      "Cody Bellinger homers (25) on a fly ball to right center field. Max Muncy scores. ",
                      "Anthony Rizzo homers (25) on a fly ball to right field. ",
                      "Travis Shaw homers (32) on a line drive to right field. ",
                      "Taylor Ward homers (6) on a fly ball to left center field. Jefry Marte scores. ",
                      "Austin Barnes homers (4) on a fly ball to left center field. ",
                      "Trevor Story homers (36) on a fly ball to center field. ",
                      "Nolan Arenado homers (37) on a line drive to left field. ",
                      "Max Kepler homers (20) on a fly ball to right field. Mitch Garver scores. ",
                      "Franklin Barreto homers (5) on a fly ball to center field. Matt Olson scores. ",
                      "Willson Contreras homers (10) on a line drive to left field. Kris Bryant scores. ")),
    game_date = as.factor(c("10/1/2018", "10/1/2018", "10/1/2018", "10/1/2018",
                            "10/1/2018", "9/30/2018", "9/30/2018",
                            "9/30/2018", "9/30/2018", "9/30/2018",
                            "9/30/2018", "9/30/2018", "9/30/2018"))
)
library(tidyverse)
homeruns %>% 
    mutate(des = as.character(des), 
           location = str_extract(des, "(?<=\\sto\\s)[^\\.]+(?=\\.)")) %>% 
    select(location)
#>              location
#> 1  right center field
#> 2          left field
#> 3   left center field
#> 4  right center field
#> 5         right field
#> 6         right field
#> 7   left center field
#> 8   left center field
#> 9        center field
#> 10         left field
#> 11        right field
#> 12       center field
#> 13         left field

Yes that looks awesome. Thank you

If you do not mind, can you help me with this issue too? I have the average dimensions of an MLB stadium and I was hoping to see how many home runs a player would have at this hypothetical park. I am assuming I would you an if statement, but I am not that experienced in that area. Thanks!

homeruns_all <- data.frame(stringsAsFactors=FALSE,
Location = c("right center field", "left field", "left center field",
"right field", "right field"),
hit_distance_sc = c(409, 414, 421, 429, 389),
player_name = as.factor(c("Trevor Story", "Nolan Arenado", "Max Muncy",
"Anthony Rizzo", "Travis Shaw"))
)

if (homeruns_all$Location=="right center field" & homeruns_all$hit_distance_sc > 377) {
print("1")
} else if ( homeruns_all$Location=="right field" & homeruns_all$hit_distance_sc > 328) {
print("1")
} else if ( homeruns_all$Location=="center field" & homeruns_all$hit_distance_sc > 405) {
print("1")
} else if ( homeruns_all$Location=="left field" & homeruns_all$hit_distance_sc > 375) {
print("1")
} else if ( homeruns_all$Location=="left center field" & homeruns_all$hit_distance_sc > 332) {
print("1")
} else {
print("0")
}

Is this what you want?

Note: If you have more questions not directly related to the one in your topic title, please ask them on a new topic.

library(dplyr)

homeruns_all <- data.frame(stringsAsFactors=FALSE,
                           Location = c("right center field", "left field", "left center field",
                                        "right field", "right field"),
                           hit_distance_sc = c(409, 414, 421, 429, 389),
                           player_name = as.factor(c("Trevor Story", "Nolan Arenado", "Max Muncy",
                                                     "Anthony Rizzo", "Travis Shaw"))
)

dimensions <- data.frame(stringsAsFactors = FALSE,
                         Location = c("right center field", "right field", "center field",
                                      "left field", "left center field"),
                         location_length = c(377, 328, 405, 375, 332))
homeruns_all %>% 
    left_join(dimensions, by = "Location") %>% 
    mutate(homerun = if_else(hit_distance_sc > location_length, 1, 0))
#>             Location hit_distance_sc   player_name location_length homerun
#> 1 right center field             409  Trevor Story             377       1
#> 2         left field             414 Nolan Arenado             375       1
#> 3  left center field             421     Max Muncy             332       1
#> 4        right field             429 Anthony Rizzo             328       1
#> 5        right field             389   Travis Shaw             328       1

Yes, that is what I want. Thank you again!

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.