Issue with If Else and Text

Hi everyone, question on using an If Else command with text in R. What I'm trying to do is tell R to look at a column and, if the text in that column equals what is in another column to pull a numerical figure from a third column, and if it doesn't to pull a different numerical value. For example, I'm working with football data and want the system to detect is "Minnesota Vikings" is present in the column I direct. The syntax I'm using, with a mutate command to create a new column, is below:

mutate(cover=(ifelse("team_home"=="Longform Name", spread, "spread_favorite")))

Problem is, R doesn't detect that the team names in text match, it is treating everything as though it doesn't match even though tidyverse is loaded up. I checked in excel and according to that, the text does match (ruling out any spacing issues or anything like that). Any suggestions or ideas? I can manipulate my original data from excel to get the output i want without having to match anything but would prefer to figure out why R isn't detecting a match. Thanks!

Hi, and welcome!

Could you post a short reproducible example, called a reprex to make the issue concrete? It's possible to reverse engineer data from the description, but everyone's lazy, especially today :grin:

So here's the code I'm using:

install.packages("dplyr")
install.packages("tidyverse")
library(dplyr)
library(tidyverse)

NFLData<-spreadspoke_scores
view(NFLData)

NFLData %>%
inner_join(Longform) %>%
filter(schedule_season==2017, schedule_week=="17") %>%
mutate(Spread=spread_favorite*-1) %>%
filter(Spread>9.5) %>%
mutate(Winner=score_home-score_away) %>%
mutate(cover=(ifelse("team_home"=="Longform Name", spread, spread_favorite))) %>%
view()

When I view it, the output is:


Does that help?

Second screenshot since they system only lets me do one per post:

Thanks, that gets us part-way there, but doesn't provide NFLData, leaving me to google spreadspoke_scores and wonder if it comes from here. Screenshots are seldom helpful. Running your code through reprex shows the data structure and you can just cut and paste it up to about 50K characters. (Generally, it's best to comment out install.packages() lines.)

But let's see what we can do with the code.

First off, in debugging, it's helpful to assign code blocks to an object

str_df <- NFLData %>%
inner_join(Longform) %>%
filter(schedule_season==2017, schedule_week=="17") %>%
mutate(Spread=spread_favorite*-1) %>%
filter(Spread>9.5) %>%
mutate(Winner=score_home-score_away) 

This gives something that can be inspected with str to see what's inside that you are trying to work in the code line that is giving you trouble.

For your mutate line

mutate(cover=(ifelse("team_home"=="Longform Name", spread, spread_favorite)))

to create cover when you

str_df  %>%  mutate(cover=(ifelse("team_home"=="Longform Name", spread, spread_favorite)))

there must be variables named spread and spread_favorite present to provide the results of a successful test and a failed test, respectively. Assuming you have those, that leaves the test part of ifelse

"team_home"=="Longform Name"

I can eyeball that this test will always fail, and return spread_favorite, never spread. How?

"team_home"=="Longform Name"
#> [1] FALSE

Created on 2019-11-28 by the reprex package (v0.3.0)

just as the test would always pass if it were

"team_home"=="team_home"
#> [1] TRUE

Created on 2019-11-28 by the reprex package (v0.3.0)

If, on the other hand, team_home and Longform Name are variables in str_df, the test can either pass or fail.

team_home <- "Buffalo Bills"
long_name <- "Buffalo Bills"
team_home == long_name
#> [1] TRUE
long_name <- "Chicago Bears"
team_home == long_name
#> [1] FALSE

Created on 2019-11-28 by the reprex package (v0.3.0)

(Do yourself a favor and rename Longform Name to longform_name).

Long didactic answer boils down to

lose the quotes

Sorry, first time user, thought screenshots would help more. Your are correct about the dataset I'm using. I tried removing the quotes as suggested but now get this error:

Error in rep(yes, length.out = length(ans)) :
attempt to replicate an object of type 'closure'

The entire code I am using, including what I'm installing, is:

install.packages("dplyr")
install.packages("tidyverse")
library(dplyr)
library(tidyverse)

NFLData<-spreadspoke_scores
Longform<-Longform
view(NFLData)

NFLData %>%
inner_join(Longform) %>%
filter(schedule_season==2017, schedule_week=="17") %>%
mutate(Spread=spread_favorite*-1) %>%
filter(Spread>9.5) %>%
mutate(Winner=score_home-score_away) %>%
mutate(cover=(ifelse(team_home==longform_name, spread, ""))) %>%
view()

I also re-named the variable as suggested but can't seem to get it to pull what I need. Can confirm that both the columns in question are variables. Any thoughts?

1 Like

To avoid unnecessary back and forths and get to a solution quicker, can you please share a small part of the data set in a copy-paste friendly format?

In case you don't know how to do it, there are many options, which include:

  1. If you have stored the data set in some R object, dput function is very handy.

  2. In case the data set is in a spreadsheet, check out the datapasta package. Take a look at this link.

Does this help? It doesn't look very user friendly to me but I tried other methods and came up empty. Clearly I'm not good at this :frowning:

~schedule_date, ~schedule_season, ~schedule_week, ~team_home, ~team_away, ~stadium, ~team_favorite_id, ~spread_favorite, ~Spread, ~over_under_line, ~weather_detail, ~weather_temperature, ~weather_wind_mph, ~weather_humidity, ~score_home, ~score_away, ~stadium_neutral, ~schedule_playoff, ~game_id, ~Home.Away, ~Winner, ~longform_name, ~cover,
"2017-12-31", 2017, "17", "Minnesota Vikings", "Chicago Bears", "U.S. Bank Stadium", "MIN", -13.5, 13.5, 38.5, "DOME", 72, 0, NA, 23, 10, FALSE, FALSE, NA, TRUE, 13, "Minnesota Vikings", NA,
"2017-12-31", 2017, "17", "New England Patriots", "New York Jets", "Gillette Stadium", "NE", -17, 17, 43.5, NA, 13, 14, NA, 26, 6, FALSE, FALSE, NA, TRUE, 20, "New England Patriots", NA

Thanks. Could you run

str_df <- NFLData %>%
inner_join(Longform) %>%
filter(schedule_season==2017, schedule_week=="17") %>%
mutate(Spread=spread_favorite*-1) %>%
filter(Spread>9.5) %>%
mutate(Winner=score_home-score_away) 

and show the results of

str(str_df)

Sure. Results are:

Classes ‘tbl_df’, ‘tbl’ and 'data.frame': 2 obs. of 22 variables:
schedule_date : POSIXct, format: "2017-12-31" "2017-12-31" schedule_season : num 2017 2017
schedule_week : chr "17" "17" team_home : chr "Minnesota Vikings" "New England Patriots"
team_away : chr "Chicago Bears" "New York Jets" stadium : chr "U.S. Bank Stadium" "Gillette Stadium"
team_favorite_id : chr "MIN" "NE" spread_favorite : num -13.5 -17
Spread : num 13.5 17 over_under_line : num 38.5 43.5
weather_detail : chr "DOME" NA weather_temperature: num 72 13
weather_wind_mph : num 0 14 weather_humidity : num NA NA
score_home : num 23 26 score_away : num 10 6
stadium_neutral : logi FALSE FALSE schedule_playoff : logi FALSE FALSE
game_id : chr NA NA Home-Away : logi TRUE TRUE
Winner : num 13 20 longform_name : chr "Minnesota Vikings" "New England Patriots"

1 Like

Any luck? Just don't want this to get lost :slight_smile:

Thanks. In addition to @andresrcs' suggestion, you can make it easier on the the eyes just by inclosing the cut-and-paste between triple backticks ( ` with no spaces on the first and last lines).

your output here

What we have confirmed is that both team_home and longform_name are available to be passed to ifelse.

As before, we know that we can perform an equality test if that's the case

team_home <- "Buffalo Bills"
long_name <- "Buffalo Bills"
team_home == long_name
#> [1] TRUE
long_name <- "Chicago Bears"
team_home == long_name
#> [1] FALSE

Created on 2019-11-29 by the reprex package (v0.3.0)

What I'd suggest next is taking str_df and using select

toy_example <- str_df %>% select(team_home, long_name, Spread, spread_favorite) 

and use that to test ifelse

toy_example %>% mutate(new_var = ifelse(team_home == long_name, Spread, spread_favorite))

Thanks, I think that finally debugged it

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.