New Variable in Dataframe with pre-conditions

Hi guys,
I am quite new to R and probably have a tricky question for you that might also challenge some more advanced coders.

I want to create a variable that tells me, if any of the Friends of the persons come from the same country, then how much.
So in case of "Alex" the new column in the dataset should show 51% because 51% of his friends come from the same country, in this case "Austria"

In the case of Bob the column should show me in Bob's row that 10% come from the same country.

In case of Felicia it would be 0% because her friends do not come from the same country

Person Name Person Country Most friends come from Most friends come from percent Second most friends come from Second most friends come from percent Third most friends come from Third most friends come from percent
Alex Austria Austria 51,00% Sweden 12,00% Norway 5,00%
Bob Denmark United States 14,00% Denmark 10,00% Spain 5,00%
Charlie France France 30,00% Spain 15,00% United States 12,00%
Diana The Netherlands Spain 32,00% Sweden 28,00% United States 10,00%
Eva Finland Sweden 60,00% United States 20,00% Finland 10,00%
Felicia Germany Austria 24,00% United States 23,00% Spain 17,00%

I would really appreciate it if you could help me with these lines of code because as a beginner I am quite overwhelmed with this task.

Thank you very much in advance,
Lukas

Hi Lukas,

Welcome! In the future, try to share your data in a reproducible fashion. It is not directly obvious how to get your example data into R. I was able to use datapasta but this isn't always the case. Learn about reproducible examples here:

Anyways, on to the problem. I first had to convert some text to numbers and then the key to your problem was using case_when which is similar to a bunch of if-else statements.

library(tidyverse)


friend_country <- data.frame(
   stringsAsFactors = FALSE,
   Person.Name = c("Alex","Bob","Charlie",
                   "Diana","Eva","Felicia"),
   Person.Country = c("Austria","Denmark",
                      "France","The Netherlands","Finland",
                      "Germany"),
   Most.friends.come.from = c("Austria",
                              "United States","France","Spain","Sweden",
                              "Austria"),
   Most.friends.come.from.percent = c("51,00%","14,00%",
                                      "30,00%","32,00%","60,00%","24,00%"),
   Second.most.friends.come.from = c("Sweden","Denmark",
                                     "Spain","Sweden","United States",
                                     "United States"),
   Second.most.friends.come.from.percent = c("12,00%","10,00%",
                                             "15,00%","28,00%","20,00%","23,00%"),
   Third.most.friends.come.from = c("Norway","Spain",
                                    "United States","United States",
                                    "Finland","Spain"),
   Third.most.friends.come.from.percent = c("5,00%","5,00%",
                                            "12,00%","10,00%","10,00%","17,00%")
)

conv_percent <- function(x){
   str_replace_all(x, ",", ".") %>% parse_number() # in my country, . is used where you use , so I had to replace this
}

friend_country %>%
   mutate(
      Most.friends.come.from.percent=conv_percent(Most.friends.come.from.percent),
      Second.most.friends.come.from.percent=conv_percent(Second.most.friends.come.from.percent),
      Third.most.friends.come.from.percent=conv_percent(Third.most.friends.come.from.percent),
      Self.country.percent=case_when(
         Person.Country==Most.friends.come.from~Most.friends.come.from.percent,
         Person.Country==Second.most.friends.come.from~Second.most.friends.come.from.percent,
         Person.Country==Third.most.friends.come.from~Third.most.friends.come.from.percent,
         TRUE ~ 0
      )
   )
#>   Person.Name  Person.Country Most.friends.come.from
#> 1        Alex         Austria                Austria
#> 2         Bob         Denmark          United States
#> 3     Charlie          France                 France
#> 4       Diana The Netherlands                  Spain
#> 5         Eva         Finland                 Sweden
#> 6     Felicia         Germany                Austria
#>   Most.friends.come.from.percent Second.most.friends.come.from
#> 1                             51                        Sweden
#> 2                             14                       Denmark
#> 3                             30                         Spain
#> 4                             32                        Sweden
#> 5                             60                 United States
#> 6                             24                 United States
#>   Second.most.friends.come.from.percent Third.most.friends.come.from
#> 1                                    12                       Norway
#> 2                                    10                        Spain
#> 3                                    15                United States
#> 4                                    28                United States
#> 5                                    20                      Finland
#> 6                                    23                        Spain
#>   Third.most.friends.come.from.percent Self.country.percent
#> 1                                    5                   51
#> 2                                    5                   10
#> 3                                   12                   30
#> 4                                   10                    0
#> 5                                   10                   10
#> 6                                   17                    0

Created on 2021-12-20 by the reprex package (v2.0.1)

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.