How can I make a new variable based on a percentage of four other variables?

I would like to compute country-specific weights based on the following elements: 40% population, 40% total GDP, 10% number of past applications and 10% unemployment rate.

My data looks like this:
> head(d)
Country Population (peop~ Population (%): `` Total GDP (millio~ Total GDP (%) Number of past appli~ Number of past app~ `Unemployment rate (% ~
1 Austria 8822267 0.0168 113. 0.0324 196875 0.0415 4.9
2 Belgium 11398589 0.0216 108. 0.0311 126520 0.0266 6
3 Bulgaria 7050034 0.0134 103. 0.0296 57120 0.0120 5.2
4 Croatia 4105493 0.00780 100. 0.0289 4660 0.000981 8.4
5 Cyprus 864236 0.00164 103. 0.0297 19315 0.00407 8.4
6 Czech Re~ 10610055 0.0202 115 0.0330 7270 0.00153 2.2

Please give more details.

Hi!

To help us help you, could you please prepare a reproducible example (reprex) illustrating your issue? Please have a look at this guide, to see how to create one:

Hi,

Thank you for responding.

Does this help?

d1 <- tibble::tribble(
    ~Country, ~Population.(people):, ~Population.(%):, ~Total.GDP.(million.euro), ~Total.GDP.(%), ~Number.of.past.applications.(number):, ~Number.of.past.applications.(%):, ~Unemployment.rate.(%.of.active.population),
   "Austria",           "8 822 267",         "1,68 %",                      1127,       "3,24 %",                              "196 875",                          "4,15 %",                                          49,
   "Belgium",          "11 398 589",         "2,16 %",                      1083,       "3,11 %",                              "126 520",                          "2,66 %",                                           6,
  "Bulgaria",           "7 050 034",         "1,34 %",                      1029,       "2,96 %",                               "57 120",                          "1,20 %",                                          52,
   "Croatia",           "4 105 493",         "0,78 %",                      1005,       "2,89 %",                                "4 660",                          "0,10 %",                                          84,
    "Cyprus",             "864 236",         "0,16 %",                      1034,       "2,97 %",                               "19 315",                          "0,41 %",                                          84
  )
head(d1)

I don't really have an error or any packages to refer to, as I don't know how to proceed to create this variable.

I highly appreciate all suggestions on how to proceed and would be happy to provide any further information you would need in order to help me with this.

Thanks again!

Hi,

Thank you for responding.

I would like to create a "distribution key" based on country-specific weights in order to figure out how many asylum applications each European country should receive according to a proportional allocation rule proposed by the European Commission.

Like this:


(Source: https://ec.europa.eu/anti-trafficking/sites/antitrafficking/files/communication_on_the_european_agenda_on_migration_en.pdf)

Please let me know if you can think of a way to create this variable in R.

I look forward to hearing from you and will gladly provide you with more details (please let me know what sort of details you need) :blush:

Thanks again!

There are some problems with your data, you are using non syntactic column names (this could bring you problems in the future) and you have numeric variables stored as text. Also, you are not telling us what is the specific formula for the variable you want to create but this is an example that you could easily adapt to your actual application.

library(tidyverse)
# Sample data
d1 <- tibble::tribble(
    ~Country, ~'Population.(people):', ~'Population.(%):', ~'Total.GDP.(million.euro)', ~'Total.GDP.(%)', ~'Number.of.past.applications.(number):', ~'Number.of.past.applications.(%):', ~'Unemployment.rate.(%.of.active.population)',
    "Austria",           "8 822 267",         "1,68 %",                      1127,       "3,24 %",                              "196 875",                          "4,15 %",                                          49,
    "Belgium",          "11 398 589",         "2,16 %",                      1083,       "3,11 %",                              "126 520",                          "2,66 %",                                           6,
    "Bulgaria",           "7 050 034",         "1,34 %",                      1029,       "2,96 %",                               "57 120",                          "1,20 %",                                          52,
    "Croatia",           "4 105 493",         "0,78 %",                      1005,       "2,89 %",                                "4 660",                          "0,10 %",                                          84,
    "Cyprus",             "864 236",         "0,16 %",                      1034,       "2,97 %",                               "19 315",                          "0,41 %",                                          84
)

# Replace non syntactic names
names(d1) <- c("Country", "Population", "Population_pct", "Total_GDP", "Total_GDP_pct", "Past_applications", "Past_applications_pct", "Unemployment")

d1 %>% 
    mutate(Population = parse_number(Population), # Convert text into numeric values
           Past_applications = parse_number(Past_applications), # Convert text into numeric values
           Key = Population * 0.4 + Total_GDP * 0.4 + Past_applications * 0.1 + Unemployment * 0.1) %>% 
    select(Country, Key)
#> # A tibble: 5 x 2
#>   Country    Key
#>   <chr>    <dbl>
#> 1 Austria   478.
#> 2 Belgium   451.
#> 3 Bulgaria  425.
#> 4 Croatia   412.
#> 5 Cyprus    770.

Created on 2019-08-14 by the reprex package (v0.3.0.9000)

1 Like

This is perfect!

I got it all figures out now :grin:

Thanks a lot :relaxed:

If your question's been answered (even by you!), would you mind choosing a solution? It helps other people see which questions still need help, or find solutions if they have similar problems. Here’s how to do it:

1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.