Help on - Loop - argument is of length zero

I'm not sure I understand how exactly you're using for the aforementioned filter here.

((df1[i, "conditions"]) == "L")&
  (df1[i, "work_location"] == df2[j, "Base.Country"])&
  else if ((df1[i, "conditions"]) == "E""] then [(df1[i, "work_location"] == df2[j, "work_location])

Are you looking for something like the following? (...'s are the other filters)

... &
(((df1[i, "conditions"] == "L") & (df1[i, "work_location"] == df2[j, "Base.Country"])) | ((df1[i, "conditions"] == "E") & (df1[i, "work_location"] == "IND"))) &
...
If so, you can check this code.
library(lubridate)
#> 
#> Attaching package: 'lubridate'
#> The following object is masked from 'package:base':
#> 
#>     date

df1 <- data.frame(Job.cat = c(2L, 3L, 4L, 5L, 6L),
                  skill = as.factor(c("Art", "science", "maths", "maths", "sciencce")),
                  conditions = as.factor(c("L", "E", "L", "L", "L")),
                  work_location = as.factor(c("IND", "NZ", "CHI", "SWT", "IND")),
                  Date.created = as.factor(c("1/30/2016", "2/27/2017", "3/20/2018", "4/22/2017", "5/26/2018")))
df2 <- data.frame(Job.cat = c(2L, 3L, 4L, 5L, 3L, 2L, 3L, 3L, 5L, 5L, 5L),
                  Base.Country = as.factor(c("IND", "NZ", "NZ", "SWT", "IND", "IND", "IND", "IND", "SWT", "SWT", "SWT")),
                  Date.Available = as.factor(c("1/30/2016", "7/22/2017", "10/30/2018", "12/26/2017", "6/25/2016", "2/21/2016", "12/21/2015", "10/21/2015", "1/22/2017", "7/22/2017", "11/22/2016")),
                  skill = as.factor(c("Art", "science", "maths", "maths", "sciencce", "maths", "maths", "sciencce", "maths", "maths", "sciencce")))

df1$Date.created <- mdy(df1$Date.created)
df2$Date.Available <- mdy(df2$Date.Available)

# To view the lesser and greater 6 month range
df1$lessmonth <- as.Date(df1$Date.created) %m-% months(6)
df1$greatmonth <- as.Date(df1$Date.created) %m+% months(6)

for (i in 1:nrow(df1)){
  count <- 0
  for (j in 1:nrow(df2)){
    if((df1[i, "skill"] == df2[j, "skill"])&
       (df1[i, "Job.cat"] == df2[j, "Job.cat"] | df1[i, "Job.cat"] +1 == df2[j, "Job.cat"])&
       (((df1[i, "conditions"] == "L") & (df1[i, "work_location"] == df2[j, "Base.Country"])) | ((df1[i, "conditions"] == "E") & (df1[i, "work_location"] == "IND"))) &
       (df2[j, "Date.Available"] >= df1[i, "lessmonth"] & df2[j, "Date.Available"] <= df1[i, "greatmonth"]))
      count <- count + 1
  }
  print(count)
  df1[i, "out"] <- count
}
#> [1] 1
#> [1] 0
#> [1] 0
#> [1] 2
#> [1] 0

Hi @Yarnabrina,

Thanks for the reply and the code something simliar. Please find attached the conditions. if it's L and E then the conditions differ

  1. for e.g if code has to take the first row in df1 say job.cat is 2, skill is art , conditions is L and work location is NL and go to the datasets df2 and see if this conditions satisifies if yes then the new count is added to the df1 dataset. so in this situation only NL should be selected since the condition is Local. but in the next row if the work location is India then only India should be selected under base.country.

2, and now in the next row of df1 say job.cat is 2, skill is science, condition is E and work location is NL and go to the datasets df2 and see if the conditions is satisifie now in this situation any country under base country can be part of the count there is no restrictions on the base country there.

First of all, English isn't my native language, and hence I find it difficult to understand long descriptions. I'll guess that if((df1[i, Job.cat] == 2) & (df1[i,skill] == "Art") & (df1[i,conditions] == "L") & (df1[i,work_location] == "NL")) will represent the conditions you're trying to say in point 1, but as far as I can see, "NL" is absent as a work_location, and I really don't understand what you're saying regarding df2 immediately afterwards. So, you'll have to complete the rest.

Second, I already provided you a working solution regarding your original question, and a modification based on further conditions. Surely, you can generalise from that to add more conditions as you require, isn't it?

Please don't expect us to code for you. If you're facing problems, people here will surely try to point out the mistakes. But you'll have to solve your problems by yourself finally.

I hope you understand my point and will take it sportingly. Good luck!

1 Like

Hi @Yarnabrina,

Thank you so much for all the support and the guidance. Very helpful. Really appreciate your time.
Yes i understand your point i don't expect people to code for me that's why i have pasted my code earlier to help me out with the mistakes. However i was unclear if i communicated correctly here about my problem. So i posted the longer descriptions then expected.

Hi all,

Thanks for helping out. The R code with For loop i tested with few records working fine. however with the large data its running for hours. I completely understand using the dplyr filter however it's working and giving the error of

Error in rank(x, ties.method = "first", na.last = "keep") :
argument "x" is missing, with no default

Hi Guys,

Thank you so much. I could resolve it finally and got what i was looking for. It was good learning exercise.

Regards,
sri

3 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.