I have found the problem. I was looking at what each version of the df is showing by clicking them in the rstudio environment. dfTransformed needs to merge with dfEmployeeProfile to make dfTransformedWithEmployeeProfile. The issue I believe, lies with the line:
# Step 3: Obtain an Employee Profile by fetching all data except for question related data.
dfEmployeeProfile = df %>%
dplyr::select(-c("Question Body", "Questionnaire Answer", "Question", "Questionnaire")) %>%
group_by("Employee ID") %>%
filter(row_number() == n()) %>% # keep last row
ungroup()
as when I looked at the table this produces, it only shows the last line of the original dataframe.
Is there something wrong with this code?
here is the reprex of dfEmployeeProfile:
tibble::tribble(
dfEmployeeProfile <- ~Effective.Date, ~Business.Process.Reason, ~Employee.ID, ~Voluntary.or.Involuntary, ~Manager, ~Supervisory.Organization, ~Management.Level, ~Location, ~Country, ~`FTE.%`, ~Service.in.Years, ~Worker.Type, ~`Supervisory.Org.-.1.From.The.Top`, ~`Supervisory.Org.-.2.From.The.Top`, ~`Supervisory.Org.-.3.From.The.Top`, ~`Supervisory.Org.-.4.From.The.Top`, ~`Supervisory.Org.-.5.From.The.Top`, ~`Supervisory.Org.-.6.From.The.Top`, ~Respondent, ~`"Employee.ID"`,
1L, "30/06/2022", "Terminate Employee > Voluntary > Resignation - Personal", 830957L, "Voluntary", "Vincent Gautheron (861145)", "Procurement- Group Functions (Vincent Gautheron (861145))", "Level 6", "UK - Berkshire - Windsor Millstream", "United Kingdom", 100L, 5.12, "Employee", "Centrica PLC (Scott Wheway (213219))", "Centrica PLC (Chris O'Shea (854981))", "British Gas Services and Solutions (Jana Siber (905647))", "COO, British Gas (David House (137182))", "Group Procurement (Peter Sowrey (873613))", "Procurement- Group Functions (Vincent Gautheron (861145))", "Anne Rittmueller"
)
#> # A tibble: 1 × 20
#> Effective.Date Busin…¹ Emplo…² Volun…³ Manager Super…⁴ Manag…⁵ Locat…⁶ Country
#> <int> <chr> <chr> <int> <chr> <chr> <chr> <chr> <chr>
#> 1 1 30/06/… Termin… 830957 Volunt… Vincen… Procur… Level 6 UK - B…
#> # … with 11 more variables: `FTE.%` <chr>, Service.in.Years <int>,
#> # Worker.Type <dbl>, `Supervisory.Org.-.1.From.The.Top` <chr>,
#> # `Supervisory.Org.-.2.From.The.Top` <chr>,
#> # `Supervisory.Org.-.3.From.The.Top` <chr>,
#> # `Supervisory.Org.-.4.From.The.Top` <chr>,
#> # `Supervisory.Org.-.5.From.The.Top` <chr>,
#> # `Supervisory.Org.-.6.From.The.Top` <chr>, Respondent <chr>, …
#> # ℹ Use `colnames()` to see all variable names
Created on 2022-08-04 by the reprex package (v2.0.1)
it should have 63 rows of data, not the last row of the entire dataframe.