Grouping data for XMR

I'm using the script below and having a problem with the grouping. The script is supposed to be grouped by the column TeamMetric, which is a combination of the team name (category) and the metric. My data has various metrics with values for each team each month, and I am trying to add columns to the data for control line and central line values using the package xmrr.


library(xmrr)
library(tidyverse)
df <- data.frame(XMRTest)
Output <- df %>% 
    group_nest(TeamMetric) %>% 
    pull(data) %>%
    setNames(nm = unique(df$TeamMetric)) %>% 
    map_dfr(~xmr(., measure = "Value"), .id = "TeamMetric")

The problem is the results appear to be out of sync:

The screenshot shows an example where TeamMetric="Team6-Metric1" but the TeamName for that row is Team3 so the value of TeamMetric should be "Team3-Metric1"

Each row is a month of data for one team and one metric, so I need to calculate the xmr column values for each Team-metric combination. Can anyone see what's wrong with the script that is causing this issue and advise how to fix it?

Thanks
Leslie

To help us help you, could you please prepare a reproducible example (reprex) illustrating your issue?

OK here it is (I think...).
I suspect the problem has something to do with a team not having rows for each month.

XMRTest2 <- data.frame(stringsAsFactors=FALSE,
     TeamName = c("Team6", "Team6", "Team6", "Team3", "Team3", "Team3",
                  "Team3", "Team3", "Team3", "Team3", "Team3", "Team3",
                  "Team3", "Team3", "Team3", "Team6", "Team6", "Team6", "Team3",
                  "Team3", "Team3", "Team3", "Team3", "Team3", "Team3", "Team3",
                  "Team3", "Team3", "Team3", "Team3"),
   FMonthYear = c("FY2018 M11", "FY2018 M10", "FY2018 M12", "FY2018 M01",
                  "FY2018 M05", "FY2018 M09", "FY2018 M11", "FY2018 M10",
                  "FY2018 M04", "FY2018 M03", "FY2018 M12", "FY2018 M02",
                  "FY2018 M08", "FY2018 M07", "FY2018 M06", "FY2018 M11", "FY2018 M10",
                  "FY2018 M12", "FY2018 M01", "FY2018 M05", "FY2018 M09",
                  "FY2018 M11", "FY2018 M10", "FY2018 M04", "FY2018 M03", "FY2018 M12",
                  "FY2018 M02", "FY2018 M08", "FY2018 M07", "FY2018 M06"),
   TeamMetric = c("Team6-Metric3", "Team6-Metric3", "Team6-Metric3",
                  "Team3-Metric3", "Team3-Metric3", "Team3-Metric3",
                  "Team3-Metric3", "Team3-Metric3", "Team3-Metric3", "Team3-Metric3",
                  "Team3-Metric3", "Team3-Metric3", "Team3-Metric3", "Team3-Metric3",
                  "Team3-Metric3", "Team6-Metric6", "Team6-Metric6",
                  "Team6-Metric6", "Team3-Metric6", "Team3-Metric6", "Team3-Metric6",
                  "Team3-Metric6", "Team3-Metric6", "Team3-Metric6", "Team3-Metric6",
                  "Team3-Metric6", "Team3-Metric6", "Team3-Metric6",
                  "Team3-Metric6", "Team3-Metric6"),
   MetricName = c("Metric3", "Metric3", "Metric3", "Metric3", "Metric3",
                  "Metric3", "Metric3", "Metric3", "Metric3", "Metric3",
                  "Metric3", "Metric3", "Metric3", "Metric3", "Metric3", "Metric6",
                  "Metric6", "Metric6", "Metric6", "Metric6", "Metric6", "Metric6",
                  "Metric6", "Metric6", "Metric6", "Metric6", "Metric6",
                  "Metric6", "Metric6", "Metric6"),
        Value = c(355, 3, 1286, 209, 149, 112, 138, 147, 181, 179, 35, 196,
                  160, 90, 68, 25.35714286, 1, 48.11111111, 1.445783133,
                  1.330882353, 1.265306122, 1.394495413, 1.302325581, 1.623188406,
                  1.386666667, 1.166666667, 1.496402878, 1.384057971, 1.432098765,
                  1.594594595)
)

library(xmrr)
library(tidyverse)
df <- data.frame(XMRTest2)
Output <- df %>% 
    group_nest(TeamMetric) %>% 
    pull(data) %>%
    setNames(nm = unique(df$TeamMetric)) %>% 
    map_dfr(~xmr(., measure = "Value"), .id = "TeamMetric")
1 Like

You just have to make sure the names are sorted, see the example:

XMRTest2 <- data.frame(stringsAsFactors=FALSE,
                       TeamName = c("Team6", "Team6", "Team6", "Team3", "Team3", "Team3",
                                    "Team3", "Team3", "Team3", "Team3", "Team3", "Team3",
                                    "Team3", "Team3", "Team3", "Team6", "Team6", "Team6", "Team3",
                                    "Team3", "Team3", "Team3", "Team3", "Team3", "Team3", "Team3",
                                    "Team3", "Team3", "Team3", "Team3"),
                       FMonthYear = c("FY2018 M11", "FY2018 M10", "FY2018 M12", "FY2018 M01",
                                      "FY2018 M05", "FY2018 M09", "FY2018 M11", "FY2018 M10",
                                      "FY2018 M04", "FY2018 M03", "FY2018 M12", "FY2018 M02",
                                      "FY2018 M08", "FY2018 M07", "FY2018 M06", "FY2018 M11", "FY2018 M10",
                                      "FY2018 M12", "FY2018 M01", "FY2018 M05", "FY2018 M09",
                                      "FY2018 M11", "FY2018 M10", "FY2018 M04", "FY2018 M03", "FY2018 M12",
                                      "FY2018 M02", "FY2018 M08", "FY2018 M07", "FY2018 M06"),
                       TeamMetric = c("Team6-Metric3", "Team6-Metric3", "Team6-Metric3",
                                      "Team3-Metric3", "Team3-Metric3", "Team3-Metric3",
                                      "Team3-Metric3", "Team3-Metric3", "Team3-Metric3", "Team3-Metric3",
                                      "Team3-Metric3", "Team3-Metric3", "Team3-Metric3", "Team3-Metric3",
                                      "Team3-Metric3", "Team6-Metric6", "Team6-Metric6",
                                      "Team6-Metric6", "Team3-Metric6", "Team3-Metric6", "Team3-Metric6",
                                      "Team3-Metric6", "Team3-Metric6", "Team3-Metric6", "Team3-Metric6",
                                      "Team3-Metric6", "Team3-Metric6", "Team3-Metric6",
                                      "Team3-Metric6", "Team3-Metric6"),
                       MetricName = c("Metric3", "Metric3", "Metric3", "Metric3", "Metric3",
                                      "Metric3", "Metric3", "Metric3", "Metric3", "Metric3",
                                      "Metric3", "Metric3", "Metric3", "Metric3", "Metric3", "Metric6",
                                      "Metric6", "Metric6", "Metric6", "Metric6", "Metric6", "Metric6",
                                      "Metric6", "Metric6", "Metric6", "Metric6", "Metric6",
                                      "Metric6", "Metric6", "Metric6"),
                       Value = c(355, 3, 1286, 209, 149, 112, 138, 147, 181, 179, 35, 196,
                                 160, 90, 68, 25.35714286, 1, 48.11111111, 1.445783133,
                                 1.330882353, 1.265306122, 1.394495413, 1.302325581, 1.623188406,
                                 1.386666667, 1.166666667, 1.496402878, 1.384057971, 1.432098765,
                                 1.594594595)
)

library(xmrr)
library(tidyverse)

Output <- XMRTest2 %>% 
  group_nest(TeamMetric) %>%
  arrange(TeamMetric) %>% 
  pull(data) %>%
  setNames(nm = sort(unique(XMRTest2$TeamMetric))) %>% 
  map_dfr(~xmr(., measure = "Value"), .id = "TeamMetric")

Output
#> # A tibble: 30 x 11
#>    TeamMetric TeamName FMonthYear MetricName Value Order `Central Line`
#>    <chr>      <chr>    <chr>      <chr>      <dbl> <dbl>          <dbl>
#>  1 Team3-Met… Team3    FY2018 M01 Metric3      209     1            151
#>  2 Team3-Met… Team3    FY2018 M05 Metric3      149     2            151
#>  3 Team3-Met… Team3    FY2018 M09 Metric3      112     3            151
#>  4 Team3-Met… Team3    FY2018 M11 Metric3      138     4            151
#>  5 Team3-Met… Team3    FY2018 M10 Metric3      147     5            151
#>  6 Team3-Met… Team3    FY2018 M04 Metric3      181     6            151
#>  7 Team3-Met… Team3    FY2018 M03 Metric3      179     7            151
#>  8 Team3-Met… Team3    FY2018 M12 Metric3       35     8            151
#>  9 Team3-Met… Team3    FY2018 M02 Metric3      196     9            151
#> 10 Team3-Met… Team3    FY2018 M08 Metric3      160    10            151
#> # … with 20 more rows, and 4 more variables: `Moving Range` <dbl>,
#> #   `Average Moving Range` <dbl>, `Lower Natural Process Limit` <dbl>,
#> #   `Upper Natural Process Limit` <dbl>

Created on 2019-09-10 by the reprex package (v0.3.0)

Thanks Andrés. This is really helpful, but one thing doesn't seem right... The data will be sorted by FMonthYear (FY2018 M01, FY2018 M02... etc) so shouldn't the Order column be consistent with this?

Or should I be sorting my dataset by TeamMetric and FMonthYear before applying the R script on it?

You can sort it at your convenience, as long as the names here (setNames(nm = sort(unique(XMRTest2$TeamMetric))) %>% ) match the order of the nested dataframe.

Like this for example:

library(xmrr)
library(tidyverse)

Output <- XMRTest2 %>%
  arrange(TeamMetric, FMonthYear) %>%
  group_nest(TeamMetric) %>%
  pull(data) %>%
  setNames(nm = sort(unique(XMRTest2$TeamMetric))) %>% 
  map_dfr(~xmr(., measure = "Value"), .id = "TeamMetric")

Output
#> # A tibble: 30 x 11
#>    TeamMetric TeamName FMonthYear MetricName Value Order `Central Line`
#>    <chr>      <chr>    <chr>      <chr>      <dbl> <dbl>          <dbl>
#>  1 Team3-Met… Team3    FY2018 M01 Metric3      209     1           183.
#>  2 Team3-Met… Team3    FY2018 M02 Metric3      196     2           183.
#>  3 Team3-Met… Team3    FY2018 M03 Metric3      179     3           183.
#>  4 Team3-Met… Team3    FY2018 M04 Metric3      181     4           183.
#>  5 Team3-Met… Team3    FY2018 M05 Metric3      149     5           183.
#>  6 Team3-Met… Team3    FY2018 M06 Metric3       68     6           108.
#>  7 Team3-Met… Team3    FY2018 M07 Metric3       90     7           108.
#>  8 Team3-Met… Team3    FY2018 M08 Metric3      160     8           108.
#>  9 Team3-Met… Team3    FY2018 M09 Metric3      112     9           108.
#> 10 Team3-Met… Team3    FY2018 M10 Metric3      147    10           108.
#> # … with 20 more rows, and 4 more variables: `Moving Range` <dbl>,
#> #   `Average Moving Range` <dbl>, `Lower Natural Process Limit` <dbl>,
#> #   `Upper Natural Process Limit` <dbl>
1 Like

That's really helpful Andrés. Thanks again for all the help.

Leslie

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.