The column `Condition` doesn't exist. Even though I can see the column is there!

I have the formula below:

ANOVAtest <- anova_test(data=Dataovershootwithoutmeans, formula=overshoot~Condition*Distancetodistalboundary+Error(ID/Condition*Distancetodistalboundary), dv=overshoot, wid=ID,within=c(Condition,Distancetodistalboundary),effect.size = "pes")

Error: Can't subset columns that don't exist. x The column Condition doesn't exist. Run rlang::last_error() to see where the error occurred.

Even though I can clearly see in my data frame that there is a column called 'Condition'!

Could anyone give a helping hand?

Can you show the structure of your data frame or even better can you provide a proper REPRoducible EXample (reprex) illustrating your issue?

It's not possible to give an exact answer without a reprex (See the (FAQ: What's a reproducible example (`reprex`) and how do I do one?).

To troubleshoot this, my recommendation is to start simply and to build to the more complex. The help page for rstatix::anova_test has a good starting point

library(dplyr)
library(rstatix)
#> 
#> Attaching package: 'rstatix'
#> The following object is masked from 'package:stats':
#> 
#>     filter
data("ToothGrowth")
df <- ToothGrowth
df %>% anova_test(len ~ dose)
#> Coefficient covariances computed by hccm()
#> ANOVA Table (type II tests)
#> 
#>   Effect DFn DFd       F        p p<.05   ges
#> 1   dose   1  58 105.065 1.23e-14     * 0.644

Created on 2020-02-29 by the reprex package (v0.3.0)

So, begin with

Dataovershootwithout means %>% anova_test(Condition ~ Distanceditalboundery) -> ANOVAtest

and start adding features from there.

A tibble: 154 x 5

Groups: Distancetodistalboundary, Condition [15]

Distancetodistalboundary Condition ID Stopdistancefromsta… overshoot

1 77.0 0 P_2002141233… 80.2 3.20
2 77.0 0.5 P_2002141233… 98.7 21.7
3 77.0 2 P_2002141233… 98.5 21.4
4 109. 0 P_2002141233… 115. 6.31
5 109. 0.5 P_2002141233… 132. 23.4
6 109. 2 P_2002141233… 115. 5.95
7 156. 0 P_2002141233… 175. 19.1
8 156. 0.5 P_2002141233… 174. 17.9
9 156. 2 P_2002141233… 217. 60.5
10 227. 0 P_2002141233… 245. 18.3

… with 144 more rows

Could you run dput(Dataovershootwithoutmeans %>% ungroup() %>% slice(1:20)) and post the output here, like this?
```[your output]```

library(rstatix)

ANOVAtest <- anova_test(data=Dataovershootwithoutmeans, formula=Stopdistancefromstart~ConditionDistancetodistalboundary+Error(ID/ConditionDistancetodistalboundary), dv=Stopdistancefromstart, wid=ID,within=c(Condition,Distancetodistalboundary),effect.size = "pes")

structure(list(Distancetodistalboundary = c(77.008, 77.008, 77.008, 
108.61, 108.61, 108.61, 156.016, 156.016, 156.016, 227.123, 227.123, 
333.784, 333.784, 77.008, 77.008, 77.008, 108.61, 108.61, 108.61, 
156.016), Condition = c("0", "0.5", "2", "0", "0.5", "2", "0", 
"0.5", "2", "0", "0.5", "0", "0.5", "0", "0.5", "2", "0", "0.5", 
"2", "0.5"), ID = c("P_200214123342", "P_200214123342", "P_200214123342", 
"P_200214123342", "P_200214123342", "P_200214123342", "P_200214123342", 
"P_200214123342", "P_200214123342", "P_200214123342", "P_200214123342", 
"P_200214123342", "P_200214123342", "P_200217101213", "P_200217101213", 
"P_200217101213", "P_200217101213", "P_200217101213", "P_200217101213", 
"P_200217101213"), Stopdistancefromstart = c(80.21, 98.661, 98.4568, 
114.916, 132.014666666667, 114.558, 175.139, 173.9455, 216.5585, 
245.408, 263.881, 360.428, 491.432, 81.4215, 83.917, 91.191, 
116.674, 158.305, 116.279, 176.845), overshoot = c(3.202, 21.653, 
21.4488, 6.306, 23.4046666666667, 5.94800000000001, 19.123, 17.9295, 
60.5425, 18.285, 36.758, 26.644, 157.648, 4.41350000000001, 6.90900000000001, 
14.183, 8.06400000000001, 49.695, 7.669, 20.829)), row.names = c(NA, 
-20L), class = c("tbl_df", "tbl", "data.frame")) ```

Great -- thanks! Now could you try it again, but with triple backticks (```) before and after so it looks like this, and add the library commands you use, too? (Like where anova_test() comes from.)

structure(list(Distancetodistalboundary = c(77.008, 77.008, 77.008, 
108.61, 108.61, 108.61, 156.016, 156.016, 156.016, 227.123, 227.123, 
333.784, 333.784, 77.008, 77.008, 77.008, 108.61, 108.61, 108.61, 
156.016), Condition = c("0", "0.5", "2", "0", "0.5", "2", "0", 
"0.5", "2", "0", "0.5", "0", "0.5", "0", "0.5", "2", "0", "0.5", 
"2", "0.5"), ID = c("P_200214123342", "P_200214123342", "P_200214123342", 
"P_200214123342", "P_200214123342", "P_200214123342", "P_200214123342", 
"P_200214123342", "P_200214123342", "P_200214123342", "P_200214123342", 
"P_200214123342", "P_200214123342", "P_200217101213", "P_200217101213", 
"P_200217101213", "P_200217101213", "P_200217101213", "P_200217101213", 
"P_200217101213"), Stopdistancefromstart = c(80.21, 98.661, 98.4568, 
114.916, 132.014666666667, 114.558, 175.139, 173.9455, 216.5585, 
245.408, 263.881, 360.428, 491.432, 81.4215, 83.917, 91.191, 
116.674, 158.305, 116.279, 176.845), overshoot = c(3.202, 21.653, 
21.4488, 6.306, 23.4046666666667, 5.94800000000001, 19.123, 17.9295, 
60.5425, 18.285, 36.758, 26.644, 157.648, 4.41350000000001, 6.90900000000001, 
14.183, 8.06400000000001, 49.695, 7.669, 20.829)), row.names = c(NA, 
-20L), class = c("tbl_df", "tbl", "data.frame"))

library(rstatix)

ANOVAtest <- anova_test(data=Dataovershootwithoutmeans, formula=Stopdistancefromstart~Condition *Distancetodistalboundary+Error(ID/Condition* Distancetodistalboundary), dv=Stopdistancefromstart, wid=ID,within=c(Condition,Distancetodistalboundary),effect.size = "pes")

108.61, 108.61, 108.61, 156.016, 156.016, 156.016, 227.123, 227.123, 
333.784, 333.784, 77.008, 77.008, 77.008, 108.61, 108.61, 108.61, 
156.016), Condition = c("0", "0.5", "2", "0", "0.5", "2", "0", 
"0.5", "2", "0", "0.5", "0", "0.5", "0", "0.5", "2", "0", "0.5", 
"2", "0.5"), ID = c("P_200214123342", "P_200214123342", "P_200214123342", 
"P_200214123342", "P_200214123342", "P_200214123342", "P_200214123342", 
"P_200214123342", "P_200214123342", "P_200214123342", "P_200214123342", 
"P_200214123342", "P_200214123342", "P_200217101213", "P_200217101213", 
"P_200217101213", "P_200217101213", "P_200217101213", "P_200217101213", 
"P_200217101213"), Stopdistancefromstart = c(80.21, 98.661, 98.4568, 
114.916, 132.014666666667, 114.558, 175.139, 173.9455, 216.5585, 
245.408, 263.881, 360.428, 491.432, 81.4215, 83.917, 91.191, 
116.674, 158.305, 116.279, 176.845), Overshootextent = c(3.202, 
21.653, 21.4488, 6.306, 23.4046666666667, 5.94800000000001, 19.123, 
17.9295, 60.5425, 18.285, 36.758, 26.644, 157.648, 4.41350000000001, 
6.90900000000001, 14.183, 8.06400000000001, 49.695, 7.669, 20.829
)), row.names = c(NA, -20L), class = c("tbl_df", "tbl", "data.frame"
))```

Thanks for adding the library! I ran your code and got no error of the kind you described, just the one you can see in the code block below. What happens if you copy and paste my anova_test() code, but replace the data frame with the full one?

library(rstatix)
#> 
#> Attaching package: 'rstatix'
#> The following object is masked from 'package:stats':
#> 
#>     filter

Dataovershootwithoutmeans <- 
structure(list(Distancetodistalboundary = c(77.008, 77.008, 77.008, 
                                            108.61, 108.61, 108.61, 156.016, 156.016, 156.016, 227.123, 227.123, 
                                            333.784, 333.784, 77.008, 77.008, 77.008, 108.61, 108.61, 108.61, 
                                            156.016), Condition = c("0", "0.5", "2", "0", "0.5", "2", "0", 
                                                                    "0.5", "2", "0", "0.5", "0", "0.5", "0", "0.5", "2", "0", "0.5", 
                                                                    "2", "0.5"), ID = c("P_200214123342", "P_200214123342", "P_200214123342", 
                                                                                        "P_200214123342", "P_200214123342", "P_200214123342", "P_200214123342", 
                                                                                        "P_200214123342", "P_200214123342", "P_200214123342", "P_200214123342", 
                                                                                        "P_200214123342", "P_200214123342", "P_200217101213", "P_200217101213", 
                                                                                        "P_200217101213", "P_200217101213", "P_200217101213", "P_200217101213", 
                                                                                        "P_200217101213"), Stopdistancefromstart = c(80.21, 98.661, 98.4568, 
                                                                                                                                     114.916, 132.014666666667, 114.558, 175.139, 173.9455, 216.5585, 
                                                                                                                                     245.408, 263.881, 360.428, 491.432, 81.4215, 83.917, 91.191, 
                                                                                                                                     116.674, 158.305, 116.279, 176.845), overshoot = c(3.202, 21.653, 
                                                                                                                                                                                        21.4488, 6.306, 23.4046666666667, 5.94800000000001, 19.123, 17.9295, 
                                                                                                                                                                                        60.5425, 18.285, 36.758, 26.644, 157.648, 4.41350000000001, 6.90900000000001, 
                                                                                                                                                                                        14.183, 8.06400000000001, 49.695, 7.669, 20.829)), row.names = c(NA, 
                                                                                                                                                                                                                                                         -20L), class = c("tbl_df", "tbl", "data.frame"))
dput(Dataovershootwithoutmeans)
#> structure(list(Distancetodistalboundary = c(77.008, 77.008, 77.008, 
#> 108.61, 108.61, 108.61, 156.016, 156.016, 156.016, 227.123, 227.123, 
#> 333.784, 333.784, 77.008, 77.008, 77.008, 108.61, 108.61, 108.61, 
#> 156.016), Condition = c("0", "0.5", "2", "0", "0.5", "2", "0", 
#> "0.5", "2", "0", "0.5", "0", "0.5", "0", "0.5", "2", "0", "0.5", 
#> "2", "0.5"), ID = c("P_200214123342", "P_200214123342", "P_200214123342", 
#> "P_200214123342", "P_200214123342", "P_200214123342", "P_200214123342", 
#> "P_200214123342", "P_200214123342", "P_200214123342", "P_200214123342", 
#> "P_200214123342", "P_200214123342", "P_200217101213", "P_200217101213", 
#> "P_200217101213", "P_200217101213", "P_200217101213", "P_200217101213", 
#> "P_200217101213"), Stopdistancefromstart = c(80.21, 98.661, 98.4568, 
#> 114.916, 132.014666666667, 114.558, 175.139, 173.9455, 216.5585, 
#> 245.408, 263.881, 360.428, 491.432, 81.4215, 83.917, 91.191, 
#> 116.674, 158.305, 116.279, 176.845), overshoot = c(3.202, 21.653, 
#> 21.4488, 6.306, 23.4046666666667, 5.94800000000001, 19.123, 17.9295, 
#> 60.5425, 18.285, 36.758, 26.644, 157.648, 4.41350000000001, 6.90900000000001, 
#> 14.183, 8.06400000000001, 49.695, 7.669, 20.829)), row.names = c(NA, 
#> -20L), class = c("tbl_df", "tbl", "data.frame"))
# ANOVAtest <- 
  anova_test(data = Dataovershootwithoutmeans, 
             formula = Stopdistancefromstart~
               Condition * Distancetodistalboundary+
               Error(ID / Condition * Distancetodistalboundary), 
             dv = Stopdistancefromstart, 
             wid =  ID,
             within = c (Condition,Distancetodistalboundary),
             effect.size = "pes")
#> Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...): 0 (non-NA) cases

Created on 2020-02-29 by the reprex package (v0.3.0)

(And for future reference, I created the code block by loading the reprex package, then copying all of the code you see at once (after removing commneted lilnes), then running the reprex() command, and then pasting here before copying or pasting anything else.)

Hey @dromano when I retry your code it comes up with this:

Could you run glimpse(Dataovershootwithoutmeans)?

Screenshot 2020-03-01 at 00.27.25|593x159

Do you want Condition to be character, rather than numeric?

yup :slight_smile: I feel that's more suitable (3 categories). @technocrat

Gotcha, thanks. Wasn't sure based on the shot

anova_test(data = Dataovershootwithoutmeanswithoutactualstopdistance, formula = Overshootextent ~ as.character(Condition)*Targetdistance + Error(ID/as.character(Condition)*Targetdistance), dv = Overshootextent, wid = ID, within = c (as.character(Condition),Targetdistance),effect.size = "pes")

If I put the Condition: 'as.character', it gives me the output below:

Error: Can't subset columns that don't exist.
x The column `c(3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, \n3, 3, 3, 3, 3, 3, 3, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1

it worked on this dataset:

image

Could you post a small version of the data that worked?

Targetdistance Condition ID Undershootextent

1 63.207 0 P_200214103155 3.498000
2 63.207 0.5 P_200214103155 6.161250
3 63.207 2 P_200214103155 10.420333
4 94.810 0 P_200214103155 15.852250
5 94.810 0.5 P_200214103155 10.553000
6 94.810 2 P_200214103155 10.441000
7 142.215 0 P_200214103155 17.495000
8 142.215 0.5 P_200214103155 38.332667
9 142.215 2 P_200214103155 40.320750
10 213.322 0 P_200214103155 45.789800
11 213.322 0.5 P_200214103155 45.947000
12 213.322 2 P_200214103155 8.867333
13 319.983 0 P_200214103155 30.879400
14 319.983 0.5 P_200214103155 26.690333
15 319.983 2 P_200214103155 41.765333
16 63.207 0 P_200214123342 9.676333
17 94.810 0 P_200214123342 37.739500
18 94.810 0.5 P_200214123342 1.987000
19 94.810 2 P_200214123342 19.793000
20 142.215 0 P_200214123342 80.312000
21 142.215 0.5 P_200214123342 38.406333
22 142.215 2 P_200214123342 9.039000
23 213.322 0 P_200214123342 31.162333
24 213.322 0.5 P_200214123342 72.099000
25 213.322 2 P_200214123342 63.824200
26 319.983 0 P_200214123342 48.597250
27 319.983 0.5 P_200214123342 50.873000
28 319.983 2 P_200214123342 88.283000
29 63.207 0 P_200217101213 1.230000
30 63.207 0.5 P_200217101213 4.307000
31 63.207 2 P_200217101213 12.951500
32 94.810 0 P_200217101213 15.071000
33 94.810 0.5 P_200217101213 24.047333
34 94.810 2 P_200217101213 31.722000
35 142.215 0 P_200217101213 10.974000
36 142.215 0.5 P_200217101213 15.467000
37 142.215 2 P_200217101213 38.746667
38 213.322 0 P_200217101213 39.025000
39 213.322 0.5 P_200217101213 13.446000
40 213.322 2 P_200217101213 23.621000
41 319.983 0 P_200217101213 13.705000
42 319.983 0.5 P_200217101213 34.771667
43 319.983 2 P_200217101213 58.126000
44 63.207 0 P_200219091823 17.166000
45 63.207 0.5 P_200219091823 8.551000
46 63.207 2 P_200219091823 23.984000
47 94.810 0.5 P_200219091823 14.267000
48 94.810 2 P_200219091823 17.278500
49 142.215 0 P_200219091823 23.461000
50 142.215 0.5 P_200219091823 12.110000

Could you run dput(Dataundershootwithout...) (I can't copy the exact name), and post the result? Between a pair of ```'s?