Error in one way ANOVA

Hello,
I have been trying to perform a simple one way anova with the data I imported from an Excel sheet. Data was loaded properly and I checked if the table had had any issue (it didn't seem it had).
However, when performing the one way anova with the data from two of the columns, I got two errors. Here is my full code with the error:

library(readxl)
Species_measurement_merged <- read_excel("Species_measurement_merged.xlsx")
View(Species_measurement_merged)
data <- Species_measurement_merged
one.way <- aov(WaterPot_Dawn ~ Art, data = data)

Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) :
NA/NaN/Inf in 'y'
In addition: Warning message:
In storage.mode(v) <- "double" : NAs introduced by coercion

Weird thing is, I already used this code at the beginning of the month on a different computer, and I had no trouble running these lines. Any idea on where the error could be? Thank you.

can you provide us part of your df using dput(head(data,20))?

Here it is:

dput(head(data,20)) #In case of error
structure(list(Art = c("Acer buergerianum", "Acer buergerianum",
"Acer buergerianum", "Acer buergerianum", "Acer buergerianum",
"Acer rufinerve", "Acer rufinerve", "Acer rufinerve", "Acer rufinerve",
"Acer rufinerve", "Carpinus japonica", "Carpinus japonica", "Carpinus japonica",
"Carpinus japonica", "Carpinus japonica", "Celtis australis",
"Celtis australis", "Celtis australis", "Celtis australis", "Celtis australis"
), Date = c("AMay", "AMay", "AMay", "AMay", "AMay", "AMay", "AMay",
"AMay", "AMay", "AMay", "AMay", "AMay", "AMay", "AMay", "AMay",
"AMay", "AMay", "AMay", "AMay", "AMay"), Ind = c(1, 2, 3, 4,
5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5), WaterPot_Dawn = c("NA",
"NA", "0.41", "0.59599999999999997", "0.78", "NA", "NA", "0.62",
"0.75", "0.44800000000000001", "0.12", "0.48799999999999999",
"0.73", "NA", "NA", "0.35499999999999998", "0.28000000000000003",
"0.41", "0.27800000000000002", "0.505"), WaterPot_Noon = c("NA",
"NA", "0.14000000000000001", "0.28000000000000003", "0.31", "NA",
"NA", "0.255", "0.42", "0.182", "0.31", "0.62", "0.91", "NA",
"NA", "0.85", "0.93", "0.52", "NA", "NA"), ChloroCont = c("NA",
"NA", "21.2", "18.100000000000001", "18.399999999999999", "NA",
"NA", "26.1", "24.7", "26", "27.4", "24.3", "24.8", "26.7", "23.9",
"31.6", "6.2", "17.2", "29.5", "18.7"), Leaf_area = c("NA", "NA",
"52.6", "63.29", "22.97", "NA", "NA", "332", "318.04000000000002",
"338.9", "41.76", "56.04", "47.83", "65.03", "56.11", "5.92",
"2.99", "7", "5.95", "3.57"), Fresh_weight = c("NA", "NA", "1.1599999999999999",
"1.26", "0.79", "NA", "NA", "7.23", "5.84", "5.05", "1.06", "1.29",
"1.22", "1.46", "1.24", "0.6", "0.56999999999999995", "0.61",
"0.62", "0.6"), Dry_weight = c("NA", "NA", "0.26", "0.27", "0.1",
"NA", "NA", "2.25", "1.84", "1.6", "0.31", "0.39", "0.3", "0.44",
"0.37", "4.7E-2", "2.4E-2", "3.5000000000000003E-2", "4.9000000000000002E-2",
"3.9E-2"), DBH = c("NA", "NA", "18", "14.8", "9.8000000000000007",
"NA", "NA", "10", "10", "10.199999999999999", "11.4", "10", "11.6",
"11.2", "9.6", "12", "11.8", "11.8", "13.2", "13.7"), Height = c("NA",
"NA", "371", "397", "303", "NA", "NA", "352", "309", "337", "251",
"293", "313", "307", "270", "372", "379", "372", "362", "385"
), 1st_leaf = c("NA", "NA", "189", "179.5", "185", "NA", "NA",
"182.5", "169", "178", "157", "173", "195", "168", "164", "196",
"210", "185", "189", "195"), Axis_1 = c("NA", "NA", "123", "146",
"80", "NA", "NA", "87", "61", "68", "95", "116", "118", "94",
"124", "50", "63", "67", "65", "70"), Axis_2 = c("NA", "NA",
"112", "106", "90", "NA", "NA", "92", "58", "63", "81", "104",
"133", "105", "109", "94", "68", "69", "59", "53"), Canopy_size = c("NA",
"NA", "182", "217.5", "118", "NA", "NA", "169.5", "140", "159",
"94", "120", "118", "139", "106", "176", "169", "187", "173",
"190"), Leaf_dry_cont = c("NA", "NA", "0.22413793103448279",
"0.2142857142857143", "0.12658227848101267", "NA", "NA", "0.31120331950207469",
"0.31506849315068497", "0.31683168316831684", "0.29245283018867924",
"0.30232558139534882", "0.24590163934426229", "0.30136986301369861",
"0.29838709677419356", "7.8333333333333338E-2", "4.2105263157894743E-2",
"5.7377049180327877E-2", "7.9032258064516137E-2", "6.5000000000000002E-2"
), Crown_area = c("NA", "NA", "10502268.842726992", "14099593.493017135",
"3558796.1579865171", "NA", "NA", "5682839.5174491908", "2074791.564234795",
"2853219.580731479", "3029877.6188281402", "6064027.8036651621",
"7757187.0699222777", "5746726.945652592", "6001262.9712366425",
"3464967.2573993024", "3032667.3531045276", "3621213.3208280397",
"2779073.8053165548", "2952678.2153539266"), Specific_leaf = c("NA",
"NA", "202.30769230769229", "234.40740740740739", "229.7", "NA",
"NA", "147.55555555555554", "172.84782608695653", "211.81249999999997",
"134.70967741935485", "143.69230769230768", "159.43333333333334",
"147.79545454545456", "151.64864864864865", "125.95744680851064",
"124.58333333333334", "199.99999999999997", "121.42857142857143",
"91.538461538461533")), row.names = c(NA, -20L), class = c("tbl_df",
"tbl", "data.frame"))

@Flm is right—this is hard to diagnose. See the FAQ: How to do a minimal reproducible example reprex for beginners.

Here it is. Sorry, I am still quite bad at this.

dput(head(data,20)) #In case of error
structure(list(Art = c("Acer buergerianum", "Acer buergerianum",
"Acer buergerianum", "Acer buergerianum", "Acer buergerianum",
"Acer rufinerve", "Acer rufinerve", "Acer rufinerve", "Acer rufinerve",
"Acer rufinerve", "Carpinus japonica", "Carpinus japonica", "Carpinus japonica",
"Carpinus japonica", "Carpinus japonica", "Celtis australis",
"Celtis australis", "Celtis australis", "Celtis australis", "Celtis australis"
), Date = c("AMay", "AMay", "AMay", "AMay", "AMay", "AMay", "AMay",
"AMay", "AMay", "AMay", "AMay", "AMay", "AMay", "AMay", "AMay",
"AMay", "AMay", "AMay", "AMay", "AMay"), Ind = c(1, 2, 3, 4,
5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5), WaterPot_Dawn = c("NA",
"NA", "0.41", "0.59599999999999997", "0.78", "NA", "NA", "0.62",
"0.75", "0.44800000000000001", "0.12", "0.48799999999999999",
"0.73", "NA", "NA", "0.35499999999999998", "0.28000000000000003",
"0.41", "0.27800000000000002", "0.505"), WaterPot_Noon = c("NA",
"NA", "0.14000000000000001", "0.28000000000000003", "0.31", "NA",
"NA", "0.255", "0.42", "0.182", "0.31", "0.62", "0.91", "NA",
"NA", "0.85", "0.93", "0.52", "NA", "NA"), ChloroCont = c("NA",
"NA", "21.2", "18.100000000000001", "18.399999999999999", "NA",
"NA", "26.1", "24.7", "26", "27.4", "24.3", "24.8", "26.7", "23.9",
"31.6", "6.2", "17.2", "29.5", "18.7"), Leaf_area = c("NA", "NA",
"52.6", "63.29", "22.97", "NA", "NA", "332", "318.04000000000002",
"338.9", "41.76", "56.04", "47.83", "65.03", "56.11", "5.92",
"2.99", "7", "5.95", "3.57"), Fresh_weight = c("NA", "NA", "1.1599999999999999",
"1.26", "0.79", "NA", "NA", "7.23", "5.84", "5.05", "1.06", "1.29",
"1.22", "1.46", "1.24", "0.6", "0.56999999999999995", "0.61",
"0.62", "0.6"), Dry_weight = c("NA", "NA", "0.26", "0.27", "0.1",
"NA", "NA", "2.25", "1.84", "1.6", "0.31", "0.39", "0.3", "0.44",
"0.37", "4.7E-2", "2.4E-2", "3.5000000000000003E-2", "4.9000000000000002E-2",
"3.9E-2"), DBH = c("NA", "NA", "18", "14.8", "9.8000000000000007",
"NA", "NA", "10", "10", "10.199999999999999", "11.4", "10", "11.6",
"11.2", "9.6", "12", "11.8", "11.8", "13.2", "13.7"), Height = c("NA",
"NA", "371", "397", "303", "NA", "NA", "352", "309", "337", "251",
"293", "313", "307", "270", "372", "379", "372", "362", "385"
), 1st_leaf = c("NA", "NA", "189", "179.5", "185", "NA", "NA",
"182.5", "169", "178", "157", "173", "195", "168", "164", "196",
"210", "185", "189", "195"), Axis_1 = c("NA", "NA", "123", "146",
"80", "NA", "NA", "87", "61", "68", "95", "116", "118", "94",
"124", "50", "63", "67", "65", "70"), Axis_2 = c("NA", "NA",
"112", "106", "90", "NA", "NA", "92", "58", "63", "81", "104",
"133", "105", "109", "94", "68", "69", "59", "53"), Canopy_size = c("NA",
"NA", "182", "217.5", "118", "NA", "NA", "169.5", "140", "159",
"94", "120", "118", "139", "106", "176", "169", "187", "173",
"190"), Leaf_dry_cont = c("NA", "NA", "0.22413793103448279",
"0.2142857142857143", "0.12658227848101267", "NA", "NA", "0.31120331950207469",
"0.31506849315068497", "0.31683168316831684", "0.29245283018867924",
"0.30232558139534882", "0.24590163934426229", "0.30136986301369861",
"0.29838709677419356", "7.8333333333333338E-2", "4.2105263157894743E-2",
"5.7377049180327877E-2", "7.9032258064516137E-2", "6.5000000000000002E-2"
), Crown_area = c("NA", "NA", "10502268.842726992", "14099593.493017135",
"3558796.1579865171", "NA", "NA", "5682839.5174491908", "2074791.564234795",
"2853219.580731479", "3029877.6188281402", "6064027.8036651621",
"7757187.0699222777", "5746726.945652592", "6001262.9712366425",
"3464967.2573993024", "3032667.3531045276", "3621213.3208280397",
"2779073.8053165548", "2952678.2153539266"), Specific_leaf = c("NA",
"NA", "202.30769230769229", "234.40740740740739", "229.7", "NA",
"NA", "147.55555555555554", "172.84782608695653", "211.81249999999997",
"134.70967741935485", "143.69230769230768", "159.43333333333334",
"147.79545454545456", "151.64864864864865", "125.95744680851064",
"124.58333333333334", "199.99999999999997", "121.42857142857143",
"91.538461538461533")), row.names = c(NA, -20L), class = c("tbl_df",
"tbl", "data.frame"))

I am sorry, but I have tried to provide that twice but it keeps getting automatically locked as spam (maybe due to its size). I am not sure about how to show you my dput.

Let's try with this:

dput(head(data,20)) #In case of error
structure(list(Art = c("Acer buergerianum", "Acer buergerianum",
"Acer buergerianum", "Acer buergerianum", "Acer buergerianum",
"Acer rufinerve", "Acer rufinerve", "Acer rufinerve", "Acer rufinerve",
"Acer rufinerve", "Carpinus japonica", "Carpinus japonica", "Carpinus japonica",
"Carpinus japonica", "Carpinus japonica", "Celtis australis",
"Celtis australis", "Celtis australis", "Celtis australis", "Celtis australis"
), Date = c("AMay", "AMay", "AMay", "AMay", "AMay", "AMay", "AMay",
"AMay", "AMay", "AMay", "AMay", "AMay", "AMay", "AMay", "AMay",
"AMay", "AMay", "AMay", "AMay", "AMay"), Ind = c(1, 2, 3, 4,
5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5), WaterPot_Dawn = c("NA",
"NA", "0.41", "0.59599999999999997", "0.78", "NA", "NA", "0.62",
"0.75", "0.44800000000000001", "0.12", "0.48799999999999999",
"0.73", "NA", "NA", "0.35499999999999998", "0.28000000000000003",
"0.41", "0.27800000000000002", "0.505")

It is only part of my code. Let's see if it gets marked as spam again.

if you try to upload it to dropbox for example, does it still give you problems?

I have never used dropbox, so I have not tried and do not know how. Is there any other way I can post my result to dput here? Sorry for the inconvenience, I am rather new.

You could put the file in a drive and share the link. This way is for offer a better help you.

The data provided is fine, since it contains NAs, that’s one potential source. Can you share the code you run against this?

code

library(tidyverse)

Species_measurement_merged <- structure(list(Art=c("Acerbuergerianum","Acerbuergerianum",
"Acerbuergerianum","Acerbuergerianum","Acerbuergerianum",
"Acerrufinerve","Acerrufinerve","Acerrufinerve","Acerrufinerve",
"Acerrufinerve","Carpinusjaponica","Carpinusjaponica","Carpinusjaponica",
"Carpinusjaponica","Carpinusjaponica","Celtisaustralis",
"Celtisaustralis","Celtisaustralis","Celtisaustralis","Celtisaustralis"
),Date=c("AMay","AMay","AMay","AMay","AMay","AMay","AMay",
"AMay","AMay","AMay","AMay","AMay","AMay","AMay","AMay",
"AMay","AMay","AMay","AMay","AMay"),Ind=c(1,2,3,4,
5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5),WaterPot_Dawn=c("NA",
"NA","0.41","0.59599999999999997","0.78","NA","NA","0.62",
"0.75","0.44800000000000001","0.12","0.48799999999999999",
"0.73","NA","NA","0.35499999999999998","0.28000000000000003",
"0.41","0.27800000000000002","0.505"),WaterPot_Noon=c("NA",
"NA","0.14000000000000001","0.28000000000000003","0.31","NA",
"NA","0.255","0.42","0.182","0.31","0.62","0.91","NA",
"NA","0.85","0.93","0.52","NA","NA"),ChloroCont=c("NA",
"NA","21.2","18.100000000000001","18.399999999999999","NA",
"NA","26.1","24.7","26","27.4","24.3","24.8","26.7","23.9",
"31.6","6.2","17.2","29.5","18.7"),Leaf_area=c("NA","NA",
"52.6","63.29","22.97","NA","NA","332","318.04000000000002",
"338.9","41.76","56.04","47.83","65.03","56.11","5.92",
"2.99","7","5.95","3.57"),Fresh_weight=c("NA","NA","1.1599999999999999",
"1.26","0.79","NA","NA","7.23","5.84","5.05","1.06","1.29",
"1.22","1.46","1.24","0.6","0.56999999999999995","0.61",
"0.62","0.6"),Dry_weight=c("NA","NA","0.26","0.27","0.1",
"NA","NA","2.25","1.84","1.6","0.31","0.39","0.3","0.44",
"0.37","4.7E-2","2.4E-2","3.5000000000000003E-2","4.9000000000000002E-2",
"3.9E-2"),DBH=c("NA","NA","18","14.8","9.8000000000000007",
"NA","NA","10","10","10.199999999999999","11.4","10","11.6",
"11.2","9.6","12","11.8","11.8","13.2","13.7"),Height=c("NA",
"NA","371","397","303","NA","NA","352","309","337","251",
"293","313","307","270","372","379","372","362","385"
),'1st_leaf'=c("NA","NA","189","179.5","185","NA","NA",
"182.5","169","178","157","173","195","168","164","196",
"210","185","189","195"),Axis_1=c("NA","NA","123","146",
"80","NA","NA","87","61","68","95","116","118","94",
"124","50","63","67","65","70"),Axis_2=c("NA","NA",
"112","106","90","NA","NA","92","58","63","81","104",
"133","105","109","94","68","69","59","53"),Canopy_size=c("NA",
"NA","182","217.5","118","NA","NA","169.5","140","159",
"94","120","118","139","106","176","169","187","173",
"190"),Leaf_dry_cont=c("NA","NA","0.22413793103448279",
"0.2142857142857143","0.12658227848101267","NA","NA","0.31120331950207469",
"0.31506849315068497","0.31683168316831684","0.29245283018867924",
"0.30232558139534882","0.24590163934426229","0.30136986301369861",
"0.29838709677419356","7.8333333333333338E-2","4.2105263157894743E-2",
"5.7377049180327877E-2","7.9032258064516137E-2","6.5000000000000002E-2"
),Crown_area=c("NA","NA","10502268.842726992","14099593.493017135",
"3558796.1579865171","NA","NA","5682839.5174491908","2074791.564234795",
"2853219.580731479","3029877.6188281402","6064027.8036651621",
"7757187.0699222777","5746726.945652592","6001262.9712366425",
"3464967.2573993024","3032667.3531045276","3621213.3208280397",
"2779073.8053165548","2952678.2153539266"),Specific_leaf=c("NA",
"NA","202.30769230769229","234.40740740740739","229.7","NA",
"NA","147.55555555555554","172.84782608695653","211.81249999999997",
"134.70967741935485","143.69230769230768","159.43333333333334",
"147.79545454545456","151.64864864864865","125.95744680851064",
"124.58333333333334","199.99999999999997","121.42857142857143",
"91.538461538461533")),row.names=c(NA,-20L),class=c("tbl_df",
"tbl","data.frame")) %>% as_tibble()

data <- Species_measurement_merged %>% 
  mutate(across(where(is.character), ~na_if(., "NA"))) %>% 
  type.convert()


one.way <- aov(WaterPot_Dawn ~ Art, data = data)
one.way


The problem lies in the fact that the NAs are seen as text and not as real NAs. We have to convert them.

Before
1

After

aov

1 Like

I see. So how can I convert the NAs from text to true NAs? Thank you.

Look at code in my previous post

Sorry, I missed it last time I checked. Thank you very much for your help.

if I can do anything else I'm here! otherwise you can mark the answer as solved.

Actually, I ran across a new problem when performing said ANOVA. I performed several one.way tests with no issues, but then I realized I had to do those tests separately according to date. So I extracted the first month's worht of measurements from my original dataframe and tried to perform the one.way again. This time, the variable is not found:

one.way <- aov(WaterPot_Dawn ~ Art, data = data)

summary(one.way)
Df Sum Sq Mean Sq F value Pr(>F)
Art 18 0.750 0.04168 0.936 0.536
Residuals 209 9.304 0.04451
222 observations deleted due to missingness

dataMay <- data%>% filter(Date=="AMay")
one.wayWPDMay <- aov(WaterPot_Dawn ~ Art, dataMay=data)
Error in eval(predvars, data, env) : object 'WaterPot_Dawn' not found

I am quite puzzled by this because the selected dataframe (dataMay) is loaded with no issues, showing all column names and all variables (including WaterPot_Dawn) with no issues. Any idea about where I could be making a mistake? Thank you.

Check this dataMay=data : try data= dataMay

1 Like

It does work now. Wow, I gotta be more careful. Thanks a lot.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.