Summary of anova with bargraph

Nielo · April 18, 2020, 3:09pm

Good morning
Please, is it a package named 'biology' in R?
I did the anova test and would like to use a bargraph to appreciate the difference between factors. the post-hoc Tukey"s test is associated . something like this:

So, I applied the instructions (script as indicated in the handbook) with my data as you can see with the script bellow but it does't work:
library(biology)
Mbargraph(dataframe1$NestGroupSize, dataframe1$VegetationType2, symbols = c("A", "AB", +"B", "AB"), ylab = "Mean number of nests", xlab = "Vetgetation type")

but when I run the script this is the results in red:

library(biology)
Error in library(biology) : aucun package nommé ‘biology’ n'est trouvé
Mbargraph(dataframe1$NestGroupSize, dataframe1$VegetationType2, symbols = c("A", "AB", "B", "AB"), ylab = "Mean number of nests", xlab = "Vetgetation type")
Error in Mbargraph(dataframe1$NestGroupSize, dataframe1$VegetationType2, :
impossible de trouver la fonction "Mbargraph"
if ther is anothe roption please let me know

andresrcs · April 18, 2020, 3:47pm

It seems like biology package is not available at CRAN or any mainstream package source, look in your documentation of you can find a reference to the source or the author.

Nielo · April 18, 2020, 6:00pm

Good evening Andresrcs
i got from Biostatistical designand analysis using R of Murray Logan (see atachment) at page 268.

if you have another option to sot or illustrate the diffirences of factors from anova, please let me know.
Thanks

FJCC · April 18, 2020, 7:00pm

I took the data available for the book and made a graph with the following code. Most of the code is just wrangling the data to facilitate the layout in your example.

DF <- read.csv("c:/users/fjcc/Documents/R/Play/medley.csv")
FIT <- lm(DIVERSITY ~ ZINC, data = DF)
library(broom)
TidyFit <- tidy(FIT)
TidyFit
#> # A tibble: 4 x 5
#>   term        estimate std.error statistic  p.value
#>   <chr>          <dbl>     <dbl>     <dbl>    <dbl>
#> 1 (Intercept)   1.80       0.165    10.9   5.81e-12
#> 2 ZINCHIGH     -0.520      0.226    -2.29  2.89e- 2
#> 3 ZINCLOW       0.235      0.233     1.01  3.21e- 1
#> 4 ZINCMED      -0.0797     0.226    -0.352 7.27e- 1
TidyFit$term <- c("Back", "High", "Low", "Med")
TidyFit$term <- factor(TidyFit$term, levels = c("High", "Med", "Low", "Back"), ordered = TRUE)
Offset <- TidyFit[1, "estimate", drop = TRUE]
TidyFit[2:4, "estimate"] <- TidyFit[2:4, "estimate"] +  Offset
TidyFit$Symbols <- c("AB", "A", "B", "AB")
TidyFit
#> # A tibble: 4 x 6
#>   term  estimate std.error statistic  p.value Symbols
#>   <ord>    <dbl>     <dbl>     <dbl>    <dbl> <chr>  
#> 1 Back      1.80     0.165    10.9   5.81e-12 AB     
#> 2 High      1.28     0.226    -2.29  2.89e- 2 A      
#> 3 Low       2.03     0.233     1.01  3.21e- 1 B      
#> 4 Med       1.72     0.226    -0.352 7.27e- 1 AB
library(ggplot2)
ggplot(TidyFit, aes(term, estimate)) + geom_col(fill = "grey50") +
  geom_errorbar(aes(ymax = estimate + std.error, ymin = estimate - std.error), width = 0.2) +
  geom_text(aes(y = estimate + std.error, label = Symbols), vjust = -1) +
  ylim(0, 2.4) + labs(x = "Zinc Concentration", y = "Mean Diatom Diversity")

^{Created on 2020-04-18 by the reprex package (v0.3.0)}

Nielo · April 18, 2020, 10:11pm

[quote="FJCC, post:4, topic:62035"]
Thanks you FJCC for the suggestion. However, whenI apply it to my own that it does't work. I instead have the following result:

> FIT <- lm(NestGroupSize ~ VegetationType2, data = dataframe1)
> library(broom)
> TidyFit <- tidy(FIT)
> TidyFit
# A tibble: 4 x 5
  term                        estimate std.error statistic  p.value
  <chr>                          <dbl>     <dbl>     <dbl>    <dbl>
1 (Intercept)                    3.81      0.267    14.3   1.35e-30
2 VegetationType2Open forest     1.14      0.447     2.55  1.16e- 2
3 VegetationType2Wooding sav~    0.332     0.461     0.720 4.73e- 1
4 VegetationType2Secondary f~   -0.815     2.42     -0.337 7.37e- 1
> TidyFit$term <- c("Mature forest", " Wooding savannah", " Open forest", "Secondary forest", "Grass savannah", "Swampy Zone"," Semi-Swampy zone")
Error in `$<-.data.frame`(`*tmp*`, term, value = c("Mature forest", " Wooding savannah",  : 
  replacement has 7 rows, data has 4
> TidyFit$term <- factor(TidyFit$term, levels = c("Mature forest", "Wooding savannah", " Open forest", "Secondary forest", "Grass savannah", "Swampy Zone", "Semi-Swampy zone"), ordered = TRUE)
> Offset <- TidyFit[1, "estimate", drop = TRUE]
> TidyFit[4:7, "estimate"] <- TidyFit[4:7, "estimate"] +  Offset
> TidyFit$Symbols <- c( "A", "B", "C","AB","BA","CB", "AC")
> TidyFit
# A tibble: 7 x 6
  term  estimate std.error statistic   p.value Symbols
* <ord>    <dbl>     <dbl>     <dbl>     <dbl> <chr>  
1 NA       3.81      0.267    14.3    1.35e-30 A      
2 NA       1.14      0.447     2.55   1.16e- 2 B      
3 NA       0.332     0.461     0.720  4.73e- 1 C      
4 NA       3.00      2.42     -0.337  7.37e- 1 AB     
5 NA      NA        NA        NA     NA        BA     
6 NA      NA        NA        NA     NA        CB     
7 NA      NA        NA        NA     NA        AC

I would like to remind that my goal cpnsist to compare factors means from anova with a bargraph associated to leters. Factors in the graph previously attached with same letters indicate that there is not signifacnt difference among them.
Thanks
Thanks

assignUser · April 18, 2020, 11:19pm

The package is available from the authors homepage: http://users.monash.edu.au/~murray/BDAR/
But be warned! This package is 10+ Years old and was written for a very out of date version of R.
So do not expect it to work flawlessly with a modern setup!

assignUser · April 18, 2020, 11:20pm

Please use reprex to properly format your output, otherwise it is difficult to follow:

FAQ: What's a reproducible example (`reprex`) and how do I create one? meta

Why reprex? Getting unstuck is hard. Your first step here is usually to create a reprex, or reproducible example. The goal of a reprex is to package your code, and information about your problem so that others can run it and feel your pain. Then, hopefully, folks can more easily provide a solution. What's in a Reproducible Example? Parts of a reproducible example: background information - Describe what you are trying to do. What have you already done? complete set up - include any library() calls and data to reproduce your issue. data for a reprex: Here's a discussion on setting up data for a reprex make it run - include the minimal code required to reproduce your error on the data…

Nielo · April 19, 2020, 11:05am

Thanks AssignUser for the biology package. I got it but as you said it is not compatible with my R version (3.6).
Concerning the reprex I don't understand how it works. however this is my worry.
I have survey a forest to count ape nests and these nests fell in seven vegetation types (that I considered as factors for this variable). ANOVA help me to determine significant differences between the vegetation types and Tukey test have helped me to compare the respective difference of each factors to another. I now want to show this differences in a graphic like the one from Murray book above . Unfortunately, the procedure that Murray used did not help me. Murray suggested in his book to use library(biology) package and the script Mbargraph(medley$DIVERSITY, medley$ZINC, symbols = c("A", "AB","B", "AB"), ylab = "Mean diatom diversity",xlab = "Zinc concentration")
is it any option that can help me to meet my goal? looking forward to hear from you.
Thanks

nirgrahamuk · April 19, 2020, 11:12am

I have no idea what you are trying to do with this. You have a 4 row dataframe as a result of fitting your model, and you then try adding a 7row column to it ? What and why...

assignUser · April 19, 2020, 11:17am

You seem to lack some fundamentals about R (no offence, we all need to start somewhere ) I would suggest you read this (free!) book:

Nielo · April 20, 2020, 4:38am

thanck for the book. look intersting for a beginner like us. Really appreciate

Nielo · April 20, 2020, 11:57am

Hello Dear nirgrahamuk
I have survey a forest to count ape nests and these nests fell in seven vegetation types (that I considered as factors for this variable). ANOVA help me to determine significant differences between the vegetation types and Tukey test have helped me to compare the respective difference of each factors to another. I am trying to show this differences in a graphic as Logan did with the example about Zinc concentration with diatom diversity

nirgrahamuk · April 20, 2020, 12:08pm

fair enough, but your results:

  term                        estimate std.error statistic  p.value
  <chr>                          <dbl>     <dbl>     <dbl>    <dbl>
1 (Intercept)                    3.81      0.267    14.3   1.35e-30
2 VegetationType2Open forest     1.14      0.447     2.55  1.16e- 2
3 VegetationType2Wooding sav~    0.332     0.461     0.720 4.73e- 1
4 VegetationType2Secondary f~   -0.815     2.42     -0.337 7.37e- 1

Don't they show that you only have 3 values of Vegetation Type 2 that would have estimates different from the base case (Intercept) - so at most you can plot charts of 4 different bars, and not 7 .
(well I suppose you could plot 7 but there would be 4 identical bars only differing by the VegetationType2 label, and not by prediction.

system · May 11, 2020, 12:09pm

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.