Thank you
this is my code.
data <- c(9,12,10,8,15,
20,21,23,17,30,
6,5,8,16,7)
levels <- factor(rep(c("1","2","3"), each=5))
AA<- data.frame(data,levels)
boxplot(data~levels,AA)
data <- c(9,12,10,8,15,
20,21,23,17,30,
6,5,8,16,7)
levels <- factor(rep(c("1","2","3"), each=5))
AA<- data.frame(data,levels)
boxplot(data~levels,AA)
How so? The function did was it was intended to do with the arguments it was given. What are you looking for it to do?
When it comes to the second boxplot, the maximum is 30.It is supposed to be a line just like the first one. It should not be a point because it is not a outlier. It caused the same result in third graph too. thank you
Every problem in R
can be considered with advantage as an extension of school algebra: f(x) = y. x is what is at hand, y is what is desired, and f is the object (in R
everything is an object) to transform one to the other.
So, in this example, AA
is x, y is a plot object and f is the boxplot
function.
The function signature (from help(boxplot)
) is
boxplot(formula, data = NULL, ..., subset, na.action = NULL,
xlab = mklab(y_var = horizontal),
ylab = mklab(y_var =!horizontal),
add = FALSE, ann = !add, horizontal = FALSE,
drop = FALSE, sep = ".", lex.order = FALSE)
The temptation to move on, rather than unraveling this, is strong. But, for the time being, most of the arguments have defaults (including, particularly ...
, which is always optional). What we are really looking at is no more than formula
Formula is
a formula, such as y ~ grp, where y is a numeric vector of data values to be split into groups according to the grouping variable grp (usually a factor). Note that ~ g1 + g2 is equivalent to g1:g2.
In terms of AA
, the y
in the function signature is the first column of AA, and grp
is the second.
Next, look at the value
that boxplot
returns
stats
a matrix, each column contains the extreme of the lower whisker, the lower hinge, the median, the upper hinge and the extreme of the upper whisker for one group/plot. If all the inputs have the same class attribute, so will this component.
So, let's look at it
Where's 30? We know it's in there somewhere. We also know from summary(AA)
that it's the maximum value in AA$dat
.
Go back and look at the stats
return value. What's the whisker
? It's the vertical line. It spans a range. If 30 is outside the range, it is not included in the whisker
. If not, where is it?
out
the values of any data points which lie beyond the extremes of the whiskers.
Keep in mind that boxplot
is being asked to plot three boxplots, the ranges of which, and their respective out
return values are different. See the end of the reprex
below:
# avoid naming objects data, as it is a built-in function name
dat. <- c(9,12,10,8,15,
20,21,23,17,30,
6,5,8,16,7)
levels <- factor(rep(c("1","2","3"), each=5))
AA<- data.frame(dat.,levels)
boxplot(dat. ~ levels, data = AA) -> the_object
str(the_object)
#> List of 6
#> $ stats: num [1:5, 1:3] 8 9 10 12 15 17 20 21 23 23 ...
#> $ n : num [1:3] 5 5 5
#> $ conf : num [1:2, 1:3] 7.88 12.12 18.88 23.12 5.59 ...
#> $ out : num [1:2] 30 16
#> $ group: num [1:2] 2 3
#> $ names: chr [1:3] "1" "2" "3"
the_object$stats
#> [,1] [,2] [,3]
#> [1,] 8 17 5
#> [2,] 9 20 6
#> [3,] 10 21 7
#> [4,] 12 23 8
#> [5,] 15 23 8
the_object$out
#> [1] 30 16
AA
#> dat. levels
#> 1 9 1
#> 2 12 1
#> 3 10 1
#> 4 8 1
#> 5 15 1
#> 6 20 2
#> 7 21 2
#> 8 23 2
#> 9 17 2
#> 10 30 2
#> 11 6 3
#> 12 5 3
#> 13 8 3
#> 14 16 3
#> 15 7 3
Created on 2020-10-16 by the reprex package (v0.3.0.9001)
This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.
If you have a query related to it or one of the replies, start a new topic and refer back with a link.