To plot a vector, v
, in a way that shows some classification of the values of v
into different categories. The classification derived from some algorithm must be identified to the plot method either explicitly as a separate variable (such as a vector of kmeans clusters) or implicitly provided by algorithm. For example, there's the binning performed by hist()
.
Here's an example of using kmeans clustering of a single continuous variable to create a plot of the mtcars$mpg
variable according to membership in a kmeans cluster. Other interval algorithms can also be specified.
# Load required packages
library(classInt)
library(ggplot2)
# Load the mtcars dataset
data(mtcars)
# Calculate kmeans intervals
num_intervals <- 5
kmeans_intervals <- classIntervals(mtcars$mpg, num_intervals, style = "kmeans")
# Assign each mpg value to its corresponding interval
mtcars$kmeans_group <- cut(mtcars$mpg, breaks = kmeans_intervals$brks, include.lowest = TRUE, labels = FALSE)
# Create the dotplot using ggplot2
ggplot(mtcars, aes(x = factor(kmeans_group), y = mpg)) +
geom_dotplot(binaxis = "y", stackdir = "center", dotsize = 0.5) +
labs(title = "Dotplot of mtcars$mpg Divided into kmeans Intervals",
x = "kmeans Group",
y = "Miles per Gallon (mpg)") +
coord_flip() +
theme_minimal()
#> Bin width defaults to 1/30 of the range of the data. Pick better value with
#> `binwidth`.

Created on 2023-10-02 with reprex v2.0.2