Bayes Decision Boundary

ggplot2

#1

I want to plot the Bayes decision boundary for a data that I generated, having 2 predictors and 3 classes and having the same covariance matrix for each class. Can anyone help me with that?

Here is the data I have:

set.seed(123)
x1 = mvrnorm(50, mu = c(0, 0), Sigma = matrix(c(1, 0, 0, 3), 2))
x2 = mvrnorm(50, mu = c(3, 3), Sigma = matrix(c(1, 0, 0, 3), 2))
x3 = mvrnorm(50, mu = c(1, 6), Sigma = matrix(c(1, 0, 0, 3), 2))


#2

You should include a reproducible example, called a reprex because data don't have Bayes decision boundaries, Bayesian models have them.


#3

See the scripts for MASS chapter 12 here. The predplot function shows how to take a grid of posterior probabilities for each class and make a contour plot. It uses base graphics but you can make a better version using ggplot2 pretty easily


#4

Hi, Richard,

initially I thought you were right, but on second thoughts, Maria_s actually did give us the model, i.e., the data generating process, not just a data sample. By writing out that her data is conditionally normally distributed in each class, and that the covariance matrix is the same for each class, she's basically saying that the Bayes classifier (the classifier which minimizes the probability of misclassification) is LDA. See https://en.wikipedia.org/wiki/Linear_discriminant_analysis

x1 = mvrnorm(50, mu = c(0, 0), Sigma = matrix(c(1, 0, 0, 3), 2))
x2 = mvrnorm(50, mu = c(3, 3), Sigma = matrix(c(1, 0, 0, 3), 2))
x3 = mvrnorm(50, mu = c(1, 6), Sigma = matrix(c(1, 0, 0, 3), 2))

So now it's just a matter of plotting the linear boundaries - probably the code linked by @Max can be easily rewritten in ggplot2.


#5

I agree that all of those implications could be drawn from from knowing that MASS::mvrorm was used to generate the matrices

             [,1]        [,2]
 [1,] -0.527796456 -0.72467316
 [2,] -0.462147562  0.50428239
 [3,] -0.681117039  1.47324304
 [4,]  2.158980267  1.88261815
 [5,]  1.771674308  3.53047590
 [6,]  1.797841356 -4.80572219
 [7,] -0.896777100 -0.67995561
 [8,]  0.056029156 -2.57703471
 [9,] -0.036139940 -1.68664178
[10,]  1.036021910 -0.82579020

but there's more than one thing you can do with a matrix, of course, and the function is not limited to any particular model. That's why I thought that the question should specify a reproducible example, called a reprex to see where the OP was stuck. There are times when it's reasonable to guess what the question is, but as posed, this one is more of a puzzle.


#6

Did you find the answers useful? If you're still stuck, I'll try to write running code during the weekend. No promises though - I'm on a long business trip.


#7

I figured something out! Thank you though.


#9

Hi Maria.

Could you please share the code? I am working on a similar problem and I am unable to plot the decision boundaries. Thank you!