Mahalanobis distance, comparable values - different samples

littlebylittle · July 28, 2022, 1:21pm

Hello,

I calculated the mahalanobis distance with the intention to detect "careless responder" or in other words people who were not attentive while filling in a questionnaire.

Here´s what I´ve got so far:

mahad(x, plot = TRUE, flag = TRUE, confidence = 0.99, na.rm = TRUE)
x = data set
plot = Plot the resulting QQ graph
flag = Flag potential outliers using the confidence level specified in parameter
confidence = The desired confidence level of the result
na.rm = Should missing data be deleted

#Sample 1 -> =Ver1_items
mahad(Ver1_items, plot = TRUE, flag = TRUE, confidence = 0.99, na.rm = TRUE)

#######d_sq flagged
#1 53.28543 FALSE
#2 59.82937 FALSE
#3 70.93420 FALSE
#4 40.99005 FALSE
#5 61.38863 FALSE
#6 91.87906 TRUE
#7 50.07120 FALSE

#> the numbers on the left stand for the individual persons

The problem now is that I´ve got different samples who filled in a different amount of items. For example sample 1 got 40 items, sample 2 got 100 items. Person x belongs to sample 1 and 20 of her answers on the items stand out ("TRUE") compared to the sample´s mean. So half of her answers suggest her being inattentive or careless. Person y belongs to sample 2 and as well, 20 answers stand out. But now just 25 % of her answers suggest her being careless.

I hope this example makes my question more clear. Please excuse my english.
Maybe it´s also useful to add that my intention is not a categorization of people in "careless" and not "careless" as I intend to use constant variables. So the column "d_sq" is more important for my analysis than the column "flagged". But that might also be an unnecessary information...

To compare the mahalanobis distance I need the values of the different samples to be comparable. So i might need proportions or something like that. But I´m really struggeling to find a function that allows me to do so. I guess Mahalanobis is not the most common function so I hope theres somebody who may can help or has some advice. I would be super grateful as I´m really struggeling with R.

Thanks very much.

system · August 18, 2022, 1:22pm

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.