Dichotomising Ordinal Categorical Variables When Variables Have Different Levels

Hi All,

Just yesterday, user 'nwerth' suggested a very helpful insight for using the 'lapply' function to assign dichotomous values to a dataset with six prospective values, and five variables.

In such circumstances, I had suggested that I had a data set called 'trial' with five variables ("L", "M", "N", "O", "P") with six levels (0,1,2,3,4,5,), which I wished to dichotomise: 0 = {0,1,2}, 1 = {2,3,4,5}.

trial[] <- lapply(trial, factor, levels = 0:5, labels = c(1, 1, 1, 2, 2, 2))

Yet, suppose now that some of variables in my data set did not have six levels, but instead had three levels: {1,2,3}.

It follows, that if I wished to dichotomised L = 0 = {0,1,2}, 1 = {2,3,4,5}, yet also wished to dichotomise M,N,O,P = 0 = {1}, 1 = {2,3}, that there would be multiple matchings for zero = {0,1,2} and {1}.

I wondered therefore if anyone knew how to overcome this problem and code variables with values which are independently?

Would be appreciated.



Levels are an attribute of each variable object. As a consequence dichotomisation of each variable requires separate treatment. Those that share the same number of levels can be done with an appropriately composed function.

That is the immediate point—the more important point, though, is that the focus on the subsidiary question how obstructs visibility of the principal question what.

In what sense is a level of one variable comparable to the level of another? Suppose the variables are unisex garments and the levels represent sizes—XS, S, M, L, XL, XXL for one set and S, M, L for another. To put them on a common basis, some information contained in the six-level variables must be discarded by combining XS with S and XL and XXL with L. Alternatively, some information must be imputed to the other by creating levels with no instances.

A problem like this benefits from a concrete manifestation. See the FAQ: How to do a minimal reproducible example reprex for beginners. This will allow the general solution to be approached inductively rather than deductively and provide a principled basis on which to evaluate how well or poorly it performs against expectation.

Thanks Richard,

Appreciate your feedback. I managed to use the 'car' package to resolve it, yet thanks again for your input.

1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.