Recoding Continuous Variables into a Categorical Variable with Value Ranges

Hi there,

I wondered if someone could provide some guidance on how to recode a continuous variable into a categorical variable with certain value ranges.

For example, suppose I had a continuous variable 'Score' taken from a data set called 'test'. The continuous variable score ranges from -1, 1.

  • For scores < 0 , I would like to classify this as a "critical".
  • For 0 < score < 0.5, I would like to classify as "poor"
  • And for score > 0.5, I would like to classify as "good"

Would really appreciate anyone's help on the matter.

Best,

It's a bit of a hack, but...

Score <- runif(10, -1, 1)  ## sample data
Score
Val <- rep("", 10) ## storage for the classifcation
Val[Score<0] <- "critical"
Val[0<=Score & Score<0.5] <- "poor"
Val[Score>=0.5] <- "good"
Val

Thanks @bloosmore - I appreciate your contribution.

I see you've generated a random distribution for Score. However, would this type same procedure work if I ran test$Score in place of Score?

If so, how would I define the storage for classification?

Would be appreciated.

Sure!

test <- data.frame(Score = runif(10, -1, 1))  ## sample data

test$Val <- rep("", length(test$Score))

test$Val[test$Score<0] <- "critical"
test$Val[0<=test$Score & test$Score<0.5] <- "poor"
test$Val[test$Score>=0.5] <- "good"

test

Thanks. Really appreciate your help on the matter.

One more dumb question, apologies. I don't understand the operational purpose of the `rep('''',10)'.

Would you be able to clarify that a little?

Note I simplified the above slightly...

The rep() function creates a vector of the same length as the original data consisting initially of null characters. It's basically just creating storage for the subsequent classifications. An alternative approach would be

 test$Val <- vector("character", length(test$Score))

This is a pretty common operation, so there is actually a function that does it for you:

test <- data.frame(Score = runif(10, -1, 1))  ## sample data
test$val <- cut(test$Score, 
                breaks = c(-1, 0, 0.5, 1), 
                labels = c("critical","poor","good"))

Which is doing the same thing as @bloosmore, but also has methods for different types of variables.

2 Likes

Thank you. That makes sense :slight_smile:

I guess thereafter, it is possible/necessary to transform the 'character' Val into a factor/categorical variable using the standard arguments?

cut returns a factor, with option to return an ordered factor, see ?cut

Thanks both :slight_smile: Really appreciate both contributions.

@AC3112 it is helpful to mark this post as solved, so that others know where to devote effort. See

Understood :slight_smile:

I've marked the solution overall to balance contributions by both parties. Thanks again for the help of you both

And yes, if you decide not to use cut() you can always use:

test$Val <- as.factor(test$Val)

Hi guys,

Sorry to be a pain again.

When I recode this variable as instructed as above in my data set, the original variable Score and the new variable Val exist.

Is there anyway to simply have Val replace Score?

Something like:

test <- data.frame(Score = runif(10, -1, 1))  ## sample data
test$Score <- cut(test$Score, 
                breaks = c(-1, 0, 0.5, 1), 
                labels = c("critical","poor","good"))
1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.