Calculating % of a column with binary values

Thanks, your answers help a lot :grin:.

I should have probably made my advice on this more obvious: we have an FAQ that shows you how to get the formatting right when you're pasting stuff into posts on this site (it's as easy as clicking a button in the posting box). You can find that info here: FAQ: How to format your code
(I linked to it in my message above, but a little obliquely — sorry! :sweat_smile:).

For now, I went ahead and fixed the formatting in your reply. Without code formatting, the forum software thinks all those dollar signs mean you're trying to write mathematical equations and things get wonky :upside_down_face:.

A good intuition! Beyond being a case of using a sledgehammer to swat a fly, a model is not the same thing as a description of your data.

I also agree that factor manipulation functions are not helpful for your question. A factor is just a fancy numeric variable that has an "attribute" storing the information about what the numeric codes correspond to, so as I think you've realized, there's no benefit to factor-izing Improvement at this point.

Great :grinning: So as a first step, can you express that algorithm in R code? In case you don't know where to start, try running these statements and seeing what happens:

Kaplandata_$Improvement

?length
length(Kaplandata_$Improvement)

Kaplandata_$Improvement == 0

Kaplandata_$Improvement[Kaplandata_$Improvement == 0]
How I'd translate the above statements into English sentences
  • Kaplandata_$Improvement: "Show me the vector 'Improvement' from the data frame 'Kaplandata_'"

  • ?length: "Show me help for the 'length' function"

  • length(Kaplandata_$Improvement): How many elements are in Improvement?

  • Kaplandata_$Improvement == 0: "Show me the results of going through each element of Improvement and asking if the element is equal to zero"

  • Kaplandata_$Improvement[Kaplandata_$Improvement == 0]: "Show me the elements of Improvement that are equal to zero", or more verbosely and precisely, "Show me the elements of Improvement where the answer to asking if they are equal to zero is TRUE".

The direct translation of your pencil-and-paper method might not be the fanciest or most clever way to solve the problem, but it is often a very good starting place because it is a method you understand and can reason about. Plus it's easier to have fun trying other methods when you know you've got at least one thing that works :grinning:.

A hint for going further: given that you're dealing with a column of 1s and 0s, can you think of a simple arithmetic trick that would give you the number of 1s in the column?