Categorical data

thomas25 · April 4, 2019, 9:36am

Hello, I need to calculate the odds of one variable, however, I am not sure whether I calculated them correctly and interpreted them correctly as well. So, I have categorical data:
-following politics (0=not follow, 1=follow)
-richness(0=poor people, 1=rich people)

-distribution of cases is following:
-350 people (not follow; poor people)
-20 people (follow; poor people)
-280 people(not follow; rich peole)
-80 people (follow; rich people)

I need to calculate the odds of following politics for the rich people as well as for the poor people, therefore I would calculate it as follows:
-odds of following politics for the rich people: 80/280=0.29
-odds of following politics for the poor people:20/350=0.06

And I am not sure how to interpret these results but I would say that:
People are 0.29 times less likely to follow politics when they are rich.
People are 0.06 times less likely to follow politics when they are poor.

Then, we can also calculate odds ratio as 0.29/0.06=4.83, which means that the odds of following politics are 4.83 times higher when the people are rich than when they are poor.

What do you think? Am I on the right route?

mara · April 4, 2019, 12:54pm

Assuming you're using R for this, could you please turn this into a self-contained reprex (short for reproducible example)? It will help us help you if we can be sure we're all working with/looking at the same stuff.

install.packages("reprex")

If you've never heard of a reprex before, you might want to start by reading the tidyverse.org help page. The reprex dos and don'ts are also useful.

There's also a nice FAQ on how to do a minimal reprex for beginners, below:

FAQ: How to do a minimal reproducible example ( reprex ) for beginners Guides & FAQs

A minimal reproducible example consists of the following items: A minimal dataset, necessary to reproduce the issue The minimal runnable code necessary to reproduce the issue, which can be run on the given dataset, and including the necessary information on the used packages. Let's quickly go over each one of these with examples: Minimal Dataset (Sample Data) You need to provide a data frame that is small enough to be (reasonably) pasted on a post, but big enough to reproduce your issue. Let's say, as an example, that you are working with the iris data frame head(iris) #> Sepal.Length Sepal.Width Petal.Length Petal.Width Species #> 1 5.1 3.5 1.4 0.…

What to do if you run into clipboard problems

If you run into problems with access to your clipboard, you can specify an outfile for the reprex, and then copy and paste the contents into the forum.

reprex::reprex(input = "fruits_stringdist.R", outfile = "fruits_stringdist.md")

For pointers specific to the community site, check out the reprex FAQ.

Yarnabrina · April 4, 2019, 2:35pm

Obviously, this is right. There can be no doubt about it.

But please don't round off this much. Without rounding off during calculation of the two odds, you would have got the value of the odds ratio as 5.

I don't think I've ever come across this phrasing in interpretation of odds. I can't this wrong, but I'm more familiar with any of the following forms:

When people are rich, following politics is 0.29 times more likely than not following politics

, or, equivalently,

When people are rich, not following politics is 3.5 times more likely than following politics

, as 3.5 = 280 / 80, and it is the odds of not following politics among the rich people.

I don't know which book you're following, but in our college, and in many of the colleges in India, An Introduction to Categorical Data Analysis by Alan Agresti is used. Let me quote from Section 2.3 from 2nd edition of that book:

For a probability of success π, the odds of success are defined to be odds = π/(1 − π).

For instance, if π = 0.75, then the odds of success equal 0.75/0.25 = 3.
The odds are nonnegative, with value greater than 1.0 when a success is more
likely than a failure. When odds = 4.0, a success is four times as likely as a failure.
The probability of success is 0.8, the probability of failure is 0.2, and the odds equal
0.8/0.2 = 4.0. We then expect to observe four successes for every one failure. When
odds = 1/4, a failure is four times as likely as a success. We then expect to observe
one success for every four failures.

Hope this helps.

PS

From your previous two or three questions, it seems to me that these are all homework related. Though I completely understand that you're not posting any verbatim assignment, I think you should at least be familiar with the following post:

Also, since you're more interested in the interpretations of the results, which have nothing to do with R or RStudio, so probably (I'm not sure, and certainly I've no authority at all to say this) this forum is not an appropriate place to ask these questions.

thomas25 · April 5, 2019, 3:52pm

Hey @Yarnabrina, sorry for tag you but I have to ask you one thing to clarify. Although your explanation was useful, I am still pretty confused with these odds. The odds in my case were 0.29 and you interpreted it as:

When people are rich, following politics is 0.29 times more likely than not following politics.

Why more likely? Why not less? Odds are less than 1 so I guess that it should be less likely? Or is this applicable only for odds ratio?

Thanks

Yarnabrina · April 5, 2019, 4:34pm

If you reply without tagging, then also I'd have been notified, unless I had muted this topic specifically. In that case, probably I won't get notified even if you tag me. So it was unnecessary.

You may go through this relevant post.

FAQ: Should I @name mention other users in my post?

This does not mean that rich people follow politics more than they don't. Obviously it means that following politics is less likely than not following politics. But since 0.29 < 1, in my opinion, it automatically takes into account the reverse meaning.

English isn't my native language, so I can't say with certainty. But I don't think the following means what you're trying to mean:

People are 0.29 times less likely to follow politics when they are rich.

If you are satisfied that these two mean the same thing, then you can use it by all means.

Since your original problem regarding interpreting results is solved (irrespective of my phrasing or yours ), will you please consider marking this thread to be solved?

If you don't know how, please take a look at this:

system · April 26, 2019, 4:34pm

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.