Hi there, just to start off, I am new to R and essentially trying to learn how to use R by jumping in with a project. I will try to simplify the idea of my project (this is not my actual project – just used for simplicity). I have almost 400,000 individual people and asked them if they drink tea or not. In addition to determining if they drink tea, I recorded several other demographics such as what country they are in, gender, ethnicity, and several other characteristics. All of the 400,000 responses are binned and placed into a summary table. I would like to figure out which demographics of these individuals is associated with tea drinking. An example of some data is below. How would I go about doing a regression or goodness-of-fit model based on the summary table? Thanks in advance for any help.
Tea Use
Location | Yes | No |
---|---|---|
Australia | 40 | 388 |
Canada | 4219 | 13959 |
London | 68719 | 150043 |
United States | 20573 | 141608 |
Tea Use
Gender | Yes | No |
---|---|---|
Male | 49376 | 164396 |
Female | 44176 | 141602 |