Jess_R
July 3, 2020, 12:20am
1
I am trying to create a population pyramid using the following code:
HH_rep$AGEcut <- cut(HH_rep$age,seq(0,100,5))
ggplot(data=HH_rep,aes(x= AGEcut,fill=sex)) +
geom_bar(data=subset(HH_rep,sex=="female")) +
geom_bar(data=subset(HH_rep,sex=="male"),aes(y=..count..*(-1))) +
scale_y_continuous(breaks=seq(-1000,1000,200),labels=abs(seq(-1000,1000,200))) +
coord_flip()
I get a pyramid plot ( see image)
However each time the age-group 5-10yrs jumps into the middle of the graph instead of being in sequential order. Even after releveling the the same problem occurs. Any idea how I can overcome this issue?
FJCC
July 3, 2020, 12:28am
2
Can you share your code? If I use cut() to bin an Age variable, I get the bins in the correct order.
library(ggplot2)
DF <- data.frame(Age = runif(n =500, min = 0, max = 100))
DF$Agebin <- cut(x = DF$Age, breaks = seq(0, 100, 5))
ggplot(DF, aes(Agebin)) + geom_bar() +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5))
Created on 2020-07-02 by the reprex package (v0.3.0)
HH_rep$AGEcut <- cut(HH_rep$age,seq(0,100,5))
ggplot(data=HH_rep,aes(x= AGEcut,fill=sex)) +
geom_bar(data=subset(HH_rep,sex=="female")) +
geom_bar(data=subset(HH_rep,sex=="male"),aes(y=..count..*(-1))) +
scale_y_continuous(breaks=seq(-1000,1000,200),labels=abs(seq(-1000,1000,200))) +
coord_flip()
FJCC
July 3, 2020, 1:34am
4
This code produces a correctly ordered y axis for me. Does it work for you?
library(ggplot2)
HH_rep <- data.frame(Age = runif(n =500, min = 0, max = 100),
sex = sample(c("female", "male"), 500, replace = TRUE))
HH_rep$AGEcut <- cut(x = HH_rep$Age, breaks = seq(0, 100, 5))
ggplot(data=HH_rep,aes(x= AGEcut,fill=sex)) +
geom_bar(data=subset(HH_rep,sex=="female")) +
geom_bar(data=subset(HH_rep,sex=="male"),aes(y=..count..*(-1))) +
scale_y_continuous(breaks=seq(-1000,1000,200),labels=abs(seq(-1000,1000,200))) +
coord_flip()
Hi FJCC,
Thanks yes it is ordered now but the structure has changed. Now(I think) it shows the the percentage of men and women in each age category when I would like the count of men and women in each age group from the whole sample. The plot should look like a pyramid with less men and women in the older age groups.
FJCC
July 3, 2020, 3:13am
6
The data in my example are just random numbers so you should not worry that the shape is not what you expect. We now have to figure out why the categories are not correctly ordered with your data.
What is the result of running
str(HH_rep)
on your original HH_rep data frame?
How is HH_rep made? Do you read it in from a file?
HH_rep is read in from an excel file
When i run str(HH_rep) this is what i get for the variables of interest
age : num 44 43 18 8 13 41 32 11 5 1 ...
sex : chr "male" "female" "male" "female" ...
FJCC
July 3, 2020, 2:26pm
8
I cannot see how your original graph is not ordering the levels correctly. I must be missing something. Are the results of the levels() functions the same if you run
DF <- data.frame(Age = runif(n =500, min = 0, max = 100),
sex = sample(c("female", "male"), 500, replace = TRUE))
DF$AGEcut <- cut(x = DF$Age, breaks = seq(0, 100, 5))
levels(DF$AGEcut)
and, using your original HH_rep,
HH_rep$AGEcut <- cut(HH_rep$age,seq(0,100,5))
levels(HH_rep$AGEcut)
Hi FJCC my apologies for the delay in responding.
If I run the code you have provided above using my dataset the age grouping are in the correct order. So I suspect it is something to do with my code for the population pyramid. Do you think it is creating a mode or something in the graph? As it seems to put 5-10yrs in the middle and it is the age group with the most people in it..........
FJCC
July 8, 2020, 2:16am
10
Did you try the following? What was the result?
FJCC:
Are the results of the levels() functions the same if you run
DF <- data.frame(Age = runif(n =500, min = 0, max = 100),
sex = sample(c("female", "male"), 500, replace = TRUE))
DF$AGEcut <- cut(x = DF$Age, breaks = seq(0, 100, 5))
levels(DF$AGEcut)
and, using your original HH_rep,
HH_rep$AGEcut <- cut(HH_rep$age,seq(0,100,5))
levels(HH_rep$AGEcut)
Jess_R
July 8, 2020, 2:31am
12
This is what I get when I run that
FJCC
July 8, 2020, 2:40am
13
I was looking for something like the following. I simply ran the code and copied what appeared in the console.
> DF <- data.frame(Age = runif(n =500, min = 0, max = 100),
+ sex = sample(c("female", "male"), 500, replace = TRUE))
> DF$AGEcut <- cut(x = DF$Age, breaks = seq(0, 100, 5))
> levels(DF$AGEcut)
[1] "(0,5]" "(5,10]" "(10,15]" "(15,20]" "(20,25]" "(25,30]" "(30,35]" "(35,40]"
[9] "(40,45]" "(45,50]" "(50,55]" "(55,60]" "(60,65]" "(65,70]" "(70,75]" "(75,80]"
[17] "(80,85]" "(85,90]" "(90,95]" "(95,100]"
>
What does running this on your system produce?
HH_rep$AGEcut <- cut(HH_rep$age,seq(0,100,5))
levels(HH_rep$AGEcut)
Jess_R
July 8, 2020, 2:52am
14
HH_rep <- data.frame(age = runif(n =1000, min = 0, max = 100),
+ sex = sample(c("female", "male"), 1000, replace = TRUE))
> HH_rep$AGEcut <- cut(x = HH_rep$age, breaks = seq(0, 100, 5))
> levels(HH_rep$AGEcut)
[1] "(0,5]" "(5,10]" "(10,15]" "(15,20]" "(20,25]" "(25,30]" "(30,35]" "(35,40]" "(40,45]"
[10] "(45,50]" "(50,55]" "(55,60]" "(60,65]" "(65,70]" "(70,75]" "(75,80]" "(80,85]" "(85,90]"
[19] "(90,95]" "(95,100]"
>
FJCC
July 8, 2020, 3:00am
15
So the factor is ordered correctly. What happens if you run
ggplot(data=HH_rep,aes(x= AGEcut,fill=sex)) +
geom_bar(data=subset(HH_rep,sex=="female"))
If that looks correct, incrementally add features until you determine what causes the problem. I don't know what we will do then, but the problem will be better defined.
Jess_R
July 8, 2020, 3:38am
16
The problem arises at this point aes(y=..count..*(-1))
FJCC
July 8, 2020, 4:07am
17
Please try this simplified version of the plotting code. I will try to reproduce your problem tomorrow but I am out of time today.
FEMALES <- subset(HH_rep,sex=="female")
MALES <- subset(HH_rep,sex=="male")
ggplot(mapping = aes(x= AGEcut,fill=sex)) +
geom_bar(data = FEMALES) +
geom_bar(data = MALES, aes(y=..count..*(-1))) +
scale_y_continuous(breaks=seq(-1000,1000,200),labels=abs(seq(-1000,1000,200))) +
coord_flip()
Jess_R
July 8, 2020, 4:25am
18
Thanks it has the same problem.
system
Closed
July 29, 2020, 4:37am
19
This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.