Barplots for 4 or more categorical variables in R

Hi,

I'm trying to make a barplot for medications ( about 8 of them) by the area of speciality.
dm_long=
dput( Area_of_Speciality meds count

1 Internal Medicine New_Metfrornim Yes
2 Internal Medicine New_Sulfonylureas Yes
3 Family Medicine New_Thiazolidinediones Yes
4 Emergency Medicine Asprin No
5 Family Medicine New_GLP1 Yes
6 Internal Medicine Advil No
7 Internal Medicine Insulin Yes
8 Neonatal Medicine ibuprofen Yes
9 Dentistry Sulfonylureas No
10 Internal Medicine Thiazolidinediones No )

This is the dataset and I want to make the barplots for only meds with count as "yes"


ggplot(dm_long) +
  geom_col(aes(x = meds, y = count),position = 'dodge') +
  facet_wrap(~ Area_of_Speciality) 

This code is giving me the same barplots for all the area of speciality . Please help. I hope this way of presentation helps. Thank you

geom_bar() makes the height of the bar proportional to the number of cases in each group (or if the weight aesthetic is supplied, the sum of the weights). If you want the heights of the bars to represent values in the data, use geom_col() instead.

If this suggestion doesnt solve your issue, then to help us help you, could you please prepare a reproducible example (reprex) illustrating your issue? Please have a look at this guide, to see how to create one:

1 Like

I'm sorry what exactly is confusing about my post.

I didnt say it was confusing. I explained that it was not reproducible (in a convenient way), and gave you a guide for how to address that.

Few tips on how to ask a good question:

  1. Donot add data files or pdf files.
  2. Use dput to paste a representative example of input (not more than 20 rows).
  3. Enclose the code in back ticks (```)
  4. Add library used at the top of code, so the code works on someone else's R session.

As advised by @nigrahamuk read the reprex FAQ

Thank you. I understand now

我不清楚这是否是你想要的!

我下载了你的数据,并试图重现,并按我的理解修改了代码,主要是最终绘图的函数修改就可以!
修改前生成效果图:

ggplot(plor_bar) +
  geom_bar(aes(x = Meds, fill = Response),position = 'dodge') + 
  facet_grid(~ Area_of_Speciality)

经过上面的修改后:

不确定这是否是你想要的结果!

dm_long=
dput( Area_of_Speciality meds count

1 Internal Medicine New_Metfrornim Yes
2 Internal Medicine New_Sulfonylureas Yes
3 Family Medicine New_Thiazolidinediones Yes
4 Emergency Medicine Asprin No
5 Family Medicine New_GLP1 Yes
6 Internal Medicine Advil No
7 Internal Medicine Insulin Yes
8 Neonatal Medicine ibuprofen Yes
9 Dentistry Sulfonylureas No
10 Internal Medicine Thiazolidinediones No )

This is the dataset and I want to make the barplots for only meds with count as "yes"


ggplot(dm_long) +
  geom_col(aes(x = meds, y = count),position = 'dodge') +
  facet_wrap(~ Area_of_Speciality) 

This code is giving me the same barplots for all the area of speciality . Please help. I hope this way of presentation helps

If useing geom_col then you only map xand not map y as the purpose of geom_col is that it will construct y for you by counting. If you want to filter out the records which are count "no". Then I recommend you remove them from the dataframe you pass to ggplot via dplyr::filter()

But I want to plot the proportion of "Yes" not just the count of "Yes", so filtering it, I will lose that. Please how do I go about it. Sorry for asking too many questions. I'm new to R.

If you want proportions i recommend you calculate them. dplyr group_by() and summarise() are good tools for the task

1 Like

说实话,还是没太明白你想要的,结合上次你的数据,我是这样操作的,不知是否如你所想:

#这是上次数据gather后的数据,是否是你dput的样子?

structure(list(Area_of_Speciality = c("Internal Medicine", "Emergency Medicine", 
"Internal Medicine", "Childrens Clinic", "Endocrinology", "Family Medicine", 
"Internal Medicine", "Family Medicine", "Childrens Clinic", "Emergency Medicine", 
"Emergency Medicine", "Internal Medicine", "Emergency Medicine", 
"Internal Medicine", "Childrens Clinic", "Endocrinology", "Family Medicine", 
"Internal Medicine", "Family Medicine", "Childrens Clinic", "Emergency Medicine", 
"Emergency Medicine", "Internal Medicine", "Emergency Medicine", 
"Internal Medicine", "Childrens Clinic", "Endocrinology", "Family Medicine", 
"Internal Medicine", "Family Medicine", "Childrens Clinic", "Emergency Medicine", 
"Emergency Medicine", "Internal Medicine", "Emergency Medicine", 
"Internal Medicine", "Childrens Clinic", "Endocrinology", "Family Medicine", 
"Internal Medicine", "Family Medicine", "Childrens Clinic", "Emergency Medicine", 
"Emergency Medicine", "Internal Medicine", "Emergency Medicine", 
"Internal Medicine", "Childrens Clinic", "Endocrinology", "Family Medicine", 
"Internal Medicine", "Family Medicine", "Childrens Clinic", "Emergency Medicine", 
"Emergency Medicine", "Internal Medicine", "Emergency Medicine", 
"Internal Medicine", "Childrens Clinic", "Endocrinology", "Family Medicine", 
"Internal Medicine", "Family Medicine", "Childrens Clinic", "Emergency Medicine", 
"Emergency Medicine", "Internal Medicine", "Emergency Medicine", 
"Internal Medicine", "Childrens Clinic", "Endocrinology", "Family Medicine", 
"Internal Medicine", "Family Medicine", "Childrens Clinic", "Emergency Medicine", 
"Emergency Medicine"), Meds = c("Asprin", "Asprin", "Asprin", 
"Asprin", "Asprin", "Asprin", "Asprin", "Asprin", "Asprin", "Asprin", 
"Asprin", "Vitamin D", "Vitamin D", "Vitamin D", "Vitamin D", 
"Vitamin D", "Vitamin D", "Vitamin D", "Vitamin D", "Vitamin D", 
"Vitamin D", "Vitamin D", "Tylenol", "Tylenol", "Tylenol", "Tylenol", 
"Tylenol", "Tylenol", "Tylenol", "Tylenol", "Tylenol", "Tylenol", 
"Tylenol", "Advil", "Advil", "Advil", "Advil", "Advil", "Advil", 
"Advil", "Advil", "Advil", "Advil", "Advil", "Cough Syrup", "Cough Syrup", 
"Cough Syrup", "Cough Syrup", "Cough Syrup", "Cough Syrup", "Cough Syrup", 
"Cough Syrup", "Cough Syrup", "Cough Syrup", "Cough Syrup", "Vitamin c", 
"Vitamin c", "Vitamin c", "Vitamin c", "Vitamin c", "Vitamin c", 
"Vitamin c", "Vitamin c", "Vitamin c", "Vitamin c", "Vitamin c", 
"Vitamin k", "Vitamin k", "Vitamin k", "Vitamin k", "Vitamin k", 
"Vitamin k", "Vitamin k", "Vitamin k", "Vitamin k", "Vitamin k", 
"Vitamin k"), Response = c("Yes", "Yes", "Yes", "Yes", "No", 
"Yes", "Yes", "Yes", "No", "Yes", "Yes", "Yes", "No", "No", "Yes", 
"Yes", "Yes", "Yes", "Yes", "No", "Yes", "No", "Yes", "No", "No", 
"Yes", "Yes", "No", "Yes", "No", "No", "No", "Yes", "No", "No", 
"No", "Yes", "No", "No", "No", "Yes", "No", "Yes", "Yes", "Yes", 
"No", "Yes", "Yes", "No", "Yes", "No", "No", "No", "Yes", "Yes", 
"No", "No", "Yes", "Yes", "No", "No", "Yes", "Yes", "No", "Yes", 
"Yes", "Yes", "No", "Yes", "Yes", "No", "Yes", "Yes", "Yes", 
"No", "Yes", "Yes")), row.names = c(NA, -77L), class = "data.frame")

##以下是绘图:我这里采用的是百分比柱状图,它可以展示给你每类的占比,我把No这类颜色设定为白色,以便于忽略,

ggplot(plor_bar) +
  geom_bar(aes(x = Meds, fill = Response),position = 'fill') +
  facet_wrap(~ Area_of_Speciality) +
  scale_fill_manual(values=c(Yes = "blue", No = "white"))

如下为结果截图: