Hello I have to do a project with the following instruction :
Plot the distribution of the number of transcripts per gene.
By distribution, we expect the density or binned histogram of the univariate number of transcript per gene.
For the moment I did this, but this does not work very well ...
library(dplyr) library(tidyverse) library(ggplot2) library(forcats) gencode %>% filter(!is.na(transcript_id)) %>% group_by(gene_id) %>% summarise(n = n_distinct(transcript_id)) %>% ggplot(aes(x = gene_id)) + stat_bin(aes(y = "count", label= "count"), geom="text", vjust=-.5) + geom_bar(stat="identity") geom_histogram(aes(x = n), col ="red", fill = "green", alpha = .2) + geom_density(col = 2) + labs(title = "Histogram of transcript per gene", x = "gene", y = "transcript")
I would be happy if you could help me, thanks a lot