create a ggplot using two lists

user124578 · February 23, 2020, 5:06pm

I am trying to create histogram using ggplot of two lists. At the moment I am using the base function plot. Please can someone explain how to using ggplot? I would like the y axis to show the density.

data1=data.matrix(samples1)
data2=data.matrix(samples_cp)

hist(data1[,4], freq = F, col="blue", main ="1990 Prediction")
hist(data2[,4], freq = F, col="red", add=T)

FJCC · February 23, 2020, 5:42pm

The easiest way to plot the two distributions is ggplot is to combine the two data sets. How to do this combination depends on just how your data are structured. If you explain more about your data, someone might provide more specific advice.

library(ggplot2)
DF1 <- data.frame(A = rep("Dat1", 20), Value = rnorm(20))
DF2 <- data.frame(A = rep("Dat2", 20), Value = rnorm(20, 0.5, 1))
AllDat <- rbind(DF1, DF2)
ggplot(AllDat, aes(x = Value, fill = A)) + geom_histogram(binwidth = 0.2, position = "dodge")

^{Created on 2020-02-23 by the reprex package (v0.3.0)}

dromano · February 23, 2020, 7:11pm

Other possibilities might be to mimic adding two histograms, as you did, or using frequency polygons instead. I'll use @FJCC's defintions of DF1 and DF2, to illustrate.

The first results in one histogram obscuring the other:

DF1 <- data.frame(A = rep("Dat1", 20), Value = rnorm(20))
DF2 <- data.frame(A = rep("Dat2", 20), Value = rnorm(20, 0.5, 1))

library(ggplot2)
ggplot() +
  geom_histogram(aes(Value, y = ..density.., fill = 'DF1'), data = DF1) +
  geom_histogram(aes(Value, y = ..density.., fill = 'DF2'), data = DF2)
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

By changing the transparency of the bars, this might be improved, or you could add position = "dodge" as @FJCC does. Frequency polygons allow for a different kind of visibility:

  
ggplot() +
  geom_freqpoly(aes(Value, y = ..density.., color = 'DF1'), data = DF1) +
  geom_freqpoly(aes(Value, y = ..density.., color = 'DF2'), data = DF2)
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

^{Created on 2020-02-23 by the reprex package (v0.3.0)}

user124578 · February 23, 2020, 9:41pm

Thanks for your answer. The data is in a list format. Is it possible to convert straight to data frame?

FJCC · February 23, 2020, 10:06pm

It may be easy to convert your data from a list to a data frame, in fact a data frame is basically a list, but we would have to know the structure of the list. Try running the commands

str(samples1)
str(samples_cp)

When you paste the result into you reply, please place three back ticks, ```, on the lines before and after the output so that it gets formatted as code.

user124578 · February 23, 2020, 10:08pm

samples1

List of 4
 $ : 'mcmc' num [1:1000, 1:4] 0.437 0.461 0.471 0.562 0.511 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : NULL
  .. ..$ : chr [1:4] "a1" "mu" "sd" "ypred"
  ..- attr(*, "mcpar")= num [1:3] 2002 4000 2
 $ : 'mcmc' num [1:1000, 1:4] 0.496 0.614 0.399 0.467 0.585 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : NULL
  .. ..$ : chr [1:4] "a1" "mu" "sd" "ypred"
  ..- attr(*, "mcpar")= num [1:3] 2002 4000 2
 $ : 'mcmc' num [1:1000, 1:4] 0.665 0.509 0.353 0.563 0.568 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : NULL
  .. ..$ : chr [1:4] "a1" "mu" "sd" "ypred"
  ..- attr(*, "mcpar")= num [1:3] 2002 4000 2
 $ : 'mcmc' num [1:1000, 1:4] 0.521 0.52 0.493 0.649 0.532 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : NULL
  .. ..$ : chr [1:4] "a1" "mu" "sd" "ypred"
  ..- attr(*, "mcpar")= num [1:3] 2002 4000 2
 - attr(*, "class")= chr "mcmc.list"

str(samples_cp)

List of 4
 $ : 'mcmc' num [1:1000, 1:5] 0.206 0.292 0.154 0.251 0.108 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : NULL
  .. ..$ : chr [1:5] "a1" "b1" "mu" "sd" ...
  ..- attr(*, "mcpar")= num [1:3] 10020 30000 20
 $ : 'mcmc' num [1:1000, 1:5] 0.306 0.16 0.317 0.204 0.367 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : NULL
  .. ..$ : chr [1:5] "a1" "b1" "mu" "sd" ...
  ..- attr(*, "mcpar")= num [1:3] 10020 30000 20
 $ : 'mcmc' num [1:1000, 1:5] 0.369 0.133 0.133 0.181 0.235 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : NULL
  .. ..$ : chr [1:5] "a1" "b1" "mu" "sd" ...
  ..- attr(*, "mcpar")= num [1:3] 10020 30000 20
 $ : 'mcmc' num [1:1000, 1:5] 0.165 0.318 0.254 0.264 0.254 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : NULL
  .. ..$ : chr [1:5] "a1" "b1" "mu" "sd" ...
  ..- attr(*, "mcpar")= num [1:3] 10020 30000 20
 - attr(*, "class")= chr "mcmc.list"

FJCC · February 23, 2020, 10:52pm

I think

DF_samples1 <- dplyr::bind_rows(samples1)

will give you the data frame you want from that data set and you can run a similar command with the other list. You need to have the dplyr package, of course. You can then add a column to each data frame with a command like

DF_samples1$Source <- "Samples1"

Finally, bind the two data frame with

AllData <- dplyr::bind_rows(DF_samples1, DF_samples_cp)

user124578 · February 24, 2020, 8:46am

When running: DF_samples1 <- dplyr::bind_rows(samples1)

I am getting an error message:

Error: Argument 1 must be a data frame or a named atomic vector, not a mcmc.list

DavoWW · February 24, 2020, 9:12am

This output is an mcmc.list from modelling using Bayesian techniques (well that's my guess since you haven't told us how it was produced). You will need special tools to process the output and visualize the results. Check out the {MCMCvis} and {coda} packages on CRAN which are designed to handle mcmc.list output.
HTH

user124578 · February 24, 2020, 9:54am

Yes it is an MCMC output of Bayesian model

FJCC · February 24, 2020, 2:38pm

Try using

DF_samples1 = as.data.frame(data.matrix(samples1))
DF_samples_cp = as.data.frame(data.matrix(samples_cp))

DF_samples1$Source <- "samples1"
DF_samples_cp$Source <- "samples_cp"

AllData <- dplyr::bind_rows(DF_samples1, DF_samples_cp)

I think you can then use ggplot as previously explained.

user124578 · February 24, 2020, 9:44pm

Thanks for your help

system · March 2, 2020, 9:44pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.