Histogram using ggplot

#1

I keep getting the same error. What am I doing wrong in trying to plot a histogram using ggplot grouped by "Class" in the code below?

data_breast <- bc_data[ ,c("Clump", "CellSize", "CellShape", 
                                         "Adhesion", "Epithelial", "Bare", "Bland",
                                         "Nucleoli", "Mitoses", "Class")]

ggplot(data = melt(data_breast, id.var = "Class"), mapping = aes(x = value)) + 
  geom_histogram(bins = 10, aes(fill=Class), alpha=0.5) + facet_wrap(~variable, scales ='free_x')

Error in match.names(clabs, names(xi)) :
names do not match previous names

0 Likes

#2

Hi!

We dont have acces to bc_data so, to help us help you, could you please prepare a reproducible example (reprex) illustrating your issue? Please have a look at this guide, to see how to create one:

0 Likes

#3

I have been strangling with reprex so I decided to copy and paste the data here.

> head(bc_data)
# A tibble: 6 x 10
  Clump CellSize CellShape Adhesion Epithelial Bare  Bland Nucleoli Mitoses Class
  <dbl>    <dbl>     <dbl>    <dbl>      <dbl> <fct> <dbl>    <dbl>   <dbl> <fct>
1     5        1         1        1          2 1         3        1       1 2    
2     5        4         4        5          7 10        3        2       1 2    
3     3        1         1        1          2 2         3        1       1 2    
4     6        8         8        1          3 4         3        7       1 2    
5     4        1         1        3          2 1         3        1       1 2    
6     8       10        10        8          7 10        9        7       1 4  
0 Likes

#4

It's hard to copy your data in this format, please post it on a copy/paste friendly format, you can do it easily by using the datapasta package, please read the link I gave you before or check this other one to learn how to do it.

0 Likes

#5

Are you using melt from reshape2? I was able to generate a plot with the following code, even though there are some issues with the 'Mitoses' category. I suspect the issue may be that the columns Bare and Class are factors in your data.

d <- 
"Clump CellSize CellShape Adhesion Epithelial Bare  Bland Nucleoli Mitoses Class
1     5        1         1        1          2 1         3        1       1 2    
2     5        4         4        5          7 10        3        2       1 2    
3     3        1         1        1          2 2         3        1       1 2    
4     6        8         8        1          3 4         3        7       1 2    
5     4        1         1        3          2 1         3        1       1 2    
6     8       10        10        8          7 10        9        7       1 4"

df <- read.table(textConnection(d), header = TRUE)
library(ggplot2)
ggplot(reshape2::melt(df, id.var = "Class"), mapping = aes(x = value)) + 
  geom_histogram(bins = 10, aes(fill=Class), alpha=0.5) +
  facet_wrap(~variable, scales ='free_x')
0 Likes

#6

Please find the data below using datapasta.

bc_data <-tibble::tribble(
            ~Clump_Thickness, ~Cell_Size_Uniformity, ~Cell_Shape_Uniformity, ~Marginal_Adhesion, ~Single_Epi_Cell_Size, ~Bare_Nuclei, ~Bland_Chromatin, ~Normal_Nucleoli, ~Mitoses,      ~Class,
                           5,                     1,                      1,                  1,                     2,            1,                3,                1,        1,    "benign",
                           5,                     4,                      4,                  5,                     7,           10,                3,                2,        1,    "benign",
                           3,                     1,                      1,                  1,                     2,            2,                3,                1,        1,    "benign",
                           6,                     8,                      8,                  1,                     3,            4,                3,                7,        1,    "benign",
                           4,                     1,                      1,                  3,                     2,            1,                3,                1,        1,    "benign",
                           8,                    10,                     10,                  8,                     7,           10,                9,                7,        1, "malignant",
                           1,                     1,                      1,                  1,                     2,           10,                3,                1,        1,    "benign",
                           2,                     1,                      2,                  1,                     2,            1,                3,                1,        1,    "benign",
                           2,                     1,                      1,                  1,                     2,            1,                1,                1,        5,    "benign",
                           4,                     2,                      1,                  1,                     2,            1,                2,                1,        1,    "benign",
                           1,                     1,                      1,                  1,                     1,            1,                3,                1,        1,    "benign",
                           2,                     1,                      1,                  1,                     2,            1,                2,                1,        1,    "benign",
                           5,                     3,                      3,                  3,                     2,            3,                4,                4,        1, "malignant",
                           1,                     1,                      1,                  1,                     2,            3,                3,                1,        1,    "benign",
                           8,                     7,                      5,                 10,                     7,            9,                5,                5,        4, "malignant",
                           7,                     4,                      6,                  4,                     6,            1,                4,                3,        1, "malignant",
                           4,                     1,                      1,                  1,                     2,            1,                2,                1,        1,    "benign",
                           4,                     1,                      1,                  1,                     2,            1,                3,                1,        1,    "benign",
                          10,                     7,                      7,                  6,                     4,           10,                4,                1,        2, "malignant",
                           6,                     1,                      1,                  1,                     2,            1,                3,                1,        1,    "benign",
                           7,                     3,                      2,                 10,                     5,           10,                5,                4,        4, "malignant",
                          10,                     5,                      5,                  3,                     6,            7,                7,               10,        1, "malignant",
                           3,                     1,                      1,                  1,                     2,            1,                2,                1,        1,    "benign",
                           8,                     4,                      5,                  1,                     2,           NA,                7,                3,        1, "malignant"
            )

head(bc_data)
#> # A tibble: 6 x 10
#>   Clump_Thickness Cell_Size_Unifo~ Cell_Shape_Unif~ Marginal_Adhesi~
#>             <dbl>            <dbl>            <dbl>            <dbl>
#> 1               5                1                1                1
#> 2               5                4                4                5
#> 3               3                1                1                1
#> 4               6                8                8                1
#> 5               4                1                1                3
#> 6               8               10               10                8
#> # ... with 6 more variables: Single_Epi_Cell_Size <dbl>,
#> #   Bare_Nuclei <dbl>, Bland_Chromatin <dbl>, Normal_Nucleoli <dbl>,
#> #   Mitoses <dbl>, Class <chr>
0 Likes

#7

Yes..... "Bare" and "Class" are factor variables. I am using melt from reshape2.

I got this error the second time. What am I missing?

Error: StatBin requires a continuous x variable: the x variable is discrete. Perhaps you want stat="count"?
In addition: Warning message:
attributes are not identical across measure variables; they will be dropped

0 Likes

#8

Well if you run the code below, you will see via the str output that in df2 the column value is an integer, whereas in df3 the column value is character. The default stat of geom_histogram is bin which does not make sense for character variables. I suggest you convert Bare to an integer in the original data.

d <- 
  "Clump CellSize CellShape Adhesion Epithelial Bare  Bland Nucleoli Mitoses Class
1     5        1         1        1          2 1         3        1       1 2    
2     5        4         4        5          7 10        3        2       1 2    
3     3        1         1        1          2 2         3        1       1 2    
4     6        8         8        1          3 4         3        7       1 2    
5     4        1         1        3          2 1         3        1       1 2    
6     8       10        10        8          7 10        9        7       1 4"

df <- read.table(textConnection(d), header = TRUE)
df2 <- reshape2::melt(df, id.var = "Class")
str(df2)
df$Bare <- factor(df$Bare)
df$Class <- factor(df$Class)
df3 <- reshape2::melt(df, id.var = "Class")
str(df3)
0 Likes

closed #9

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.

0 Likes