get a photo of it.
My question is there something akin to OpenRefine's value.split(";") in Rstudio, so I can do count and plot for example with the companies abbreviation without it counting it like WD; LNER; BR but as count them as one WD, One LNER and one BR separate
Welcome to the community @kiankier ! I believe separate_rows()
is the function you're looking for.
library(tidyverse)
df = data.frame(
row = 1:3,
abbr = c('WD; LNER; BR',
'WD; LNWR; BR',
'WD; GCR; LNER' )
)
df
#> row abbr
#> 1 1 WD; LNER; BR
#> 2 2 WD; LNWR; BR
#> 3 3 WD; GCR; LNER
out = df %>%
separate_rows(abbr, sep = '; ')
out
#> # A tibble: 9 × 2
#> row abbr
#> <int> <chr>
#> 1 1 WD
#> 2 1 LNER
#> 3 1 BR
#> 4 2 WD
#> 5 2 LNWR
#> 6 2 BR
#> 7 3 WD
#> 8 3 GCR
#> 9 3 LNER
count(out, abbr)
#> # A tibble: 5 × 2
#> abbr n
#> <chr> <int>
#> 1 BR 2
#> 2 GCR 1
#> 3 LNER 2
#> 4 LNWR 1
#> 5 WD 3
Created on 2023-01-07 with reprex v2.0.2.9000
I will try it out, but a question follow this, could I use the count or plot command after this Rscript? like do a plot with the sorting abbreviations?
Yes, that's possible. Below is a continuing example that puts it all together in one script and uses ggplot
to create a bar graph.
df %>%
separate_rows(abbr, sep = '; ') %>%
count(abbr) %>%
ggplot(aes(x = abbr, y = n)) +
geom_bar(stat = 'identity')
so many thanks, I will use it tomorrow and hopefully, I can make up for the time spent on finding a solution
so is this normal for a plot?
Flm
January 8, 2023, 11:29am
7
Are there NA? You can edit the code like this:
mydataframe %>%
drop_na() %>% # add this line
ggplot(…….)
Is it a case where the "missing" bars are for cases that only appear once? Below is an example that shows if the largest value is 22,500, the bar for n = 1 does not show.
library(tidyverse)
df = data.frame(abbr = c('a', 'b', 'c', 'd'),
n = c(1, 2, 5, 22500))
ggplot(df, aes(x = abbr, y = n)) +
geom_bar(stat = 'identity') +
coord_flip()
Created on 2023-01-08 with reprex v2.0.2.9000
final question, which command do I add to make x-axis labels (in this plot with abbr, I am talking about BR, GCR, LNER) bigger/thick?
Add the theme()
line below. You can specify formatting for axis.text.x
and/or axis.text.y
.
ggplot(df, aes(x = abbr, y = n)) +
geom_bar(stat = 'identity') +
coord_flip() +
theme(axis.text.y = element_text(face = 'bold', size = 20))
1 Like
system
Closed
February 19, 2023, 7:49pm
12
This topic was automatically closed 42 days after the last reply. New replies are no longer allowed. If you have a query related to it or one of the replies, start a new topic and refer back with a link.