Ggplot2: How to specify x labels in a datetime data and two segment lines


#1

I have a time-series dataset: the time is by the following format "2017/5/22 4:00". There are questions about creating graphs by ggplot2.

1- when I use ggplot+geom_line, the x labels always show by certain fixed hours: such as 2017/5/22 22:00. Is there any way to control which date + time (not 22:00) showing on x axis?

2- If I want to create my time-series data by two segment lines separated by the time, how can I achieve it?

3- how to create a text legend in a fixed position when x axis is a POSIXct variable?

if ggplox2 cannot achieve these well, is there any other package?

Thanks,

time	sales
2017/5/22 4:00	10943
2017/5/21 4:00	11504
2017/5/31 4:00	11530
2017/5/25 4:00	11586
2017/5/30 4:00	11834
2017/5/28 4:00	11909
2017/5/29 4:00	11962
2017/5/23 4:00	12500
2017/5/26 4:00	13361
2017/5/27 0:00	13873
2017/5/27 4:00	14221
2017/5/23 0:00	14616
2017/5/24 0:00	15123
2017/5/25 0:00	15506
2017/5/31 0:00	15635
2017/5/30 0:00	15954
2017/5/28 0:00	15977
2017/5/26 0:00	16020
2017/5/24 4:00	16484
2017/5/21 0:00	17865
2017/5/22 0:00	18176
2017/5/29 0:00	19663
2017/5/29 20:00	82236
2017/5/27 20:00	83043
2017/5/28 20:00	85146
2017/5/29 8:00	87779
2017/5/21 8:00	88097
2017/5/29 16:00	90712
2017/5/30 20:00	93007
2017/5/21 20:00	93178
2017/5/25 20:00	95056
2017/5/26 20:00	95741
2017/5/28 16:00	96534
2017/5/30 8:00	98494
2017/5/27 16:00	100263
2017/5/24 20:00	100511
2017/5/30 16:00	100517
2017/5/28 8:00	100785
2017/5/23 20:00	101296
2017/5/26 16:00	103592
2017/5/31 20:00	103735
2017/5/25 16:00	105954
2017/5/22 20:00	106065
2017/5/30 12:00	106879
2017/5/21 16:00	108202
2017/5/29 12:00	109062
2017/5/25 8:00	110579
2017/5/28 12:00	114185
2017/5/27 8:00	116136
2017/5/21 12:00	117835
2017/5/24 8:00	117853
2017/5/26 8:00	122221
2017/5/22 8:00	122425
2017/5/23 16:00	124383
2017/5/24 16:00	127696
2017/5/26 12:00	133663
2017/5/27 12:00	134442
2017/5/23 12:00	136722
2017/5/31 8:00	140863
2017/5/23 8:00	143900
2017/5/24 12:00	145211
2017/5/31 16:00	147506
2017/5/22 16:00	150118
2017/5/31 12:00	153933
2017/5/25 12:00	154980
2017/5/22 12:00	155885

#2

All of those things can be done with ggplot2:

  1. Take a look at ?scale_x_datetime, specifically at date_labels and date_breaks parameters. There are some examples that show how you can control the appearance.
  2. You can add another column that will indicate the group depending on this condition with, e.g., dplyr::mutate. You can then use this column inside of aes and/or add it inside of facet_wrap. Read documentation and play with examples to see how it can be done.
  3. This one I've never done myself, but you can take a look here and see if it'll work in your case.

#3

Thanks mishabalyasin.

I tried scale_x_datetime before and date_break works. However, It could not control x labels exactly like what I wanted. for example, ggplot starts to show the date as "5.21 22:00", I didn't find a way to change it as "5.21 00:00" (starting by a different hour)

question 3, I will have a see. Because x axis is a datetime type (Posixct type), I tried several ways, they didn't work.


#4

Could you please include these as self-contained reprex(es) (short for minimal reproducible example)? It will help us help you if we can be sure we're all working with/looking at the same stuff.

Right now the best way to install reprex is:

# install.packages("devtools")
devtools::install_github("tidyverse/reprex")

If you've never heard of a reprex before, you might want to start by reading the tidyverse.org help page. The reprex dos and don'ts are also useful.

If you run into problems with access to your clipboard, you can specify an outfile for the reprex, and then copy and paste the contents into the forum.

reprex::reprex(input = "fruits_stringdist.R", outfile = "fruits_stringdist.md")

For pointers specific to the community site, check out the reprex FAQ, linked to below.


#5

Thanks both. Actually, I just need to control the x label (for example, only show lables in two specific date +time) and I want to tell R/ggplot the x axis poistion (for example, if I want legend and I need to let R know which position it should be) Is there a simple way to achieve such goals? Or any other package can achieve it?


#6

Here's an example of using scale_x_datetime on your data with custom breaks. I picked irregular ones as an example, but you could use the commented-out text to make more typical periodic breaks.

library(tidyverse)

lernst_data <-
  tibble::tribble(
    ~ time, ~ sales,
    "2017/5/22 4:00", 10943L,
    "2017/5/21 4:00", 11504L,
    "2017/5/31 4:00", 11530L,
    "2017/5/25 4:00", 11586L,
    "2017/5/30 4:00", 11834L,
    "2017/5/28 4:00", 11909L,
    "2017/5/29 4:00", 11962L,
    "2017/5/23 4:00", 12500L,
    "2017/5/26 4:00", 13361L,
    "2017/5/27 0:00", 13873L,
    "2017/5/27 4:00", 14221L,
    "2017/5/23 0:00", 14616L,
    "2017/5/24 0:00", 15123L,
    "2017/5/25 0:00", 15506L,
    "2017/5/31 0:00", 15635L,
    "2017/5/30 0:00", 15954L,
    "2017/5/28 0:00", 15977L,
    "2017/5/26 0:00", 16020L,
    "2017/5/24 4:00", 16484L,
    "2017/5/21 0:00", 17865L,
    "2017/5/22 0:00", 18176L,
    "2017/5/29 0:00", 19663L,
    "2017/5/29 20:00", 82236L,
    "2017/5/27 20:00", 83043L,
    "2017/5/28 20:00", 85146L,
    "2017/5/29 8:00", 87779L,
    "2017/5/21 8:00", 88097L,
    "2017/5/29 16:00", 90712L,
    "2017/5/30 20:00", 93007L,
    "2017/5/21 20:00", 93178L,
    "2017/5/25 20:00", 95056L,
    "2017/5/26 20:00", 95741L,
    "2017/5/28 16:00", 96534L,
    "2017/5/30 8:00", 98494L,
    "2017/5/27 16:00", 100263L,
    "2017/5/24 20:00", 100511L,
    "2017/5/30 16:00", 100517L,
    "2017/5/28 8:00", 100785L,
    "2017/5/23 20:00", 101296L,
    "2017/5/26 16:00", 103592L,
    "2017/5/31 20:00", 103735L,
    "2017/5/25 16:00", 105954L,
    "2017/5/22 20:00", 106065L,
    "2017/5/30 12:00", 106879L,
    "2017/5/21 16:00", 108202L,
    "2017/5/29 12:00", 109062L,
    "2017/5/25 8:00", 110579L,
    "2017/5/28 12:00", 114185L,
    "2017/5/27 8:00", 116136L,
    "2017/5/21 12:00", 117835L,
    "2017/5/24 8:00", 117853L,
    "2017/5/26 8:00", 122221L,
    "2017/5/22 8:00", 122425L,
    "2017/5/23 16:00", 124383L,
    "2017/5/24 16:00", 127696L,
    "2017/5/26 12:00", 133663L,
    "2017/5/27 12:00", 134442L,
    "2017/5/23 12:00", 136722L,
    "2017/5/31 8:00", 140863L,
    "2017/5/23 8:00", 143900L,
    "2017/5/24 12:00", 145211L,
    "2017/5/31 16:00", 147506L,
    "2017/5/22 16:00", 150118L,
    "2017/5/31 12:00", 153933L,
    "2017/5/25 12:00", 154980L,
    "2017/5/22 12:00", 155885L
  ) %>%
  mutate(time = lubridate::ymd_hm(time))

# One way to set up even breaks
# desired_breaks <- seq.POSIXt(from = lubridate::ymd_hm("20170521 0000"), 
#                              to   = lubridate::ymd_hm("20170605 0000"), by = "1 week")

desired_breaks <- lubridate::ymd_hm(c("20170521 0000", "20170522 1200", "20170529 0000"))

ggplot(lernst_data, aes(time, sales)) + 
  geom_line(alpha = 1/2) +
  geom_point(size = 1/2) +
  scale_x_datetime(date_labels = "%m.%d\n%H:%M", breaks = desired_breaks, date_minor_breaks = "1 day")