Issues with scale_y_datetime function

So I am trying to create an eCDF plot using acoustic biomass data.

Thus far, my code looks like this:

ggplot(echo_120, aes(PRC_ABC)) + stat_ecdf(geom = "step", pad = FALSE) + ggtitle("120 kHz") + coord_flip() +scale_x_datetime(breaks = date_breaks("1 day"), breaks = minor_breaks ( "1 hour")) +
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(), panel.background = element_rect(fill = "white", colour = NA))

The issue is that I need to make the y-axis go by time. Breaks need to be by hour and separate days need to be indicated (it is only ~2 days worth of data). I keep getting error messages that the "date_breaks" or "minor_breaks" functions dont exist despite ggplot suggesting them. Is there a fix? Or another way to get the dates on the y-axis?

I think you want to use date_breaks and date_minor_breaks as in the following example.

ggplot(DF, aes(A)) + stat_ecdf() +
  coord_flip() +scale_x_datetime(date_breaks = "1 day", 
                                 date_minor_breaks = "1 hour")

Tried that and got this error message:

Error: Invalid input: time_trans works with objects of class POSIXct only

But at least its not telling me my commands don't exist, progress.

I suspect your data are not of the type expected by the function. Can you post part or all of your data?

For part of the data, use

dput(head(echo_120))

For all of the data, use

dput(echo_120)

In either case, please paste the result of the command into this thread. Put a line containing only three back ticks,
```
before and after the pasted output.

" Region 3", " Region 3", " Region 3", " Region 3", " Region 3"
), Region_class = c(" Unclassified regions", " Unclassified regions", 
" Unclassified regions", " Unclassified regions", " Unclassified regions", 
" Unclassified regions"), Process_ID = c(9L, 9L, 9L, 9L, 9L, 
9L), Interval = 429878:429883, Layer = c(1L, 1L, 1L, 1L, 1L, 
1L), Sv_mean = c(-57.642678, -56.468429, -23.977426, -23.28391, 
-23.441881, -23.182833), Height_mean = c(10.195943, 10.188783, 
10.839862, 10.975883, 10.975883, 10.975883), Depth_mean = c(6.109091, 
6.105691, 6.434995, 6.498878, 6.498878, 6.498878), Layer_depth_min = c(1, 
1, 1, 1, 1, 1), Layer_depth_max = c(13, 13, 13, 13, 13, 13), 
    Ping_S = c(0L, 258L, 3855L, 7455L, 11055L, 14654L), Ping_E = c(257L, 
    3854L, 7454L, 11054L, 14653L, 18253L), Dist_M = c(0, 0, 0, 
    0, 0, 0), Date_M = c(20190115L, 20190115L, 20190115L, 20190115L, 
    20190115L, 20190115L), Time_M = c(" 13:57:50.1730", " 14:30:00.9570", 
    " 15:29:59.2970", " 16:29:59.6050", " 17:29:59.8730", " 18:29:59.1840"
    ), Lat_M = c(999, 999, 999, 999, 999, 999), Lon_M = c(999, 
    999, 999, 999, 999, 999), Noise_Sv_1m = c(-999, -999, -999, 
    -999, -999, -999), Minimum_Sv_threshold_applied = c(0L, 0L, 
    0L, 0L, 0L, 0L), Maximum_Sv_threshold_applied = c(0L, 0L, 
    0L, 0L, 0L, 0L), Standard_deviation = c(5.98219728e-06, 1.21484137e-05, 
    0.03644431484846, 0.03931212447703, 0.03570024069026, 0.0380443987566
    ), PRC_ABC = c(1.754525128e-05, 2.297626592e-05, 0.04337916305593, 
    0.0515286165613, 0.04968798231584, 0.05274195481807), Species_Id = c(0L, 
    0L, 0L, 0L, 0L, 0L), Species_name = c(" Unassigned", " Unassigned", 
    " Unassigned", " Unassigned", " Unassigned", " Unassigned"
    ), Species_percent = c(100, 100, 100, 100, 100, 100), Species_TS = c(-40, 
    -40, -40, -40, -40, -40), Species_weight = c(1, 1, 1, 1, 
    1, 1), Density_number = c(601785.28319, 788063.874026, 1487863667.78174, 
    1767382103.10234, 1704250114.68738, 1808998441.03529), Density_weight = c(601785.28319, 
    788063.874026, 1487863667.78174, 1767382103.10234, 1704250114.68738, 
    1808998441.03529), Thickness_mean = c(10.195943, 10.188783, 
    10.839862, 10.975883, 10.975883, 10.975883), Range_mean = c(6.109091, 
    6.105691, 6.434995, 6.498878, 6.498878, 6.498878), Exclude_below_line_range_mean = c(11.222018, 
    11.20854, 11.212945, -9999, -9999, -9999), Exclude_above_line_range_mean = c(1, 
    1, 1, 1, 1, 1)), row.names = c(NA, 6L), class = "data.frame")```
>

What im trying to do is take the PRC_ABC (area backscatter coefficient) data and create a CDF and PDF graph out of it (thus far I've only focused on the CDF part). I also want to plot it against time.

The graphs Ive managed to produce thus far are eCDF plotted against the random variable it generates. The good new is, it shows what I'd expect to see, the bad is I cant seem to smooth it out.

Perhaps I should try generating CDF and PDF dataframes from the ABC data I have? And then plot it against the time values already in my spreadsheet?

Your copy and paste seems to have had a problem as the displayed dput() result seems to start somewhere in the middle. From what I can see, there is one PRC_ABC value for each combination of Date_M and Time_M. If so, I do not understand how you can construct a CDF as a function of time. You can easily plot PRC_ABC vs. time but a CDF requires multiple values. Do you intend to bin the data in some way?

You original ggplot code does not work because you are plotting only PRC_ABC yet trying to display the axis with scale_x_datetime. Since PRC_ABC is a pure number, ggplot gets confused.

Sorry about that, but you seem to have a grasp on what it is. Initially that was what I was trying to do, but as I have learned and as you have said, I cannot plot ABC and Time in a CDF against eachother. The bin idea is probably what is best here.

Ive only managed an empirical CDF, and dont really know how a PDF works.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.