Good day!
I wanted to make my own personal R package so that I can automate the task I want. Currently, I am in the process of learning how to make a package, and I have no knowledge in such subject.
Here is the graph that I want to produce, if it is feasible, by the package I am thinking:
froese(x, lm = 10.2, bw = 0.5)
I know that this code I am thinking will produce an error, but I write it anyway, so that I can have my first step. This is the code:
froese <- function(x, lm, bw){
# x = the data frame
# lm = length at first maturity
# bw = the specified binwidth (possible choices are 0.5 or 1.0)
# Load the required packages
library(magrittr)
library(dplyr)
library(ggplot2)
# Find the minimum and maximum value in the data
min_lf <- floor(min(x))
max_lf <- ceiling(max(x))
# Compute for the value of megaspawners
mega <- max_lf * 0.7
# Sort the data frame from lowest to highest.
# Needed for the computation for the percentage of juveniles, adults,
# and megaspawners individual
x <- sort(x, decreasing = FALSE)
x <- as.data.frame(x)
# Filter the lengths that are considered as juveniles
juveniles <- x %>%
dplyr::filter(x < lm)
# Filter the lengths that are considered as adults
adults <- x %>%
dplyr::filter(x >= lm & x < mega)
# Filter the lengths that are considered as megaspawners
megaspawners <- x %>%
dplyr::filter(x >= mega)
# Compute for the percent contribution of juveniles to the whole collected data
prcnt_juveniles <- round((nrow(juveniles) / nrow(x)) * 100, digits = 2)
# Compute for the percent contribution of adults to the whole collected data
prcnt_adults <- round((nrow(adults) / nrow(x)) * 100, digits = 2)
# Compute for the percent contribution of megaspawners to the whole collected data
prcnt_megaspawners <- round((nrow(megaspawners) / nrow(x)) * 100, digits = 2)
# Save the result of the histogram to a variable, so to extract the
# mid-length with the highest frequency. This is needed for adding annotation in the graph
res_hist <- hist(x,
breaks = seq(from = min_lf,
to = max_lf,
by = bw))
# Extract which index has the highest frequency
max_val <- which.max(res_hist$counts)
# Find the corresponding value
max_val1 <- as.numeric(res_hist$counts[max_val])
# Make the plot
p1 <-
ggplot(aes(x = x)) +
geom_histogram(binwidth = bw, colour = "#555555",
fill = "#23272A") +
scale_x_continuous(breaks = seq(from = min_lf,
to = max_lf,
by = bw),
limits = c(min_lf, max_lf)) +
geom_vline(xintercept = lm, color = "#D9534F", linetype = "dashed") +
geom_vline(xintercept = mega, color = "#D9534F", linetype = "dashed") +
annotate("rect", xmin = -Inf, xmax = lm, ymin = 0, ymax = Inf, alpha = 0.2, fill = "yellow") +
annotate("rect", xmin = lm, xmax = mega, ymin = 0, ymax = Inf, alpha = 0.2, fill = "red") +
annotate("rect", xmin = mega, xmax = Inf, ymin = 0, ymax = Inf, alpha = 0.2, fill = "blue") +
annotate("text", x = ((max(juveniles) + min(juveniles)) / 2), y = max_val1 + 100, label = paste0(prcnt_juveniles, "%\nJuveniles"), size = 4) +
annotate("text", x = ((max(adults) + min(adults)) / 2), y = max_val1 + 100, label = paste0(prcnt_adults, "%\nAdults"), size = 4) +
annotate("text", x = ((max(megaspawners) + min(megaspawners)) / 2), y = max_val1 + 100, label = paste0(prcnt_megaspawners, "%\nMegaspawners"), size = 4)
# Print the result to the screen
print(p1)
}
I am thinking how to apply it to a data frame or vectors containing lengths, and apply if I want to facet the graph.
Hoping for your kind consideration to this matter.
Edit: This is the sample data. sample-length