How to Fit an Ensemble Model Output Statistics (EMOS) to Hindcast big data with verifying observations?

I want to calibrate my hindcast forecast for many years (50 years) using the Ensemble Model Output Statistics (EMOS) technique. Here, I provide sample data for 5 years. So, 2 years will be used as training data for fitting an EMOS model and 2 years will be used for testing the EMOS technique. How can I use all data assigned for training in order to develop the EMOS mode, and then use the developed EMOS model to test its performance using the test data?. Or, is there any alternative way I can use the EMOS technique to meet my objective?

library("ensembleBMA")  # For creating an ens object
#> Loading required package: chron
library("ensembleMOS")  # For EMOS modeling
#> Loading required package: evd
#> 
#> Attaching package: 'ensembleMOS'
#> The following objects are masked from 'package:ensembleBMA':
#> 
#>     brierScore, cdf, crps, quantileForecast, trainingData
library("dplyr")
#> 
#> Attaching package: 'dplyr'
#> The following object is masked from 'package:ensembleBMA':
#> 
#>     combine
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

# Download a sample data file
if( !dir.exists("data_emos") ){
  system("git clone https://github.com/Peter-Enos/data_emos.git")
}
setwd("data_emos")

inFile <- readr::read_delim("data_file.txt")
#> Rows: 4604 Columns: 11
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> chr  (1): station
#> dbl (10): YEAR, MONTH, DAY, HR, WS, T2M, MSLP, RH2, dates, observations
#> 
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

# Partition data to train & test
train_subset <- filter(inFile, YEAR %in% 2006:2008 )  # for modeling EMOS

test_subset  <- filter(inFile, YEAR %in% 2009:2010 )  # for testing 

# Create an ensemble data object for ensembleMOS modeling/training
ens_data_train <- ensembleData( forecasts = train_subset[c("WS", "T2M", "MSLP", "RH2")],
                                       dates = train_subset["dates"],
                                       observations = train_subset["observations"],
                                       #initializationTime = "00",
                                       #forecastHour = 3,
                                       station = train_subset["station"] )
#> Warning in ensembleData(forecasts = train_subset[c("WS", "T2M", "MSLP", : data
#> cannot be used for modeling unless forecast hour is specified
#> Warning in ensembleData(forecasts = train_subset[c("WS", "T2M", "MSLP", :
#> initialization time not specified

# Get training days
train_days <- levels(factor(format( lubridate::ymd_h( ens_data_train$dates), "%Y%m%d")))

# Extract training data
train_data <- ensembleMOS::trainingData( ensembleData = ens_data_train, 
                                         trainingDays = length(train_days)
                                         )
#> Error in seq.default(from = julianDATES[trainingDays] + lag * incr, to = max(julianDATES) + : 'from' must be of length 1
# Fit EMOS model to ensemble 
data_model_fit <- ensembleMOS( ensembleData = train_data, 
                               trainingDays = length(train_days),
                               model = "lognormal" 
                               ) 
#> Error in ensembleMOS(ensembleData = train_data, trainingDays = length(train_days), : object 'train_data' not found

Created on 2022-04-06 by the reprex package (v2.0.1)

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.