Help with Feature Extraction for Holiday variable

I am working on a forecast project and was wondering what would be a good way to decode the variable for holidays in one variable. Since the holidays in Germany are set differently for all sixteen states, the value of the variable is not either holidays or non-holidays, but can assume seventeen different ones. My question would be, what kind of values are typically used here.

Or would it be better to provide sixteen features to the models for all the different states?

Data frame for holiday 2019:

tibble(Datum=as_date(1:365, origin = "2018-12-31"),
                        BW=c(rep(1,5), rep(0,99), rep(1,13), rep(0,44), rep(1,11), rep(0,37), rep(1,44), rep(0,47), rep(1,4), rep(0,52), rep(1,9)),
                        BAY=c(rep(1,5), rep(0,57), rep(1,5), rep(0,37), rep(1,13), rep(0,44), rep(1,11), rep(0,37), rep(1,43), rep(0,48), rep(1,4), rep(0,19), rep(1,1), rep(0,32), rep(1,9)),
                        BER=c(rep(1,5), rep(0,29), rep(1,6), rep(0,64), rep(1,12), rep(0,34), rep(1,1), rep(0,10), rep(1,1), rep(0,8), rep(1,44), rep(0,62), rep(1,1), rep(0,2), rep(1,13), rep(0,64), rep(1,9)),
                        BRA=c(rep(1,5), rep(0,29), rep(1,6), rep(0,64), rep(1,12), rep(0,54), rep(1,45), rep(0,61), rep(1,15), rep(0,65), rep(1,9)),
                        BRE=c(rep(1,4), rep(0,26), rep(1,2), rep(0,63), rep(1,18), rep(0,37), rep(1,1), rep(0,10), rep(1,1), rep(0,22), rep(1,42), rep(0,50), rep(1,15), rep(0,63), rep(1,11)),
                        HH=c(rep(1,4), rep(0,27), rep(1,1), rep(0,30), rep(1,12), rep(0,58), rep(1,5), rep(0,13), rep(1,1), rep(0,26), rep(1,42), rep(0,57), rep(1,15), rep(0,13), rep(1,1), rep(0,48), rep(1,12)),
                        HES=c(rep(1,12), rep(0,92), rep(1,13), rep(0,64), rep(1,40), rep(0,51), rep(1,13), rep(0,71), rep(1,9)),
                        MV=c(rep(1,5), rep(0,29), rep(1,12), rep(0,58), rep(1,10), rep(0,36), rep(1,1), rep(0,6), rep(1,5), rep(0,19), rep(1,41), rep(0,54), rep(1,1), rep(0,2), rep(1,6), rep(0,19), rep(1,1), rep(0,51), rep(0,9)),
                        N=c(rep(1,4), rep(0,26), rep(1,2), rep(0,65), rep(1,16), rep(0,37), rep(1,12), rep(0,22), rep(1,42), rep(0,50), rep(1,15), rep(0,65), rep(1,9)),
                        NW=c(rep(1,4), rep(0,100), rep(1,13), rep(0,44), rep(1,1), rep(0,33), rep(1,44), rep(0,47), rep(1,13), rep(0,57), rep(1,9)),
                        RP=c(rep(1,4), rep(0,51), rep(1,5), rep(0,52), rep(1,8), rep(0,61), rep(1,40), rep(0,51), rep(1,12), rep(0,72), rep(1,9)),
                        SL=c(rep(1,4), rep(0,51), rep(1,9), rep(0,42), rep(1,10), rep(0,65), rep(1,40), rep(0,58), rep(1,12), rep(0,65), rep(1,9)),
                        SA=c(rep(1,4), rep(0,44), rep(1,13), rep(0,47), rep(1,8), rep(0,34), rep(1,1), rep(0,37), rep(1,40), rep(0,58), rep(1,12), rep(0,56), rep(1,11)),
                        SAA=c(rep(1,4), rep(0,37), rep(1,5), rep(0,61), rep(1,13), rep(0,30), rep(1,2), rep(0,32), rep(1,42), rep(0,50), rep(1,8), rep(0,20), rep(1,1), rep(0,51), rep(1,9)),
                        SH=c(rep(1,4), rep(0,89), rep(1,15), rep(0,42), rep(1,1), rep(0,30), rep(1,41), rep(0,54), rep(1,15), rep(0,65), rep(1,9)),
                        TH=c(rep(1,4), rep(0,37), rep(1,5), rep(0,58), rep(1,13), rep(0,33), rep(1,1), rep(0,37), rep(1,41), rep(0,50), rep(1,13), rep(0,62), rep(1,11)))

It seems like it would be possible to have a flag variable that has the meaning 'is a holiday for the region that relates to the observation under observation ' and its negation

if I only knew what exactly you mean by that :no_mouth:

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.