I am working on a forecast project and was wondering what would be a good way to decode the variable for holidays in one variable. Since the holidays in Germany are set differently for all sixteen states, the value of the variable is not either holidays or non-holidays, but can assume seventeen different ones. My question would be, what kind of values are typically used here.
Or would it be better to provide sixteen features to the models for all the different states?
Data frame for holiday 2019:
tibble(Datum=as_date(1:365, origin = "2018-12-31"),
BW=c(rep(1,5), rep(0,99), rep(1,13), rep(0,44), rep(1,11), rep(0,37), rep(1,44), rep(0,47), rep(1,4), rep(0,52), rep(1,9)),
BAY=c(rep(1,5), rep(0,57), rep(1,5), rep(0,37), rep(1,13), rep(0,44), rep(1,11), rep(0,37), rep(1,43), rep(0,48), rep(1,4), rep(0,19), rep(1,1), rep(0,32), rep(1,9)),
BER=c(rep(1,5), rep(0,29), rep(1,6), rep(0,64), rep(1,12), rep(0,34), rep(1,1), rep(0,10), rep(1,1), rep(0,8), rep(1,44), rep(0,62), rep(1,1), rep(0,2), rep(1,13), rep(0,64), rep(1,9)),
BRA=c(rep(1,5), rep(0,29), rep(1,6), rep(0,64), rep(1,12), rep(0,54), rep(1,45), rep(0,61), rep(1,15), rep(0,65), rep(1,9)),
BRE=c(rep(1,4), rep(0,26), rep(1,2), rep(0,63), rep(1,18), rep(0,37), rep(1,1), rep(0,10), rep(1,1), rep(0,22), rep(1,42), rep(0,50), rep(1,15), rep(0,63), rep(1,11)),
HH=c(rep(1,4), rep(0,27), rep(1,1), rep(0,30), rep(1,12), rep(0,58), rep(1,5), rep(0,13), rep(1,1), rep(0,26), rep(1,42), rep(0,57), rep(1,15), rep(0,13), rep(1,1), rep(0,48), rep(1,12)),
HES=c(rep(1,12), rep(0,92), rep(1,13), rep(0,64), rep(1,40), rep(0,51), rep(1,13), rep(0,71), rep(1,9)),
MV=c(rep(1,5), rep(0,29), rep(1,12), rep(0,58), rep(1,10), rep(0,36), rep(1,1), rep(0,6), rep(1,5), rep(0,19), rep(1,41), rep(0,54), rep(1,1), rep(0,2), rep(1,6), rep(0,19), rep(1,1), rep(0,51), rep(0,9)),
N=c(rep(1,4), rep(0,26), rep(1,2), rep(0,65), rep(1,16), rep(0,37), rep(1,12), rep(0,22), rep(1,42), rep(0,50), rep(1,15), rep(0,65), rep(1,9)),
NW=c(rep(1,4), rep(0,100), rep(1,13), rep(0,44), rep(1,1), rep(0,33), rep(1,44), rep(0,47), rep(1,13), rep(0,57), rep(1,9)),
RP=c(rep(1,4), rep(0,51), rep(1,5), rep(0,52), rep(1,8), rep(0,61), rep(1,40), rep(0,51), rep(1,12), rep(0,72), rep(1,9)),
SL=c(rep(1,4), rep(0,51), rep(1,9), rep(0,42), rep(1,10), rep(0,65), rep(1,40), rep(0,58), rep(1,12), rep(0,65), rep(1,9)),
SA=c(rep(1,4), rep(0,44), rep(1,13), rep(0,47), rep(1,8), rep(0,34), rep(1,1), rep(0,37), rep(1,40), rep(0,58), rep(1,12), rep(0,56), rep(1,11)),
SAA=c(rep(1,4), rep(0,37), rep(1,5), rep(0,61), rep(1,13), rep(0,30), rep(1,2), rep(0,32), rep(1,42), rep(0,50), rep(1,8), rep(0,20), rep(1,1), rep(0,51), rep(1,9)),
SH=c(rep(1,4), rep(0,89), rep(1,15), rep(0,42), rep(1,1), rep(0,30), rep(1,41), rep(0,54), rep(1,15), rep(0,65), rep(1,9)),
TH=c(rep(1,4), rep(0,37), rep(1,5), rep(0,58), rep(1,13), rep(0,33), rep(1,1), rep(0,37), rep(1,41), rep(0,50), rep(1,13), rep(0,62), rep(1,11)))