Multivariate imputing missing values in weather data

0

I need to get a weather dataset ready as input to keras. I have 1096 entries over 3 years of daily data of which first month is missing. I got one of the columns filled in for temperature from a nearby weather station. However, to check which imputation fits best, I deleted these 30 values and kept all columns as NA for first month. Then, I tried various imputing packages including 1. Mice - gave continuous values but too high average; 2. KNN (VIM) gave a constant value too high 3.MissForest - gave constant value too high; 4. imputeTS_interpolation - gave continuous value slightly low; 5. imputeTS_seasonal - gave constant value slight low. Therefore, I selected imputeTS_interpolation. And used this to now impute the remaining columns after filling the temperature column with actual values. However, I cannot seem to get the seasonality in imputeTS working. Any idea why? Please find below the data file and code used: Code:

# #MICE did not match with historical data too high avg of 10.2
# impute <- mice(gf, method = "pmm")
# print(impute)
# xyplot(impute, Temp ~ Reco | .imp, pch = 20, cex =1.4)
# mf <- complete(impute, 3)
# mf <- cbind(mf, Date = df$Date.Time)
# write.csv(mf, "Mice_imputed.csv", row.names=TRUE)
# View(mf)
# 
# View(gf_impo)
# ##Using Miss Forest ;) too high contstant 9.3
# gf_impo <- missForest(gf, maxiter = 100, ntree = 500)
# gf_impo$ximp
# gf_impo <- cbind(gf_impo, Date = df$Date.Time)
# write.csv(gf_impo$ximp, "Val_MissForest_imputed.csv", row.names=TRUE)
# class(gf_impo)


##KNN using VIM too high constant 13
imp_knn <- kNN(gf, k = 500)
aggr(imp_knn, delimiter = "imp")
View(imp_knn)
imp_knn <- cbind(imp_knn, Date = df$Date.Time)
write.csv(imp_knn, "Val_KNN_imputed.csv", row.names=TRUE)
View(imp_seas)

#imputeTS

#for seasonal imputation
imp_seas <- gf
imp_seas <- cbind(imp_seas, Date = df$Date.Time)
View(imp_seas)
View(imp_TS_intn)
imp_TS_intn <- na_interpolation(imp_seas, option = "spline") #avg of 2.83 close to real 4.1
# imp_TS_seas <- na_seasplit(imp_seas, algorithm = "interpolation", find_frequency = FALSE, maxgap = Inf)
#const 2.7
write.csv(imp_TS_intn, "ML_impTS_interpolate.csv", row.names=TRUE)

Data:

             A           B             C          D           E       Date
1           NA          NA            NA  5.4000000          NA 2018-01-01
2           NA          NA            NA  5.7500000          NA 2018-01-02
3           NA          NA            NA  6.8000000          NA 2018-01-03
4           NA          NA            NA  6.3500000          NA 2018-01-04
5           NA          NA            NA  3.3500000          NA 2018-01-05
6           NA          NA            NA  3.0500000          NA 2018-01-06
7           NA          NA            NA  2.2000000          NA 2018-01-07
8           NA          NA            NA  0.6500000          NA 2018-01-08
9           NA          NA            NA  2.8500000          NA 2018-01-09
10          NA          NA            NA  2.2000000          NA 2018-01-10
11          NA          NA            NA  2.3500000          NA 2018-01-11
12          NA          NA            NA  5.1000000          NA 2018-01-12
13          NA          NA            NA  6.5500000          NA 2018-01-13
14          NA          NA            NA  5.0000000          NA 2018-01-14
15          NA          NA            NA  5.7500000          NA 2018-01-15
16          NA          NA            NA  2.0000000          NA 2018-01-16
17          NA          NA            NA  5.0500000          NA 2018-01-17
18          NA          NA            NA  3.8500000          NA 2018-01-18
19          NA          NA            NA  2.4500000          NA 2018-01-19
20          NA          NA            NA  5.1500000          NA 2018-01-20
21          NA          NA            NA  6.7500000          NA 2018-01-21
22          NA          NA            NA  9.2500000          NA 2018-01-22
23          NA          NA            NA  9.5000000          NA 2018-01-23
24          NA          NA            NA  6.4500000          NA 2018-01-24
25          NA          NA            NA  5.4000000          NA 2018-01-25
26          NA          NA            NA  5.3500000          NA 2018-01-26
27          NA          NA            NA  6.5500000          NA 2018-01-27
28          NA          NA            NA 10.1000000          NA 2018-01-28
29          NA          NA            NA  6.6000000          NA 2018-01-29
30          NA          NA            NA  3.8500000          NA 2018-01-30
31          NA          NA            NA  2.9000000          NA 2018-01-31
32  0.05374951 0.041144312  0.0023696211  5.9902083 0.068784302 2018-02-01
33  0.07565470 0.012326176  0.0057481689 10.5280417 0.176209125 2018-02-02
34  0.04476314 0.113718139  0.0089845444 12.8125000 0.176408788 2018-02-03
35  0.01695546 0.060965133 -0.0034163682 16.9593750 0.000000000 2018-02-04
36  0.09910202 0.090170142 -0.0111946461 10.4867292 0.088337951 2018-02-05
37  0.08514839 0.026061013 -0.0029183210  7.1662500 0.085590326 2018-02-06
38  0.06724108 0.104761909 -0.0416036605  6.9130417 0.134828348 2018-02-07
39  0.07638534 0.097570813 -0.0192784571  3.3840000 0.029682717 2018-02-08
40  0.02568162 0.008244304 -0.0288903610 12.0282292 0.055817103 2018-02-09
41  0.02752688 0.088544666 -0.0172136911  6.8694792 0.098169954 2018-02-10
42  0.06643098 0.063321337 -0.0347752292  7.4539792 0.034110652 2018-02-11
43  0.09743445 0.057502178  0.0162851223 13.9365208 0.264168082 2018-02-12
44  0.09189575 0.034429904  0.0020940613 13.8687292 0.162341764 2018-02-13
45  0.07857244 0.009406862  0.0075904680 11.7800000 0.101283946 2018-02-14
46  0.01987263 0.024783795 -0.0088742973  4.4463750 0.063949011 2018-02-15
47  0.02332892 0.010138857  0.0091396448  5.6452292 0.034708981 2018-02-16
48  0.02022396 0.014207518  0.0036018714 14.2862500 0.043205299 2018-02-17
49  0.07043020 0.075317793  0.0036760070  5.5940208 0.171898590 2018-02-18
50  0.02120779 0.010461857 -0.0277470177 13.6131250 0.061486533 2018-02-19
51  0.06405819 0.034185344  0.0173606568  7.0551042 0.052148976 2018-02-20
52  0.09428869 0.026957653  0.0016863903  6.7955000 0.085888435 2018-02-21
53  0.04248937 0.048782786  0.0004039921 17.5706250 0.000000000 2018-02-22
54  0.02076763 0.038094949 -0.0003671638 14.8379167 0.000000000 2018-02-23
55  0.01343260 0.118003726 -0.0214988345  6.4564583 0.053353606 2018-02-24
56  0.05231647 0.054454132 -0.0098012290  7.8568333 0.183326943 2018-02-25
57  0.02476706 0.087501472  0.0031839472 15.7493750 0.210616272 2018-02-26
58  0.07358998 0.023558218  0.0031618607 10.8001250 0.241602571 2018-02-27
59  0.02042573 0.009268439  0.0088051496  7.2967500 0.251608940 2018-02-28
60  0.02107772 0.083567750 -0.0037223644  6.2674375 0.062221630 2018-03-01
61  0.05830801 0.029456683  0.0114978078 13.0810417 0.193765948 2018-03-02
62  0.02923587 0.070533843  0.0068299668 14.4095833 0.244310193 2018-03-03
63  0.02570283 0.058270093  0.0137174366  3.8527917 0.120846709 2018-03-04
64  0.01434395 0.014637405  0.0051951050  9.6877083 0.112579011 2018-03-05
65  0.06426214 0.078872579  0.0068664343  4.6763750 0.000000000 2018-03-06
66  0.04782772 0.011762501  0.0086182870 12.7027083 0.129606106 2018-03-07
67  0.01809136 0.105398844  0.0231671305 10.8052083 0.017683908 2018-03-08
68  0.04427582 0.020397435 -0.0009758693  6.5983333 0.041148864 2018-03-09
69  0.05123687 0.115984361 -0.0372104856  6.5021250 0.180013174 2018-03-10
70  0.01913266 0.005981014 -0.0159701842  8.9844375 0.095262921 2018-03-11
71  0.04407234 0.009142247 -0.0031640496  7.7638333 0.000000000 2018-03-12
72  0.09108709 0.038174205  0.0005654564  5.3772083 0.044105747 2018-03-13
73  0.05488394 0.115153937  0.0192819858  8.9182917 0.039993864 2018-03-14
74  0.03726892 0.067983475 -0.0311367032  2.4423333 0.066108171 2018-03-15
75  0.05563102 0.003831231 -0.0011148743 10.7100000 0.217461791 2018-03-16
76  0.04922930 0.055446609  0.0075246331  5.0829375 0.149530704 2018-03-17
77  0.02972858 0.061966039 -0.0392014211 12.3645625 0.060670492 2018-03-18
78  0.02812688 0.018183092  0.0134514770  9.0172292 0.158435250 2018-03-19
79  0.03066101 0.007622504 -0.0249482114  6.2709792 0.118487919 2018-03-20
80  0.06801767 0.083261012  0.0133423296 13.3683333 0.196053774 2018-03-21
81  0.04178157 0.093600914  0.0116253865 10.0024167 0.020835522 2018-03-22
82  0.04725052 0.018187748 -0.0115718535 10.3528333 0.097352796 2018-03-23
83  0.02042339 0.081504844 -0.0380958738 17.2006250 0.010500742 2018-03-24
84  0.06674396 0.098739090 -0.0108474961 17.5437500 0.119415595 2018-03-25
85  0.07049507 0.016286614 -0.0007817195 16.8800000 0.060452087 2018-03-26
86  0.01244906 0.018100693 -0.0266155999  8.8651458 0.018144668 2018-03-27
87  0.05271711 0.015368632 -0.0477885811  7.2415417 0.092797451 2018-03-28
88  0.01610886 0.014919094  0.0023487944  7.7914792 0.062818728 2018-03-29
89  0.08847253 0.059397043  0.0130362880 10.9732708 0.087451484 2018-03-30
90  0.02938725 0.044473745  0.0091253257  6.0241458 0.025488946 2018-03-31
91  0.08599249 0.043160908  0.0082536160  8.8211875 0.012975783 2018-04-01
92  0.05747667 0.017709243 -0.0090965038  6.3249375 0.065731818 2018-04-02
93  0.05772051 0.085210524 -0.0013533831 13.4166667 0.067790160 2018-04-03
94  0.01699834 0.020657341  0.0039885065  3.2999792 0.076302652 2018-04-04
95  0.03565076 0.110372607 -0.0313309140 12.7822083 0.184844707 2018-04-05
96  0.02050401 0.078943608 -0.0062322339  4.3233125 0.067820413 2018-04-06
97  0.06186790 0.013147512  0.0203249289  6.3953750 0.034104318 2018-04-07
98  0.06304988 0.012997642  0.0061171825  9.7322708 0.021220516 2018-04-08
99  0.03799006 0.012420760  0.0054724563  8.8472083 0.068664033 2018-04-09
100 0.01610225 0.061182804  0.0031002885  7.5622708 0.085766429 2018-04-10
101 0.05937683 0.008333173 -0.0053972689  7.8848542 0.058386726 2018-04-11
102 0.02190115 0.037843227  0.0089823372  8.3339792 0.055761391 2018-04-12
103 0.01179665 0.016899394 -0.0016533437  5.5101667 0.099133313 2018-04-13
104 0.02464707 0.021231270 -0.0212016846 15.5106250 0.126661378 2018-04-14
105 0.01906818 0.065273389  0.0081694393  7.6616667 0.032939519 2018-04-15
106 0.05418785 0.074619385 -0.0355680586 11.3618750 0.057768261 2018-04-16
107 0.06508988 0.014345229  0.0080423912 14.7137500 0.032709791 2018-04-17
108 0.06101126 0.060624597 -0.0399526978 17.2754167 0.230982139 2018-04-18
109 0.02226268 0.010230837  0.0001617419  2.9382083 0.000000000 2018-04-19
110 0.03884772 0.014218453  0.0039652960 10.7261875 0.179962834 2018-04-20
111 0.09054488 0.025711098 -0.0115944362  4.4734583 0.011442318 2018-04-21
112 0.03072171 0.076530730  0.0032123501  9.4128750 0.033174489 2018-04-22
113 0.04361276 0.101151670  0.0249408843 14.5804167 0.024238883 2018-04-23
114 0.03877568 0.049142846  0.0080689866  8.3168750 0.084570611 2018-04-24
115 0.05564027 0.076917047  0.0033447160 15.7308333 0.199762524 2018-04-25
116 0.04752544 0.019655228 -0.0063218138 15.7302083 0.020449908 2018-04-26
117 0.01718916 0.026132806 -0.0261027525 10.0887500 0.128898351 2018-04-27
118 0.04144832 0.034526516  0.0117868820  6.0784375 0.014449565 2018-04-28
119 0.03255833 0.113650910 -0.0123724759 11.8654167 0.085410171 2018-04-29
120 0.03656535 0.043333607  0.0230071368  7.0974167 0.035725321 2018-04-30
121 0.04570760 0.093595938 -0.0329915968  5.4016458 0.013467946 2018-05-01
122 0.07271528 0.061923504  0.0130002656  9.1602292 0.018299062 2018-05-02
123 0.02646133 0.007506529 -0.0276898846  0.2338125 0.246100834 2018-05-03
124 0.02379895 0.067273612  0.0112587565 19.1260417 0.120707266 2018-05-04
125 0.05925152 0.075768053  0.0050178925 16.2114583 0.162884739 2018-05-05
126 0.01858152 0.040845398  0.0164467420 12.9156250 0.028823967 2018-05-06
127 0.06994835 0.059457560 -0.0181926787  7.7316042 0.035106399 2018-05-07
128 0.05926409 0.038623605  0.0167222227 13.5464583 0.055665220 2018-05-08
129 0.03104010 0.006805893 -0.0141792029 14.5006250 0.012099383 2018-05-09
130 0.06631012 0.059314975 -0.0228020931 13.3711875 0.073114370 2018-05-10
131 0.03794480 0.015615642  0.0034917459 16.6675208 0.191141576 2018-05-11
132 0.03532917 0.050988581  0.0079455282 14.7375208 0.214172062 2018-05-12
133 0.08512617 0.063322454  0.0224309652 11.6861250 0.166425889 2018-05-13
134 0.04498265 0.012386160 -0.0051629339  7.2488333 0.280120908 2018-05-14
135 0.06383512 0.126840241 -0.0172296864 17.3852083 0.020363429 2018-05-15
136 0.06932861 0.026819550 -0.0109061610 20.9152083 0.099516538 2018-05-16
137 0.04020292 0.021831228 -0.0007211804  6.7122292 0.069831669 2018-05-17
138 0.02037474 0.020931810  0.0088341962 15.8758333 0.130548701 2018-05-18
139 0.01704143 0.105810563 -0.0243003529 10.7339583 0.038013440 2018-05-19
140 0.01266417 0.013985439  0.0091359503  6.5119375 0.196746897 2018-05-20
141 0.03623625 0.057182212 -0.0136101306 18.6637500 0.009431062 2018-05-21
142 0.03938695 0.054879146  0.0091277482 15.5393750 0.115389187 2018-05-22
143 0.05995812 0.061925644 -0.0029137774 11.8191667 0.015729774 2018-05-23
144 0.06548692 0.095240991  0.0055356839  4.3011875 0.081309326 2018-05-24
145 0.01582489 0.015264434 -0.0020079231  9.3315833 0.105132636 2018-05-25
146 0.06834050 0.028756388 -0.0512068435 13.6035417 0.212930829 2018-05-26
147 0.08354736 0.023524928  0.0041989465  4.5111250 0.227197329 2018-05-27
148 0.05738595 0.011159952 -0.0225834032 12.9385417 0.090503870 2018-05-28
149 0.07817132 0.103507587 -0.0222426051 13.4047292 0.034928812 2018-05-29
150 0.04773356 0.035856991 -0.0191600449  9.6657708 0.019893986 2018-05-30

Disclaimer: I am looking for a co-author for help in validating my work with keras / tensor flow

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.