Forecasting data of export Wh-with irregularly spaced time series


#1

I am new to R. I am analyzing export Wh, data for my project.
Original csv file (org_data.csv) that I m analyzing can be found in this link:
[https://drive.google.com/open?id=12a3EfbSKKuPRAYUC-c58tbnBaiVlweVI ]

In the original csv file, there are 1,584,823 total records, with 157 meters.
Recorded from 1-Oct-2015 00:00:00 to 31-Mar-2016 23:59:59

In the csv file, there are three columns: local minutes, dataID, meter_value.

  • Local minutes is formatted with “yyyy-mm-dd hh:mm:ss-UTC
    time offset”.
  • dataID represents the ID number for each 157 meter,
  • meter_value represents the export Wh

Quick view of original csv file, as below:

localminute,dataid,meter_value
2015-10-01 00:00:10-05,739,88858
2015-10-01 00:00:13-05,8890,197164
2015-10-01 00:00:20-05,6910,179118
2015-10-01 00:00:22-05,3635,151318
2015-10-01 00:00:22-05,1507,390354
2015-10-01 00:00:29-05,5810,97506
2015-10-01 00:01:18-05,6910,179118
2015-10-01 00:01:34-05,6910,179118
2015-10-01 00:01:38-05,5810,97506
2015-10-01 00:01:43-05,4352,218216
2015-10-01 00:01:46-05,484,99298
2015-10-01 00:01:56-05,1718,161076
2015-10-01 00:01:58-05,1714,147048

I have done plotting the existing data of all the 157 meters using facelet

But I stuck in writing algorithm for predicting export Wh of all 157meters (From 01 Apr 2016 00:00:00 to 31Dec2016 11:59:59).

May I know how should I develop forecasting model for all 157 meters, with this UTC time offset?

As i have to submit this prediction algorithm by 20-Oct-2018. I am grateful if you guys help me.


#2

It sounds like this might be a homework question. If it is, please see our homework policy.

If you've written some code and are having a particular problem, please add a minimal reprex (reproducible example) to your post. This will ensure we're all looking at the same data and code. A guide for creating a reprex can be found here.


#3

Hi @may! I'm afraid this is rather an enormous question. If you've already gotten started writing code based on any references you may have read (books, websites), it would help if you could provide the code you have and links to the references you've consulted.

If you haven't gotten started yet, I think the best that anybody's going to be able to do is to give you general pointers at references where you can start learning about forecasting. There's not one way to do this sort of thing — your statistical judgement and your knowledge of the larger project and goals (where do the data come from? what conditions do you expect to be different over the new time period? what is the forecast going to be used for?) will always play a big role in choosing and implementing your method. You and the people you directly work with are best positioned to provide these crucial elements.

All that said, if you are just getting started, this is a go-to reference: https://otexts.org/fpp2/


#4

Hi All,

I tried using ARIMA method to forecast next 24 hours but I could not get the forecasted meter_value.

I appreciate your help, to get the forecasted meter_value.

To answer Jcblum's question:
Data are from Pacan's street -get export Wh of gas meters from neighbour homes.
Since the existing plot (till 01-Apr-2016 04:59:58) are increasing over period, I expect there will be up-down plot for next 8 months (which means predict till end of 31-Dec-2016 11:59:59)
That forecast will be used for learning purpose that I have learned how to predict using R.

sensor_data <- data.table::fread("org_data.csv") #For this big CSV file(0.056GB), “data.table::fread” is *much* faster than “read.csv”, Read 1,584,823 rows and 3 columns in 5sec.
 
sensor_data <- sensor_data %>% mutate(localminute =lubridate::as_datetime(localminute),dataid=factor(dataid)) # convert dataid (integer type) to factor type # convert localminute to datetime (fread imports it as character)

str(sensor_data)  #sensor_data is original data set 
sensor_dataTS = ts(sensor_data$localminute)
meterValueTS = ts(sensor_data$meter_value)
Alldata <- data.frame(sensor_dataTS,sensor_data$dataid,meterValueTS)
arima_fit=auto.arima(Alldata[,1])
arima_forecast = forecast(arima_fit, h = 10)
#output as below
Point Forecast      Lo 80      Hi 80      Lo 95      Hi 95
1584824     1459486811 1459484733 1459488890 1459483633 1459489990
1584825     1459486825 1459483762 1459489888 1459482141 1459491509
1584826     1459486838 1459483004 1459490673 1459480974 1459492702
1584827     1459486852 1459482351 1459491353 1459479968 1459493735
1584828     1459486865 1459481761 1459491970 1459479059 1459494672
1584829     1459486879 1459481213 1459492544 1459478214 1459495543
1584830     1459486892 1459480697 1459493088 1459477417 1459496367
1584831     1459486906 1459480203 1459493608 1459476655 1459497156
1584832     1459486919 1459479727 1459494111 1459475920 1459497918
1584833     1459486932 1459479265 1459494600 1459475206 1459498659