# How can I make a histogram with track of distribution?

Hello!
I've tried to make a histogram with figure by using data but failed to make it.Could you help me to find the solution?

`````` rain <- scan("http://robjhyndman.com/tsdldata/hurst/precip1.dat",skip=1)
rainseries <- ts(rain,start=c(1813))
rainseriesforecasts <- HoltWinters(rainseries, beta=FALSE, gamma=FALSE)
library("forecast")
rainseriesforecasts2 <- forecast:::forecast.HoltWinters(rainseriesforecasts, h=8)
rainseriesforecasts2
``````

I made a function of plotForecastError as below and tried it to show a graph but got some errors.

`````` plotForecastErrors <- function(forecasterrors)
{ mysd <- sd(forecasterrors)
mymin <- min(forecasterrors) - mysd*5
mymax <- max(forecasterrors) + mysd*3
mynorm <- rnorm(10000, mean=0, sd=mysd)
mymin2 <- min(mynorm)
mymax2 <- max(mynorm)
if (mymin2 < mymin) { mymin <- mymin2 }
if (mymax2 > mymax) { mymax <- mymax2 }

mybins <- seq(mymin, mymax, mybinsize)
hist(forecasterrors, col="red", freq=FALSE, breaks=mybins)
myhist <- hist(mynorm, plot=FALSE, breaks=mybins)

points(myhist\$mids, myhist\$density, type="l", col="blue", lwd=2)  }
``````
``````plotForecastErrors(rainseriesforecasts2\$residuals)

Error in quantile.default(as.numeric(x), c(0.25, 0.75), na.rm = na.rm,  :
missing values and NaN's not allowed if 'na.rm' is FALSE
``````

Could you help me find any revisions for the function?

Thank you very much!

TG

The error message says "missing values and NaN's not allowed if 'na.rm' is FALSE" so do

``````sum(is.na(rainseriesforecasts2\$residuals))
``````

and you will see that you have one NA in the residuals. You need to remove that NA or to set na.rm = TRUE at the appropriate step. You can find which step in the function causes the problem by setting

``````forecasterrors <- rainseriesforecasts2\$residuals
``````

and executing the steps of your function one by one as if they were steps in a script.

I will also note that the line

``````mybins <- seq(mymin, mymax, mybinsize)
``````

is likely to cause other problems. The values of mymin and mymax may have been set by the limits of mynorm but mybinsize was set by the limits of forecasterrors. mybinsize may not fit an integer number of times between mymin and mymax. You might try using the length.out parameter of seq() instead.

Also, please format code by typing three back tics ,```, before and after the code. On a US keyboard, the back tic is just above the TAB key. The code you posted is very difficult to read.

Dear FJCC,

Thank you very much for your great help!
I reformatted the function in the above.
Could you check this again?

Best regards,

TG

As FJCC said, you code is still hard to read, please take a look into this FAQ about code formatting.

Thank you!

TG

Could you check this again?

Insang

The main problem with your code is that the 1st observation in `rainseriesforecasts2\$residuals` is `NA`. So, all of `mysd`, `mymin`, `mymax` are `NA`, as you haven't used `na.rm = TRUE`. This leads to the fact that all elements of `mynorm` are `NaN`, because you're supplying `NA` to `sd` argument of `rnorm`.

If you add `na.rm = TRUE` to the calculations of `mysd`, `mymin` and `mymax`, then you'll avoid this problem. However, I cannot test it, as I don't know what is `mybinsize`. You're using it inside the function, but haven't defined it anywhere in your latest code. In my solution below, I used what was here before edits, without knowing whether it's correct or not.

A working code
``````library(forecast)
#> Registered S3 method overwritten by 'xts':
#>   method     from
#>   as.zoo.xts zoo
#> Registered S3 method overwritten by 'quantmod':
#>   method            from
#>   as.zoo.data.frame zoo
#> Registered S3 methods overwritten by 'forecast':
#>   method             from
#>   fitted.fracdiff    fracdiff
#>   residuals.fracdiff fracdiff
rainDataset <- scan(file = "http://robjhyndman.com/tsdldata/hurst/precip1.dat",
skip = 1)
rainTimeSeries <- ts(data = rainDataset,
start = 1813)
rainHoltWinters <- HoltWinters(x = rainTimeSeries,
beta = FALSE,
gamma = FALSE)
rainForecasts <- forecast(object = rainHoltWinters,
h = 8)
plotResiduals <- function(forecastErrors)
{
na.rm = TRUE)
forecastErrorsMin <- min(forecastErrors,
na.rm = TRUE) - forecastErrorsSd * 5
forecastErrorsMax <- max(forecastErrors,
na.rm = TRUE) + forecastErrorsSd * 3
forecastErrorsNorm <- rnorm(n = 10000,
mean = 0,
binMin <- min(forecastErrorsMin, forecastErrorsNorm)
binMax <- max(forecastErrorsMax, forecastErrorsNorm)
binBreaks <- IQR(x = forecastErrors,
na.rm = TRUE) / 4 # obtained from previous edits of OP
bins <- seq(from = binMin,
to = binMax,
by = binBreaks)
hist(x = forecastErrors,
col = "red",
freq = FALSE,
breaks = bins)
with(data = hist(x = forecastErrorsNorm,
plot = FALSE,
breaks = bins),
expr = lines(x = mids,
y = density,
col = "blue",
lwd = 2))
}
plotResiduals(forecastErrors = rainForecasts\$residuals)
`````` Created on 2019-09-16 by the reprex package (v0.3.0)

Hope this helps.

Note

It's not really relevant to this thread, but you shouldn't use `:::`. I know I suggested it to you earlier, but since Nathan already gave you a better solution (and you also chose it), you should use that only.

Hi Yarnabrina,

Thank you very much for your great solution!
I will use this for the plot.
About :::, I did not have time to check the correct method yet.
Using this just for the convenience, I will use the right code for the next time.
Best,

TG

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.