How to add back existing columns to model after building

Tams101 · November 20, 2020, 6:17am

Thank you so much @technocrat. I see the data scrubbing huddle you had to go through. Yeah, a few observations though:\

Using input derived from Average_Consumption > Availability,1,0 shouldn't suffice as that wasn't the response variable. The response variable is Status as I have indicated before now.
It still does not answer my question, that is, outputting a list of customers that bypassed based on the result of the test set.

I have cleaned the dataset in my Github page, please revisit my GitHub page for the new dataset. I really need to get off this and move on to building a shiny dashboard based on the results of the model.

technocrat · November 20, 2020, 6:30am

You want to predict Status from what?

The problem of bypassed is not a prediction problem, it's a classification problem.

technocrat · November 20, 2020, 6:37am

Shiny deployment, I can't help you with. It's data science in the same way that PowerPoint is rhetoric in the way it is typically used— p-hacking for the masses. It's great as an EDA tool for users who know what they are about.

A model is a description of a population, not of an observation. When we say that a patient has a 0.02 probability of dying of COVID-19 exposure, that does not mean that an observed patient is 0.02 dead. The patient is either dead or not, 0 or 1. Only if the patient is a random observation from a normal population can we say anything useful about their status.

What you may need to be looking at is classification methods. Given an observation of meter readings and estimates of distribution line capacity, what is the likely status of a meter? For that see, the Irizzary text

Tams101 · November 20, 2020, 7:03am

Classification or prediction is a matter of definition and intent. In my case, I am trying to first classify existing customers based on their known energy theft history and bypass outcome. But it doesn't stop there, I want to use the classification model to predict the outcome of a new customer.

I understand you're busy and you have a lot on your table, that's why I am so appreciative of your efforts so far. I just need to get this part done. For the Shiny, I already have a template in mind.

We just need to move forward and get this part completed.

technocrat · November 21, 2020, 6:00am

I was unable to fit a glm model on Status ~ [anything] with these data. I don't think I will be able to assist further.

Tams101 · November 23, 2020, 3:09pm

Thank you @technocrat... I did it!

Specificity is 65%

Accuracy 87%

system · December 14, 2020, 3:09pm

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.