How does the 'weights' argument work in logisitic regression (glm)?

Hi, I'm pretty new to R so apologies in advance if this is a basic question.

I'm really puzzled by the weighting argument in glm. For example, in the code below my dependant variable PCL_Sum2 is binary and highly imbalanced: There are far more observations = 0 than there are observations =1. I would like both levels to be equally weighted. I would appreciate some pointers as to how I could accomplish this.

Final_Frame.df <- read.csv("no_subset.csv")
Omitted_Nas.df<-na.omit(Final_Frame.df)

###This yields 278 observations with no missing data

prelim_model<-glm(PCL_Sum2~Mean_social_combined +
  Mean_traditional_time+
  Mean_Passive_Use_Updated+
  factor(Gender)+
  factor(Ethnicity)+
  factor(Age)+
  factor(Location)+
  factor(Income)+
  factor(Education)+
  factor(Working_Home)+
  Perceived_Fin_Risk+
  Anxiety_diagnosed+
  Depression_diagnosed+
  Lived_alone+
  Mean_Active_Use_Updated, data=Omitted_Nas.df<-na.omit(Final_Frame.df), weights=??? family = binomial())

summary(prelim_model)

I've tried setting weights = 0.5, 0.5 but I always get the following error:

Error in model.frame.default(formula = PCL_Sum2 ~ Mean_social_combined + : variable lengths differ (found for '(weights)')

I think you need to provide a weight for each observation (i.e. a vector of weights of the same length as your data).

The weights for PCL_Sum2 == 0 observations should be 1/number_of_zeroes
The weights for PCL_Sum2 == 1 observations should be 1/number_of_ones

So the sum of weights for each type of observation will both be 1.

1 Like

Thank you! That works.

1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.