Recoding Values from Same Row into one of two groups.

Thank you much appreciated. I'll try this in a bit.

Just trying to highlight the differences in turnout based on restricted and non-restricted states


So Row 53's headings differ from the first 52 and don't line up, which I noticed is what seems to be causing the error.


barplot(Fixed.Presidential.2020.State.Voting.Data$TotalVotes ~ Fixed.Presidential.2020.State.Voting.Data$Area)

As you can see, the 306 area column as become it's own entity on the bar graph, throwing the data off.

Since only the first 51 rows contain state-level data, you can make a new data frame.

StateData <- Fixed.Presidential.2020.State.Voting.Data[1:51, ]

You may have to convert all of the columns that contain numeric data using parse_number(), as I showed before. The presence of those headings in row 53 probably forced everything to be characters. You can then make your plots using the StateData data frame.

Gotcha, will try that now. Thank you


t.test(StateData$TotalVotes ~ StateData$Area, alternative="greater")

Do I need to recode again to ensure states' total votes are categorized as "1 - restrictive or 2 - nonrestrictive"?

  1. Running parse_number on an entire data frame will not work. That is what you did with
StateData <- parse_number(StateData)

You should run it on a particular column

StateData$TotalVotes <- parse_number(StateData$TotalVotes)
  1. You are using the t.test() incorrectly. You are running StateData$TotalVotes against StateData$Area. The Area column probably has 51 levels, the 50 states plus D.C., unless it is different from what I have seen in your images. As the error message says, the grouping factor should have two levels. You ask about states categorized as restrictive and nonrestrictive but I do not see how your t.test could tell which states are which; it is just looking at the two columns TotalVotes and Area. It might make sense to run the t test on TotalVotes and the column that labels states as restrictive and nonrestrictive, but I have strong doubts about that.
    What is the question you are trying to answer? Is it whether more absolute votes were cast in states with restrictive or nonrestrictive policies? That will be very strongly affected by the populations of the states. Your result might well be effectively the same as asking whether the populations of restrictive and nonrestrictive states are different, which seems like an odd question. And I doubt that those populations are distributed anything like normally.

I ended up doing t.test(StateData$TotalVotes ~ StateData$Restrict) which had provided me with the figures I think I need (df, t value, p-value, I also installed the "lsr" package and ran a cohen
s d test to determine the magnitude of the relationship between states with and without restrictive policies.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.