Error: Aesthetics must be either length 1 or the same as the data (15537): fill

I think this is my reprex? Not sure if I did that correctly? I'm getting the error from the title after I try to run the houseplot command. I'm not sure why I'm getting the error for the "'womenincongress.csv' does not exist in current working directory " because it is in my wd and it doesn't display that error when I run that line. It only tells me I need to do the column specification but otherwise there's no error.

library(tidyverse)
all_states <- map_data("state")
congress<-read_csv("womenincongress.csv")
#> Error: 'womenincongress.csv' does not exist in current working directory ('C:/Users/clair/AppData/Local/Temp/RtmpqSTZ3O/reprex-36046fbb58b6-bared-pika').
names(congress)[2] <- "region"
#> Error in names(congress)[2] <- "region": object 'congress' not found
stateData <- left_join(all_states,congress,by="region")
#> Error in is.data.frame(y): object 'congress' not found
repProp <- congress$representatives/(congress$total)
#> Error in eval(expr, envir, enclos): object 'congress' not found
housePlot <- ggplot()+geom_polygon(data=stateData,aes(x=long, y=lat, group = group, fill=repProp),color="grey50")+coord_map()+labs(x="",y="",title="Women in the House")+theme_classic()+ theme(axis.ticks.y = element_blank(),axis.text.y = element_blank(), axis.ticks.x = element_blank(),axis.text.x = element_blank())
#> Error in fortify(data): object 'stateData' not found
housePlot
#> Error in eval(expr, envir, enclos): object 'housePlot' not found

Created on 2021-11-29 by the reprex package (v2.0.1)

I'm not sure how to include the congress data, but here is an image of it and a copy of the output:
dput(congress[1:10, ])
structure(list(...1 = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10), state = c("alabama",
"arizona", "arkansas", "california", "colorado", "connecticut",
"delaware", "district of columbia", "florida", "georgia"), senators = c(0,
0, 1, 2, 0, 0, 0, 0, 0, 0), representatives = c(2, 3, 0, 19,
1, 2, 0, 1, 7, 0), total = c(7, 9, 4, 53, 7, 5, 1, 1, 27, 14)), row.names = c(NA,
-10L), class = c("tbl_df", "tbl", "data.frame"))

Hi there.

The error with the womenincongress.csv is because it is not in your wd() here on the website where the code is evaluated when you wrote your post.

But lets get to the problem: ggplot() works best on a tidy dataset. The idea is that you have one dataset in a data.frame where each column is a variable and each row is a observation. Your data already fulfills this requirement. However, when you calculate your new parameters repProp etc., you don't save them as a new column in your datasets but as new vecotors in your environment.

library(tidyverse)
all_states <- map_data("state")

# replace this with the actual data from your .csv
congress <- data.frame(row_id = c(1,2,3),
  state = c("alabama", "arizona", "arkansas"),
                       senators = c(0,0,1),
                       representatives = c(2,3,0),
                       total = c(7,9,4)
)

names(congress)[2] <- "region"

stateData <- left_join(all_states,
                       congress,
                       by="region") %>% # this is a pipe: 
                                        # it takes whats on its left and applies it in the 
                                        # function on its right (here: mutate()) as the first arguement

# Add a new column based on the old columns of the joined dataset
  mutate(repProp = representatives/total)

# And now your plot works!
housePlot <- 
ggplot()+
  geom_polygon(data=stateData,
               aes(x=long, 
                   y=lat, 
                   group = group, 
                   fill=repProp),
               color="grey50")+
  coord_map()+
  labs(x="",y="",title="Women in the House")+
  theme_classic()+ 
  theme(axis.ticks.y = element_blank(),
        axis.text.y = element_blank(), 
        axis.ticks.x = element_blank(),
        axis.text.x = element_blank())
housePlot

However, this is not how you would normally set up a ggplot(). Its better to define general aesthetics in the first call to ggplot(), like this:

housePlot <- ggplot(data=stateData,
                   aes(x=long, 
                       y=lat, 
                       group = group, 
                       fill=repProp))+
 geom_polygon(color="grey50")+
 coord_map()+
 labs(x="",y="",title="Women in the House")+
 theme_classic()+ 
 theme(axis.ticks.y = element_blank(),
       axis.text.y = element_blank(), 
       axis.ticks.x = element_blank(),
       axis.text.x = element_blank())
housePlot

This makes it easy to add further layers to your plot that can then inherit the aes() defined at the start.

Hope this solves your problem!
Best,
Valentin

I think you are missing "congress$" infront of repProp definition. repProp should be a column in the congress df.

congress$repProp <- congress$representatives/(congress$total)
1 Like

Thank you so much for your help! My only question is for the congress part, you said to replace the data from my actual .csv. I'm trying to show all 50 states so is there a simpler way to show that rather than listing out every value from each column? Sorry, this feels like an obvious question but I'm new to R so I don't fully know how everything works yet.

Never hesitate asking the "obvious" questions. I meant it the other way around. I've added this just for me to get a working example, since I dont have that .csv of yours. You should, however, use your own data from the .csv (just use your original code for the import if it is working).

The only thing that did not work was what @vkatti and I pointed out, so just use one of the suggested fixes from your original code and you are good to go. Best

1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.