Aesthetics must be either length 1 or the same as the data (151): x, y

I am new to coding so this is all annotated for me to understand; sorry if it's not as formally accurate as it should be!
I am trying to make a linear regression of two time points, one we estimated and the actual one to see the accuracy of our estimation protocol.

Here is my code:

>MolClockMASTERDATA <- read_csv("~/Desktop/Projects/Molecular Clock/DataAnalysis/MolClockALLDATA.csv")

>head(MolClockMASTERDATA)

# A tibble: 6 x 27 (a sample of my data set)
    Run Sample  APD_1  APD_5  APD_10 AAstart AAend GoodAAs Seq_cont HXB2nt_start HXB2nt_end Consensus Fragment
  <int> <chr>   <dbl>  <dbl>   <dbl>   <int> <int>   <int>    <int>        <int>      <int> <chr>        <int>
1     3 pt313… 0.0233 0.0180 0.0166       84   476     392        1         1044       2220 cgggcgag…        1
2     3 pt313… 0.0199 0.0182 0.0168       62   467     405        1          978       2193 tcagtatt…        1
3     3 pt313… 0.0210 0.0176 0.0157      159   238      79        1         1269       1506 tcagtatt…        1
4     3 pt258… 0.0157 0.0120 0.0109      515   937     422        1         2337       3603 gcgtcagt…        1
5     3 pt485… 0.0160 0.0120 0.0120      515   979     464        1         2337       3729 tcagtatt…        1
6     3 pt490… 0.0204 0.0101 0.00780     109   359     250        1         1119       1869 gcgagagc…        1
# ... with 14 more variables: `ActualTOI (Month)` <int>, `ActualTOI (year)` <dbl>, MAE1 <dbl>, Slope1 <int>,
#   Yint1 <dbl>, CalculatedTOI1 <dbl>, MAE5 <dbl>, Slope5 <int>, Yint5 <dbl>, CalculatedTOI5 <dbl>, MAE10 <dbl>,
#   Slope10 <int>, Yint10 <dbl>, CalculatedTOI10 <dbl>


ETI1=molclockMaster$CalculatedTOI1

TI=molclockMaster$ActualTOI..year.

#I can roughly plot my data without using ggplot
plot(TI, ETI1)

#simple scatterplot ETI vs TI for 1% APD 

ggplot2::ggplot(MolClockMASTERDATA, aes (x=TI, y=ETI1)) + geom_point()

Returns: Error: Aesthetics must be either length 1 or the same as the data (151): x, y

Help?

I am confused about you deriving ETI1 and TI from molclockMaster but the using MolClockMASTERDATA as the data source in ggplot(). What if you plot

ggplot2::ggplot(MolClockMASTERDATA, aes (x=TOI..year, y=CalculatedTOI)) + geom_point()

You do not have to store your data in separate vectors to use it in ggplot. Just use the unquoted column names from the data frame that you use as data in ggplot().

When I run the code above it returns object not found for both x and y.

@fishsca. if theres any doubt as to whether TOI..year and CalculatedTOI are variables within MolClockMASTERDATA
a quick console run of

names(MolClockMASTERDATA)

should answer that :slight_smile:

Hi @fishsca : Could you run glimpse(MolClockMASTERDATA) and glimpse(molclockMaster) and post the results? That might help clear some questions up.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.