Hi. I'm new to R and was looking for help. I was hoping to create a scatterplot using two variables (danceability and popularity), however, when I run the code, it includes all of the variables in the csv, even though I have defined the x and y variables. Below is the code:

#Section 1 - 1.1

df <- read.csv("spotify_dataset.csv")

print(df)

#Section 1 - 1.2

set.seed(1008444239)

sample(1:112,1)

#The number I received from the sample is 49

unique(df$track_genre)

my_regression_df <- df[df$track_genre == "hardcore",]

#Section 1 - 1.3

sample(1:112,2)

#The two numbers that I received from the sample are 22 and 47

unique(df$track_genre)

my_group_diff_df <- df[df$track_genre == "dancehall" | df$track_genre == "happy",]

#Section 2 - 2.1

my_regression_df <- read.csv("spotify_dataset.csv")

install.packages("ggplot2")

library(ggplot2)

plot1 <- ggplot(data=df, aes(x=popularity, y=danceability)) + geom_point()

plot1

plot1_x <- ggplot(data=my_regression_df, aes(x=popularity, y=danceability)) + geom_point() +

xlab("popularity index")

plot1_x

plot1_y <- ggplot(data=my_regression_df, aes(x=popularity, y=danceability)) + geom_point() +

xlab("popularity index") + ylab("danceability index")

plot1_y

plot1_title <- ggplot(data=my_regression_df, aes(x=popularity, y=danceability)) + geom_point() +

xlab("popularity index") + ylab("danceability index") + ggtitle("popularity vs. danceability")

plot1_title