keeping rows based on their name

varlist <- c("num_parts","parent_theme")
legoraw_cl <- subset(legoraw, select=varlist)

i only want the summary statistics of some parent themes and i want to only keep the necessary ones.

i don't know what function to use but i tried this one and it didn't work.

legoraw_cl <- filter("Harry Potter","Staw Wars","Ninjago","Technic","Superheroes")

Below is an example using tidyverse functions and the LEGO data set referenced here. The target_themes are pasted into one string separated by the "|" character in order to pass to the str_detect() function. The select() statement returns the columns of interest.

library(tidyverse)
lego_url <- 'https://raw.githubusercontent.com/SmilodonCub/DATA605/master/lego_sets.csv'
lego_df <- read.csv( lego_url )

target_themes = c("Harry Potter","Star Wars","NINJAGO","Technic","Super Heroes")
target_themes = paste(target_themes, collapse = '|')

df = lego_df %>%
  filter(str_detect(theme_name, target_themes)) %>%
  select(theme_name, piece_count)

count(df, theme_name, name = 'total_sets')
#>                  theme_name total_sets
#> 1   DC Comics™ Super Heroes        148
#> 2       Marvel Super Heroes        414
#> 3                  NINJAGO®        263
#> 4                Star Wars™       1377
#> 5                   Technic        505
#> 6 THE LEGO® NINJAGO® MOVIE™        796

Created on 2022-10-11 with reprex v2.0.2.9000

This topic was automatically closed 42 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.