find and count repeated words in another column

Column 1 Column 2 Result
A well spent weekend

captured photooftheday foodie food foodporn igers lunch iphonography iphonexsmax|earn|1|
|wallpapers iPhone
iPhoneXSMax iPhoneXS iPhoneX iPhoneXR

Earth Fantasy v wallpaper for

iPhone XS MAXiPhone XR|good|0|
|This sunset was straight out of a commercial

#hotonmoment #omentanamorphic #hotoniphone #phonexsmax h.o/JkMHR7Db|like|2|
|Were giving away another unlocked GB iPhoneXSMax this September for free Follow the link to earn your chances to win like|dislike|0|
|wallpapers iPhone
iPhoneXSMax iPhoneXS iPhoneX iPhoneXR

Digital Crystals wallpaper for

iPhone XS MAXiPhone XR|well|1|
|AMAZON US WhitestoneDomeGlass
The Best Review For iPhoneXsMax Dome Glass screen protector
The glass feels grea|joy|0|
|wallpapers

Imagination Planet wallpaper for

iPhoneXSMAX
iPhoneXR
iPhoneXS
iPhoneX
ALL other iPhone like|sad|0|
|well AMAZON US WhitestoneDomeGlass
The Best Review For iPhoneXsMax Dome Glass screen protector
The glass feels grea|worry|0|

Can any one help in getting result column above. word will be available in where in the column. need to find and count how many times it is available in whole column.

You can use purrr::map_int() for making the count, take a look at this example (BTW please use a copy/paste friendly format for sharing data like in the example).

df <- data.frame(stringsAsFactors = FALSE,
    column_1 = c("A well spent weekend\n\ncaptured photooftheday foodie food foodporn igers lunch iphonography iphonexsmax",
                           "wallpapers iPhone\niPhoneXSMax iPhoneXS iPhoneX iPhoneXR\n\nEarth Fantasy v wallpaper for\n\n iPhone XS MAXiPhone XR",
                           "This sunset was straight out of a commercial\n\n#hotonmoment #omentanamorphic #hotoniphone #phonexsmax h.o/JkMHR7Db",
                           "Were giving away another unlocked GB iPhoneXSMax this September for free Follow the link to earn your chances to win like",
                           "wallpapers iPhone\niPhoneXSMax iPhoneXS iPhoneX iPhoneXR\n\nDigital Crystals wallpaper for\n\n iPhone XS MAXiPhone XR",
                           "AMAZON US WhitestoneDomeGlass\nThe Best Review For iPhoneXsMax Dome Glass screen protector\nThe glass feels grea",
                           "wallpapers\n\nImagination Planet wallpaper for\n\n iPhoneXSMAX\n iPhoneXR\n iPhoneXS\n iPhoneX\n ALL other iPhone like",
                           "well AMAZON US WhitestoneDomeGlass\nThe Best Review For iPhoneXsMax Dome Glass screen protector\nThe glass feels grea"),
    column_2 = c("earn", "good", "like", "dislike", "well", "joy", "sad", "worry"),
    result = c(1L, 0L, 2L, 0L, 1L, 0L, 0L, 0L)
)
library(dplyr)
library(purrr)
library(stringr)

df %>% 
    bind_cols(result_2 = map_int(df$column_2, ~sum(str_count(df$column_1, paste0("\\s", ., "\\s?") )))) %>% 
    select(column_2, result, result_2)
#>   column_2 result result_2
#> 1     earn      1        1
#> 2     good      0        0
#> 3     like      2        2
#> 4  dislike      0        0
#> 5     well      1        1
#> 6      joy      0        0
#> 7      sad      0        0
#> 8    worry      0        0

Created on 2019-05-19 by the reprex package (v0.3.0)

Thanks a lot @andresrcs , i will test and confirm if any issues

Hi @andresrcs , i am not able to project result. can you create new data frame with two columns
column_2, result_2. Since i am executing the code in BI tool hence not able to generate the output. it would be fine, if result is new data frame with column_2, result_2.

You just have to modify the code a little bit, I think this would be a good exercise for you to learn how to do this kind of stuff.

Take a look at this learning resource
https://r4ds.had.co.nz/

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.