How to auto-generate 'anchor links' and directory path to text search function?

library(quanteda)
library(tidyverse)
library(htmltools)

I have a tokenized txt document with 'div' tags and 'id' to it :

text <- <div id="4">But how do you do?</div>
        <div id="5">I see I have frightened you—sit... ”</div>
        <div id="6">It was in July, 1805, and the speaker..</div>
        <div id="7">With these words she greeted Prince Vasíli Kurágin...</div>
        <div id="8">Anna Pávlovna had had a cough for some days...</div>
        <div id="9">She was, as she said, suffering from la grippe....</div>
        <div id="10">Petersburg, used only by the elite.</div>
        <div id="11">All her invitations without exception, written in French...</div>
        <div id="12">“If you have nothing better to do, Count (or Prince).. </div>
        <div id="13">“Heavens!</div>
        <div id="14">what a virulent attack!”</div>

         ''''
        <div id="2107">It was plain that this “well?”</div>

I need to auto generate this output to finish it up

<a href="C:\Users\John\Desktop\final_tokens.html#div number"> text- sentence </a>

Ex- When I search for the word 'good'

<a href="C:\Users\John\Desktop\final_tokens.html#49"> Our good and wonderful sovereign has to </a>
<a href="C:\Users\John\Desktop\final_tokens.html#73">He is one of the  the good ones.</a>
<a href="C:\Users\John\Desktop\final_tokens.html#138">She is rich and of good family..</a>

the div id number should go beside # as show above.

Previously i used

make_sentences <- function(word) {
                  grep(word,text,value= TRUE)} 

above grep worked fine with plain text before but with lot of regex I need to modify it ,to get the anchor links directory path and div number to. is there any solution to this maybe ?

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.