Get the results from `Find in Files` function into a `data.frame`.

rstudio

#1

We know, the function Find in Files helps us to find the key we search into all files in a directory.
The shortcut is CMD + SHIFT + F. I want to know whether there is any function or package to extract the search result into a data.frame. And we can do a deeper data analysis.


#2

To clarify in terms of nomenclature, you mean Find in Files in RStudio, as opposed to a function in R, right? I'm pretty sure this is in the java code for the IDE, and not implemented in R, but I could be wrong.


#3

What is the object of what you're trying to do?

I'm assuming you are using Find in Files to locate a string or pattern in the files from the project you're working in?
E.g.:

~/path/to/project/R/file
  39: blah blah pattern blah blah
  68: mep mep mep mep pattern mep mep
~/path/to/project/inst/other_file
  5: blah pattern blah

This is essentially doing a recursive grep on the project directory. That is something that can be done in R. There might be an easier or more efficient method of getting to the goal though. To what end do you want to get the R UI Find in Files results into a dataframe?


#4

You can use my sifr package!

devtools::install_github("s-fleck/sifr")
x <- sifr::sif("blah")
as.data.frame(x)  # x is a data.frame already, but with a s3 subclass and print method

the package is still experimental and will stay so for a while, the "experiment" part is the search and replace feature though, not the search feature


#5

@hoelk Hi, I think sifr is what I want. That's cool, but is there an option to set encoding. Some files I comment in Chinese. When I run this function, I get a messy output.

@grosscol Yes, that is what I want, I think sifr already solve my problem.


#6

hmm reinstall from github,

  • i added an option to set input encoding (its just passed on to readLines(), see the help of that base function)
  • if you see weird character strings like \033[33mreturn\033[39m, thats not encoding related. Those are for the colour print output with the package crayon.

as.data.frame() should now automatically removes this color information, but its pretty slow.

The package should probably be rewritten to save the match indices for each row, instead of the color-formatted string, but I was going for quick and dirty when I wrote it. I wont be able to put any work into the package soon.. sorry... :frowning:

Put on the plus side the package is really simple. You should be able to modify it yourself if you need some special functionality :slight_smile: