Planning a Shiny app for processing very large data files: Experience?

cyrus · January 6, 2019, 11:01am

Dear board,
we are planning on creating a shiny app for a R data processing pipeline which currently consists of a bunch of self-written functions for processing spectroscopic data. Since these functions are quite troublesome to use for new users, we were thinking of developing a shiny app so that the people who own the data can process it themselves. However, the data files to be processed are very large, ranging from 10-30MB/file to sometimes up to 1-2 GB/file with 5-10 files to sometimes hundreds of files.
Does anyone have experience with such a situation?
Thanks a lot
Cyrus

jdlong · January 6, 2019, 11:47am

what, in particular, concerns you with the workflow?

The performance of the functions seems to be a known issue. I presume you are not looking for input on how to make the functions faster. That's a deeper technical discussion.

Are you concerned the app will time out? Would you like to show a progress bar in the app?

cyrus · January 6, 2019, 1:46pm

There are three steps that take time in the work flow: the data conversion from proprietary vendor file format to open mzML or mzXML. This is done under MS Windows with a software called msconvert, without R involvement. Then under R comes the feature detection from the spectroscopic data and the alignment of the features for feature list generation.
At the moment, the converted mzML-files are read in by the R-functions, spectral features are detected and aligned. I would imagine that the "upload" into a Shiny app would take some additional time, is that correct? Ideally, one could also observe and navigate though the data/spectra in a Shiny plot before and after processing, scaling, nomalization etc, which is possible with the R-functions, but requires a lot of knowledge of the workings of these functions.
Sorry my simplistic view and ignorance, but we are at the very first steps of conceptualization and dont have an R-expert at our hands. And before we embark on such a project, we would need to understand the concept and structure.

jdlong · January 9, 2019, 9:03pm

This may seem like a weird question, but since your analysis isn't done in R and it sounds like you don't have much of an R user base internally, why are you wanting to use Shiny at all?

cyrus · January 10, 2019, 9:16pm

the data processing (and possibly also analysis, but I am more after the processing) is indeed done in R. The reason is that the shiny apps seem very intuitive and we want to eliminate the pain of having to deal with the functions.

jdlong · January 11, 2019, 3:17am

Do you have an R workflow that does basically the things you want to do in Shiny?

When building Shiny apps my (limited) experience is that I start by defining all my inputs at the top of an R script. Then I build the script to do what I want the analysis to do, including make graphs or whatever. After I have it all working in a normal R script and I have the performance working reasonably I then (and only then) wrap my R logic in the Shiny bits.

I am of the opinion that many folks make Shiny development MUCH harder than it needs to be because they start with a shiny app template and start filling in the R bits instead of starting with a working R script and then wrapping the Shiny around that.

system · February 1, 2019, 3:17am

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.