Updating + Retrieving database in online Shiny app

Hello there,

First post here, so hopefully it will be a great start!

So I have a personal project that I've been trying to accomplish using Shiny, which I've been learning for the past month or so.

My end goal is to replicate the Trudeaumetre for my country, but I'm in the stage where I have 1084 government proposals (after text-mining) and I need to collaborate with hopefully dozens of people (possibly simultaneous) in order to individually check if each has been done/in process/discarded.

I have a .csv file with 1084 rows and 6 columns. I'd like to make a temporary website/app where people could filter columns to select rows with empty fields, fill them in, submit the entry to the current database, update it and make that row unavailable for other users as the 1st stage is to get the whole table filled in and only then to validate answers.

I've done most of this locally apart from updating the database part and then deploy it.

My question is: Given that my Shiny app would be published online, when the user updates the table in question, how could I then retrieve this new table after all of this user input? And can multiple users work on the same shiny app simultaneously? These questions make me unsure on how to deploy this [i.e. which platform] or even if Shiny is the right way to do this.

Hi @creativedoctor, welcome to community.rstudio.com , I think this article in the Shiny site may be of help: http://shiny.rstudio.com/articles/persistent-data-storage.html

For your specific case, I would use a database that's available online, like Amazon Redshift, to save and retrieve the data, and maybe publish the app to shinyapps.io so you don't have to worry about setting up your own Shiny server.

@creativedoctor
I have a few questions before weighing in on whether Shiny is appropriate:

  • Do you plan to have people login before they can make changes?
  • Are you worried about one user ovewriting what someone else already entered?
  • Have you ever used any relational databases before, like MySQL, etc.?
  • What is your monthly budget for your app?

Please don't take @edgararuiz advice of using Amazon Redshift. I know he means well with his suggestion, but it's absolutely the wrong tool for this job, not to mention expensive. You'll want to use a database that's lightweight, can handle lots of concurrent connections and simultaneous writes, and is easy to setup and cheap. If you want to use an Amazon product, the RDS servers are a good way to go and there's a free tier got can try out.

https://aws.amazon.com/rds/free/

@raybuhr
Thank you for your help.

First in reply to your questions:

  • No login necessary
  • Table would be pre-filtered so it will only show rows where column X and Y will not be filled-in. Users ideally would only be able to fill-in the cell of that specific row "attributed" to them regarding those specific columns. If that wouldn't be possible, then yes, I'd be worried about one user ovewriting what someone else had already entered, since people of diverging political opinions may disagree on wether some measure was fulfilled or not. Validation stage would ideally occur, though posteriorly so I could personally check those labeled as different things by different users.
  • Used MySQL: No. Though I did online courses in both Python and R on how to get them and "transform" them into regular data frames
  • Budget: Ideally free. I'd have to think twice before paying for something. But if I'm going to put a price to it I'd say over 15$/€ would be too much already in my opinion. This is a "hobbie" and I'm in my infancy as far as my career goes, so as close to 0 as possible would be ideal.

I'm hoping for 50 users or so to fill in the data, though realistically this number will probably be closer to 5-10...

People have suggested Python-related tools for this job (as I know a bit [Datacamp's track on Data Science with Python]), though learning Django seems quite time-consuming - which I don't have much at the moment - and jam-py, from demos, seems to just show an interactive table and create rows, not exactly edit them and then make them completely disappear for end user so they're only left with empty rows.

@edgararuiz thanks for the link. I'll experiment with possibly keeping a file in my dropbox/google sheets and trying to update that with the shiny app. Though not too sure if the HTML contents would possess my account details [i.e. exposed to people inspecting the code] for the shiny app to work.

@creativedoctor - If free is the preference, then I agree with @raybuhr, don't listen to me :slight_smile:

Regarding masking your account details, this will ultimately depend on the R package you use to connect to the document service, but in general term, you should be able to use a package like config to save off your credentials in a yaml file, and then read the values as variables inside the code in R. This will prevent your code from ever being in your script: https://github.com/rstudio/config

Given your answers, I think Shiny would be perfectly acceptable, though probably overkill. I think you might be confused about what you actually want. For example, you say no login necessary, but you also want to filter rows to only what is appropriate for that user? How would your application know to filter to only those rows if you didn't make the user tell you who they are? You could use a simple drop down and honor system (i.e. people are truthful of who they say they are), but then you would have to question the results you obtained.

My recommendation for your use case would to be to use Google Sheets for these reasons:

  1. Free
  2. Built in user authentication
  3. Multiple users can edit at the same time
  4. You can easily lock rows users can edit
  5. Complete version control (you can see each edit made to the spreadsheet and by which user)

If you just want to use Shiny for the project to better learn R & Shiny, then go for it. It's a good use for Shiny app, but will require quite a bit of work on your end to getting something approximate to what you'd get from Google Sheets. If you go with R & Shiny, my recommendation is:

  1. Use SQLite as a database (it's free, easy to use from R, lots of getting started tutorials)
  2. Use the Shiny gallery example to have users login to be able to submit their updates to the database. Use the SQLite database to store their logins and passwords (make sure to keep that out of version control though)

Either way, good luck!

My application would filter to all users so it would only show rows with empty cells in X and Y column. Filtering is therefore user-independent.

Thanks for your comments. I did not know Google Sheets had complete version control nor the ability to lock rows [and I assume, specific cells too]. I'll further into that since although I'd like to learn even more R and Shiny, I prefere to do this quicker rather than take the long way. Appreciated.