Connect AWS MySql Database to a Shiny app, in a network with proxy.

Kaladin · September 5, 2019, 7:07pm

...Hi! I'm trying to connect an AWS RDS MySQL to my instance of Shiny. But I'm having trouble to connect to the instance from inside the organization i'm working. I'm wonder if someone can explain to me how to connect to an Aws RDS when you 've a proxy between.
Thanks.
Joaquin

hugo-pa · September 12, 2019, 12:54pm

Hi Kaladin,

[Edit] The following instructions assume a particular use case:

The "proxy" you are referring to between your machine and the AWS RDS instance is a EC2 gateway server;
A local, development instance of the database does not exist within your organization;
You are working in a development environment that requires access to an RDS database;
You only need to create an SSH tunnel for development/testing purposes.

In order to establish a connection to your RDS database from within your organization's network, you'll need to configure SSH port forwarding on your local machine. In essence, this will cause your SSH client to listen for connections on a configured port, and forward connections received on that port to an SSH server (what you have referred to as the proxy). The server then connects to a specified destination port on a different machine than the SSH server (in this case, the RDS server).

Thankfully, this can all be accomplished from R with the help of the following three packages:

RMariaDB, for establishing a database connection to your MySQL instance;
ssh, for configuring SSH port forwarding;
callr, for running SSH port forwarding in a background R process (more on this below).

You will also need the following information:

The database connection information, including:
- The database name;
- The username and password;
- The host address (e.g. likely of the form your-database.cluster-xxxxxxxxxxxx.aws-region.rds.amazonaws.com);
- The port that MySQL is listening on, if different from the default (i.e. 3306);
The SSH gateway (proxy) server connection information, including:
- The gateway server's address (e.g. likely of the form ec2-11-222-33-444.aws-region.compute.amazonaws.com);
- The username of the gateway server's user;
- The key file generated by Amazon when the gateway server was launched (e.g. your-gateway.pem).

With the above information, you can establish a connection to your RDS instance in R as follows:

library(RMariaDB)
library(shiny)

ui <- fluidPage()

server <- function(input, output, session) {
  
  # Database connection information
  db_name <- "my_db"
  db_user <- "my_db_user"
  db_password <- "my_db_password"
  db_host <- "your-database.cluster-xxxxxxxxxxxx.aws-region.rds.amazonaws.com"
  db_port <- 3306
  
  # SSH gateway information
  ssh_server <- "ec2-11-222-33-444.aws-region.compute.amazonaws.com"
  ssh_user <- "ec2-user"
  ssh_key <- "my-gateway.pem"
  
  # Create an SSH tunnel in a background R process.
  tunnel_process <- callr::r_bg(
    function(ssh_host, ssh_user, key_file, mysql_host, mysql_port) {
      
      # Create an ssh session
      session <- ssh::ssh_connect(
        glue::glue("{ ssh_user }@{ ssh_host }"),
        keyfile = key_file
      )
      
      # Open a port on your machine (will use the MySQL port) to tunnel all
      # traffic to a custom target host (the MySQL instance) via the SSH server
      ssh::ssh_tunnel(
        session,
        mysql_port,
        glue::glue("{ mysql_host }:{ mysql_port }")
      )
      
    },
    args = list(ssh_server, ssh_user, ssh_key, db_host, db_port),
    stdout = nullfile(),
    stderr = nullfile()
  )
  
  # Connect to your database as normal
  conn <- dbConnect(
    MariaDB(),
    dbname = db_name,
    username = db_user,
    password = db_password,
    host = db_host,
    port = db_port
  )
  
  # Use the connection (`conn`) in your app...
  
}

shinyApp(ui, server)

^{Created on 2019-09-12 by the reprex package (v0.3.0)}

Note that I have specified all of the connection information within the Shiny app's server function simply to demonstrate how to establish a connection for your use case. In practice, you will want to secure your credentials; see this article for more information. [Edit] Furthermore, I have included the SSH tunnel code within the Shiny server function simply for the sake of brevity; in practice, you would want to separate this functionality from your Shiny app, and only call it when absolutely necessary from within your development environment.

The reason that the tunnel must be created in a background process is that it is a blocking function; that is, the R process that calls the function will not be able to perform any other actions until the tunnel is closed. By creating the tunnel in a background R process, the R process running the Shiny app can continue to do work.

Note also that the tunnel created by ssh::ssh_tunnel() will be automatically closed when the client disconnects; thus, when being used for a database connection in a Shiny app, the tunnel will be closed when the database connection is closed (e.g. via dbDisconnect()). Therefore, a new tunnel will need to be created every time a new database connection is created (or, alternatively, you could use the pool package to create a database connection pool).

I hope this helps! Let me know if you have any further questions and/or issues.

cole · September 12, 2019, 1:29pm

Thanks for the interesting question! Can you describe a bit more what you mean by "my instance of Shiny" and what "proxy" is between the RDS instance and your instance of Shiny? Are you talking about Shiny Server?

@hugo-pa's advice will definitely work in the right situation, although it is a bit advanced from a general R user's perspective. I think a bit more information will help us ensure that the SSH tunnel approach is right for your particular case. Going the SSH tunnel approach is not always recommended, and in some cases, may be forbidden or frowned upon (depending on the organization).

Further, it is worth noting that practices around SSH, SSH tunnels, and SSH port-forwarding will vary by organization. As a result, contacting your IT department can often be a good step in figuring out their best practices regarding database connectivity.

hugo-pa · September 12, 2019, 1:48pm

@cole raises some very good points, and I have updated my original answer with the assumptions I made to provide context (sections marked with [Edit]).

Kaladin · October 3, 2019, 1:52pm

Hello. I've been very busy with work so I haven’t been able to work on this project this week. Here's the situation: I’m working on the thesis of my information systems engineering degree. I’m doing a Dashboard for the pension system of the province in which I live (for free, because it’s part of an educational project), but my partner and I can’t afford the license of shiny server (and the institution wants access control so we’ve decided to use the open source server ShinyProxy). We need to connect the server of the institution (where we’ve been programming an ETL to make the shinyapp more efficient) to our database on RDS Amazon – as you can imagine, the institution has a proxy managed by the government of the province. We’ve already approved the Host of the database, but the moment we want to connect, we need to specify the proxy’s credentials.

system · November 26, 2019, 5:52pm

This topic was automatically closed 54 days after the last reply. New replies are no longer allowed.