BI and Data Science: Matching Approaches to Applications

This is a companion discussion topic for the original entry at https://blog.rstudio.com/2021/03/18/bi-and-data-science-the-tradeoffs


Photo by Jamie Street on Unsplash

In the previous posts in our series on Data Science and Business Intelligence, we first discussed how data science can either complement or augment self-service BI tools to deliver more combined value. We then explored the strengths and challenges of the two approaches, both of which aim to help an organization get more insights from their data and to make better decisions.

In this post, we’ll provide insights from organizations who have used both types of tools and give some guidance about which you should use when. We’ll also set the stage for future blog posts where we will explore specific integration points for BI and Data Science tools.

Don’t Get Trapped into a False Choice

In our prior post, we explored the strengths and challenges of both BI tools and open source data science. We won’t repeat those arguments here. Instead, we’ll hear from users who seem to understand that both approaches have their place.

BI tools are often an easier place for an organization to start when approaching an analytic problem, They provide a lower barrier to entry for the typical business user, who may not be comfortable coding in R or Python. The built-in features make it easy to visualize, explore and analyze data using a point-and-click approach and then to share that analysis with others.

For example, this user prefers Power BI for creating quick and easy visualizations, but switches to R and Shiny for their highly interactive user interfaces.

“Power BI is an easy to build visualization tool widely used in our organization to make data accessible to non-data people. This is a really great tool when we want to create a dashboard for trends and track some metrics. But it becomes very difficult when we want to enable high levels of user interactivity with the dashboard. That’s where R Shiny helped us to build intuitive and highly interactive user interfaces.”

A marketer at a large telecommunications firm

Meanwhile this Biotech firm views Spotfire and Tableau as fine products so long as you are satisfied with their built-in capabilities, but sees R being more flexible.

RStudio is code based, so in the beginning tools like Spotfire and Tableau have [their] advantages since many things are already built in, but in terms of flexibility RStudio will win over the longer term.

A team lead in a biotech company

The individuals below describe how they apply this flexibility and power from two different industry perspectives. The first is from a financial industry leader.

“Most of the work the data scientists did used the R language. They did a great job satisfying management’s constant barrage of questions because iterative analysis is so easy with tools like R, and the powerful visualization tools made communication of results easy for sales people to grasp. As the CEO, I was gratified at how clear the presentations were and at how quickly presenters answered my difficult questions, in some cases on the fly during the presentations.

As an R user myself, I know its code-based workflow lends itself to rapid iteration while, at the same time, documenting the process used. It was easy to unroll the tape to see every step that led to any conclusion.”

– Art Steinmetz, former Chairman and CEO of Oppenheimer Funds

The second individual describes how he uses R in the beverages industry:

“The R ecosystem has vast power to quickly solve problems. With R, I can incorporate nearly any AI/ML model into a dashboard or Shiny app, without being reliant on proprietary data science tools. Executives can be confident I am using the best analytic approach for a given problem, and I can rapidly apply new approaches as they become available.”

– Paul Ditterline, Director of Data Science at Heaven Hill Brands

While these may be only anecdotal evidence, they do show awareness of both approaches to data analysis and provide some color into why companies opt for each solution. They illustrate that as the questions get more complex, requiring greater analytic depth to answer, and more customization in how the analysis is done and presented, BI tools may struggle. Users will encounter a relatively low ceiling to the complexity of questions they can answer.

On the other hand, code-friendly data science tools represent a relatively high barrier to entry. They require those who create the analyses to have some understanding of coding in R and Python, and familiarity with applying and interpreting advanced analytic methods to get the most out of the tools. However, the flexibility and analytic breadth of code-friendly data science combines to provide a very high ceiling for answering difficult, valuable questions for an organization.

This just leaves open the question, “How should I select my approach?”

Match Your Data Science Approach to Application Needs

We expect firms to continue struggling with this tradeoff between BI tools and open source data science for years to come. As we argued in our first post on the topic, this isn’t about choosing between the two approaches, but how to exploit the strengths of each while mitigating their challenges.

In the table below, the Use When You… column augments the table we presented last week. While this guide won’t be correct for every case, it at least provides a guideline for those times a data science leader needs a quick answer to an urgent project.

p { padding: 0 0 8px 0; } th { font-size: 90%; background-color: #4D8DC9; color: #fff; vertical-align: middle; } td { font-size: 80%; background-color: #F6F6FF; vertical-align: top; line-height: 16px; } td.approach { font-size: 90%; background-color: #4D8DC9; color: #fff; vertical-align: middle; } caption { padding: 0 0 0 0; } table { width: 100%; padding: 0 0 16px 0; } th.approach { width: 16%; } th.strengths { width: 28%;; vertical-align: middle; } th.challenges { width: 28%; vertical-align: middle; } th.use { width: 28%; vertical-align: middle; } table { border-top-style: hidden; border-bottom-style: hidden;}
Strengths Challenges Use When You...
Self-service BI Tools
  • Explore and visualize data without coding skills
  • Share analyses and interactive dashboards
  • Do self-service reporting and scheduling
  • Support data-driven organizations
  • Are difficult to adapt and inspect
  • Are limited by their black box nature
  • Struggle with enriched or wide data
  • Create uncertain conclusions
  • Include limited data science and machine learning
  • capabilities
  • Require skills that aren't easily transferred
  • Must support analysis and sharing with people without coding skills
  • Want to produce descriptive analytics and general reporting
  • Know that your use is covered by your BI Tool's feature set
Open Source Data Science
  • Provide a wide range of open source capabilities
  • Unlock the benefits of code
  • Allow fully customizable data products
  • Have broad Interoperability
  • Create transferable skills and analyses
  • Tap a wider pool of potential talent
  • Necessitate coding in R or Python
  • May require package and environment management
  • Provide limited native deployment capabilities
  • Don't include enterprise security, scalability and cloud features
  • Need flexibility to tackle novel problems
  • Expect the analysis to be reused and will need to be reproducible without the code creator
  • Need to solve harder questions, which require data science and ML on complex data
  • Must support complex decision-making with deep interactivity

Table 1: Guidelines for when you should apply BI Tools or open source data science.

Summary

RStudio is dedicated to the proposition that code-friendly data science is uniquely powerful, and that everyone can learn to code. We support this through our education efforts, our Community site, and making R easier to use through our open source projects such as the tidyverse. Our software is already used by millions of people to analyze data every day.

However, code-friendly data science does present a higher barrier to entry compared to BI tools, which are very valuable for the wider community of analysts and business users in an organization. Because of this, it is critical to leverage both, and use data science to augment and complement your BI tools.

In our next posts, we will explore specific points of integration between these tools. We’re happy to help you explore these topics, so if you’d like to learn more about how RStudio products can help augment and complement your BI approaches, you can set up a meeting with our Customer Success team.

To Learn More

This topic was automatically closed after 83 days. New replies are no longer allowed.


If you have a query related to it or one of the replies, start a new topic and refer back with a link.