Pre-teaching R. What's your best argument for switching to R from Excel?

That plot would show a different picture just by using different ranges for the scales.

The better option (imo) would be to use a common index (e.g. at May 2015) or else use facets.

2 Likes

Yes, and the plot actually obscures the fact that the employement dropped by >50% while the number of rigs only went down by about 10%, so the "divergence" happened earlier.

That said, I actually support ggplot2 getting independent secondary axis support, but primarily because the person creating the graph isn't always the person deciding the content of graph. If someone has to fall back to Excel to make their boss happy, they'll be less likely to stick with R. Fortunately, the option for a linear transformation secondary axis means that you can fake it if necessary (though not necessarily conveniently).

If one absolutely must (and all arguments to the contrary have been exhausted), there are methods to get independent secondary axes using ggplot2, which would be preferable to not using ggplot2 (or even R) at all.

The code can easily be found via the usual search engines. It would feel like stepping onto a host's carpet with muddy shoes to post the links here directly.

I think adding more visual helpers for data transformation could really help. The import csv wizard in R Studio is brill and I use it to create my R code for text importing quite a bit even though I'm happy coding. With more helpers like that, I think the barrier to entry for Excel users would be much lower.

2 Likes

I think the key thing is to show compassion and respect for Excel user's skills and knowledge. It can be very easy to switch into the framework that R is better than Excel and only big dummies continue to use GUIs for data analysis. So I think rather than go at it that way, it's often better to appeal to people's curiosity and desire for personal growth. So rather than telling them that their tool of choice sucks you're asking them to stretch their minds a bit to learn something new. This helps turn the awkwardness of learning to program from something irritating to an interesting challenge.

I wrote up my thoughts about this last year, and since then I thought of a useful analogy for Excel users. Excel users are more like musicians then they are like programmers. When people get really good with Excel they do their analysis in concert with the program, in other words part of their work is done by Excel and part of it is done by the analyst. As people get better they develop mental and physical skills that help them interact with the tool fluidly. There's a joy in using these skills which is similar to the joy that people feel when they play piano or video games and when you're asking someone to move to R you're asking them to give up using skills which they worked hard to develop and enjoy using. If you approach this as an argument, you're probably not going to win. So a better way to approach it in my experience is as something which can augment and enrich their work.

12 Likes

I think it is difficult to convince people to switch to R from Excel for "one-off" jobs - when something needs to be done quickly and unlikely to be repeated again - Excel is an awesome tool. Excel is also good for "one-off" programming tasks, especially when logic is inseparable from data. Good example is "financial models" produced by your accounting team. There are certain templates, but more often they just start from scratch every quarter. There's just too much variation in the data (and the data is too messy) to write a script and execute it over a bunch of excel sheets. For financial analyst Excel is a coding environment, a data storage and a presentation tool (did you know that there are people that paint in Excel?).

The wake-up call comes when you have to repeat things. I had a colleague that was convinced to start using R when he had a dataset that, when stored in Excel, occupied over 30 Mb on his disk and contained coordinates that he needed to "spatially join" to a shape file from corporate GIS system (point-in-a-polygon). I helped him build a script and he never even attempted to open Excel again (ArcGIS was an option, but thanks goodness it is horribly slow). We do certain things in Spotfire/PowerBI, but limitations of interactive tool when joining and aggregating more than 3-4 tables (especially combined with automatic text cleaning) will make it impossible to do it (and especially re-do it) in Excel.

This is why I switched too. When Excel fails, I switched to Python, then to R.