Wrangling Unruly Data: The Bane of Every Data Science Team

This is a companion discussion topic for the original entry at https://www.rstudio.com/blog/wrangling-unruly-data


There’s an old saying (at least old in data scientist years) that goes, “90% of data science is data wrangling.” This rings particularly true for data science leaders, who watch their data scientists spend days painstakingly picking apart ossified corporate datasets or arcane Excel spreadsheets. Does data science really have to be this hard? And why can’t they just delegate the job to someone else?
Data Is More Than Just Numbers The reason that data wrangling is so difficult is that data is more than text and numbers.