Tuesday, April 18, 2017

Data Scientists vs BI Analysts: Why is this a thing?

I came upon this popular Linked In post that attempts to define and draw distinctions between these two roles. It is one of several I have read recently.

Upon reading this, my initial reaction was how much I disagreed with these characterizations. My feeling is that, if we want to draw such distinctions at all, they should be between business analysts and data analysts. Data Scientist is just a newer title that combines some attributes of both business and data analysis. It nearly always includes a mastery of big data technologies and statistical methods, thus commanding higher compensation.

Then I realized that this is all beside the point. These role definitions are more about recruiting, HR job descriptions, org charts, and pay grades than what is actually required to succeed in an analytics program. What matters is having the necessary skill sets on the analytics team, regardless of what roles or organizations they come from.

As has always been the case in BI & Analytics, the critical skill sets can be considered using the classic Input à Process à Output model:

·   Data sourcing & extraction (ETL/ELT)
·   Data preparation
·   Data quality
·   Data governance
·   Data navigation & investigation
·   Data discovery
·   Business analysis
·   Modeling
·   Predictive analytics
·   Reporting
·   Dashboards & KPIs
·   Visualization
·   Operational applications
·   Presentation/storytelling

BI/Analytics technology no longer respects the walls between these skill sets. The market has moved away from niche tools to suites that address the entire analytics capability set. For example, what were once pure visualization tools now offer data sourcing, transformation and modeling features. The impact of this has been to democratize the entire data supply chain in such a way that it has moved much closer to the business and completely obscured the role distinctions between data analysts, scientists and yes, decision makers. In fact, the overlap of these roles and the trend toward self-service BI tends to create organizational redundancy within larger organizations that can afford it.

The fact that the technology is available to many roles does not mean that individuals should be expected to have all the necessary skills to leverage it effectively. In fact, very few people do. Our trade has always placed a high value on those who can navigate data, develop actionable information, and present it effectively because they are still rare. This won’t last. The generation that is now entering the workforce has a much higher level of data skills than its predecessors and will value the ability to develop their own stories and support its own decisions with data as it rises to executive positions.

If the goal is to leverage data most effectively and maximize decision support success, don’t look to your organization to create a new role. Look to your team to fill any skills gaps, preferably by expanding the roles already in place. The goal is to minimize the organizational distance, handoffs, and filters between your sources of data and those who directly put it to use in business processes.