Alex J. Gutman

Becoming a Data Head


Скачать книгу

      Don't assume you know the answer to this question until you've asked it.

      What If We Don't Like the Results?

      The last question a Data Head should ask prepares the stakeholders for something they'd rather overlook—the possibility their assumptions were wrong. “What if we don't like the results?” imagines you are at the point of no return. You've spent hours on a project only to find out the results show something different. Notice this is different from having data that can't answer the question. Here, the data can answer the question, perhaps quite confidently, but the answer is not what the stakeholders wanted.

      Asking this question will also expose differences in how individuals will accept the results of the project. For instance, consider our avatar George from the introduction. George is the type of person who would ignore the results if they don't align to his beliefs, while simultaneously promoting favorable results that do. The question will hopefully uncover his bias early on before the project starts.

      You don't want to start a project where you know there's only one accepted result.

      Projects can fail for a host of reasons: lack of funding, tight timelines, wrong expertise, unreasonable expectations, and the like. Add data and analysis methods into the mix, and the list of possible failures not only grows but becomes obscured behind the analysis. A project team might apply an analysis method they can't explain on data they don't understand to solve a problem that doesn't matter—and still think they've succeeded.

      Let's look at a scenario.

      Customer Perception

      You work for a Fortune 10 company, Company X, that recently received negative media attention for a socially insensitive marketing campaign. You've been assigned to a project to monitor “customer perception.”

      The project team consists of the following:

       The project manager (you)

       The project sponsor (the person paying for it)

       Two marketing professionals (who don't have data backgrounds)

       A young data scientist (fresh out of college and eager to apply the techniques they learned)

      The basic premise, you are told, is that sentiment analysis can automatically label a tweet or Facebook post as “positive” or “negative.” For instance, the phrase, “Thank you for sponsoring the Olympics.” is positive, whereas “Horrible customer service” is negative. Conceivably, the data scientist could count the daily totals of positives and negatives, plot the trends over time (and in real time!), and subsequently share the results via a dashboard for all to see. Most important: no one needs to read customer comments anymore. The machine will do it for everyone. So, it's decided. The project kicks off.

      The project sponsor loves it. A week later, the dashboard is displayed on a monitor in the break room for all to see.

      Success.

      Six months later, the break room is renovated, and the monitor is removed.

      No one notices.

Graph depicts the sentiment analysis trends.

      Discussion

      In this scenario, it seemed like everything went well. But the fundamental question—why is the project important?—doesn't appear to have been brought up. Instead, the project team moved forward attempting to answer another question: “Can we build a dashboard to monitor the sentiment of customer feedback on the company's Twitter and Facebook pages?” The answer, of course, was yes, they could. But in the end the project wasn't useful or even important to the organization.

      You would think marketers would have had more to say, but they were not identified as people who would have been affected by the project. In addition, this project exhibited two early warning signs in how the team attempted to solve the problem: methodology focus (sentiment analysis) and deliverable focus (dashboard).

      Moreover, the project team in the Customer Perception scenario could have taken their problem, “Can we build a dashboard to monitor the sentiment of customer feedback on the company's Twitter and Facebook pages?” and performed a solution trial run. They could have assumed a dashboard was available and updated daily with positive/negative sentiments of social media comments:

       Can we use the answer? The team would be thinking about the relevance of sentiment analysis on customer perception. How can the team use the information? What is the possible business benefit of knowing the sentiment of customers on social media?

       Whose work will change? Suppose the team convinces itself that knowing sentiment is important in order to be good stewards of the business. But is someone going to monitor this dashboard? If the trends suddenly go down, do we do anything? What if they trend up?

      At this point, the marketing team would have hopefully spoken up. Would they have known what to do differently in their daily work with that kind of information? Likely not. The project, in its current form, hit a wall.

      If only they asked the five questions.

      Right now, the industry is focused on training as many data workers as possible to meet the demand. That means universities,