Becoming a Data Head. Alex J. Gutman. Читать онлайн. Hotlib. HOTLIB.NET

Becoming a Data Head

Don't assume you know the answer to this question until you've asked it.

What If We Don't Like the Results?

The last question a Data Head should ask prepares the stakeholders for something they'd rather overlook—the possibility their assumptions were wrong. “What if we don't like the results?” imagines you are at the point of no return. You've spent hours on a project only to find out the results show something different. Notice this is different from having data that can't answer the question. Here, the data can answer the question, perhaps quite confidently, but the answer is not what the stakeholders wanted.

It's never easy to get to the end of a project only to find out the results were not what you expected. This all too real scenario happens more often than we'd like to admit. Thinking first about the possibility that the project might reach an unwanted conclusion will ensure you have a plan in motion when you have to deliver the bad news.

Asking this question will also expose differences in how individuals will accept the results of the project. For instance, consider our avatar George from the introduction. George is the type of person who would ignore the results if they don't align to his beliefs, while simultaneously promoting favorable results that do. The question will hopefully uncover his bias early on before the project starts.

You don't want to start a project where you know there's only one accepted result.

UNDERSTANDING WHY DATA PROJECTS FAIL

Projects can fail for a host of reasons: lack of funding, tight timelines, wrong expertise, unreasonable expectations, and the like. Add data and analysis methods into the mix, and the list of possible failures not only grows but becomes obscured behind the analysis. A project team might apply an analysis method they can't explain on data they don't understand to solve a problem that doesn't matter—and still think they've succeeded.

Let's look at a scenario.

Customer Perception

You work for a Fortune 10 company, Company X, that recently received negative media attention for a socially insensitive marketing campaign. You've been assigned to a project to monitor “customer perception.”

The project team consists of the following:

The project manager (you)

The project sponsor (the person paying for it)

Two marketing professionals (who don't have data backgrounds)

A young data scientist (fresh out of college and eager to apply the techniques they learned)

At the kick-off meeting, the project sponsor and data scientist quickly and excitedly discuss something called “sentiment analysis.” The project sponsor heard about it at a recent tech conference after a competing company reported using it. The data scientist volunteered they knew sentiment analysis, having implemented it in their senior capstone project. They think it might be a good technique to apply to customer comments on the company's Twitter and Facebook pages. The marketers understand the technique as being able to interpret people's emotional reactions using social media data, but they don't say much.

The basic premise, you are told, is that sentiment analysis can automatically label a tweet or Facebook post as “positive” or “negative.” For instance, the phrase, “Thank you for sponsoring the Olympics.” is positive, whereas “Horrible customer service” is negative. Conceivably, the data scientist could count the daily totals of positives and negatives, plot the trends over time (and in real time!), and subsequently share the results via a dashboard for all to see. Most important: no one needs to read customer comments anymore. The machine will do it for everyone. So, it's decided. The project kicks off.

A month later, the data scientist proudly shows off Company X's Customer Perception Dashboard. It's updated each day to include the latest data and lists some of the week's “positive” comments along the side. Figure 1.1 zooms in on the main graphic in the dashboard: trendlines of sentiment over time. Only positive and negative values are shown, and most comments are neutral.

The project sponsor loves it. A week later, the dashboard is displayed on a monitor in the break room for all to see.

Success.

Six months later, the break room is renovated, and the monitor is removed.

No one notices.

Graph depicts the sentiment analysis trends.

FIGURE 1.1 Sentiment analysis trends

A postmortem of the project revealed no one in the company used the analysis, not even the marketers on the team. When asked why, the marketers admit they weren't really comfortable with the original analysis. Yes, it was possible to label each communication as positive or negative. But the idea that nobody would need to read comments anymore seemed like wishful thinking. They questioned the degree to which the labeling process had even been useful. Further, they countered that perception couldn't only be measured by online interaction even if that was the dataset most readily available to support sentiment analysis.

Discussion

In this scenario, it seemed like everything went well. But the fundamental question—why is the project important?—doesn't appear to have been brought up. Instead, the project team moved forward attempting to answer another question: “Can we build a dashboard to monitor the sentiment of customer feedback on the company's Twitter and Facebook pages?” The answer, of course, was yes, they could. But in the end the project wasn't useful or even important to the organization.

You would think marketers would have had more to say, but they were not identified as people who would have been affected by the project. In addition, this project exhibited two early warning signs in how the team attempted to solve the problem: methodology focus (sentiment analysis) and deliverable focus (dashboard).

Moreover, the project team in the Customer Perception scenario could have taken their problem, “Can we build a dashboard to monitor the sentiment of customer feedback on the company's Twitter and Facebook pages?” and performed a solution trial run. They could have assumed a dashboard was available and updated daily with positive/negative sentiments of social media comments:

Can we use the answer? The team would be thinking about the relevance of sentiment analysis on customer perception. How can the team use the information? What is the possible business benefit of knowing the sentiment of customers on social media?

Whose work will change? Suppose the team convinces itself that knowing sentiment is important in order to be good stewards of the business. But is someone going to monitor this dashboard? If the trends suddenly go down, do we do anything? What if they trend up?

At this point, the marketing team would have hopefully spoken up. Would they have known what to do differently in their daily work with that kind of information? Likely not. The project, in its current form, hit a wall.

If only they asked the five questions.

WORKING ON PROBLEMS THAT MATTER

So far, we've tied project failures to not defining the underlying problem correctly. Mostly, we've placed this failure in terms of losing money, time, and energy. But there's a broader issue happening all over the data space, and it's something that you wouldn't expect.

Right now, the industry is focused on training as many data workers as possible to meet the demand. That means universities,

Скачать книгу