Alex J. Gutman

Becoming a Data Head


Скачать книгу

the “story” of the book is chronological, each chapter is effectively a standalone lesson and could be read out of order. But we recommend reading the book from beginning to end to help construct your mental model to go from the basics to deep learning.

      The book is organized into four parts:

       Part I: Thinking Like a Data Head In this part, you'll learn to think like a Data Head—to think critically and ask the right questions about the data projects your organization takes on; what data is and the right lingo to use; and, how to view the world through a statistical lens.

       Part II: Speaking Like a Data Head Data Heads are active participants in important data conversations. This part will teach you how to “argue” with data and what questions to ask to make sense of the statistics you encounter. You'll be exposed to basic statistics and probability concepts required to understand and challenge the results you see.

       Part III: Understanding the Data Scientist's Toolbox Data Heads understand the fundamental concepts of how statistical and machine learning models work. You'll gain an intuitive understanding of unsupervised learning, regression, classification, text analytics, and deep learning.

       Part IV: Ensuring Success Data Heads understand the common mistakes and traps when working with data. You'll learn about technical pitfalls that cause projects to fail, and you'll learn about the people and personalities involved in data projects. Finally, we provide direction on how to succeed as a Data Head.

      We've established that the data field is growing faster than we can articulate the problems and opportunities it creates. We showed that our past (both society's and the authors’) is filled with data failures. And only by understanding that past can we understand the future. We started you down this path by introducing you to several important concepts in the restaurant classification example.

      To understand data at a deeper level, you'll need to cut through the noise, think critically about data problems, and communicate effectively with data workers. Armed with this knowledge, we know you'll be well off.

      Are you ready? Your journey to become a Data Head begins on the next page.

      1 1 Venture Beat. “87% of data science projects failing”: venturebeat.com/2019/07/19/why-do-87-of-data-science-projects-never-make-it-into-production

      2 2 www.brookings.edu/wp-content/uploads/2016/06/11_origins_crisis_baily_litan.pdf

      3 3 Nate Silver wrote a series of articles describing this in great detail ( fivethirtyeight.com/tag/the-real-story-of-2016 ). Pollsters wrongly assuming independence, just like in the mortgage crisis, was one mistake.

      4 4 Note to our fellow statisticians: We just mean regular confidence, not statistical confidence.

      5 5 K-nearest-neighbor can also be used to predict numbers instead of classes. These are called regression problems, and we'll cover them later in the book.

      6 6 This idea is discussed in an amazingly helpful book: Wilson, G. (2019). Teaching tech together. CRC Press.

      Many companies rush to try the “next big thing” in data without ever pausing to ask the right business questions. Or learn basic data terminology. Or learn how to look at the world through a statistical lens.

      Data Heads won't have that problem. Part I, “Thinking Like a Data Head,” prepares you for the road ahead and puts you in the right mindset to think about and understand data. Here's what we'll cover:

       Chapter 1: What Is the Problem?

       Chapter 2: What Is Data?

       Chapter 3: Prepare to Think Statistically

       “A problem well stated is a problem half solved.”

       —Charles Kettering, inventor & engineer

      The first step on your journey to become a Data Head is to help your organization work on data problems that matter.

      That may sound obvious, but we suspect many of you have looked on as companies talked about how great data is but then went on to overpromise impact, misinterpret results, or invest in data technologies that didn't add business value. It often seems as if data projects are undertaken because companies like the sound of what they are implementing without fully understanding why the project itself is important.

      In our experience, going back to first principles and asking the fundamental questions required to solve a problem is easier said than done. Every company has a unique culture, and team dynamics don't always lend themselves to openly asking questions, especially ones that might make others feel undermined. And many of those becoming Data Heads find that they don't have the space to even begin asking the important questions that will drive the projects forward. Which is why having a culture in which to ask these questions is as important as the questions themselves.

      There's no one-size-fits-all formula for every company and every Data Head. If you are a leader, we ask that you create an open environment that will get the questions going. (This starts with inviting the technical experts into the room.) And ask questions yourself. This exhibits humility, a key leadership trait, and encourages others to join in. If you are more junior, we encourage you to try your best to ask these questions anyway, even if you're concerned it might upset the status quo. Our advice is to simply do your best. From experience, we believe simply asking the right questions always goes a lot further than not.