Dave Fowler

The Informed Company


Скачать книгу

id="uaa5e46fd-6818-5687-b833-f7bdafe60b82">

      

      Table of Contents

      1  Cover

      2  Title Page

      3  Copyright

      4  Dedication

      5  About This Book Why Write This Book Who This Book Is For Who This Book Is Not For Who Wrote the Book Who Edited the Book Influences How This Book Was Written How to Read This Book

      6  Foreword

      7  Introduction Merging Business Context with Data Information The Four Stages of Agile Data Organization

      8  STAGE 1: SOURCE aka Siloed Data Chapter One: Starting with Source Data Common Options for Analyzing Source Data Chapter Two: The Need to Replicate Source Data Chapter Three: Source Data Best Practices Keep a Complexity Wiki Page Snippet Dictionary Use a BI Product Double Check Results Keep Short Dashboards Design Before Building

      9  STAGE 2: DATA LAKE aka Data Combined Chapter Four: Why Build a Data Lake? What Is a Data Lake? Reasons to Build a Data Lake Summarized Chapter Five: Choosing an Engine for the Data Lake Modern Columnar Warehouse Engines Modern Warehouse Engine Products Database Engines Recommendation Chapter Six: Extract and Load (EL) Data ETL versus ELT EL/ETL Vendors Extract Options Load Options Multiple Schemas Other Extract and Load Routes Chapter Seven: Data Lake Security Access in Central Place Permission Tiers Chapter Eight: Data Lake Maintenance Why SQL? Data Sources Performance Upgrade Snippets to Views

      10  STAGE 3: DATA WAREHOUSE aka the Single Source of Truth Chapter Nine: The Power of Layers and Views Make Readable Views Layer Views on Views Start with a Single View Chapter Ten: Staging Schemas Orient to the Schemas Pick a Table and Clean It Other Staging Modeling Considerations Building on Top of Staging Schemas Chapter Eleven: Model Data with dbt Version Control Modularity and Reusability Package Management Organizing Files Macros Incremental Tables Testing Chapter Twelve: Deploy Modeling Code Branch Using Version Control Software Commit Message Test Locally Code Review Schedule Runs