Nathan E. Myers

Self-Service Data Analytics and Governance for Managers


Скачать книгу

replace many of the unstructured spreadsheet processes they perform each day, or at least replace many of the processing steps embedded within such processes (always remember the residual tail). In this section, we wish to highlight self-service data analytics as an indispensable evolving discipline that is rising to prevalence across medium-to-large organizations. This specific subset of data analytics will be one of the key focal areas of this book.

      The flexible, customizable, and low-code, no-code capabilities offered by self-service data analytics tools allow process owners to very quickly structure a workflow to extract source data for processing, whether selected and consumed from systems or arriving in free text, as a clean data array in a flat file or a spreadsheet or as an image for OCR/ICR data extraction. Once data is extracted and ingested, it can be transformed (joined or enriched with data from additional datasets, filters can be applied, mathematical operations completed, and field order or file format can be changed) before the data outputs are loaded to another system for further processing, or perhaps to a visualization application or dashboard for display.

      These functions are commonly referred to as extract, transform, and load (ETL) capabilities. ETL represents some of the most common use cases for self-service data analytics tools. If you think about what operators in your respective organizations are doing in spreadsheets all day, it is very often starting with one or several system extracts, then enriching them further by joining together a number of other flat files or spreadsheet files, performing any number of operations on the dataset, before transforming data to a specific output format such as a report, or the required format for load to another tool. As a last step, the enriched and transformed dataset can be either input directly to another system or tool or transmitted via any number of delivery methods.

      Of course, many of these steps could be eliminated by adding additional features and functionality to core systems. If there was interoperability between systems upstream, such that datasets were adequately rich within core systems, users may not be required to enrich them downstream outside of systems. No longer would they need to open six spreadsheets and use key fields and VLOOKUPs to pull back all the data required for a given processing operation. If system reporting suites were adequately rich and flexible, operators could forego the “transformation” steps they perform outside of systems to reorder fields or to reformat system outputs.

      The authors submit that in no way should tactical self-service tooling replace the core tech backlog delivery. Managers should continue to push the change apparatus in their respective organizations for the delivery of processing functionality in strategic applications. We have already discussed that lengthy wait times often accompany a full core tech backlog; however, there is no need to suffer while you wait. In many cases, process owners, themselves, can quickly structure their own processes in self-service analytics tools, like Alteryx, for example, dramatically reducing the time spent performing processing in spreadsheet-based end-user computing tools (EUCs), and can even reduce their number altogether. In Chapter 4, we will prescribe a control point to ensure that all tactical self-service data analytics builds can be cross-referenced back to an enhancement request in a strategic core technology system backlog. This ensures that tactical builds are a stopgap measure with a limited shelf-life, until the strategic solution can be delivered behind it.

      There is one final point to make about self-service data analytics. They are not meant to be a bandage for a broken, overly convoluted, or an inefficient process. Process owners should map out their processes from start to finish, preferably with swim lanes to readily identify inter-functional touchpoints with other parties and stakeholders (see the section Process Map (Swim Lanes) in Chapter 7 for an example of this artifact). They should take the opportunity to highlight and eliminate any low value-added steps, where possible. They should ask the Whys to understand the root causes and rationales for any accommodation steps in workflows. Only when process owners have distilled and rationalized their processes down to eloquent simplicity should they embark on a tactical automation project with data analytics tools. The idea is not to take the pain out of a broken process with tactical automation, so that it is smoothed over and forgotten, and left to age with all of its pimples and warts, out of sight, out of mind. After all, pain points have a way of festering when left unaddressed.

      Dashboarding and Visualization

      It is this last aphorism that has really stuck. This is mentioned because in all of our businesses, there are key metrics that are actively managed (and frequently reported) to allow individuals, managers, or executives to closely monitor process performance. These measures are referred to as key performance indicators (KPIs) and are used to measure and report on the health and performance of an organization, a division, a function, or even a process within them. They tend to be some of the most widely reported numbers for internal audiences, and a portion of them may find their way to external stakeholders and regulators. A whole book could be written on how to make a thoughtful selection of KPIs for a given process, in order to convey health across a number of dimensions. However, for purposes of this book, we will assume that these have been arrived at separately and are effectively conveying business performance to allow for active and rigorous management. What we do want to cover in this section is the ways that KPIs can be compiled and displayed efficiently through the use of dashboards and visualization tools.

      Most, if not all of our readers will be familiar with common temporal data visualization formats – bar charts (to show value comparisons), line charts (to show time-series movements), scatterplots (to show large numbers of observations), and sparklines (for trending). Some may be familiar with hierarchical visualizations like tree diagrams, sunbursts, and ring charts. A more select few will be familiar with multidimensional data visualizations that can communicate more than one variable for each observation. Examples of multidimensional data visualizations include pie charts and stacked bar charts that show observation values relative to the