content, metadata, navigation, and search system performance. I’ll do my best to help you integrate SSA, which is an inherently data-driven way to analyze user behavior, into traditional, more qualitative user-centered design methodologies. SSA is a missing link and a goldmine of untapped riches for all kinds of designers. I hope my book will serve as a toolkit to help you mine the data and, like John, achieve a truly better user experience.
Chapter 2. Site Search Analytics in a Nutshell
What Is Site Search Analytics?
George Kingsley Zipf, Harvard Linguist and Hockey Star
Ways to Use SSA (and This Book)
In the last chapter, I showed how Vanguard used (and continues to use) site search analytics to measure, monitor, and optimize its search system’s performance. Not to mention that it improves the overall user experience, as well as saves money, promotes jobs, and avoids disaster. Now it’s your turn to give it a try. The bulk of this book will teach you the nuts and bolts of SSA. Starting with Chapter 3, I’ll show you how to analyze your data, gain actionable insights, and put them to good use so your organization can enjoy some of the same benefits as Vanguard. But before we go deep, we’ll go broad. In this chapter, I’ll briefly cover the nuts-and-bolts aspects of SSA: what it is, how it works, and why you would use it. Think of this chapter as an introduction to SSA in 20 pages or fewer.
What Is Site Search Analytics?
Site search analytics is, at its simplest, the analysis of the search queries entered by users of a specific search system (see Figure 2-1 and Figure 2-2). What did they search? What do their searches tell you about them and their needs? How did their searches go? Does their experience suggest fixes or improvements to your site? Or does it raise follow-up questions to pursue through other forms of user research?
Note that in this book, we’re exploring the searching performed on a Web site or intranet. We are not covering how people search the entire Web using Google or another search engine. There are certainly parallels, but as you’ll see in the table in Figure 2-3, they’re not the same; Referral Queries of the Michigan State University site came from Web search engines like Google; Local Queries were executed on MSU’s own search engine.
http://www.flickr.com/photos/rosenfeldmedia/5690980708/
Figure 2-1. In SSA, you can analyze queries, like these frequent queries of the AIGA.org site, as reported by Google Analytics...
http://www.flickr.com/photos/rosenfeldmedia/5690405125/
Figure 2-2. ...to learn about what your users want from your sites and your organizations.
http://www.flickr.com/photos/rosenfeldmedia/5690980732/
Figure 2-3. Rich Wiggins of Michigan State University assembled, categorized, and even color-coded the most frequent queries from the open Web versus those generated locally to illustrate their differences.
Unlike people searching the Web, your site’s searchers typically have more specific needs. They also may be familiar with your organization, its products, and its content—after all, they had to find their way to your site in order to use its search system. So the knowledge you’ll glean from SSA will be a bit different than (and complementary to) what you’ll learn from SEO (Search Engine Optimization) and SEM (Search Engine Marketing). Consider this analogy: if people searching the Web are essentially the leads you want to attract, people searching your site are the customers you hope to retain.
Why You’ll Want to Use SSA
SSA is unique: there truly is nothing like studying what people want from your site. It should be in your research toolkit—not by itself, mind you—but there’s no reason for it not to be there, unless your site somehow doesn’t have a search system.
There are plenty of ways you can track and learn from users’ behaviors aside from SSA. For example, if you’re a web analytics person, you might rely on clickstream analysis; if you’re a user researcher, perhaps you perform eye-tracking studies. They’ll all tell you something about user intent.
But none of these methods will tell you what users want in their own words. SSA provides an unmatched trove of semantic richness—not just what users want, but the tone and flavor of the language they use to express those needs. And it’s without the biases introduced by testing and a lab environment. Plus, you have the data already. You certainly won’t find it anywhere else or acquire it any other way.
It Always Starts with Data
SSA starts with raw data that describes what happens when a user interacts with a search system. It’s ugly, and we’ll break it down shortly, but here’s what it typically looks like (this sample is from the Google Search Appliance):
XXX.XXX.X.104 - - [10/Jul/2006:10:25:46 -0800] "GET /search?access=p&entqr=0&output=xml_no_dtd&sort= date%3AD%3AL%3Ad1&ud=1&site=All Sites&ie=UTF-8&client=www&oe=UTF- 8&proxystylesheet=www&q=lincense+plate&ip= XXX.XXX.X.104 HTTP/1.1"