Volume 1, Issue 1
Integrating Browse with Search: Finding Needles in Haystacks
Ken Dufort, Vice President of Product Development, NewsBank
Expert searchers know that one of the best strategies for getting precise search results quickly and effectively is to use metadata when constructing searches.
Many have dedicated countless hours learning the search fields, subject headings, search syntax and interface functionality of numerous databases in order to efficiently satisfy information requests. But in today's world, user expectations are higher than ever. Not only do they expect precise results quickly, they expect to be able to do it themselves without having to become expert searchers. Learning the advanced functionality of various interfaces or Library of Congress Subject Headings is not on their agenda.
Thus, the challenge for designers of information products is to expose those capabilities in a way that puts precise results within easy grasp of any user. The integrated browse/search design of the Readex Archive of Americana collections is an example of how to approach this challenge, and based on customer and user feedback, it appears to be a success. The following are the core principles behind the design:
Principle #1: Just because it's powerful and sophisticated doesn't mean it's advanced; presentation makes all the difference.
In most databases, field searching is relegated to the advanced search portion of the interface. Even when it isn't, users are generally expected to know what the fields represent, what values might be useful as search terms (e.g., Library of Congress Subject Headings), how to combine fields with other fields or full-text search terms, etc.
All these mechanisms exist for a single purpose: to help the user specify the scope of their search. Presentation mechanisms such as browse lists, check box choices and radio button choices are all ways to hide the complexity of field searching and—applied effectively—can help users effectively utilize supposedly advanced features of the database to get better results faster.
Principle #2: Getting users to fully utilize search forms is hard; getting them to click on relevant links is easy.
Studies of search logs show that most users enter two or three search terms when constructing a query, regardless of how much encouragement there is in the interface to do more. Even users of Advanced Search are likely to use only a very limited range of the available functionality. What users will do, however, is make choices from lists that seem to be relevant to their goal of retrieving information. The results can be quite dramatic. Years ago in one of Readex's products, a button presented along with the search results enabled the user to bring up a list of topics related to their search; it was utilized by only about 5% of all users. When the design was changed to simply include Related Topics side-by-side with the search results, 30% of all users utilized them.
Principle #3: The scope of the user's information need is (almost always) narrower than the entire database; make it easy for users to narrow the scope of their search.
Large, general-purpose databases pose a real challenge for users when they are only entering two or three full-text search terms. Invariably, the result is too many hits, and users have no clear idea of how to narrow the results effectively. Getting users to indicate what specific portion of the content they are interested in before they enter full-text search terms greatly increases their chance of success by "throwing out a lot of hay." These intuitive options for narrowing the scope of searches tend to be content-specific, but nearly all content collections have one or two dimensions that can be exploited for this purpose.
Principle #4: Browsing and searching work best in combination.
Providing browseable lists of subject terms and other metadata has been a fairly common feature of databases for quite some time, but often the browsing is not integrated with search. If you want all of the results for a particular subject, that's great. Otherwise, you are left with more than you want and no good way to narrow your results with additional search terms. By integrating browse with search, users selecting subject terms, genres, etc. are essentially "throwing away some of the hay" from the "haystack" of the full database. These users will have an easier time finding the "needles" when they do their full-text search.
Putting it all together in the Archive of Americana
When you use the Archive of Americana collections, you will see that we've put these design principles to work. For each collection, we've carefully studied the available metadata for that collection, looking for natural ways to let users browse the particular "haystack" of information that they think will contain the most relevant information for their needs. For Early American Imprints, Series I and II, that may be a browseable list of genres. For the U.S. Congressional Serial Set, 1817-1980, that may be a browseable list of Congresses. For Early American Newspapers, Series I, II and III, that may be a specific geographic region or even a specific publication. In all cases, users are able to utilize (knowingly or unknowingly) complex search strategies to improve their success at finding "the needle in the haystack."