Martin White reviews a collection of essays on a wide range of current topics and challenges in information retrieval.
Information Retrieval and Enterprise Search
For much of 2011 I worked on a project commissioned by the Institute for Prospective Technological Studies, Joint Research Centre, European Commission, on a techno-economic study of enterprise search in Europe. There is no dispute that the volume of information inside organisations is growing very rapidly, though much of this growth is the result of never discarding any digital information. The scale of the problem is well documented by the McKinsey Global Institute (MGI) in its report on ‘Big Data’ . In this report, MGI suggest that the majority of companies in the USA with 1000 or more employees have over 200 terabytes of digital information and in some sectors (such as manufacturing) this was approaching 1 petabyte. There is no reason to suggest that the situation is much different in Europe. Research conducted by MarkLogic  indicates that, although managers regard information as a core business asset, few are taking any action to ensure that their employees are able to find the information they need to make good decisions that will benefit their organisations and their own careers.
In Europe it is likely that only one company in ten has a good enterprise search application. Many more may have a search application for an intranet, or for a document management application, but they represent only a small proportion of the unstructured (ie, text, image, video) assets of the company. Among the reasons for this are a low awareness by IT managers of the technology, availability and value of enterprise search, and a significant problem in finding staff with skills in enterprise search and information retrieval.
The paradox in Europe is that there is a very active information retrieval community, yet the connections between information retrieval and enterprise search are virtually non-existent. One result is that there is virtually no research carried out on enterprise search issues by the IR community. More importantly HEIs with IR research groups very rarely teach at undergraduate level; and even when they do, the focus is on the information science of IR and not the practical applications. Where is the next generation of enterprise search professionals going to learn its trade?
Theory and Practice
I was therefore delighted to receive this book for review, as it promised to bridge the gap (more accurately ‘gulf’) between theory and practice. This book is one of two from Facet Publishing this year , and I’ll start this review by congratulating Facet on recognising the need to publish in the IR/search sector. Normally my heart sinks when I begin to read a book that is a set of contributions by individual authors, but in this case the editors have welded the contributions together so well that it reads as though it was written by a single author.
David Bawden starts the book off with a fascinating synthesis of work on browsing. A feature of the book is that all the authors have significant experience in the topics on which they are writing, but are also able to stand back and see the strengths of the work carried out by others. Bawden is an excellent example, as he strolls effortlessly through the work of Bates, Marchionini and others to illustrate both the complexity and value of browsing. Aida Slavic then reviews work on the role of classification schemes in information retrieval, including the now ubiquitous use of facets. Slavic provides a masterly piece of analysis which I found very stimulating to read.
The next topic is fiction retrieval research, addressed by Anat Vernitski and Pauline Rafferty. I’ll admit that the requirements of fiction retrieval had totally passed me by but now I realise that the need to search fiction both for the purposes of reading for pleasure and for scholarly research presents some interesting challenges. Similar issues arise in music information retrieval research, which is the subject covered by Charlie Inskip in Chapter 4. As a musician and an information scientist, I found this of especial interest.
Next in line comes Isabella Peters with a comprehensive review of folksonomies, social tagging and information retrieval. This is a very topical subject, and the book is worth buying just for this chapter alone. The penultimate chapter is by Richard Kopak, Luanne Fruend and Heather L. O’Brien on digital information interaction as semantic navigation. Although the key papers on this very broad topic are listed, I felt this chapter wandered about a bit - even though all the authors work at the same institution - and I was left with a feeling of ‘so what?’ at the end. In contrast, the focus of the final chapter, on a webometric approach to assessing Web search engines by Mike Thelwall, is excellent and thought-provoking.
This book is one of the best examples I have come across of publisher, editors and authors working in total unison. The speed of publication is also noteworthy, as there are many references to work published in 2010. This is important in a field moving as rapidly as information retrieval. Although the book is targeted at the IR community, there is much that would be of interest to managers of enterprise search applications, because they need to be aware of just how difficult search is, and the amount of excellent research that is being undertaken to ease the burden on the search user. It should be an invaluable starting point for undergraduate and graduate information science students looking for ideas for essay and research topics, and also as an illustration of how to write good literature reviews. There must be around 500 or more papers cited in total, and anyone in the IR community and many in enterprise search would benefit from the insights provided by the authors. Definitely a five-star rating.
- McKinsey Global Institute. Big Data; The Next Frontier for Innovation, Productivity and Competition. http://www.mckinsey.com/mgi/publications/big_data/
- MarkLogic Survey Reveals Unstructured Information is Growing Rapidly, Will Soon Surpass Relational Data. MarkLogic Press Release, 6 June 2011.
- Interactive Information Seeking, Behaviour and Retrieval. Edited by Ian Ruthven and Diane Kelly, Facet Publishing, 2011, 978-1-85604-707-4.
Martin White has been involved in the science and business of search since 1974, and has written two books on the subject of enterprise search. He was Chair of the Enterprise Search Europe Conference, which took place in October 2011, the first ever conference on the subject in Europe. He is a Visiting Professor at the iSchool, University of Sheffield.