This page contains a Flash digital edition of a book.
FEATURE Semantic enrichment


From documents to data


The rapid growth of digital information has led to rising interest in new approaches to information retrieval and discovery, writes David Stuart


I


n a world where thousands, millions, or even billions of documents may be returned in response to a simple keyword search, new tools are required to help people find the most relevant resources and help them to dig deeper into the results. One approach that has gained a lot of interest is semantic enrichment, the enrichment of content with additional semantic metadata to enable greater understanding of content by machines. Speaking with Bob Kasenchak, head of


product development at Access Innovations (www.accessinn.com), it’s clear that the work of semantic enrichment is really only just beginning, and there is a lot of potential in the adoption of increasingly complex knowledge organisation systems and enrichment at a finer level of granularity.


The challenge


The rapid growth in digital resources over the last couple of decades comes from both the digitisation of previously analogue content and a huge growth in the quantity of new digital content being made available. Digital versions of traditional publications have been joined by a wealth of new born-digital genres (e.g., web pages, blogs, and social media), whilst the potential of open data and open code to stimulate innovation has led to the opening of such resources by innovative organisations in many different sectors. The value of the growing abundance of resources can only be realised, however, if people can find the information they need, and don’t have to waste their time trawling through thousands of resources irrelevant to their particular needs. Although full text search is increasingly offered for textual resources, the inherent ambiguity of natural language leads to numerous false drops.


34 Research Information AUGUST/SEPTEMBER 2015


‘It is particularly relevant to organisations providing content to the public’


Ensuring people have access to the resources they need, as and when they need them, isn’t a problem restricted to any one particular type of organisation, but rather is a problem for all kinds of organisations and individuals in the retrieval of a wide range of resources. It’s a problem for governments that need to ensure civil servants can quickly find the data they need whichever government department has collected the data; it’s a problem for commercial organisations that want to streamline and standardise processes throughout the world; and it’s a problem for


researchers that need to be able to find the resources they need, irrespective of whether those resources originate within their own fields of study or in a tangential field.


It is a problem that is particularly relevant to organisations providing content to the public, especially those who want to be paid for it. According to Kasenchak, this has led scholarly publishers to be at the leading edge of semantic enrichment: ‘Hundreds of thousands of document were digitised for researchers, but as digital models became prevalent people also came to expect content to be free, and they get very frustrated when you pay for membership to a science society and the search is terrible. ‘In order to demonstrate the value to their membership scholarly publishers began finding ways to give the benefit of membership in actual


@researchinfo www.researchinformation.info


McIek/Shutterstock.com


Page 1  |  Page 2  |  Page 3  |  Page 4  |  Page 5  |  Page 6  |  Page 7  |  Page 8  |  Page 9  |  Page 10  |  Page 11  |  Page 12  |  Page 13  |  Page 14  |  Page 15  |  Page 16  |  Page 17  |  Page 18  |  Page 19  |  Page 20  |  Page 21  |  Page 22  |  Page 23  |  Page 24  |  Page 25  |  Page 26  |  Page 27  |  Page 28  |  Page 29  |  Page 30  |  Page 31  |  Page 32  |  Page 33  |  Page 34  |  Page 35  |  Page 36  |  Page 37  |  Page 38  |  Page 39  |  Page 40