topics. Other examples of application areas for semantic enrichment are analytics; API-driven content delivery; and content-enabled workflow applications. He said semantic enrichment helps ensure publishers’ products are cost-effective and have a competitive time-to-market. This is because automation as a result of enrichment and its outputs can make publishing workflows more efficient.
Picking a partner
For publishers considering enriching their content, partnering with a specialist technology company could be a good option. So how should publishers go about picking a partner? ‘First of all you need a solution that has the necessary precision and recall when processing the documents,’ suggested Hastings of Linguamatics. ‘The solution should be scalable to large amounts of text, flexible enough to deal with different data sources and formats and able to plug in any domain knowledge. The solution should also have an API that conforms to industry standards so that it can be easily integrated into the publisher’s workflows.’ He went on to say that partners should also have the necessary expertise and experience in processing and analysing text data, such as the use of natural language processing (NLP) and understanding the benefits and challenges of using thesauri and ontologies. ‘Publishers should ask “Does my partner have an established customer base that has already benefitted from their technology, and have they worked with publishers before?”.’
Mayer agreed with the need to look for partners with experience in helping publishers, ideally, he added, publishers in the same field (for example biomedicine or engineering). He also suggested publishers should look for partners who can ‘put them in touch with counterparts using their platform and can share best practices and insights into this key workflow’ and ‘offer a robust/scalable yet deeply customisable product, as well as quality measurement tools, to be able to adapt the platform to their specific use cases and tune the way it behaves to meet acceptance criteria.’ In addition, he said, semantic enrichment partners should be able to provide support to publishers’ customers using the platform and be able to recommend a range of implementation partners, who can offer complementary consulting and/or integration services. Finally, he said,
Automation as a result of enrichment and its outputs can make publishing workflows more efficient
partners should ‘have off-the-shelf integrations and repeat deployments with other industry- relevant solutions (such as content repositories or workflow tools).’ Silverchair’s Zarnegar advised that publishers should ask to see practical outcomes of semantic projects when evaluating partners. ‘While many semantic technologies and approaches have the
Some semantic enrichment options
l Silverchair offers semantic enrichment project services and tools. These include strategic semantic planning services and a taxonomy/ontology manager (Totem). The company also provides taxonomy development services, an automated semantic enrichment engine that tags content down to the most granular level (Tagmaster) and a web delivery platform that uses semantically enriched content to drive advanced features and new product creation (SCM6). It also provides an analytics platform that combines user activity with semantic tagging to create detailed business intelligence about audiences and their information preferences (Silvermine).
l Linguamatics offers a natural language processing (NLP)-based text analytics platform called I2E, which identifies and
22 Research Information AUGUST/SEPTEMBER 2014
captures semantically relevant information buried in unstructured text. This platform allows you to plug in domain knowledge – ontologies, terminologies and thesauri – when processing and tagging the text. Phil Hastings of Linguamatics said: ‘This means that rather than just searching for keywords the user can search for concepts and classes of entity and the relationships between them.’
I2E operates in two ways. The first option is to run NLP-based queries directly and extract information in a structured form for further analysis and review by an end user, or for export to a knowledge base. I2E can also pass tagged and enriched documents to a search engine to improve the end user experience and make the relevant concepts more discoverable.
l TEMIS provides its Luxid Content
Enrichment Platform, which extracts structured information from unstructured content by recognising the key topics, entities and relations mentioned in text that can then enrich document metadata. The new version 7 of Luxid includes
Webstudio, an ontology management web application that enables users to create, edit and maintain their ontology collaboratively, while governing the way ontological objects are recognised by the Luxid semantic enrichment pipeline. It uses the platform’s NLP layer to preview in real time the results of the semantic enrichment process when applied to users’ corpus of documents. It is also able to suggest relevant objects mentioned in the user’s corpus that are not yet included in the ontology, helping users to improve their existing ontology or build one from scratch.
potential to be effective for many uses, publishers should look for a partner that can take semantic enrichment projects from initial concept all the way through to specific business outcomes. The best partners will help the publishers achieve those business goals with their enriched content rather than just enriching the content and moving on,’ he explained.
Future potential There are many potential
opportunities Hastings predicted a range of connected by applying for
semantic enrichment to enhance content further in the future too.
applications, including enterprise search and linked data, where data from disparate text sources is
identifiers. He also expects to see more semantic stores – generating ‘triples’ of information (entity- relationship-entity) from text and storing it in a semantically meaningful way so that the data can be connected together in networks – and geospatial search (searching for concepts within maps).
Another opportunity is in linking chemistry to biology, for example, the ability to link entity names with identifiers and property information such as SMILES, structures, HELM notation and sequence data. Hastings also sees greater potential in healthcare applications, for example better annotation and search of electronic health records.
Healthcare information is a particular focus for Silverchair, and Zarnegar described how semantic
| Page 2
| Page 3
| Page 4
| Page 5
| Page 6
| Page 7
| Page 8
| Page 9
| Page 10
| Page 11
| Page 12
| Page 13
| Page 14
| Page 15
| Page 16
| Page 17
| Page 18
| Page 19
| Page 20
| Page 21
| Page 22
| Page 23
| Page 24
| Page 25
| Page 26
| Page 27
| Page 28
| Page 29