search.noResults

search.searching

saml.title
dataCollection.invalidEmail
note.createNoteMessage

search.noResults

search.searching

orderForm.title

orderForm.productCode
orderForm.description
orderForm.quantity
orderForm.itemPrice
orderForm.price
orderForm.totalPrice
orderForm.deliveryDetails.billingAddress
orderForm.deliveryDetails.deliveryAddress
orderForm.noItems
Analysis and news


The importance of interrogation The need for fast, accurate and enriched data has never been more pressing – and the pandemic has reinforced how urgent it is, writes Manisha Bolina


With the explosion in the volume of open access data, turning information into trustworthy, meaningful and insightful knowledge is critical. Researchers simply do not have the time


to read the full-text of preprints, grants, patents, datasets, publications, policy documents and more. In a world where interdisciplinary research is intrinsic in the way it affects our lives; we need technology to help us make the process more efficient and free from human bias. Enter AI. Let’s look at two examples of


how it is used, and the power it brings. Dimensions, the world’s largest linked


research information dataset, takes the approach of linking data. It’s not only looking at publications, but how they connect to other data points such as clinical trials, patents, datasets, grants and policy documents. Research does not begin and end with publications; to REALLY understand it you need the whole picture, and be able to place a piece of research information in the wider research landscape. How does Dimensions help researchers,


funders and institutions do this? It uses AI and machine learning algorithms to read full texts of publications and other data sources, then links them together using metadata, thereby creating around five billion connections. Dimensions doesn’t have a ‘front and back list’ or an editorial board telling them what should be included in the platform. Its inclusive approach means that it puts the power back in the user’s hands. AI is really useful for things like author disambiguation; so many have tried repeatedly to crack this nut, and it’s a hard one to do. Dimensions uses machine learning (ML) methods to ensure disambiguation to a higher level of accuracy. Dimensions collects author IDs from


Scopus, PubMed, ORCID, Mendeley and the rest of the ‘PID crew’ of persistent identifiers and associated metadata. It doesn’t stop there. Our AI goes deeper; it ‘reads’ all the different data created by the author and continually checks complementary information, asking questions like: ‘who are the usual co-authors?’, ‘which institution


www.researchinformation.info | @researchinfo


are they at/have they been affiliated with?’, ‘what fields of research are they writing in?’, delving even deeper into the data to ask, ‘what concepts are constantly being extracted by this author?’. This interrogation helps the AI understand and correctly ascertain that John Smith at X University in the English department writing about Shakespeare, is not the same person as John Smith at the same university writing about nanotechnology. Dimensions builds what can be


considered a ‘semantic fingerprint’ to correctly disambiguate millions of authors, enabling the detangling of those authors who could have multiple ORCID IDs, making a monumental task seem like child’s play. While AI can really take the pain out of


the process, how accurate is it? Well, think of it this way; does Google always get it


“Its inclusive approach means that it puts the power back into the user’s hands”


right when you look for the ‘best Chinese takeaway near me’? No, but it does get pretty close! In the case of Dimensions, it will not only look at publication data, but also patent, grants, datasets, clinical trials, altmetric attention and citation information related to that author and connect them. AI and ML algorithms allow for semantic enrichment to create linked data; without it this task would be impossible. Our friends at Ripeta use semantic analysis to help overcome another challenge; trust and robustness of research. Transparency is so important when working with authors and trust in authorship is very important. Imagine if you were able to create a ‘credit check’ to report the robustness of authors and scientific methods? The pandemic amplified the need to find,


check, share and reuse data at a faster pace. Using Natural Language Processing, Ripeta tackles this by providing a quick, accurate way to assess trustworthiness of research. Why is this important? People actually believe ‘fake news’ and so many


articles have to be retracted; we’ve seen this in Dimensions during the pandemic. Ripeta uses the ‘magic’ of AI to manage and get through the sheer volume of data. The process has three categories: reproducibility, professionalism and research:


Reproducibility – can this paper be replicated for future research? Look for… Code Availability Statement, Data Availability Statement (DAS), Data Locations. Includes DAS and links to data used.


Professionalism – are the actors behind the study reliable? Look for… Ethical Approval Statement, Funding Statement, Section Headings Information. More than one author, and all are verified through their institution and previous works. Includes a funding statement and all pertaining information.


Research – is this actual research? Look for… Study Objective. The study objective is clearly stated. Detailed methods and results sections.


For publishers this technology creates efficiencies and adds value to their new and existing authors, as well as ensuring preprint content is of high value (this data is often cited and shared through social networks). Institutions can improve the quality of their manuscripts, especially important for early career researchers and, for researchers they will know what to report to grab publishers’ attention and increase their likelihood of being cited. Have I convinced you how important AI and ML are for semantic enrichment and scholarly communications? Without ML it would be a long, tedious and exhaustive process, filled with human bias further prone to error. Our industry must employ AI in their workflows if we want to continue high quality scientific dissemination and trust in academia. The way we turn information into knowledge affects our lives. Let’s embrace, trust and strive to continually improve it. Only we can do it well!


Manisha Bolina is product solutions sales manager, EMEA, at Dimensions


August/September 2021 Research Information 15


Page 1  |  Page 2  |  Page 3  |  Page 4  |  Page 5  |  Page 6  |  Page 7  |  Page 8  |  Page 9  |  Page 10  |  Page 11  |  Page 12  |  Page 13  |  Page 14  |  Page 15  |  Page 16  |  Page 17  |  Page 18  |  Page 19  |  Page 20  |  Page 21  |  Page 22  |  Page 23  |  Page 24  |  Page 25  |  Page 26  |  Page 27  |  Page 28  |  Page 29  |  Page 30  |  Page 31  |  Page 32  |  Page 33  |  Page 34  |  Page 35  |  Page 36