Sponsor’s viewpoint
Combining institutional repositories and artificial intelligence
AI in academia is poised to induce an unfaltering growth stance in research and innovation, writes Sally Ekanayaka
Two million and counting – that is the average number of pieces of scholarly content published per year.
One of the major challenges for institutions managing data repositories is the continuous obligation to control, manage and well- structure scholarly content metadata to guarantee meaningful use of collected information, otherwise put, and make them FAIR (Findable, Accessible, Interoperable and Reusable). Whether it be inconsistency in meeting these standards, issues in researchers’ adoption of data repositories, or simply by poor design, institutional repositories remain an underexploited resource.
In his paper Institutional Repositories:
Evaluating the reasons for non-use of Cornell University’s installation of DSpace, Dr Philip M Davis said: ‘Cornell’s DSpace is largely underpopulated and underused by its faculty. Its complex organisation is seen at comparable institutions, but may discourage contributions to DSpace by making it appear empty.’1 Professor Patricia Hartman, of Auburn
University, has a shared viewpoint on existing systems: ‘Institutional repositories (IRs) remain a woefully underutilised resource at many universities and libraries. Although many faculty members agree that IRs are a good idea in principle, achieving actual follow- through and adoption is much more difficult. Some view depositing articles as yet another time-consuming obligation.’2 Progressively, academia is now
understanding the role that artificial intellingence (AI) and machine learning (ML) technologies have in making data repository platforms more efficient, while addressing existing challenges and limitations.
40 Challenges in the Scholarly Publishing Cycle 2020/2021
Simplification of complex processes ‘Manual curation is a cornerstone of public biological data resources. However, it is a time-consuming process that urgently needs supportive technical solutions in the face of rapid data growth,’3 observed Aravind Venkatesan, senior data scientist at European Bioinformatics Institute. Mundane, repetitive and prone to error processes in curation, acts not only as a barrier to successful and permanent digital preservation of works, but also discourages the population of institutional repositories. Pioneering service providers, such as MyScienceWork, allow data repositories, such as Polaris OS, to capitalise on advanced AI algorithms to automate curation tasks following the point of initial deposits (again with automated deposit workflows, pre- filling services and PDF metadata retrievals), to subsequent presentation of data in a monitored and well-structured way, with minimal supervision and support.
Intelligent clustering/accessibility of scholarly content Natural language processing (NLP), acknowledged as one of the most powerful forces of the disruptive world of technology, is set to usher a decade of change. Data repositories integrated with NLP allow end- users to find and access the most relevant content with a few keystrokes, thanks to AI technology that extracts patterns and meaning in huge volumes of text and data, thus reinforcing digital discoverability.
A bird’s eye view into the future Institutions are collecting and processing data in unprecedented volume. Text and data mining technologies, applied right, allows
decision makers of innovation to understand the evolving dynamics of their respective fields and make strategic decisions in real time.
The automated process allows institutions
to select and analyse large amounts of text and data resources to discover patterns, identify relationships, make use of semantic analysis and understand how information relates to ideas and needs, in a manner that provides meaningful insights needed for study and research. Similar to text and data mining
technologies, machine learning algorithms train institutional repositories to not only make sense of data, but go beyond to predict new datasets. Simply put, AI recognises patterns and ML adapts its analysis to the changing datasets, to make sense and meaning of its characteristics for present and future adaptations. Contact
yann.mahe@
mysciencework.com
for more information on AI-powered repositories, case study applications and demos. l
References 1. Davis P.M., Institutional Repositories: Evaluating the reasons for non-use of Cornell University’s installation of DSpace. D-Lib Magazine, Vol. 13 (2007) <
http://www.dlib.org/dlib/march07/ davis/
03davis.html>
2. Hartman P., The power of personal outreach to populate an institutional repository. 2020 Southern Miss Institutional Repository Conference, April (2020) <
https://lib.usm.edu/smirc/irabstracts.html>
3. Venkatesan A., Understanding life sciences data curation practices via user research. (2019) <https://
f1000research.com/articles/8-1622/v1>
Sally Ekanayaka is communications and PR manager at MyScienceWork
Page 1 |
Page 2 |
Page 3 |
Page 4 |
Page 5 |
Page 6 |
Page 7 |
Page 8 |
Page 9 |
Page 10 |
Page 11 |
Page 12 |
Page 13 |
Page 14 |
Page 15 |
Page 16 |
Page 17 |
Page 18 |
Page 19 |
Page 20 |
Page 21 |
Page 22 |
Page 23 |
Page 24 |
Page 25 |
Page 26 |
Page 27 |
Page 28 |
Page 29 |
Page 30 |
Page 31 |
Page 32 |
Page 33 |
Page 34 |
Page 35 |
Page 36 |
Page 37 |
Page 38 |
Page 39 |
Page 40 |
Page 41 |
Page 42 |
Page 43 |
Page 44 |
Page 45 |
Page 46