Research Paper Undergraduate 2,645 words

Institutional Repositories (IR) History, Purpose,

Last reviewed: November 19, 2007 ~14 min read

Institutional Repositories (IR) History, Purpose, Programs, and Future

Over the last several years, institutional repositories have increased in number substantially. In a study done in 2005, Lynch and Lippincott found nearly 40% of institutions deployed some form of institutional repository (2005). Of the universities and colleges not housing an IR, nearly 90% reported they planned tom implement one within a year.

Based on those numbers, it is clear that IRs are useful tools in institutional organizations. This paper will focus on institutional repositories and will discuss the history, purpose, programs, and future of IRs, and will discuss how IRs are beneficial to institutional environments.

First, it is important to understand what is meant by "institutional depository." An IR is a set of services designed to manage and disseminate digital materials to members of a community. The purpose of such a repository is to preserve, organize, provide access to, and distribute such information to the community. While the responsibility for such information is spread throughout different individuals, the goal is to create a collaboration of resources of librarians, it specialists, archive managers, faculty, and university staff. The information contained in the repository is supported through information technology, and the management of technological changes, as well as the movement of digital content from one form of media to another as new technology is developed (Lynch, 2003).

The history of IRs dates back to 1966, when the Educational Resources Information Center launched the U.S. Department of Education's Office of Educational Research and Improvement and the National Library of Education (Suber, 2007). While these were not IRs per say, they were the first attempt to provide the public with a primary resource for all government educational information. From there, Medline was launched by the National Library of Medicine in 1966 on a fee basis, and Agricola was launched by the National Agriculture Library in 1970 (Suber, 2007).

By 1974, Stanford Linear Accelerator Center (SLAC) and Deutsches Elektronen Synchrotron (DESY) began to catalog preprint literature in physics. They were soon joined by the Stanford Physics Information Retrieval System (SPIRES) (Suber, 2007). The project arose out of a severe need to catalog "preprints." "Preprints" were the preprinted copies of works selected for publication in national and international physics journals. Often, the authors of such articles would send well-known professors and institution preliminary copies of their work. By 1969, the problem was an overabundance of such works with no true way to catalog them (O'Connell, 2002).

To cope, Stanford University began work on a computer database system that would allow the cataloging of a nearly unlimited number of bibliographic records. Thus, SPIRES was created. SPIRES would allow researchers to search the database by author, title, report number, and date, making information retrieval simple and easy. In addition, SPIRES allowed for a paper

Reference List

This Reference List showed a listing of previous papers the work had cited within the text. The point of such a listing was that it allowed a physicist to search the database for a particular paper, and the listing would also show any other paper that referenced the original. This gave the ability to find the most relevant and most used information quickly (O'Connell, 2002).

At the same time Stanford was working on their database, the Deutsches Elektronen-Synchrotron (DESY) in Hamburg, Germany was working on a published list to index all published literature and preprints they received. They too gave listings of author, title, and date, but rather than a listing of references, DESY provided a keyword system, where each document was assigned a 23 keyword listing (O'Connell, 2002). Again, the concept was to allow researchers to search thousands of documents easily and quickly to locate those that were most relevant.

When the two institutions discovered what the other was working with, they decided to combine forces. By 1969, the specifications for conversion of the data already compiled by DESY into SPIRES was finished. In 1974, the SLAC and DESY libraries were working within a single database (O'Connell, 2002).

However, this method was still not perfect. While the information was now cataloged and somewhat easier to locate than previously, the information was still included in vast libraries of stack after stack of information. After several years of receiving lists of new preprints and journal articles, trying to locate a specific article or limited information on a single topic was still extremely difficult. Additionally, while other institutions began cataloging their own listings over the 70's and 80's, such as USENET in 1979, Hytelnet in 1990, and Gopher in 1991, the process still was not perfect (Suber, 2007).

On August 16th, 1991, however, arXiv was launched by Paul Ginsparg, representing the first true institutional repository. The arXiv is an automated distribution system for research articles without peer review. Because it is a dissemination system only, it operates on a much lower cost ratio that that of other peer-reviewed systems (Ginsparg, 1996). The original intent of such a program was not to replace journals, which were not yet online, but was to grant global access to prepublication materials. However, since established journals were slow to come online, the arXiv provided the only online access to such information for nearly 5 years (Ginsparg, 1996).

Since its inception, the arXiv has been expanded to include archives of preprints in not only physics, but also astronomy, mathematics, computer science, nonlinear science, quantitative biology, and statistics. The database was originally hosted at the Los Alamos National Laboratory, but has since moved to Cornell University, with mirror servers located throughout the world. The arXiv's articles became known as "e-print," a term commonly used today, and the surge of interest in scientific publishing online it created was termed the "open access movement" (Ginsparg, 1996).

Following the implementation of arXiv, numerous other discipline-based repositories came into existence, such as PubMed for biomedical and life sciences, EconPapers for economic working papers, and CogPrints for cognitive psychology papers (Jones, et al., 2006). However, as noted, these repositories were discipline based, as opposed to general repositories, and did not fit the mass needs of the research community in general. As a result, there was increased demand for another level of repository.

Academics were somewhat reluctant to place their work in repositories that may not continue to exist, or those that were limited to specific disciplines. In order to maintain a vast repository, academics looked to institutions to provide a trusted repository for staff to deposit research materials.

In 1999, the "Open Archives Initiative" was created, and was a major step toward what have become institutional repositories. The Open Archives Initiative, or OAI, was introduced to find solutions to two basic problems with IRs at the time, those of usability and sharing of data. Prior to the OAI, the various IRs that existed used different search interfaces, meaning that users were forced to learn numerous different systems in order to locate resources effectively. Also, there was no machine-based way to share the metadata within the repositories. To solve the issue, a meeting was held in Santa Fe, New Mexico in July of 1999. Attending the meeting were some of the most highly technical experts in the field of IRs including Paul Ginsparg of the arXiv, Herbert Van de Sompel of the Los Alamos National Library, and other technical advisors (Sompel, 2000).

The experts created the concept of a universal system for self-archived literature originally called the UPS system, or Universal Preprint Service. This system would be a free layer of scholarly information that any could use for research purposes.

The prototype that evolved from the individuals at the meeting used the Networked Computer Science Technical Reference Library design, or NCSTRL, and the Dienst protocol. The NCSTRL provided access to a number of separate depositories while the Dienst protocol allowed a number of end-user services to be supported across several IRs. The end result was a product that allowed cross-archive digital library services using a single query language and set of search capabilities (Sompel, 2000).

The UPS (later named the OAI) identified two basic needs for the institutional repository, those of the Data Provider and the Service Provider. The data provider deposits and publishes the resources into the repository. Service providers, on the other hand, are responsible for harvesting the metadata for the purpose of providing one or more services across the entirety of the data (Sompel, 2000).

In 2001, EPrints was launched by the University of Southampton, becoming the first OAI-compliant repository software in the world. The program offers what IRs are intended to offer: free, immediate, and permanent access to full text research articles for anyone in the world. The data includes research literature, scientific data, student theses, projects, artifacts in a variety of media formats, teaching materials, collections of scholarly materials, digital records of archived data, and performance information. In short, it is an all encompassing software platform for all forms of data (Eprints, 2007).

In 2002, another major institutional repository software projects, that of DSpace, was launched by the Massachusetts Institute of Technology. Another OSI-compliant platform, DSpace captures digital research material in nearly any format directly from the creators of the material, and distributes it over the Internet for fast, reliable, trustworthy searching. When a submitter submits information, it is transformed into a data file or bitstream, which is organized into a similar data set. The item is then grouped with similar items into metadata, and is indexed for searching. Communities are then created that correspond to specific parts of an organization, such as departments. This allows DSpace to function well in multi-disciplinary environments. The end user is then able to browse, search and locate information from a vast amount of sources, and can also then download or view the material (See Appendix 1 for flowchart) (Dynamic Diagrams, 2007).

These two repository platforms have seen a large jump in usage over the last several years, according to a study by Lomangino (2006). In the study, research showed the usage of Eprints has risen from 125 repositories to over 200 between 2004 and 2005. According to the Registry of Open Access Repositories, there are currently 227 known repositories using Eprints, and another 258 using DSpace (2007) (See Appendix 2).

Additionally, there are 617 repositories that currently comply with OAI standards all over the world (See Appendix 3) (Lomangino, 2006).

Also in 2002, the IR community received another boost in a paper published by the Scholarly Publishing and Academic Resources Coalition that supported the use of institutional repositories (SPARC, 2002). In addition to offering several supportive arguments for the use of repositories, SPARC gave two convincing rationales for the use of institutional repositories that may have helped speed their implementation and acceptance. First, SPARC pointed out that IRs:

centralize, preserve, and make accessible...intellectual capital [while] forming part of a global system of distributed, interoperable repositories that provide(s) the foundation for a new disaggregated model of scholarly publishing (SPARC, 2002, pg. 6).

In other words, SPARC noted the need for a system that would function world wide, and would provide a base for all repositories, so end users could see a seamless experience across disciplines. They believed the IR would "un-bundle" the formal academic publishing model and open the market to a vaster audience (SPARC, 2002).

Further, SPARC noted that IRs could serve as indicators of an institution's academic excellence. They pointed out that while faculty publication in journals reflected the host university's excellence, IRs could contrite the intellectual property of such institutions, creating a combined database that reflected the value of their research. While the increase in visibility could show a higher level of academic prowess, SPARC noted, this increase could translate into concrete remuneration, such as increases in funding from private and public sources (SPARC, 2002).

However, SPARC also pointed out that there were barriers to the implementation of IRs. First, any alteration of the scholarly publishing model, that of peer-reviewed journal publication only, could certainly cause those within the system, such as publishers, faculty, and librarians, to fear the new IR format. There was no question that large journal publishers could easily halt the IR process, since their subscriptions made their publication possible. To soothe this concern, however, they also noted that the library market as well as authors were dissatisfied with the system as it was, and were demanding a new format. IRs, according to SPARC, were the solution for all members of the community (SPARC, 2002).

You’re 84% through this paper. Sign up to read the full paper.

Sign Up Now — Instant Access Already a member? Log in
130,000+ paper examples AI writing assistant Citation generator Cancel anytime
Cite This Paper
PaperDue. (2007). Institutional Repositories (IR) History, Purpose,. PaperDue. https://www.paperdue.com/essay/institutional-repositories-ir-history-34181

Always verify citation format against your institution’s current style guide requirements.