JSTOR (short for Journal Storage) is a digital library founded in 1995. Originally containing digitized back issues of academic journals, it now also includes books and primary sources, and current issues of journals. It provides full-text searches of more than a thousand journals. More than 7,000 institutions in more than 150 countries have access to JSTOR. Most access is by subscription, but some old public domain content is freely available to anyone, and in 2012 JSTOR launched a program providing limited no-cost access to old articles for individual scholars and researchers who register.

JSTOR’s founder was William G. Bowen the president of Princeton University from 1972 to 1988. JSTOR was originally conceived as a solution to one of the problems faced by libraries, especially research and university libraries, due to the increasing number of academic journals in existence. Most libraries found it prohibitively expensive in terms of cost and space to maintain a comprehensive collection of journals. By digitizing many journal titles, JSTOR allowed libraries to outsource the storage of these journals with the confidence that they would remain available for the long term. Online access and full-text search ability improved access dramatically.

With the success of this limited project, Bowen and Kevin Guthrie, then-president of JSTOR, were interested in expanding the number of participating journals. They met with representatives of the Royal Society of London, and an agreement was made to digitize the Philosophical Transactions of the Royal Society back to its beginning in 1665. The work of adding these volumes to JSTOR was completed by 2000. JSTOR was originally funded by the Andrew W. Mellon Foundation, and until 2009, was an independent, self-sustaining not-for-profit organization with offices in New York City and Ann Arbor, Michigan. Then, JSTOR merged with ITHAKA (a similar repository), becoming part of that organization. ITHAKA is a non-profit organization founded in 2003 ‘dedicated to helping the academic community take full advantage of rapidly advancing information and networking technologies.’

In addition to the main site, JSTOR’s labs group operates an open service that allows access to the contents of the archives for the purposes of corpus analysis at its Data for Research service. This site offers a search facility with graphical indication of the article coverage and loose integration into the main JSTOR site. Users can create focused sets of articles and then request a dataset containing word and n-gram frequencies and basic metadata. They are notified when the dataset is ready and can download it in either XML or CSV formats. The service does not offer full-text, though academics can request that from JSTOR subject to a non-disclosure agreement.

In late 2010 and early 2011, Internet activist Aaron Swartz used MIT’s data network to bulk-download a substantial portion of JSTOR’s collection of millions of academic journal articles. When discovered, JSTOR stopped the downloading, identified Swartz, and rather than pursue a civil lawsuit against him, in June 2011 reached a settlement wherein he surrendered the downloaded data. The following month, federal authorities charged Swartz with several data theft-related crimes, including wire fraud, computer fraud, unlawfully obtaining information from a protected computer, and recklessly damaging a protected computer. Prosecutors in the case claimed that Swartz acted with the intention of making the papers available on P2P file-sharing sites. Swartz surrendered to authorities, pleaded not guilty to all counts, and was released on $100,000 bail. The case was still pending when Swartz committed suicide in January 2013.

Beginning September 2011, JSTOR made public domain content freely available to the public. This ‘Early Journal Content’ program constitutes about 6% of JSTOR’s total content, and includes over 500,000 documents from over 200 journals that were published before 1923 in the United States and before 1870 in other countries. JSTOR stated that it had been working on making this material free for some time, but that the Swartz controversy and Greg Maxwell’s protest torrent of some of the same content led JSTOR to ‘press ahead’ with the initiative.

