Indiana University's Geographically Distributed HPSS
In September 2000, working with IBM, Indiana University instituted into production the world's first geographically distributed High Performance Storage System (HPSS). With hardware and software components distributed across two Indiana University campuses (located at Bloomington and Indianapolis, some 50 miles apart), users are able to read or write data locally.
HPSS, developed by a consortium of government laboratories (including Los Alamos National Laboratory, Sandia National Laboratory, etc.) and IBM Global Services - Federal, located in Clearlake, Houston, is a high-end storage software system which allows users to store massive amounts of data on a hierarchy of disks and tapes. Indiana University joined the cosortium in 1998 and has emerged as a national leader in using the HPSS to provide a ubiquitous massive data storage service for campus researchers.
The system was implemented as follows. The core HPSS infrastructure was developed first at Indiana University's Bloomington campus (where it went into production in August 1999). The Bloomington based core HPSS was made available to users fifty miles north, across a wide area network (WAN), on the Indianapolis campus. Remote HPSS disk and tape mover components were added to the core HPSS in Summer 2000 at Indiana University's IUPUI (Indiana University Purdue University, Indianapolis) campus. The IUPUI system went into production in September 2000. In late 2001, an additional tape library and movers were installed at IUPUI as part of the Indiana Genomics initiative. [The Indiana Genomics Initiative (INGEN) of Indiana University is supported in part by Lilly Endowment Inc.]
Prior to the institution of the remote movers, all IUPUI user data had to move over the WAN to/from Bloomington. With the movers located on campus, only IUPUI user requests for data are now sent to Bloomington. This generates only scant metadata traffic between the two campuses. Once metadata records are located on the core HPSS servers, the actual user data, resident on disks and tapes located at IUPUI, starts flowing directly to the user over the IUPUI local area network. This makes highly efficient use of the limited inter-campus bandwidth, and provides an attractive model for locating data movers at distant locations which are connected via slow links.