Minutes from the DAWG-I February 11, 2003 meeting

Anne Butman [AB] (convener), Judy Gardner [JG], Michael Giarlo [MG] (recorder), Nick Gonzaga [NG], Dave Hoover [DH]

Agenda:

  1. Discussion of IBM's StorageTank technology as a digital repository solution
  2. Presentation of enterprise mass storage solutions by ADIC sales team

Attachment

A quick and dirty schematic I drew of a simple SAN setup [Click to view]

Action Items

  1. AB will contact IBM and have a representative come discuss ST (functions, price, configuration options, differences from ADIC's StorNext software) with the DAWG-I subgroup so we can learn more about the product.
    All will further think about and discuss specific storage needs in regard to ADIC's questions, such as how much of the RUL digital repository will need to be immediately accessible, and how we expect it to grow over a 5-year span.

Meeting convened at 2:30pm.

  1. IBM's StorageTank (ST) technology -- 2:30pm -> 3:30pm
    1. Is remote, automatic replication going to be possible using ST as it would with the EMC Centera solution? Based on the ST documentation we have found thus far, it seems to function as part of a SAN environment. By its nature, a SAN does not span WAN links; it is more of a LAN-like technology. ST does not seem to support remote, automatic replication, according to the resources we have read to this point.
    2. AB posed question about Snapshot function of ST software. Documentation is a bit convoluted, and seems to suggest that multiple copies of files may be kept which would be an administrative nightmare when filesystem maintenance needs to be performed.
    3. Unlike the EMC Centera technology, ST enables easy deletion of files stored within (which would be especially useful when we use the mass storage solution as backup, so we can intelligently delete old backups based on whatever backup policies we devise).
    4. The group discussed at a general, high level the pros and cons of using EMC versus those of using ST in a SAN environment:
      • EMC Centera is proprietary, and a 'black box' to us. SANs are standardized, set up with well-known protocols, hardware, and network technologies, and can be upgraded or added to using a wide variety of vendors.
      • EMC Centera is a horribly expensive solution. SANs aren't inexpensive either, but should cost much less than a Centera.
      • EMC Centera requires minimal time and effort to set up, since EMC handles setup and maintenance. SANs are quite complex and take time and expertise to set up and maintain.
      • EMC Centera handles remote, automatic replication natively, though you pay extra for this software. SANs will use off-site backups to accomplish the same thing, requiring tape-swapping and paying for an off-site tape storage service like the one Systems currently uses.
      • EMC Centera integration with existing applications and operating systems requires installation of client and/or potentially writing to the API. Files on a SAN, however, appear native to the servers attached to them. (More on this in the ADIC discussion below.)
      • There are more, but these are all the ones I have written down and can remember at this time.
    5. The group concluded that a digital respository solution running ST on a SAN -is- most definitely a viable option.
  2. ADIC presentation and discussion -- 3:30pm -> 5:15pm
    1. The ADIC solution, simplified, goes like this: we purchase everything through them, including fibrechannel (FC) switch, FC host bus adapters (HBAs), SAN administration software ("StorNext"), tape library, mass disk storage, setup, and SAN training/documentation. That is, they handle not only tapes or disks, but also setup of SANs. They help us get a SAN up and running, but we are -not- required to use them in the future for any SAN upgrades.
    2. With the ADIC/SAN solution, mass disk storage could be shared among all servers connected to the SAN, rather than carved up into LUNs and assigned individually. Additionally, this shared storage can be shared by Windows, UNIX, Linux, etc. without requiring any special setup. The same files and folders could be seen from Windows and Linux without any API writing necessity. (I.e. if sallie is attached to the SAN, DSpace and Fedora can store their files, and even the applications themselves, on the mass storage of a SAN without any special configurations!)
    3. ADIC/SAN solution solves backup needs through the StorNext software, which makes the SAN appear a single resource to the servers on the SAN. When a server stores a file to the SAN -- e.g. a user on a workstation stores a file on a mapped network drive, which is a logical connection to the SAN via the server which provides said mapped drive -- it is placed on the mass disk storage so that it may be immediately retrieved by any clients making requests for it, then an automatic backup to the tape library is performed, and potentially another backup to tapes designated for "off-site" which are ejected nightly (for instance). This method is one of many ways we can use the StorNext software, since it supports robust policies, allowing us to handle storage and backup how we want when we want. Under this method, though, nightly backups to tape are unnecessary, since all files are immediately backed up to tape!
    4. Additionally, we can use the SAN for standard backups since most existing backup software will see it with no problem.
    5. Assuming this configuration and the need to have 5 TB, ADIC recommends against buying 5TB of disk storage, since 1) it's expensive, 2) it does not account for a backup solution (e.g. tape), and 3) much of that 5TB may not need to be immediately accessible. For instance, we may need to store 2TB of original TIFs, but these files are basically stored and forgotten about. Why store that 2TB on expensive disk? The argument is usually that tape is too slow, and requires manual swapping of tapes. With ADIC's tape libraries, however, files can be retrieved off tape in under a minute and swapping of tapes is controlled by a robotic process much like in a jukebox. We would ultimately control (via StorNext policies) what files stay on disk and which live exclusively on tape, and could raise our disk or tape capacities whenever we needed to, so this isn't a bad recommendation on ADIC's part.
    6. In order to discuss price and further configuration options, we should meet with ADIC again in the near future after discussing the issue above: namely, how much of the 5-10TB of storage do we want IMMEDIATELY accessible? How much of it will be accessed by end-users via the web, and how much will be accessed only by staff members (who presumably can deal with a 30 second wait every now and again)? Once we can answer these questions, we can get a price quote from ADIC and a better understanding of what their solution will be in our environment.

Meeting adjourned at 5:15pm.

Back to Top of Page
URL: http://www.libraries.rutgers.edu/rul/staff/groups/dig_infrastructure/minutes/dawg-i_03_02_11.shtml
Libraries website maintained by the Libraries Webmaster
© Copyright 1996-2006, Rutgers University Libraries   (Further Copyright Information)