2012年2月1日 星期三

Evolution of the Storage Brain 筆記 (一)

Chapter 1. And Then There Was Disk

First came to market in the late 50s and 60s, disk drives have relied on the following core components

  • Read/write heads that use electrical impulses to store and retrieve magnetically recorded bits of data
  • Magnetically coated disk platters that spin and house these bits 
  • Mechanical actuator arms that move the heads back and forth across the spinning disk platters, forming concentric ‘tracks’ of recorded data
The Past: Disk Drives Prior to 1985

The Early Days of Disk Drive Communications
  • Bus-and-Tag Systems
    • via two copper-wire cables: Bus and Tag, Bus for data, Tag for communication protocols
  • SMD Disk Drives
    • Control Data Corporation first shipped its minicomputer with storage module device (SMD) disk drives in late 1973
    • Used much smaller ―A and B flat cables to transfer control instructions (from the A cable) and data (from the B cable)
    • The disk controller shrank down to a single board which was inserted into the system‘s CPU card cage
The Middle Ages: Disk Drives in the 80s -90s

Disk drives in the late 80s and 90s went through a number of significant transformations that allowed them to be widely used in the emerging open systems world of servers and personal computers. These included:

  • 19” disk drives => less expensive 5.25” (and, eventually, 3.5”) drives.
  • Advances associated with redundant arrays of independent disks (RAID) technology. 
  • The development and widespread adoption of the Small Computer Systems Interface (SCSI). 

Disk drives produced today fall into four categories, depending on their cable connections
  • SCSI
  • Fibre Channel
  • Serial ATA (SATA)
  • Serial Attached SCSI (SAS)
SCSI used a single data cable to present its Common Command Set (CCS) interface with built-in intelligence.

Today: Disk Drives, Pork Bellies and Price Tags

Are Disk Drives in Our Future?

Today‘s latest battle cry is that solid state disks (SSDs) will completely replace magnetic disk storage

Research into the area of higher capacities for magnetic disk drives

  • Perpendicular Magnetic Recording (PMR)
    • 多層次的儲存,原本是平面,變成是 3D
  • Patterned Media Recording
  • Heat-Activated Magnetic Recording
    • relies on first heating the media so that it can store smaller bits of data per square inch
    • PMR appears to be winning the short-term race
  • Nanostorage
Chapter 2. “Oh, @#$%!” 

Address protection in two separate chapters
  • Protecting against disk drive failure
    • users can continue to access data previously stored on failed disks
  • Protecting against data loss or corruption
    • moves more deeply into storage software intelligence

The Past: Protecting SLEDs

When a drive crashed, data was recovered from tape and restored back to a new disk

The Middle Ages: RAID in the 80s

1987 paper 
  • called, “A Case for Redundant Arrays of Inexpensive 
  • Disks.” RAID was born.

    • provide greater efficiency and faster I/O performance
    • successfully survive a failure of any one disk drive
    • describe five different RAID methods (RAID 1 through RAID 5)

    RAID Technique
    No Parity
    (RAID 0)
    Ø  increase I/O performance by striping  (or logically distributing) data across several disk drives.
    Ø  offered no protection against failed disk drives
    (RAID 1)
    Ø  data is mirrored onto a second set of disks
    Ø  exacts a high capacity penalty
    Fixed Parity
    (RAID 3 / RAID 4)
    Ø  both use parity calculations (sometimes known as checksum) to perform error-checking
    Ø  recovery of missing data from failed drives
    Striped Parity
    (RAID 5)
    Ø  parity is striped (or logically distributed) across all disks in the RAID set in an attempt to boost RAID read/write performance
    Multiple Parity
    (RAID 6)
    Ø  using multiple iterations of fixed or striped parity on a group of drives, which allows for multiple drive failures without data loss.

    The Future: Smarter, Self-Healing Disk Drives

    S.M.A.R.T. technology

    Chapter 3. Virus? What Virus? 

    Approaches to Data Loss or Corruption

    Data replication
    Mirroring critical data to an alternate location
    Data backup
    Restore data that may have been accidentally deleted or earlier data version

    The Past: The Tale of the Tape

    Today’s Backup Tapes

    Decades of “format wars” ensued amongst vendors fighting for market share. Sample formats from this era included:

    • Quarter-Inch Cartridge (QIC)
    • 4mm or 8mm Tape
    • Digital Linear Tape (DLT)
    • Advanced Intelligent Tape (AIT)
    • Linear Tape Open (LTO)

    LTO-4 has become the reigning tape format today.

    The Middle Ages: Early Tape Backup Automation

    Tape-Based  Innovations:
    Ø  improved tape backup speeds by allowing backups to be written to multiple tapes concurrently
    Ø  writing to several tapes in parallel
    Synthetic Backup
    Ø  required just a single full backup and used an intermediate database to track and map the location of the continuous incremental backups performed to tape thereafter
    Ø  Also pioneered by Tivoli Storage Manager (TSM)
    Ø  the tape reclamation process solved a problem created by Synthetic tape backups
    Disk-Based Innovations
    Disk Staging
    Ø  data stored on optical media that could be moved by a staging manager to magnetic disk drives
    Ø  could be used to send the data to tape without affecting production workloads
    disk-to-disk without tape

    The Modern Age: Emerging D2D Efficiencies

     A Snapshot is Not a Backup

    NetApp explains this space-saving functionality as follows
      • We are able to create a snapshot in constant time because we have a map of the blocks that are allocated on disk. A snapshot is really just a copy of the block map rather than the actual disk blocks

      Use of local snapshots alone, however, still exposes the data to other corruption risks, such as

      • Widespread data corruption of the primary data set
      • Hardware failure impacting the data stored within
      How SnapVault Works:
      • The SnapVault “primary” system needing data protection.
      • The SnapVault “secondary” system where backup data is stored.
      1. SnapVault initially stores one “full” backup of the primary system‟s data set on the secondary
      2. builds on NetApp Snapshot efficiencies by quickly transmitting only the changed blocks found in the most recent snapshot of the primary system

      The Future of Data Protection

      • The new gold standard: Annual off-site archival of data to tape
      • Tape backup will become a service in lieu of local tape libraries.
      • D2D will be managed by the storage array itself.
      • Say goodbye to VTLs