Storage Architectures and Options
Alan McSweeney
Objectives
• To
provide high-level information on storage options and architectures for storing and managing digital camera data provide indicative sample solutions initiate discussions on storage configurations and options
• To • To
November 26, 2009
2
Agenda
• Confirmation • Data
of Storage Requirements
Flows and Processes Management Architectures and Options Management Operation, Management and Use Solutions
• Storage • Storage • Sample
November 26, 2009
3
Understanding of Requirements
Storage solution to manage raw and processed map image data • Store raw and processed data
•
− No requirement to store intermediate pre-processed data
Keep 6 month’s raw and processed data on primary storage • Keep online copy of additional data • Keep all raw and processed data indefinitely • Size for at least 5 years • Deliverables
•
− − − −
Draft data management/storage policy SLA options on data retrieval from non-primary storage Set of practical options Storage management policy document
November 26, 2009
4
Objectives of Storage Management
• Data
availability to meet service level commitments even during failures, disasters, or other forms of primary data loss protection against loss and to prevent unauthorised access
• Data • Data
retention that is compliant with regulations and standards in an unalterable state, fully audited for long periods of time storage management infrastructure
• Cost-effective Cost-
November 26, 2009
5
Backup and Data Archival
•
Backup
− Ensure efficient recoverability of data − Does not make backup data directly available − Optimised to bring large amounts of data back online quickly for system recovery − Retention management at the volume level − Not oriented to long-term management beyond life of current environment and media
•
Archiving
− Copy from online environment to separately managed (secure) storage to reduce cost of storage and enforce retention − Provides easy (ideally transparent) access for retrieval − Optimised to write and retrieve data at file granularity − File-level retention management − Designed to manage data over long-term, through media migration and with access auditing and controls − Designed to manage multiple copies of data on different media types
November 26, 2009
6
High Level Storage Management Architectures
• Multi-tier
data storage architectures
− Primary/Secondary − Primary/Secondary/Tertiary − Primary/Secondary and Tertiary in parallel − Secondary disk storage layer is purely for convenience to allow recall of data
• Advantages
and disadvantages in terms of cost and
service
November 26, 2009
7
Hierarchical Storage Management (HSM)
• HSM • Data •A
is a key requirement of effective (and costeffective) storage management is migrated (moved / copied) from one storage layer to another, usually less expensive, form of storage stub is created for and replaces each migrated file
− On the local system, a stub file looks and act like a regular file
• When
user action restores a file but the user does not change the file, that file is ″re-stubbed″ during the next migration process
November 26, 2009
8
Primary/Secondary
Migrate After Defined Interval
Primary Storage
Secondary Storage
High speed fibrechannel disk Data is directly accessible
Offline/nearline storage Retain data indefinitely Tape/optical media
November 26, 2009
9
Primary/Secondary
Migrate After Defined Interval
Primary Storage
Secondary Storage
Retrieve from Secondary to Primary
November 26, 2009
10
Primary/Secondary/Tertiary
Migrate After Defined Interval Migrate After Defined Interval
Primary Storage
Secondary Storage
Tertiary Storage
High speed fibrechannel disk Data is directly accessible
High capacity ATA (SATA/FATA) disk Data is directly accessible Data resides
Offline/nearline storage Retain data indefinitely Tape/optical media
November 26, 2009
11
Primary/Secondary/Tertiary
Migrate After Defined Interval Migrate After Defined Interval
Primary Storage
Secondary Storage
Tertiary Storage
Retrieve from Secondary/Tertiary to Primary
November 26, 2009
12
Primary/Secondary and Tertiary in Parallel
Migrate After Defined Interval
Primary Storage
Secondary Storage
Tertiary Storage Take Copy Immediately
November 26, 2009
13
Hardware Options
• Disk • Tape
Storage Storage — Manual or Automated Storage — Manual or Automated devices
• Optical • Hybrid
− VTL (Virtual Tape Library) − EMC Centera − IBM DR550 − Storage gateways
November 26, 2009
14
Hardware Options - Disk
Disk — Advantages
Speed - FC and SATA disk technologies allow the data to be housed on the appropriate disks • SATA Drive technology has mature and can lead to decreased acquisition costs • FC and SATA can be used within the same storage system for primary and secondary data • Storage Virtualisation
•
− Virtualise disk arrays within a storage system − Virtualise storage systems within a fabric − Thin provisioning allows over commitment of disk — reducing acquisition costs − Single Instance Storage (Deduplication) can be used but its effectiveness depends in the nature of the data
November 26, 2009 15
Hardware Options - Disk
Disk — Disadvantages
• Acquisition • Disk
cost
systems do not interoperate well
• Management • Most
- multiple skill sets may be required even if all storage systems are from the same vendor hardware vendors focus on ensuring hardware resilience, data resilience is not their concern costs — power, air conditioning, maintenance
• Operating
November 26, 2009
16
Hardware Options — Removable Media
• Advantages
− Control of costs − Keep fixed number of media within automated library unit (could keep none)
• Disadvantages
− External media needs media management and control
• Media management is greater for smaller capacity optical disks
− Manual costs of media management
November 26, 2009
17
Hardware Options — Optical Storage
Optical Storage
•
UDO (Ultra Density Optical)
− 60 GB media capacity
• • • • •
UDO media have a 50+ year life UDO technology roadmap -120GB and 240GB media capacities Main vendor — Plasmon Resold by other vendors: HP and IBM WORM media option
Model Maximum Media Slots Maximum Raw Capacity – (TB) – UDO2 Max/Min Drives Robotics Access Time (secs) Library Reliability (Mean Swap Between Failure) Redundant Power Import/Export Slot Bulk Load
November 26, 2009
Gx24 24 1.4
Gx32 32 1.9
Gx80 Gx174 80 174 4.8 10.4 4/2 6/2 7.3 8.3 2,000,000 NA Single NA
G238 238 14.3
G438 438 26.3
G638 638 38.3 12 / 2 6.4
2/1 2/1 7 7 2,000,000 NA Single NA
12 / 2 12 / 2 6.2 6.3 3,800,000 Optional Single 10 disk
18
Optical Library and Drive Performance
• Poor
performance relative to tape access medium
• Direct • Use
depends on data read (retrieval) and write volumes
5 sec 3 sec 35 msec 32MB 12 MB/s 6 MB/s (with verification) > 750,000 load/unload cycles > 100,000 hours Wide Ultra 2 LVD SCSI or USB 2.0
Media Load Time Media Unload Time Average Seek Time Buffer Memory Max Sustained Transfer Rate - Read Max Sustained Transfer Rate - Write MSBF - Mean Swap Between Failure MTBF - Mean Time Between Failure Interface
November 26, 2009
19
Single Drive/Path Tape and Optical Read and Write Performance
GB Hours Tape Read Tape Write Optical Optical Time Time Read Time Write Time 0.2 0.5 0.7 0.9 1.2 1.4 1.6 1.9 2.1 2.3 0.2 0.5 0.7 0.9 1.2 1.4 1.6 1.9 2.1 2.3 4.6 9.3 13.9 18.5 23.1 27.8 32.4 37.0 41.7 46.3 2.3 4.6 6.9 9.3 11.6 13.9 16.2 18.5 20.8 23.1
100 200 300 400 500 600 700 800 900 1,000
November 26, 2009
20
Hardware Options — Optical Storage
Optical — Advantages
• Reduced • Larger • Can
cost over disk
capacity media planned for the future
have embedded encryption media shelf life before refresh is required reliable medium WORM option
• Long • Very • True
November 26, 2009
21
Hardware Options — Optical Storage
Optical — Disadvantages
• Low
capacity
• Media • Low
must be managed offline unless multiple libraries are bought data access speed — not suited to large data volume restores
November 26, 2009
22
Hardware Options — Optical Storage
Optical Storage Issues
• Low
medium capacity
− UDO — 60 GB currently, 120 GB and 240 GB planned
• Tape
− LTO-4 Ultrium 1840 — 800 GB uncompressed − LTO-3 Ultrium 960 — 400 GB uncompressed
November 26, 2009
23
Tape and Optical Media Capacities
• •
Optical media capacity cumulative annual increase of c. 31% Tape media capacity cumulative annual increase of c. 64%
900 10,000 9,000 8,000 7,000 6,000 500 5,000 400 4,000 300 200 100 0 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 Optical Media Capacity
November 26, 2009
Capacity GB - Past and Current
800 700 600
3,000 2,000 1,000 0
Tape Media Capacity
Future Optical Media Capacity
Future Tape Media Capacity
24
Capacity GB - Future
Hardware Options — Tape
Tape — Advantages • Cost • Very well defined road map for LTO
− LTO4 (Dec 2006) - 1.6TB (2:1 compression) and data transfer rates of up to 240 MB/second (2:1 compression) − LTO5 (Planned) - 3.2 TB (2:1 compression) and data transfer rates of up to 360 MB/second (assuming a 2:1 compression) − LTO6 (Planned) - 6.4 TB (2:1 compression) and data transfer rates of up to 540 MB/second (assuming a 2:1 compression)
High capacity media • Designed for large data volume restore • Multiple media can be streamed to aggregate capacity and speed • Can have embedded encryption
•
November 26, 2009 25
Hardware Options — Tape
Tape — Disadvantages
• Media • Media
shelf life — medium long-term reliability single file restores access medium
• Cumbersome • Sequential
November 26, 2009
26
Hardware Options — Tape Library
•
Widely available from large number of vendors: Dell, HP, IBM, Quantum
− − − − − IBM System Storage TS3500 Tape Library One base frame, and up to 15 expansion frames Up to 12 drives per frame (up to 192 per library) Up to 5.5 PB with LTO 4 cartridges LTO Fibre Channel interface for server attachment
• •
Very high capacity automated data management Long-term data storage
November 26, 2009
27
VTL (Virtual Tape Library)
• • • • • •
Hybrid units that emulate tape libraries Use low cost disk (and possibly tape) Works with existing tape backup software Improved backup speeds No removable medium backup Sample products
− IBM
• IBM Virtualization Engine TS7510 • IBM Virtualization Engine TS7520
− HP
• StorageWorks Virtual Library System (VLS) • VLS1000i • VLS6000
November 26, 2009 28
IBM Virtualization Engine TS75x0
• • • • • •
TS7510 96 TB Capacity at 2:1 Compression Maximum number of virtual libraries — 128 Maximum number of virtual drives — 1,024 Maximum number of virtual cartridges — 8,192 Maximum number of concurrent backups – 32
• • • • • •
TS7520 2.6 PB Capacity at 2:1 Compression Maximum number of virtual libraries — 512 Maximum number of virtual drives — 4,096 Maximum number of virtual cartridges — 64,000 Maximum number of concurrent backups – 32
November 26, 2009
29
HP StorageWorks Virtual Library System (VLS)
• • • •
VLS1000i 3 TB Capacity at 2:1 Compression Maximum number of virtual libraries — 6 Maximum number of virtual drives — 12
• • • •
VLS6000 105 TB Capacity at 2:1 Compression Maximum number of virtual libraries — 16 Maximum number of virtual drives — 128
November 26, 2009
30
IBM DR550
Uses multiple storage tiers (disk, tape, optical) within an archive • Software - System Storage Archive Manager • Two models
•
− DR1 - 36.88 TB raw − DR2 - 168 TB raw
•
Attached devices — support for PB capacities
− Tape systems − Optical systems
•
Awards
− Data Protection Summit–Information Lifecycle Management (ILM)–Best of Show, 2007 − AIIM (The Enterprise Content Management Association)–Best in Show, 2005, 2006
November 26, 2009
31
Software Options
HSM
• HSM
is a principle most products offer the same basic functionality
− Automatic migration and management of data from one medium to another − Stubs or pointer are left in place of migrated files − Speed of retrieval depends upon speed of hardware upon which the files have been migrated to, this gives online, nearline and off-line options
November 26, 2009
32
Software Options
Bridgehead Software
•
Small company, employee owned
− Can they offer the level of service and support required when really needed − Are they possible acquisition targets
•
Ideal for mid — large customers
− Can it handle the levels of data over time
Caminosoft
• • •
Major corporation — publicly listed and managed by SEC rules and regulations Primary focus is on managing file server type data Repackaged by vendors such as CA
November 26, 2009 33
Software Options
Symantec • Major corporation • Two products:
− NetBackup − Enterprise Vault
• •
NetBackup
− HSM does not support Windows
Enterprise Vault
− − − − − − KVS staff still provide support, separate entity within Symantec Focus is largely on email and compliance Some integration with NetBackup Files to be migrated are collected into CAB files Entire CAB file recalled Poor support for tape as archival medium
• Recommended that you only use tape for data that is seldom or never accessed
November 26, 2009
34
Software Options
IBM — Tivoli
• Major • Vast
corporation R&D budgets
knowledge within the company
• Extensive • Agents
and options from most major software and hardware vendors
November 26, 2009
35
Software Options
HP — File Archiver
• Major • Vast
corporation R&D budgets
knowledge within the company Lightweight Solution” according to HP
• Extensive • “Simple
November 26, 2009
36
Software Options
HSM Product What is Required from chosen vendor / application?
• • • • • • •
Stable and functionally bullet proof solution Easy to use Capable of handling files Capable of handling data volumes Must integrate with backup application (so as NetBackup does not initiate a restore when backing up or restoring stubs) Expert support knowledge Expert integration knowledge
− These products are dependant on hardware vendors solutions
November 26, 2009 37
Data Deduplication
• Store • The
only one copy of data
deduplication process should be granular
− The smaller the data block examined, the more likely it is duplicate data will be found.
• The
deduplication process should be designed with minimal overhead when deduplicating (storing) and undeduplicating (retrieving) data
− Hardware better than software
• The
deduplication process should provide resiliency to insure that all data can be reliably stored and retrieved, even in the event of system failure
38
November 26, 2009
Data Deduplication
• Available
for range of storage — hardware and software
− Symantec Enterprise Vault creates a MD5 fingerprint for every file that is archived
• If multiple files have the same hash code, only one copy of the file is physically stored
− IBM N Series has Advanced Single Instance Storage (ASIS)
• Hardware and block-based deduplication
November 26, 2009
39
Deduplication in Action
Sales ed.ppt Client.ppt
20 x 4K blocks
= Identical blocks
Identical file - 20 blocks
With ASIS - 38 total blocks Without ASIS – 74 total blocks
Sales ed v2.ppt
White paper.doc
Edited file - 24 blocks
November 26, 2009
Different file - 10 blocks
40
Potential Deduplication Savings — Dependent in Data Types
Medical Imaging Web & Microsoft Office Data Engineering Home Directories Software Archive Technical Pubs Archive DataBase Backup
0%
November 26, 2009
10%
20%
30%
40%
50%
60%
70%
80%
41
Software and Solution Design Constraints and Issues
Bottom Line • Produce a realistic design before implementation and validate design • Solutions must be fully tested to ensure it works as expected • Decisions can then easily be made on the basis of the tests • NetBackup integration must be thoroughly tested with any solution • Primary to secondary to tertiary migration and retrievals must be tested and documented • Misconfiguration or lack of understanding can lead to data loss or primary production system failure • Need to look at the total cost of ownership — maintenance, power, manual effort — put a cost on all elements and activities to ensure fair comparison • Reduced complexity — fewer components, vendors — means long-term ease of operation and use and has a genuine value
November 26, 2009
42
Sample Storage Capacity Planning
•
Sizing issues and assumptions
− Annual growth rate − Overhead for determination of actual disk storage requirements (RAID overhead, etc.) − Archival storage medium utilisation overhead (allowance for unfilled tapes, optical platters, RAID for VTL, etc.) − Storage lifecycle − Number of storage layers — 2 or 3
•
Sample storage capacity planning scenarios
− Annual growth rates — 0%, 10%, 20%, 30% − Translated into monthly growth rates for calculations - 20% growth = 1.531% monthly − Three tiers − Migrate from Tier 1 to Tier 2 after 6 months − Migrate from Tier 2 to Tier 3 after further 6 months annual
November 26, 2009
43
Disk Space Calculations
• Storage
estimates expressed as raw capacities required to accommodate data overhead for effective usability, RAID, snapshots, online spare, less than 100% utilisation, etc. storage after 5 years with 10% annual growth = 25,580 GB to at least 34,533 GB of raw disk capacity
• Includes • Primary • Equates
November 26, 2009
44
Sample Storage Capacity Planning — 0% Annual Growth Rate
Annual Growth Rate Disk Storage Contingency, Allowance for Less Than 100% Utilisation, RAID, Other Overhead Tape Storage Contingency, Allowance for Less Than 100% Utilisation, Other Overhead Number of Years to Cater For in Initial Storage Solution Raw Data per Month GB Pre-processed Dara Per Month GB Processed Dara Per Month GB Primary Data Storage Retention Months Secondary Data Storage Retention Months Tertiary Data Copy Months Tertiary Data Storage Retention Months Primary Total Primary Data Per Month GB Total Primary Data Per Month Including Contingency and Growth GB Primary Storage Including Contingency GB Primary Storage Including Contingency and Growth GB Secondary Total Secondary Data Per Month GB Total Secondary Data Per Month Including Contingency and Growth GB Secondary Storage Including Contingency GB Secondary Storage Including Contingency and Growth GB UDO Medium Capacity GB LTO4 Medium Capacity Compressed
November 26, 2009
0% 35% 25% 5 700 2,000 2,000 6 6 12 9999 2,700 3,645 21,870 21,870 2,700 3,645 21,870 21,870 60 1600
45
Capacities - Annual Growth Rate — 0%
Month Primary GB 3,645 3,645 3,645 3,645 3,645 3,645 3,645 3,645 3,645 3,645 Total Secondary Total Primary GB Secondary GB GB 21,870 0 0 21,870 3,645 21,870 21,870 3,645 21,870 21,870 3,645 21,870 21,870 3,645 21,870 21,870 3,645 21,870 21,870 3,645 21,870 21,870 3,645 21,870 21,870 3,645 21,870 21,870 3,645 21,870 Tertiary GB 0 0 3,375 3,375 3,375 3,375 3,375 3,375 3,375 3,375 Total Tertiary GB 0 0 20,250 40,500 60,750 81,000 101,250 121,500 141,750 162,000 UDO Medium Slots 0 0 338 675 1,013 1,350 1,688 2,025 2,363 2,700 LTO4 Media 0 0 13 25 38 51 63 76 89 101
Month 6 Month 12 Month 18 Month 24 Month 30 Month 36 Month 42 Month 48 Month 54 Month 60
November 26, 2009
46
Storage Capacities - 0% Annual Growth Rate
180,000 160,000 140,000 120,000
GB
100,000 80,000 60,000 40,000 20,000 0
November 26, 2009
4 on th M 7 on th 10 M on th 13 M on th 16 M on th 19 M on th 22 M on th 25 M on th 28 M on th 31 M on th 34 M on th 37 M on th 40 M on th 43 M on th 46 M on th 49 M on th 52 M on th 55 M on th 58 M
Total Secondary GB Total Primary GB Total Tertiary GB
1 on th M M
on th
47
Media Requirements - 0% Annual Growth Rate
3,000
2,500
Number of Media
2,000
1,500
1,000
500
0
Month Month 1 5 Month 9 Month 13 Month Month 17 21 Month 25 Month Month 29 33 Month 37 Month 41 Month Month 45 49 Month 53 Month 57
Month
UDO Medium Slots LTO4 Media LTO3 Media
November 26, 2009
48
Sample Storage Capacity Planning — 10% Annual Growth Rate
Annual Growth Rate Disk Storage Contingency, Allowance for Less Than 100% Utilisation, RAID, Other Overhead Tape Storage Contingency, Allowance for Less Than 100% Utilisation, Other Overhead Number of Years to Cater For in Initial Storage Solution Raw Data per Month GB Pre-processed Dara Per Month GB Processed Dara Per Month GB Primary Data Storage Retention Months Secondary Data Storage Retention Months Tertiary Data Copy Months Tertiary Data Storage Retention Months Primary Total Primary Data Per Month GB Total Primary Data Per Month Including Contingency and Growth GB Primary Storage Including Contingency GB Primary Storage Including Contingency and Growth GB Secondary Total Secondary Data Per Month GB Total Secondary Data Per Month Including Contingency and Growth GB Secondary Storage Including Contingency GB Secondary Storage Including Contingency and Growth GB UDO Medium Capacity GB LTO4 Medium Capacity Compressed
November 26, 2009
10% 35% 25% 5 700 2,000 2,000 6 6 12 9999 2,700 3,645 21,870 32,020 2,700 3,645 21,870 32,020 60 1600
49
Capacities - Annual Growth Rate — 10%
Month Primary GB 3,823 4,010 4,205 4,410 4,626 4,851 5,088 5,337 5,597 5,870 Total Secondary Total Primary GB Secondary GB GB 22,459 0 0 23,586 3,823 22,459 24,737 4,010 23,586 25,945 4,205 24,737 27,211 4,410 25,945 28,539 4,626 27,211 29,932 4,851 28,539 31,393 5,088 29,932 32,925 5,337 31,393 34,533 5,597 32,925 Tertiary GB 0 0 3,713 3,894 4,084 4,283 4,492 4,711 4,941 5,183 Total Tertiary GB 0 0 21,723 44,447 68,280 93,276 119,492 146,988 175,826 206,071 UDO Medium Slots 0 0 362 741 1,138 1,555 1,992 2,450 2,930 3,435 LTO4 Media 0 0 14 28 43 58 75 92 110 129
Month 6 Month 12 Month 18 Month 24 Month 30 Month 36 Month 42 Month 48 Month 54 Month 60
November 26, 2009
50
Storage Capacities - 10% Annual Growth Rate
250,000
200,000
150,000
GB
100,000 50,000
0
November 26, 2009
4 on t M h7 on th 10 M on th 13 M on th 16 M on th 19 M on th 22 M on th 25 M on th 28 M on th 31 M on th 34 M on th 37 M on th 40 M on th 43 M on th 46 M on th 49 M on th 52 M on th 55 M on th 58 M
Total Secondary GB Total Primary GB Total Tertiary GB
1 on th M M
on th
51
Media Requirements - 10% Annual Growth Rate
3,500 3,000 2,500 2,000 1,500 1,000 500 0
Month Month 1 5 Month 9 Month 13 Month Month 17 21 Month 25 Month Month 29 33 Month 37 Month 41 Month Month 45 49 Month 53 Month 57
Number of Media
Month
UDO Medium Slots LTO4 Media LTO3 Media
November 26, 2009
52
Sample Storage Capacity Planning — 20% Annual Growth Rate
Annual Growth Rate Disk Storage Contingency, Allowance for Less Than 100% Utilisation, RAID, Other Overhead Tape Storage Contingency, Allowance for Less Than 100% Utilisation, Other Overhead Number of Years to Cater For in Initial Storage Solution Raw Data per Month GB Pre-processed Dara Per Month GB Processed Dara Per Month GB Primary Data Storage Retention Months Secondary Data Storage Retention Months Tertiary Data Copy Months Tertiary Data Storage Retention Months Primary Total Primary Data Per Month GB Total Primary Data Per Month Including Contingency and Growth GB Primary Storage Including Contingency GB Primary Storage Including Contingency and Growth GB Secondary Total Secondary Data Per Month GB Total Secondary Data Per Month Including Contingency and Growth GB Secondary Storage Including Contingency GB Secondary Storage Including Contingency and Growth GB UDO Medium Capacity GB LTO4 Medium Capacity Compressed
November 26, 2009
20% 35% 25% 5 700 2,000 2,000 6 6 12 9999 2,700 3,645 21,870 45,350 2,700 3,645 21,870 45,350 60 1600
53
Capacities - Annual Growth Rate — 20%
Month Primary GB 3,993 4,374 4,791 5,249 5,750 6,299 6,900 7,558 8,280 9,070 Total Secondary Total Primary GB Secondary GB GB 23,016 0 0 25,274 3,993 23,016 27,687 4,374 25,274 30,329 4,791 27,687 33,224 5,249 30,329 36,395 5,750 33,224 39,869 6,299 36,395 43,674 6,900 39,869 47,843 7,558 43,674 52,409 8,280 47,843 Tertiary GB 0 0 4,050 4,437 4,860 5,324 5,832 6,389 6,998 7,666 Total Tertiary GB 0 0 23,163 48,413 76,072 106,371 139,562 175,921 215,750 259,381 UDO Medium Slots 0 0 386 807 1,268 1,773 2,326 2,932 3,596 4,323 LTO4 Media 0 0 14 30 48 66 87 110 135 162
Month 6 Month 12 Month 18 Month 24 Month 30 Month 36 Month 42 Month 48 Month 54 Month 60
November 26, 2009
54
Storage Capacities - 20% Annual Growth Rate
250,000
200,000
150,000
GB
100,000 50,000 0
4 on th M 7 on th 10 M on th 13 M on th 16 M on th 19 M on th 22 M on th 25 M on th 28 M on th 31 M on th 34 M on th 37 M on th 40 M on th 43 M on th 46 M on th 49 M on th 52 M on th 55 M on th 58 M 1 M on th M on th
Total Secondary GB
Total Primary GB
Total Tertiary GB
November 26, 2009
55
Media Requirements - 20% Annual Growth Rate
4,500 4,000 3,500
Number of Media
3,000 2,500 2,000 1,500 1,000 500 0
Month Month 1 5 Month 9 Month 13 Month Month 17 21 Month 25 Month Month 29 33 Month 37 Month 41 Month Month 45 49 Month 53 Month 57
Month
UDO Medium Slots LTO4 Media LTO3 Media
November 26, 2009
56
Sample Storage Capacity Planning — 30% Annual Growth Rate
Annual Growth Rate Disk Storage Contingency, Allowance for Less Than 100% Utilisation, RAID, Other Overhead Tape Storage Contingency, Allowance for Less Than 100% Utilisation, Other Overhead Number of Years to Cater For in Initial Storage Solution Raw Data per Month GB Pre-processed Dara Per Month GB Processed Dara Per Month GB Primary Data Storage Retention Months Secondary Data Storage Retention Months Tertiary Data Copy Months Tertiary Data Storage Retention Months Primary Total Primary Data Per Month GB Total Primary Data Per Month Including Contingency and Growth GB Primary Storage Including Contingency GB Primary Storage Including Contingency and Growth GB Secondary Total Secondary Data Per Month GB Total Secondary Data Per Month Including Contingency and Growth GB Secondary Storage Including Contingency GB Secondary Storage Including Contingency and Growth GB UDO Medium Capacity GB LTO4 Medium Capacity Compressed
November 26, 2009
30% 35% 25% 5 700 2,000 2,000 6 6 12 9999 2,700 3,645 21,870 62,463 2,700 3,645 21,870 62,463 60 1600
57
Capacities - Annual Growth Rate — 30%
Month Primary GB 4,156 4,739 5,403 6,160 7,024 8,008 9,131 10,410 11,870 13,534 Total Secondary Total Primary GB Secondary GB GB 23,545 0 0 26,937 4,156 23,545 30,713 4,739 26,937 35,019 5,403 30,713 39,927 6,160 35,019 45,524 7,024 39,927 51,906 8,008 45,524 59,182 9,131 51,906 67,477 10,410 59,182 76,936 11,870 67,477 Tertiary GB 0 0 4,388 5,003 5,704 6,503 7,415 8,454 9,639 10,991 Total Tertiary GB 0 0 24,575 52,398 84,122 120,292 161,532 208,554 262,167 323,294 UDO Medium Slots 0 0 410 873 1,402 2,005 2,692 3,476 4,369 5,388 LTO4 Media 0 0 15 33 53 75 101 130 164 202
Month 6 Month 12 Month 18 Month 24 Month 30 Month 36 Month 42 Month 48 Month 54 Month 60
November 26, 2009
58
Storage Capacities - 30% Annual Growth Rate
250,000
200,000
150,000
GB
100,000 50,000 0
November 26, 2009
4 on th M 7 on th 10 M on th 13 M on th 16 M on th 19 M on th 22 M on th 25 M on th 28 M on th 31 M on th 34 M on th 37 M on th 40 M on th 43 M on th 46 M on th 49 M on th 52 M on th 55 M on th 58 M
Total Secondary GB Total Primary GB Total Tertiary GB
1 M on th M
on th
59
Media Requirements - 30% Annual Growth Rate
5,000 4,500 4,000
Number of Media
3,500 3,000 2,500 2,000 1,500 1,000 500 0
Month Month 1 5 Month 9 Month 13 Month Month 17 21 Month 25 Month Month 29 33 Month 37 Month 41 Month Month 45 49 Month 53 Month 57
Month
UDO Medium Slots LTO4 Media LTO3 Media
November 26, 2009
60
10 Year Data Storage Capacities — Different Growth Rates
1,800,000 1,600,000 1,400,000
1,200,000
1,000,000
GB
800,000 600,000 400,000 200,000
0 Month Month Month Month Month Month Month Month Month Month Month Month Month Month Month Month Month Month Month Month 6 12 18 24 30 36 42 48 54 60 66 72 78 84 90 96 102 108 114 120
Total Primary GB - 10% Total Primary GB - 20% Total Primary GB - 30%
November 26, 2009
Total Secondary GB - 10% Total Secondary GB - 20% Total Secondary GB - 30%
Total Tertiary GB - 10% Total Tertiary GB - 20% Total Tertiary GB - 30%
61
Single Drive/Path Tertiary Layer Data Write Times — Tape and Optical
2,000 1,800 1,600 1,400 1,200 1,000 800 600 400 200 0
Hours
November 26, 2009
on th 1 on th M 5 on th M on 9 th 1 M on 3 th M 1 on 7 th M 2 on 1 th 2 M on 5 th M 2 on 9 th M 3 on 3 th 3 M on 7 th M 4 on 1 th M 4 on 5 th M 4 on 9 th 5 M on 3 th M 5 on 7 th M 6 on 1 th 6 M on 5 th M 6 on 9 th M 7 on 3 th 7 M on 7 th M 8 on 1 th M 8 on 5 th M 8 on 9 th 9 M on 3 M th 9 on 7 th M 10 on 1 th 1 M on 05 th M 1 on 09 th M 1 on 13 th 11 7 M
Tape Write Time Hours 10% Growth Optical Write Time Hours 20% Growth Optical Write Time Hours 10% Growth Tape Write Time Hours 30% Growth Tape Write Time Hours 20% Growth Optical Write Time Hours 30% Growth
62
M
Implementation Options
• Factors:
− 2 or 3 tiers − Optical, tape or VTL as the last tier − Use of existing storage (HP/Dell) or new storage − DR or no DR
• Offsite manual copy or replication
− Software HSM — use existing NetBackup or other: HT FileStore, CaminoSoft, IBM Tivoli
November 26, 2009
63
Spectrum of Options
All disk DR option with replicated data
Mixed disk/tape/optical/VTL/manual/automated
Primary disk Secondary tape
November 26, 2009
64
Data Retrieval Operation
•
Secondary disk
− Data is retrieved to primary immediately — available within seconds/minutes
•
Secondary/tertiary VTL
− Data is retrieved to primary immediately — available within minutes
•
Secondary/tertiary tape library
− Data is retrieved to primary immediately — available within minutes
•
Secondary/tertiary optical library
− Data is retrieved to primary immediately — available within hours
•
Manual media retrieval
− Retrieval times depends on media location and staff allocated to media handling
November 26, 2009
65
Sample Options
• Three • All
tiers — optical or tape library as third tier existing hardware
disk cost ATA disks for secondary storage
• Reuse/expand • Low
• Not
all available options — presented for review and feedback
November 26, 2009
66
Physical Option 1 — Three Tiers — Optical or Tape
November 26, 2009
67
Physical Option 1 — Three Tiers — Optical or Tape
November 26, 2009
68
Physical Option 1 - Components
• Primary • Second • Tertiary
storage — SAN with fibre disk storage — SAN with ATA disk storage — optical library
• Software
− HT Filestore − Caminosoft − NetBackup Storage Migrator − Tivoli Storage Manager
November 26, 2009
69
Resilience
•
Primary storage mirrored for resilience
November 26, 2009
70
Operation and Service Level Agreement
November 26, 2009
71
Physical Option 2 — All Disk Configuration
• All
disk storage option mirrored sites with realtime replication replicated components for resilience configuration
• Two
• Multiple • Sample
− Primary Storage
• Clustered SAN Controllers with 594 x 300 GB Fibre Channel Drives = 151 TB Raw Storage
− Secondary Storage
• Clustered SAN Controllers with 336 x 750 GB SATA Drives = 252 TB Raw Storage
− Total 403 TB of Raw Storage capacity (doubled for DR)
November 26, 2009 72
All Disk Configuration
November 26, 2009
73
Resilience — Multiple Points of Redundancy
November 26, 2009
74
Resilience
• SAN • SAN • Two
switches controllers disks per shelf site
• Entire
November 26, 2009
75
All Disk Configuration
• Indicative
hardware and software (replication, snapshot)
cost
− €1.8 million − €4,460 per TB (doubled for DR)
•5
standard racks in each location not include
• Does
− HSM software − Installation and commissioning
• Represents
high water mark in terms of costs and functionality
76
November 26, 2009
All Disk Configuration
Advantages
• High • Low
performance resilient
manual intervention
• Highly
Disadvantages
• High
cost of acquisition and operation in data volumes means additional expense
• Growth • No
upper limit on cost
77
November 26, 2009
Physical Option 3 — Existing Hardware
• Raw,
pre-processed and processed data resides on HP continuously to second EVA
EVA
• Replicated • Dell
CX disk array used as secondary location
• Existing
ADIC LTO drives used for tertiary and long term offsite storage
November 26, 2009
78
November 26, 2009
79
Existing Hardware
Advantages
• Cost • Some
skill sets already in organisation
Disadvantages
• Investment • Software
in old technology
based HSM product skills required
November 26, 2009
80
Introduction of Tertiary Device
• Existing • UDO
HP and Dell storage still employed
or LTO device used as final destination before removal to offsite archive
November 26, 2009
81
November 26, 2009
82
Introduction of Tertiary Device
Advantages • Cost — use of existing hardware • Some skill sets already in organisation • Media life is increased with UDO Disadvantages • Cost — UDO or new tape library • Management of archived media — especially UDO as they are low capacity • Investment in old technology • Software based HSM product skills required • UDO retrieval speeds
November 26, 2009 83
Virtual Tape Library
• VTL • VTL
device will act as a tape library will be secondary location product skills may not be required could manage this process
• HSM
• NetBackup • VTL
data will ultimately be archived to tape via ADIC tape library
November 26, 2009
84
November 26, 2009
85
Virtual Tape Library
Advantages
• • •
Some skill sets already in organisation No new third party migration tool absolutely necessary Extension of NetBackup system using NetBackup Storage Migrator
Disadvantages
• • •
Cost — VTL with required capacity can be expensive Cannot take VTL backups offsite — tertiary solution still required Lack of vendor implementation experience
November 26, 2009 86
Physical Option 4 — Disk Based Secondary Information Store
• Single • Data •1
storage device with multiple PB of data scalability can be retained on information store for 15+ years and beyond TB disk make this possible can be moved to storage attached tape
• Data
• Internal
backup features of information store can aid NetBackup routine (SnapShots, Vaulting)
November 26, 2009
87
November 26, 2009
88
Disk Based Information Store
Advantages
• • • • •
Speed of retrieval No new third party migration tool absolutely necessary Simplicity Integration with NetBackup — no effect on daily backup routines Information store can be split across multiple information stores to give multiple PB capacity is required
Disadvantages
•
Cost — may be expensive initially but storage can be added over time as needed
November 26, 2009 89
Central Management — Storage Virtualisation
• Controller • Handle
site above storage systems
day to day management of storage across all platforms set consolidation
Advantages
• Skill • Costs
Disadvantages
• Vendor
based skill are still ultimately required
November 26, 2009
90
November 26, 2009
91
Key Questions
• Number
of storage tiers and preferred configuration • Use of tape/optical/VTL • Software HSM option • Disaster recovery/business continuity requirements and options • Capacity planning constraints and assumptions • New hardware or reuse of existing hardware • Level of automation required for archival level • Financial constraints and budget available • Implementation schedule
November 26, 2009 92
More Information
Alan McSweeney
[email protected]
November 26, 2009
93