VMware Architecting vCloud WP

Published on March 2017 | Categories: Documents | Downloads: 54 | Comments: 0 | Views: 343
of 100
Download PDF   Embed   Report

Comments

Content

VMware vCloud Architecting a vCloud
Version 1.6
T E C H N I C A L W HI T E P A P E R

VMware vCloud Architecting a vCloud

Table of Contents List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1. What is a VMware vCloud?. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.1 Document Purpose and Assumptions .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.2 Cloud Computing and vCloud Introduction .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 1.3 vCloud Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 1.4 vCloud Infrastructure .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 1.4.1 vCloud Management Cluster. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 1.4.2 Compute Resources .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 1.4.3 Storage Resources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 1.4.4 Networking Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 1.4.5 Component Placement. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 1.4.6 vCloud Consumer Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 1.4.7 vCloud Logical Infrastructure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2. vCloud Director Constructs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3. vCloud Consumer Resources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3.1 Cloud Consumer Resources .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3.2 Establish Provider Virtual Datacenters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.2.1 Public Cloud Considerations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 3.2.2 Private Cloud Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 3.2.3 Provider Virtual Datacenter Special Use Cases . . . . . . . . . . . . . . . . . . . . . 18 3.2.4 Compute Resources Considerations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.2.5 Storage Resources Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 3.2.6 Networking Resources Considerations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.3 Multi-Site/Multi-Geo Clouds. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.3.1 Scenario #1—Common User Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.3.2 Scenario #2—Common Set of Services. . . . . . . . . . . . . . . . . . . . . . . . . . . . .22 3.3.3 Suggested Deployment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 3.3.4 Other Multi-Site Considerations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.3.5 Merging Chargeback Reports. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.3.6 Synchronizing Catalogs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

T ECHNICAL W HI T E P A P E R / 2

VMware vCloud Architecting a vCloud

4. Providing Cloud Resources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 4.1 Establish Organizations .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 4.1.1 Administrative Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 4.1.2 Standard Organizations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 4.2 Establish Networking Options. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 4.2.1 External Networks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 4.2.2 Network Pools .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 4.2.3 Cisco Nexus 1000V Considerations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 4.3 Establish Networking Options—Public vCloud Example . . . . . . . . . . . . . . . . . . . . 26 4.3.1 External Networks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 4.3.2 Network Pools .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 4.3.3 Organization Networks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 4.3.4 Cisco Nexus 1000V Considerations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 4.4 Establish Networking Options—Private vCloud Example .. . . . . . . . . . . . . . . . . . . 30 4.4.1 External Networks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 4.4.2 Network Pools .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 4.4.3 Organization Networks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 4.4.4 Cisco Nexus 1000V Considerations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 4.5 Establish Organization Virtual Datacenters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 4.5.1 Public vCloud Considerations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 4.5.2 Private vCloud Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 4.6 Create vApp Templates and Media Catalogs .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 4.6.1 Auto-Joining Active Directory Domains .. . . . . . . . . . . . . . . . . . . . . . . . . . . 33 4.6.2 Establish Policies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 4.6.3 Accessing your vCloud. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 4.6.4 Deploy vApps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 4.6.5 Employ Chargeback or Showback .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 5. Extending vCloud Capabilities. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 5.1 Core vCloud Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 5.2 vCloud Request Manager. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 5.3 vCloud API. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 5.4 vCenter Orchestrator .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 5.4.1 Cloud Administration Orchestration Examples .. . . . . . . . . . . . . . . . . . . . . 37 5.4.2 Organization Administration Orchestration Examples. . . . . . . . . . . . . . . 37 5.4.3 Cloud Consumer Operation Orchestration Examples .. . . . . . . . . . . . . . . 37

T ECHNICAL W HI T E P A P E R / 3

VMware vCloud Architecting a vCloud

5.5 vCloud Connector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 5.5.1 vCloud Connector Placement. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 5.5.2 vCloud Connector Example Usage Scenarios .. . . . . . . . . . . . . . . . . . . . . . 39 5.5.3 vCloud Connector Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 6. Managing the vCloud. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 6.1 Monitoring. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 6.1.1 Management Cluster. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 6.1.2 Cloud Consumer Resources and Workloads. . . . . . . . . . . . . . . . . . . . . . . . 40 6.2 Logging. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 6.2.1 Logging Architectural Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 6.2.2 Logging as a Service. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 6.3 End-to-End Security Considerations with vCloud. . . . . . . . . . . . . . . . . . . . . . . . . . 43 6.3.1 vCloud Environment Security. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 6.3.2 User Access Security .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 6.3.3 Securing Workloads at the Network—Level Workload Security . . . . . . 43 6.4 Workload Availability Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 6.4.1 Uptime SLAs at 99.99%. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 6.4.2 Load Balancing of vCloud Director Cells . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 6.4.3 I/O Considerations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 6.4.4 Disaster Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 6.4.5 Backup and Restore of vApps .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 7. Sizing the vCloud . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 7.1 Initial Sizing of Cloud Consumer Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 7.2 Capacity Management. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 8. Implementing Your vCloud. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 9. Appendix: vCloud Director Cell Monitoring. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 10. Appendix: vCloud Availability Considerations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 11. Appendix: Security Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 11.1 Network Access Security. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 11.2 Compliance. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .61 11.3 Use Cases: Why Logs Should be Available . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 11.3.1 Example Compliance Use Cases for Logs . . . . . . . . . . . . . . . . . . . . . . . . . . 64 11.3.2 VMware vCloud Log Sources for Compliance .. . . . . . . . . . . . . . . . . . . . . . 65 11.4 vCloud Director Diagnostic and Audit Logs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 11.5 Load Balancer Considerations .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 12. Appendix: Signed Certificates with vCloud Director. . . . . . . . . . . . . . . . . . . . . . . . . . . 70

T ECHNICAL W HI T E P A P E R / 4

VMware vCloud Architecting a vCloud

13. Appendix: Capacity Planning. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 13.1 Cloud Administrator (Service Provider) Perspective. . . . . . . . . . . . . . . . . . . . . . . . 87 13.2 Network Capacity Planning. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 14. Appendix: Capacity Management. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 14.1 vCloud-Specific Capacity Forecasting (Demand Management) . . . . . . . . . . . . . 93 14.2 Capacity Monitoring and Establishing Triggers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 14.3 Capacity Management Manual Processes—Provider Virtual Datacenter. . . . . . 94 14.4 End-Customer (Organization) Administrator Perspective. . . . . . . . . . . . . . . . . . . 95 14.5 Organization Virtual Datacenter—Specific Capacity Forecasting . . . . . . . . . . . . 97 14.6 Capacity Management Manual Processes—Organization Virtual Datacenter. . . 100

T ECHNICAL W HI T E P A P E R / 5

VMware vCloud Architecting a vCloud

List of Figures Figure 1. vCloud Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Figure 2. Core vCloud Logical Architecture. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 Figure 3. vCloud Infrastructure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 Figure 4. vCloud Logical Architecture. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 Figure 5. vCloud Director Construct to vSphere Mapping . . . . . . . . . . . . . . . . . . . . . . . . . 15 Figure 6. vCloud Consumer Resource Mapping. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 Figure 7. Two Sites with Local vCloud Director Instances Managing Local vCenters . . 21 Figure 8. Remote Console Flow. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 Figure 9. Two Sites with Isolated vCloud Director Instances. . . . . . . . . . . . . . . . . . . . . . . . 23 Figure 10. Example Diagram of Provider Networking for a Public vCloud . . . . . . . . . . . 27 Figure 11. Configure External IPs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 Figure 12. vCloud Director Logical Networking w/ Cisco Nexus 1000V . . . . . . . . . . . . . 29 Figure 13. Example Diagram of Provider Networking for a Private vCloud. . . . . . . . . . . 30 Figure 14. vCloud Connector Architecture. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 Figure 15. Architectural Example Drawing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 Figure 16. Configure Firewall Services. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 Figure 17. Reference Architecture Kit. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 Figure 18. Log Collection in the Cloud Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 Figure 19. Architecture of vCloud Components and Log Collection. . . . . . . . . . . . . . . . . 65 Figure 20. Infrastructure Layers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

T ECHNICAL W HI T E P A P E R / 6

VMware vCloud Architecting a vCloud

List of Tables Table 1. Reference Documentation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Table 2. vCloud Components. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 Table 3. vCloud Director Constructs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 Table 4. Component Requirements for a Management Cluster . . . . . . . . . . . . . . . . . . . . . 19 Table 5. Network Pool Options. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 Table 6. vCloud vApp Requirements Checklist. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 Table 7. Definition of Resource Pool and Virtual Machine Split. . . . . . . . . . . . . . . . . . . . . . 49 Table 8. Memory, CPU, Storage, and Networking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 Table 9. Example Consolidation Ratios. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 Table 10. MBeans Used To Monitor vCloud Cells. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 Table 11. vCloud Availability Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 Table 12. Network Access Security Use Cases. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 Table 13. Audit Concerns Within The Cloud. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 Table 14. vCloud Component Logs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 Table 15. Other Component Logs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 Table 16. Load Balancer Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 Table 17. Certificate Steps. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 Table 18. vSphere Host Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 Table 19. Determing Redundancy Overhead. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 Table 20. Network Capacity Planning Items . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 Table 21. Capacity Monitoring Metrics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 Table 22. Organization Virtual Datacenter Units of Consumption. . . . . . . . . . . . . . . . . . . 95 Table 23. Recommended Organization Virtual Datacenter Capacity Thresholds. . . . . . 95 Table 24. Sample Organization Virtual Datacenter Resource Allocation . . . . . . . . . . . . . 96 Table 25. Organization Virtual Datacenter Trending Information. . . . . . . . . . . . . . . . . . . . 97 Table 26. Organization Virtual Datacenter Capacity Trending Variables . . . . . . . . . . . . . 98 Table 27. Sample Organization Virtual Datacenter Trending Information. . . . . . . . . . . . . 99

T ECHNICAL W HI T E P A P E R / 7

VMware vCloud Architecting a vCloud

1. What is a VMware vCloud?
1.1 Document Purpose and Assumptions
Architecting a vCloud is intended to serve as a reference for cloud architects. The target audience is VMware Certified Professionals (VCP) familiar with VMware products, particularly VMware vSphere (vCenter Server, ESXi, vShield Manager), VMware vCenter Chargeback, and VMware vCloud Director. Before proceeding with the rest of this document, you should have read the Service Definition for the type of cloud you are building, private or public. This document is not intended to be a substitute for detailed product documentation, nor is it a step-by-step guide for installing a vCloud. You should have access to the following documentation referred to throughout this document for step-by-step instructions on installing and configuring various components.
TO P I c REFErENcED DOcUmENt

Cloud Requirements vCloud Service Definitions

Requirements for a Cloud Service Definition for Public Cloud Service Definition for Private Cloud

vCloud Implementations

Service Provider Public vCloud Implementation Example Private vCloud Implementation Example

vCloud Director

vCloud Director Installation and Configuration Guide vCloud Director Administrator’s Guide vCloud Director Security Hardening Guide

vCloud API

vCloud API Specification vCloud API Programming Guide

vSphere

vSphere Datacenter Administration Guide vSphere Resource Management Guide

vShield vCenter Chargeback

vShield Administration Guide vCenter Chargeback User’s Guide Using vCenter Chargeback with VMware Cloud Director Technical Note

vCenter Orchestrator (vCO)

vCenter Orchestrator Developer’s Guide VMware vCenter Orchestrator Administration Guide vCenter Server 4.1 Plug-In API Reference for vCenter Orchestrator

vCloud Request Manager

vCloud Request Manager Installation and Configuration Guide vCloud Request Manager User’s Guide

Table 1. Reference Documentation

T ECHNICAL W HI T E P A P E R / 8

VMware vCloud Architecting a vCloud

For further information, refer to the set of documentation for the appropriate product. For additional guidance and best practices, refer to the Knowledge Base on vmware.com.

1.2 Cloud Computing and vCloud Introduction
A vCloud is VMware’s cloud solution built on VMware technologies and solutions to deliver cloud computing. Cloud computing is a new approach to computing that leverages the efficient pooling of on-demand, selfmanaged virtual infrastructure to provide resources consumable as a service. Cloud computing can be delivered as three layers of service delivery: • Infrastructure as a Service (IaaS) • Platform as a Service (PaaS) • Software as a Service (SaaS) This iteration of a vCloud focuses strictly on the IaaS layer. The vCloud will build upon VMware vSphere by extending the robust virtual infrastructure capabilities to facilitate delivery of infrastructure service via cloud computing.

1.3 vCloud Components
The VMware vCloud is comprised of the following components:

vCloud Request Manager vCenter Chargeback vCloud Director vShield Edge vCloud Connector vCenter Orchestrator VMware vSphere
Figure 1. vCloud Overview

vCloud API

T ECHNICAL W HI T E P A P E R / 9

VMware vCloud Architecting a vCloud

v C LO U D C O m P O N E N t

D E S cr I P t I O N

VMware vCloud Director (vCD) vCloud API

Cloud Coordinator and UI. Abstracts vSphere resources. Includes: • vCloud Director Server(s) (also known as “cell”) • vCloud Director Database • vCloud API, used to manage cloud objects

vCloud API VMware vSphere

API used to programmatically interact with a vCloud Underlying foundation of virtualized resources. The vSphere family of products includes: • vCenter Server and vCenter Server Database • ESXi hosts, clustered by vCenter Server • Management Assistant

VMware vShield

Provides network security services Includes: • vShield Manager (VSM) virtual appliance • vShield Edge* virtual appliances, automatically deployed by vCloud Director
*The fully licensed version of vShield Edge includes optional features such as VPN and load balancing that are not integrated with vCloud Director.

VMware vCenter Chargeback

Optional component that provides resource metering and reporting to facilitate resource showback/chargeback Includes: • vCenter Chargeback Server • Chargeback Data Collector • vCloud Data Collector • VSM Data Collector

VMware vCenter Orchestrator VMware vCloud Request Manager

Optional component that facilitates orchestration at the vCloud API and vSphere levels. Optional component that provides provisioning request and approval workflows, software license tracking, and policy-based cloud partitioning. Optional component to facilitate transfer of a powered-off vApp in OVF format from a local vCloud or vSphere to a remote vCloud

VMware vCloud Connector

Table 2. vCloud Components

Other VMware or third-party products or solutions are not addressed in this iteration of a vCloud.

T ECHNICAL W HI T E P A P E R / 1 0

VMware vCloud Architecting a vCloud

From an architectural view, the following diagram shows how the core vCloud components interrelate.

VMware vCloud Director (vCD)
vCloud Director Cell NFS Server

vCloud API vCloud Director End Users Web Console vCloud Director Database

VMware vSphere

vCenter Service

LDAP vCenter Database

vCenter Chargeback
vCenter Chargeback Server Data Collectors
VM VM VM VM VM VM VM VM VM

vSphere Client

vShield
VM VM VM VM VM VM VM VM VM VM VM VM

vShield Manager and vShield Edge Virtual Appliances
VM VM VM VM VM VM VM VM VM VM VM VM VM VM VM

vCenter Chargeback Database vCenter Chargeback Web Interface

VM

wa re

VM

wa re

VM

wa re

VM

wa re

VM

wa re

VM

wa re

ESX/ESXi Hosts

vCloud Agent

vCloud Agent

vCloud Agent

vCloud Agent

vCloud Agent

vCloud Agent

Datastores

Figure 2. Core vCloud Logical Architecture

1.4 vCloud Infrastructure
From an infrastructure perspective, a vCloud is built on a foundation of virtual infrastructure, whose components are split between a management cluster and cloud consumer resources.
Cloud Consumer Resources

Management Cluster

Compute

Storage

Networking

Virtual Infrastrucure
Figure 3. vCloud Infrastructure

In building a vCloud, we assume that all management components, such as vCenter Server and vCenter Chargeback Server, will run in virtual machines. As a best practice of separating resources allocated for management functions from pure user-requested workloads, the underlying vSphere clusters will be split into two logical groups: • A single management cluster running all core components and services needed to run the cloud. • Remaining available vCenter clusters are aggregated into a pool called “cloud consumer resources”. These clusters will be under the control of VMware vCloud Director. Multiple clusters can be managed by the same vCenter Server or different vCenter Servers, but vCloud Director will be managing the clusters through the vCenter Servers.

T ECHNICAL W HI T E P A P E R / 1 1

VMware vCloud Architecting a vCloud

Reasons for organizing and separating vSphere resources include: • Ensuring that management components are separate from the resources they are managing. • Minimizing overhead for cloud consumer resources. Resources allocated for cloud use have little overhead reserved. For example, cloud resource groups would not host vCenter virtual machines. • Dedicating resources for the cloud. Resources can be consistently and transparently managed and carved up, and scaled horizontally.
Cloud Consumer Resources

Management Cluster

Compute

Storage

Networking

Virtual Infrastrucure

The underlying vSphere Infrastructure will follow vSphere best practices. Design considerations specific to a vCloud will be addressed accordingly in this document, organized by the vCloud management cluster and cloud consumer resources. 1.4.1 vCloud Management Cluster
Cloud Consumer Resources

Management Cluster

Compute

Storage

Networking

Virtual Infrastrucure

The management cluster will follow vSphere best practices to facilitate load balancing, redundancy, and high availability. 1.4.2 Compute Resources Compute resources for the management cluster will follow vSphere best practices where possible, including but not limited to VMware DRS, HA, and FT. To facilitate VMware HA, a cluster of three VMware ESXi hosts will be used. While additional hosts can be added, three hosts supporting just vCloud management components should be sufficient for typical vCloud environments. Detailed sizing guidance of the management cluster is provided later in this document. Use a VMware HA percentage-based admission control policy in an “N+1” fashion instead of dedicating a single host for host failures or defining the amount of host failures a cluster can tolerate. This will allow the management workloads to run evenly across the hosts in the cluster without the need to dedicate a host strictly for host failure situations. Additional hosts can be added to the management cluster for N+2 or more redundancy but this is not required by the current vCloud Service Definitions. Use VMware HA (including VM Monitoring) and/or FT, where possible, to protect the management virtual machines. vCenter Site Recovery Manager (SRM) can be used to protect components of the management cluster. At this time, vCenter Site Recovery Manager will not be used to protect vCloud Director cells because a secondary (DR) site is out of scope of the vCloud, and changes to IP addresses and schemas in recovered vCloud Director cells can result in problems.

T ECHNICAL W HI T E P A P E R / 1 2

VMware vCloud Architecting a vCloud

Unlike a traditional vSphere environment where vCenter Server is used by administrators to provision virtual machines, vCenter Server plays an integral role in end-user self-service provisioning by handling all virtual machine deployment requests by vCloud Director. Therefore, VMware recommends that vCenter Servers are made available with a solution such as vCenter Heartbeat. Since FT is not supported for a multi-vCPU virtual machine, this is another reason for using vCenter Heartbeat for high resiliency. 1.4.3 Storage Resources Shared storage in the management cluster will be configured to include, but not limited to, the following: • Storage paths will be redundant at the host (connector), switch, and storage array levels. • All hosts in a cluster will have access to the same datastores. 1.4.4 Networking Resources Host networking in the management cluster will be configured to include (but not limited to) the following: • Logical separation of network traffic for security and load considerations by type (management, virtual machine, vMotion/FT, IP storage. • Network component and path redundancy. • At least 10GigE or GigE network speeds, if available. • Use of vNetwork distributed switches where possible for network management simplification. The architecture calls for the use of vNetwork distributed switches in the user workload resource group, so it is a best practice to use the vNetwork Distributed Switch across all of your clusters, including the management cluster. • Increasing the MTU size of the physical switches as well as the vNetwork distributed switches to at least 1524 (default is 1500) to accommodate the additional MAC header information used by vCloud Director Network Isolation links. vCloud Director Network Isolation is called for by the Service Definition and the architecture found later in this document. This needs to be done on the transport network for vCloud Director Network Isolation. Failure to increase the MTU size could affect performance due to packet fragmentation affecting network throughput of virtual machines hosted on the vCloud infrastructure. 1.4.5 Component Placement Management components running as virtual machines in the management cluster include the following: • vCenter Server(s) and vCenter Database • vCloud Director Cell(s) and vCloud Director Database • vCenter Chargeback Server(s) • vShield Manager (one per vCenter Server) Note: vShield Edge appliances are deployed automatically by vCloud Director through vShield Manager as needed and will reside in the vCloud consumer resource clusters, not in the management cluster. They will be placed in a system resource pool by vCloud Director and vCenter. For additional information on the vShield Edge appliance and its functions, refer to the vShield Manager Administrator guides. Optional management functions, deployed as virtual machines include: • vCenter Update Manager • vCenter Capacity IQ • VMware Management Assistant • vCenter Orchestrator (part of vCenter Server) • vCloud Request Manager and associated database The optional management virtual machines are not required by the Service Definition but they are highly recommended to increase the operational efficiency of the solution. Database components, if running on the same platform, can be placed on the same database server. For example, the databases used by vCloud Director, vCenter Chargeback, and vCloud Request Manager can be consolidated on the same database server.
T ECHNICAL W HI T E P A P E R / 1 3

VMware vCloud Architecting a vCloud

For more information on the resources needed by the virtual machines in the management cluster refer to the Sizing section later in this document. 1.4.6 vCloud Consumer Resources
Cloud Consumer Resources

Management Cluster

Compute

Storage

Networking

Virtual Infrastrucure

The cloud consumer resources represent vCenter clusters to host cloud workloads. These resources will be carved up by vCloud Director. We’ll cover vCloud Director cloud constructs and definitions in the next section first before drilling down on the compute, storage, and networking resources. 1.4.7 vCloud Logical Infrastructure In summary, the vCloud logical architecture with vSphere resource separation is depicted as follows.

Management Cluster
VM VM VM VM VM VM VM VM VM VM VM VM VM

vCloud Consumer Resources
VM VM VM VM VM VM VM VM VM VM VM VM VM VM VM VM VM VM VM VM VM VM VM

VM

wa

VM

re

wa

VM

re

wa

VM

re

wa

VM

re

wa

VM

re

wa

re

vCloud infrastructure virtual machine vCenter Servers & vCenter Database vCloud Director Cells & vCloud Director Database vCenter Chargeback Servers vShield Manager (1 per vCenter Server) Optional Management Functions, deployed as virtual machines vCenter Update Manager vCenter Capacity IQ VMware Management Assistant vCenter Orchestrator (part of vCenter Server) vCloud Request Manager No user workloads
VM

VM VM VM VM VM VM

VM VM VM VM VM VM

VM VM VM VM VM VM

VM VM VM VM VM

VM

wa

VM

re

wa

VM

re

wa

VM

re

wa

re

VM VM VM VM VM VM VM

VM VM VM VM VM VM

VM VM VM VM VM VM

VM VM VM VM VM

VM

wa

VM

re

wa

VM

re

wa

VM

re

wa

re

Space allocated to user workloads vCloud infrastructure virtual machines (small footprint) vShield Edge virtual appliances

Figure 4 . vCloud Logical Architecture

The management cluster may also include virtual machines or have access to servers that provide infrastructure services such as directory (LDAP/AD), timekeeping (NTP), networking (DNS, DHCP), and security (certificate). Detailed considerations for sizing are addressed in the Sizing section.

T ECHNICAL W HI T E P A P E R / 1 4

VMware vCloud Architecting a vCloud

The management cluster resides in a single physical site. vCloud consumer resources also reside within the same physical site, ensuring a consistent level of service. Otherwise, latency issues might arise if workloads need to be moved from one site to another, over a slower or less reliable network. For definition purposes, this “cloud” is defined under the context of a single physical site, and does not span multiple sites. Considerations for connecting clouds representing different sites are addressed later in this document. Secondary DR sites are discussed later in the Disaster Recovery section of this document.

2. vCloud Director Constructs
VMware vCloud Director introduces logical constructs, such as virtual datacenters, and security boundaries, such as organizations, to facilitate multi-tenancy consumption of resources. The following diagram depicts the logical constructs within vCloud Director that abstract underlying vSphere resources.
Organization A Users Access Control Users Organization B Access Control

Catalogs

Provisioning Policies

Catalogs

Provisioning Policies

User Clouds

User Clouds

Organization vDCs

vApp (VMs with vApp Network)

Organization vDCs

vApp (VMs with vApp Network)

vSphere
vApp Network Organization Network External Networks Organization Network

Port Groups or dvPort Groups

Organization vDCs Provider vDC: Gold

Organization vDCs Provider vDC: Silver

Organization vDC Provider vDC: Bronze

Resource Pools Host Cluster Datastores

Figure 5 . vCloud Director Construct to vSphere Mapping

v C LO U D D I r E ctO r C O N S tr U ct

D E S cr I P t I O N

Provider Virtual Datacenter

Logical grouping of vSphere compute resources (attached vSphere cluster and one or more datastores) for the purposes of providing cloud resources to consumers. A unit of administration that represents a logical collection of users, groups, and computing resources, and also serves as a security boundary from which only users of a particular organization can deploy workloads and have visibility into such workloads in the cloud. In the simplest term, an organization = an association of related end consumers.

Organization

T ECHNICAL W HI T E P A P E R / 1 5

VMware vCloud Architecting a vCloud

v C LO U D D I r E ctO r C O N S tr U ct

D E S cr I P t I O N

Organization Virtual Datacenter

Subset allocation of a provider virtual datacenter’s resources assigned to an organization, backed by a vCenter resource pool automatically created by vCloud Director. An organization virtual datacenter allocates resources using one of three models: •Pay as you go •Reservation •Allocation

vApp Templates and Media Catalogs

A collection of available services for consumption. Catalogs contain vApp templates (preconfigured containers of one or more virtual machines) and/or media (ISO images of operating systems). A network that connects to the outside using an existing vSphere network port group. A network visible within an organization. It can be an external organization network with connectivity to an external network, and use a direct or routed connection, or it can be an internal network visible only to vApps within the organization. A network visible within a vApp. It can be connected to other vApp networks within an organization and use a direct or routed connection, or it can be an internal network visible only to virtual machines within the vApp. A set of pre-allocated networks that vCloud Director can draw upon as needed to create private networks and NAT-routed networks.

External Network Organization Network

vApp Network

Network Pool (not shown in diagram)

Table 3. vCloud Director Constructs

3. vCloud Consumer Resources
3.1 Cloud Consumer Resources
Cloud Consumer Resources

Management Cluster

Compute

Storage

Networking

Virtual Infrastrucure

The cloud consumer resources are dedicated vCenter clusters that host cloud workloads. These resources are carved up by vCloud Director in the form of one or more provider virtual datacenters, which is a vCenter cluster, and one or more attached datastores. Networking for the resource group will encompass vSphere networks visible to the hosts in that cluster. Provider virtual datacenters are further carved up into organization virtual datacenters, which are backed by vCenter resource pools.
T ECHNICAL W HI T E P A P E R / 1 6

VMware vCloud Architecting a vCloud

Host Cluster

(Includes visible networking)

vCloud Consumer Resource

=

+

Datastore

Provider Virtual Data Center

Organization Virtual Data Center

=
Resource Pool
Figure 6. vCloud Consumer Resource Mapping

3.2 Establish Provider Virtual Datacenters
A provider virtual datacenter is a construct in vCloud Director that maps to a vSphere cluster or resource pool and one or more datastores. When creating a provider virtual datacenter, take the following rules and guidelines into consideration: • At least one provider virtual datacenter is required for a vCloud. • A provider virtual datacenter can map to one and only one cluster. Once a cluster is attached to a provider virtual datacenter, it is no longer available for attachment to another provider virtual datacenter. • While it is possible to back a provider virtual datacenter with a resource pool instead of a cluster, the best practice is to use a cluster. If additional hosts are later added to the cluster, the backed provider virtual datacenter automatically grows as well. This is not the case if a resource pool is used. Also, since vCloud Director manages vSphere resources by proxy through a vCenter Server and automatically creates resource pools within vCenter as needed to instantiate organization virtual datacenters, using vCenter Server to create resource pools or nested pools can go against the efficient allocation of resources by vCloud Director. Multiple parent-level resource pools can also add unnecessary complexity and lead to unpredictable results or inefficient use of resources, if the reservations are not set appropriately. • It is not possible to attach a second cluster to a provider virtual datacenter at this time. If additional compute capacity is required, add more hosts in the vCenter cluster on the vSphere end. • One or more datastores can be attached to a provider virtual datacenter. A datastore can be assigned to multiple provider virtual datacenters. As a best practice in segmenting storage, datastores should not be shared by multiple provider virtual datacenters. • Create multiple provider virtual datacenters to differentiate computing levels or performance characteristics of a service offering. Segment by capacity, availability, or performance type. An example of differentiating by availability would be N+1 for a Bronze provider virtual datacenter vs. N+2 for a Silver provider virtual datacenter. • As the level of expected consumption increases for a given provider virtual datacenter, add additional hosts to the cluster from vCenter and attach more datastores. • As the number of hosts in the cluster backing a provider virtual datacenter approaches the halfway mark of vSphere limits, consider implementing controls to preserve headroom. Do this well ahead of approaching the cluster limits. For example, do not allow additional tenants for this particular virtual datacenter and utilize the additional hosts to be added to address increased resource demand for the existing tenants. • If the cluster backing a provider virtual datacenter has reached the maximum number of hosts per vSphere design guidelines, create a new provider virtual datacenter associated with a new cluster. A provider virtual datacenter cannot span multiple host clusters.

T ECHNICAL W HI T E P A P E R / 1 7

VMware vCloud Architecting a vCloud

Refer to the Service Definition for guidance on the size of vSphere clusters and datastores to attach when creating a provider virtual datacenter. Consider: • Expected number of virtual machines • Size of virtual machines (CPU, RAM, disk) 3.2.1 Public Cloud Considerations Considerations for a public vCloud include creating multiple provider virtual datacenters based on tiers of service that will be provided. Since provider virtual datacenters only contain CPU, memory, and storage resources and those are common across all of the requirements in the Service Definition for Public Cloud, you should create one large provider virtual datacenter attached to a vSphere cluster that has sufficient capacity to run 1,500 virtual machines. You should also leave overhead to grow the cluster with more resources up to the maximum of 32 hosts, should organizations need to grow in the future. If you determine that your hosts do not have sufficient capacity to run the maximum number of virtual machines called out by the Service Definition for Public Cloud, then you will need additional provider virtual datacenters. 3.2.2 Private Cloud Considerations Given that a provider virtual datacenter represents a vSphere cluster, it is commonly accepted that a single provider virtual datacenter be established. Since provider virtual datacenters only contain CPU, memory, and storage resources and those are common across all of the requirements in the Service Definition for Private Cloud, you should create one large provider virtual datacenter attached to a cluster that has sufficient capacity to run 400 virtual machines. Refer to the Service Definition for Private Cloud for details on the service tier(s) called for. Should it be determined that existing host capacity can’t meet the requirement, or there’s a desire to segment capacity along the lines of equipment type (for example, CPU types in different provider virtual datacenters), then establish a provider virtual datacenter for Pay-As-You-Go use cases and a separate provider virtual datacenter for the resource-reserved use cases. 3.2.3 Provider Virtual Datacenter Special Use Cases There are instances where a provider virtual datacenter must be viewed as “special purpose” in one way or another. Special use-case provider virtual datacenters are a great example of what makes cloud computing so flexible and powerful. The primary driver behind this need is to satisfy the license restrictions imposed by a specific software vendor that stipulates that all the processors that could run specific software must be licensed for it, regardless of whether or not they actually are running that software. In order to meet the EULA requirements of such a software vendor, you can establish a purpose-specific provider virtual datacenter, populated with enough sockets of processing power to meet the need but limited to no more than what is needed in order to keep licensing costs down. An example of this, in practice, is establishing an Oracle-only provider virtual datacenter. Since a provider virtual datacenter is backed by a cluster, you can name a cluster for a special purpose and publish it to any and all clouds that might need the service. You then maintain enough paid licenses to cover all the sockets in that cluster, and you are covered under the EULA, because the guests can only run on one of the sockets in that cluster/resource pool. There is some level of user education needed to verify that all Oracle instances are deployed to the special purpose virtual datacenter; that is, vCloud Director does not provide a way to prevent someone from incorrectly deploying virtual machines. So, the enforcement has to be manual, typically through organizational processes. In the following example, you could name the virtual datacenter with a descriptive name to deploy all Oracle instances, or insert instructions in the vApp name so that users deploy to the correct virtual datacenter:
Oracle Database Use Only! PvDC

T ECHNICAL W HI T E P A P E R / 1 8

VMware vCloud Architecting a vCloud

3.2.4 Compute Resources Considerations
Cloud Consumer Resources

Management Cluster

Compute

Storage

Networking

Virtual Infrastrucure

All hosts in will be configured per vSphere best practices, similar to the management cluster. VMware HA will also be used to protect against host and virtual machine failures. Provider vDCs can be of different compute capacity sizes (number of hosts, number of cores, performance of hosts) to support differentiation of compute resources by capacity or performance for service level tiering purposes. Organization vDCs in turn should be created based on what services are planned. For a detailed look at how to size the vCloud, refer to the Sizing section later in this document. The following table lists out the requirements for each of the components that will run in the vCloud Director management cluster. For the number of virtual machines and organizations listed in the Service Definitions you will not need to worry about scaling too far beyond the provided numbers.
It E m v C PU MEmOrY StO raG E N E tw O r K I N G

vCenter Server Oracle Database vCloud Director x 2 (stats for each) vCenter Chargeback vShield Manager TOTAL

2 4 2 2 1 11

8 GB 16 GB 4 GB 8 GB 4 GB 40 GB

20 GB 100 GB 10 GB 30 GB 512 MB 161 GB*

100 MB 1 GigE 1 GigE 1 GigE 100 MB 3 GigE*

* Numbers rounded up or down will not impact overall sizing Table 4 . Component Requirements for a Management Cluster

For the table above, the Oracle Database will be shared between the vCenter Server, the vCloud Director cells, and the vCenter Chargeback Server. Different users and instances should be used for each database instance in-line with VMware best practices. In addition to the storage requirements above, a NFS volume is required to be mounted and shared by each vCloud Director cell to facilitate uploading of vApps from cloud consumers. The size for this volume will vary depending on how many concurrent uploads are in progress. Once an upload completes the vApp is moved to permanent storage on the datastores backing the catalogs for each organization and the data no longer resides on the NFS volume. The recommended starting size for the NFS transfer volume is 250 GB. You should monitor this volume and increase the size should you experience more concurrent or larger uploads in your environment.

T ECHNICAL W HI T E P A P E R / 1 9

VMware vCloud Architecting a vCloud

3.2.5 Storage Resources Considerations
Cloud Consumer Resources

Management Cluster

Compute

Storage

Networking

Virtual Infrastrucure

Shared storage in the consumer resources will be configured per vSphere best practices, similar to the management cluster. Storage types supported by vSphere will be used. The use of RDMs in the vCloud infrastructure is currently not supported and should be avoided. Creation of datastores will need to take into consideration Service Definition requirements and workload use cases, which will affect the number and size of datastores to be created. vCloud Director will assign datastores for use through provider virtual datacenters, and only existing vSphere datastores can be assigned. Datastores attached to provider virtual datacenters will be used for vCloud workloads, known as vApps. vSphere best practices apply for datastore sizing in terms of number and size. Vary datastore size or shared storage characteristic if providing differentiated or tiered levels of service. Sizing considerations include: • Datastore storage expectations: ––Size a datastore sufficiently to allow for placement of multiple vApps, avoiding creating small datastores that can house only one or two vApps. A few large datastores are preferred over many small datastores, especially since consumers are not allowed to choose which datastores to place their workload on when selecting a virtual datacenter with more than one datastore; vCloud Director will choose the datastore with the most free space available. ––What is the average vApp size x number of vApps x spare capacity? For example: Average virtual machine size * # virtual machines * (1+ % headroom) ––What is the average virtual machine disk size? ––How many virtual machines are in a vApp? ––How many virtual machines are to be expected? ––How much spare capacity do you want to allocate for room for growth (express in a percentage)? ––Will expected workloads be transient or static? • Datastore performance characteristics: ––Will expected workloads be disk intensive? ––What are the performance characteristics of the associated cluster? Refer to the requirements in the Service Definition and size your datastores accordingly. Additionally, an NFS share must be set up and made visible to all hosts for use by vCloud Director for transferring files in a vCloud Director multi-cell environment. NFS is the required protocol for the transfer volume. Refer to the vCloud Director Installation and Configuration Guide for more information on where to mount this volume. See the Workload Availability section for additional storage and storage I/O factors to take into account.

T ECHNICAL W HI T E P A P E R / 2 0

VMware vCloud Architecting a vCloud

3.2.6 Networking Resources Considerations
Cloud Consumer Resources

Management Cluster

Compute

Storage

Networking

Virtual Infrastrucure

Host networking for hosts within a provider vDC will be configured per vSphere best practices in the same manner as the vCloud management cluster. In addition, the number of vNetwork Distributed Switch ports per host should be increased from the default value of 128 to the maximum of 4096. Increasing the ports will allow for vCloud Director to dynamically create port groups as necessary for the private organization networks created later in this document. Refer to the vSphere Administrator Guide for more information on increasing this value. Networking at the provider and organization virtual datacenter level is detailed in the next section on providing cloud resources.

3.3 Multi-Site/Multi-Geo Clouds
vCloud Director is neither designed nor supported for multi-site deployments in the currently shipping version of the product, due to potential issues with network latency and reliability. In an environment with multiple sites, each site should be a separate cloud that can be potentially interconnected, rather than having a single cloud that spans the sites. Multi-site can mean a lot of different things to different audiences. Some providers would like to have one user interface that encompasses all of their sites. Some providers don’t mind having multiple interfaces but they would like the same services available in each location. These scenarios are discussed here. 3.3.1 Scenario #1—Common User Interface In our example for scenario 1 we have two physical locations where we are providing cloud service and we want both of them to be in the same vCloud Director interface. The user interface is provided from the vCloud Director cells, which can sit in either location or in both locations. One of the vCloud Director cells will serve as the proxy for a vCenter Server in one of the sites as illustrated below.

Site 1
vCD vCD vCD

Site 2
vCloud Director
vCD vCD vCD

vCenter Server
VM VM VM VM VM VM

vCenter Server
VM VM VM VM VM VM

VM

wa

VM

re

wa

re

Figure 7. Two Sites with Local vCloud Director Instances Managing Local vCenters

T ECHNICAL W HI T E P A P E R / 2 1

VMware vCloud Architecting a vCloud

The local vCenter servers will control resources local to each site. This is a very logical setup of the infrastructure until you look at some of the user flows. If we are a user that is coming into site #1 requesting remote console access to a virtual machine in site #1 we are not guaranteed to have all traffic stay in site #1. This is because we cannot control which vCloud Director cell is the proxy for which vCenter Server. We could come into a vCloud Director cell in site #1 which would then have to talk to the proxy for vCenter server #1 in site #2. That vCloud Director cell would then talk back to the vCenter server in site #1 that would then finish setting up the remote console connection to the local ESXi host with the virtual machine in question in site #1. Traffic at that time would then flow through the vCloud Director cell that initiated the request in site #1. This is all illustrated below.

Site 1
vCD vCD vCD

Site 2
vCD vCD vCD

vCenter Server
VM VM VM VM VM VM

vCenter Server
VM VM VM VM VM VM

VM

wa

VM

re

wa

re

Figure 8. Remote Console Flow

One of the problems with this setup is how do you control which vCloud Director cell a user gets terminated on, based on virtual machine and site specific data? It’s next to impossible to continue to figure this out and provide that logic to a load balancer. Another problem with the scenario is we need to have a central Oracle database for all of the vCloud Director cells from both sites. This creates even more traffic on the link between the 2 sites since the message bus in vCloud Director uses the Oracle database for communication. Overall this solution is less than optimal and only suggested for cross-campus multi-site configurations where site-to-site communication will not overwhelm the network and where network availability is highly reliable. 3.3.2 Scenario #2—Common Set of Services A more pragmatic approach to multi-site setups is to have a single vCloud Director setup in each of the sites that is isolated from other sites. This solves the network cross-talk issue but it introduces even more problems of its own. For example, how do you provide a common set of services across the different sites? How do you keep organization names and rights as well as catalogs, networks, storage, and other information common across the different sites? Currently there is no mechanism to do this in the currently shipping vCloud Director product. Using other VMware technologies included in the vSphere suite of products you can synchronize cloud deployments using automation scripts and provide common sets of services across locations. In an enterprise, a private vCloud maps to a single site. Multiple vClouds can be connected using vCloud Connector for offline vApp migrations. A public vCloud can be connected to form a hybrid cloud. 3.3.3 Suggested Deployment Multi-site deployments are not officially supported by VMware at this time. However, if you are still going to create a multi-site deployment, then the recommended way to deploy multi-site solutions is to set up an isolated vCloud Director instance in each location. This isolated vCloud Director instance would include local vCloud

T ECHNICAL W HI T E P A P E R / 2 2

VMware vCloud Architecting a vCloud

Director cells, vCenter Servers, an Oracle database instance, a vCenter Chargeback instance, and local vSphere resources as illustrated in the picture below.

Site 1
vCD vCD vCD vCD

Site 2
vCD vCD

vCenter Server
VM VM VM VM VM VM

vCenter Server
VM VM VM VM VM VM

VM

wa

VM

re

wa

re

Figure 9. Two Sites with Isolated vCloud Director Instances

In order to keep the sites synchronized with organization and resource information, VMware encourages you to create a set of onboarding scripts and workflows. These workflows would be used anytime you need to create a new organization or a new resource for an organization and would drive that creation across all cloud sites. The VMware cloud services organization can assist you in the creation of these customer specific workflows based on templates the cloud practice already has. VMware cloud services has created these template workflows using the vCenter Orchestrator product that is included with vSphere. By using the workflows for administrative resource creation you can keep multiple clouds synchronized with organization resources. 3.3.4 Other Multi-Site Considerations When creating multi-site configurations, there are resources out of the control of the vCloud setup that require some thought. How do you set up networking between the sites? How do you handle IP addressing? It is for these physical resource decisions, varying between customers, that we have not provided specific guidance on in the reference architecture. Setting up these physical resources is also not included in the sample scripts previously mentioned. 3.3.5 Merging Chargeback Reports In our reference multi-site setup we included multiple vCenter Chargeback instances. In order to provide one common bill or usage report to your cloud consumer you must aggregate all of the chargeback reports into one report. You can leverage the vCenter Chargeback API as well as vCenter Orchestrator to pull chargeback reports from each vCenter Chargeback server and consolidate them into one master report. 3.3.6 Synchronizing Catalogs Synchronizing catalogs between sites is the most time consuming task. When setting up multiple cloud sites you should designate one site as the master site for template creation and have all other sites be replication peers. It is advisable to leverage native storage array replication to replicate the storage for the templates in each catalog. Array replication can provide several benefits for long distance data movement including data de-duplication and compression. Once the data is synchronized you can leverage the catalog synchronization workflows provided by VMwarevCloud API to import the replicated templates into the appropriate catalogs in VMware vCloud Director. Synchronizing templates added at remote sites is out of scope for this version of the reference architecture. This feature can be added selectively by engaging VMware Professional Services.

T ECHNICAL W HI T E P A P E R / 2 3

VMware vCloud Architecting a vCloud

4. Providing Cloud Resources
4.1 Establish Organizations
A vCloud contains one or more organizations. Each organization represents a collection of end consumers, groups, and computing resources. Users authenticate at the organization level, using credentials established by an organization administrator locally within vCloud Director or LDAP. LDAP integration can be done at the cloud system level or per organization. Before you can configure LDAP on a per organization level, you must configure LDAP at the cloud system level. Set this up based on the cloud organization’s requirements. Please see the vCloud Installation and Configuration Guide on how to set up the LDAP service with vCloud Director. Users in an organization consume resources by selecting vApps from a predefined catalog. When creating organizations, the name of the organization will be used in the URL to access the GUI for that organization. As an example, ACME would be accessed at https://<hostname>/cloud/org/ACME. You should take care to avoid special characters or spaces in the organization name since that will affect the URL in undesirable ways. You can use the system defaults for most of the other organization settings. The one exception is leases, quotas, and limits. There are no specific requirements called out by the Service Definition for leases, quotas, and limits. The provider should set these values to whatever works best in their cloud. 4.1.1 Administrative Organization A vCloud requires at least one organization. As a best practice, the first organization to be created should be an administrative organization. This organization will own a master catalog of vApp templates that are published and shared with all other (standard) organizations. Administrators assigned to the administrative organization will also be responsible for creating official template virtual machines for placement in the master catalog for other organizations to use. Virtual machines in development should be stored in a separate development catalog that is not shared with other organizations. As a note of reference, there is already a default System organization in the vCloud Director environment. The administrative organization being created here is different from the built-in System organization since it can actually create vApps and catalogs and share them. Make sure that when you create the administrative organization you set it up to allow publishing of catalogs. 4.1.2 Standard Organizations Create an organization for each tenant of the vCloud as necessary. Each of the standard organizations should be created with the following considerations: • Cannot publish global catalogs • Use system defaults for SMTP • Use system defaults for notification settings • Use leases, quotas, and limits meeting the provider’s requirements

4.2 Establish Networking Options
Workloads in the cloud consumer resources will have network connectivity at two levels: • External networks, used to connect to the outside. These are mapped to vSphere networks. • Internal or NAT-routed networks, used to facilitate VM-to-VM communication within a cloud. These are backed by vCloud Director network pools.

T ECHNICAL W HI T E P A P E R / 2 4

VMware vCloud Architecting a vCloud

4.2.1 External Networks • An external network provides connectivity “outside” an organization through an existing, preconfigured vSphere network port group. The vSphere port groups can be created using standard vSwitch port groups, vNetwork Distributed Switch port groups, or the Cisco Nexus 1000V. • In a public vCloud, these preconfigured port groups will provide access through the Internet to customer networks, typically using VPN or MPLS terminations. • When creating an external network, make sure to have sufficient vSphere port groups created and made available for virtual machine access in the vCloud. 4.2.2 Network Pools • vCloud Director creates a private network as needed from a pool of networks to facilitate VM-to-VM communication and NAT-routed networks. vCloud Director supports one of three methods to back network pools: • vSphere port group. vCloud Director uses one of many existing, preconfigured vSphere networks. The networks themselves can have VLAN tagging for additional security. • VLAN. vCloud Director automatically uses VLAN tagging from a range provided to segment networks to create internal networks (organization and vApp networks) as needed. This assumes that vCloud Director and all the managed hosts have access to the VLANs on the physical network. • vCloud Director Network Isolation. vCloud Director automatically creates internal networks using MAC-in-MAC encapsulation. The following table compares the three options for a network pool.
C O N S I D E rat I O N v S P H E r E P O rt Gr O U P B ac K E D V L A N B ac K E D v C LO U D N E tw O r K I S O L at I O N B ac K E D

How it works

Isolated port groups must be created and exist on all hosts in cluster Only option compatible with Cisco Nexus 1000V

Uses range of available, unused VLANs dedicated for vCloud Best network performance vCloud Director creates portgroups as needed

Overlay network (with network ID) created for each isolated network Optionally requires one VLAN per vCloud Director Network Isolation backed network pool More secure than VLAN backed option vCloud Director creates portgroups as needed

Advantages

Disadvantages

Requires manual creation and management of portgroups Possible to use a portgroup that is in fact not isolated

Chance of running out of VLAN IDs

Overhead required for MAC-in-MAC encapsulation

Table 5. Network Pool Options

T ECHNICAL W HI T E P A P E R / 2 5

VMware vCloud Architecting a vCloud

Considerations when using a vSphere port group-backed network pool include: • Standard or distributed virtual switches may be used. • vCloud Director does not automatically create port groups. You must manually create these ahead of time for vCloud Director to use. Considerations when using a VLAN-backed network pool include: • Organization and vApp networks created by vCloud Director out of VLAN backed network pools are private to an organization or vApp, respectively. • Hosts in the cluster backing the provider vDC used by the organization vDC must be connected to VLAN trunk ports. • vNetwork distributed switches are required for all hosts and the cluster backing the provider vDC used by the organization vDC that draws from the network pool. • vCloud Director creates port groups automatically as needed. Considerations when using a vCloud Network Isolation-backed network pool include: • vNetwork distributed switches are required for all hosts and the cluster backing the provider vDC used by the organization vDC that draws from the network pool. • Increase the MTU size of the physical switches as well as the vNetwork distributed switches to at least 1524 to accommodate the additional MAC header information used by vCloud Director Network Isolation links. Failure to increase the MTU size could affect performance due to packet fragmentation affecting network throughput of virtual machines hosted on the vCloud infrastructure. • Specify a VLAN ID for the MAC-in-MAC transport network (this is optional but recommended for security). Leaving this blank will default to VLAN 0. • vCloud Director creates port groups automatically on vNetwork distributed switches as needed. • Private networks backed by vCloud Director Network Isolation use fewer VLAN IDs. • Organization and vApp networks created by vCloud Director out of vCloud Director Network Isolation backed network pools are private with respect to an organization or vApp, respectively. 4.2.3 Cisco Nexus 1000V Considerations In vCloud Director 1.0, the Cisco Nexus 1000V is only supported with the vSphere port group-backed option for network pools, which also happens to be the least flexible. This is also true for the VMware vNetwork Standard Switch, while the vNetwork Distributed Switch will support all network types. Cisco Nexus 1000V requires a vNetwork Distributed Switch and therefore vSphere Enterprise Plus licensing. The Cisco Nexus 1000V is typically deployed in a vSphere environment to provide increased network visibility, common management and advanced layer 2 security and quality of service functionality for virtual networking. As vCloud Director ideally uses dynamic, automatically provisioned, isolated networks; internally this means these requirements for management and security do not directly apply to network pools. However they are relevant to the external networks where traffic enters and exits the vCloud. The next sections walk go into detail on networking options for external networks and network pools, and discuss public vs. private vCloud perspectives.

4.3 Establish Networking Options—Public vCloud Example
4.3.1 External Networks Referencing the Service Definition for a Public Cloud, all service tiers use a shared public Internet connection. To fulfill this, create a single external provider network. Make sure to give the network a descriptive name, such as Provider-Internet, for the case here. You will connect this external network to a vSphere port group that is actually connected to the Internet. Make sure you have the IP information for the physical network you have attached to, including the network mask, default gateway, and DNS information. Lastly, you will create a pool

T ECHNICAL W HI T E P A P E R / 2 6

VMware vCloud Architecting a vCloud

of static IP addresses that will be consumed by vShield Edge appliances (that facilitate a routed connection) each time you connect an organization network to this external network. For sizing purposes, you should create a large enough IP address pool so that each of your organizations can have access to an external network. Per the Service Definition, the estimated number of organizations for 1,500 virtual machines is 25 organizations, so make sure you have at least 25 IP addresses in your static IP pool. More IP addresses should be set aside if you plan to allow inbound access into organizations. 4.3.2 Network Pools In addition to access to external networks, each organization in a public vCloud will have organization-specific private networks. vCloud Director instantiates Isolated L2 networks through the use of network pools. Create a single large network pool for all organizations to share, and limit the use of this network pool when you create each individual organization. The network pool created will use vCloud Network Isolation for separating the traffic. This will use an existing vNetwork Distributed Switch previously created for connecting hosts. Use a VLAN to further segregate all of the vCloud Director Network Isolation traffic in a transport network from the rest of the infrastructure. Because the network pools will be used by both the external organization network and private vApp networks, you will need at least 11 networks in the network pool per organization. Ten of the networks in the pool will be for the private vApp networks according to the Service Definition for a Public Cloud. One of the networks will be used for the protected external organization network. Given the estimate of 25 organizations, you need at least 275 networks in the pool. There is a limitation of a maximum of 4096 networks in a network pool due to the port limitation on the vNetwork Distributed Switch. Ephemeral ports in a vNetwork Distributed Switch are also limited to 1016 per Switch and per vCenter server, further limiting the number of networks that can be instantiated from a network pool. When connecting the network pool to a vNetwork Distributed Switch, make sure you have enough free ports left on the switch (at least 275).

vCloud Datacenter
Organization “ACME Corp.”

Network Pool
Org Net: “ACME-Private” Private Internal Org Net: “ACME-Internet” Private Routed “Provider-Internet”

Figure 10. Example Diagram of Provider Networking for a Public vCloud

4.3.3 Organization Networks Create two different organization networks for each organization, one external organization network and one private internal organization network. You can do this as one step in the vCloud Director UI wizard by selecting the default (recommended) option when creating a new organization network. When naming an organization network, it is a best practice to start with the organization name and a hyphen, for example, ACME-Internet.

T ECHNICAL W HI T E P A P E R / 2 7

VMware vCloud Architecting a vCloud

Per the Service Definition for Public Cloud, the external network will be connected as a routed connection that will leverage vShield Edge for firewalling and NAT to keep traffic separated from other organizations on the same external provider network. Both the external organization network and the internal organization networks will leverage the same vCloud Director Network Isolation network pool previously established. For both the internal network and the external network, you will need to provide a range of IP addresses and associated network information. Since both of the networks will be private networks, you can use RFC 1918 addresses for both static IP address pools. The Service Definition for Public Cloud defines a limit of external connections with a maximum of 8 IP addresses, so you should provide a range of 8 IP addresses only when creating the static IP address pool for the external network. For the private network, you can make the static IP address pool as large as desired. Typically, a full RFC 1918 class C is used for the private network IP pool. The last step is to add external public IP addresses to the vShield Edge configuration on the external organization network. By selecting Configure Services on the external organization network, you can add 8 public IP addresses that can be used by that particular organization. These IP addresses should come from the same subnet as the network that you assigned to the system’s external network static IP pool.

Figure 11. Configure External IPs

4.3.4 Cisco Nexus 1000V Considerations It is important to note that vCloud Director is designed for secure multi-tenancy, and Layer 2 networks are not shared between customers for organization and vApp networks. External networks can be dedicated to customers (for example, MPLS VPNs) and vShield Edge is available to securely share networks such as a common internet VLAN. These capabilities should be considered when determining whether the Cisco Nexus 1000V is required in a vCloud Director deployment. Where it has been determined that the Cisco Nexus 1000V is a requirement, either operationally or technically, the recommended approach is to use the Nexus 1000v for external networks and a VMware vNetwork Distributed Switch for network pools. This approach has the advantage of providing advanced functionality where required, without limiting the flexibility of vCloud Director networking.

T ECHNICAL W HI T E P A P E R / 2 8

VMware vCloud Architecting a vCloud

Using the Cisco Nexus 1000V for vCloud Director network pools is not recommended because this requires increased administrative overhead; the Cisco Nexus 1000V works only with vSphere port group-backed network pools and introduces scalability limits. In this model, organization and vApp networks cannot be created dynamically because port profiles need to be defined on the Cisco Nexus 1000V before being added to vCloud Director. To maintain isolation within these internal networks, each port group will need to be configured with a VLAN ID, which given the limit of 512 active VLANs across all virtual ethernet module (VEMs) managed by a Cisco Nexus 1000V, also potentially limits the total number of networking pools. This is particularly true if one virtual supervisor module (VSM) is managing VEMs in multiple vCloud resource groups, and the 802.1q standard itself is limited to 4096 VLANs. vCloud Director Network Isolation backed-network pools is an approach to address these scalability limits by providing isolation of internal vCloud Director networks using MAC-in-MAC encapsulation over a transport network. Instead of individual VLANs, vCloud Director Network Isolation network IDs are dynamically assigned to these encapsulated networks by vCloud Director. VLAN backed network pools is another option where a range of VLAN IDs is allocated to vCloud Director, which will then assign these VLANs to organization and vApp networks, as required. So while it shares the same scalability limits as port group backed, it reduces manual administrator setup. VLAN and vCloud Director Network Isolation backed network pools are only supported with the vNetwork Distributed Switch. The following diagram illustrates the recommended deployment model with the Cisco Nexus 1000V, used for external networks only.

Resource Group VMware vSwitch
Mgmt Port Group
VLAN 10 pNIC

Service Port Group

VLAN 20

pNIC

Core Physical Switching

Nexus 1000v vDS
Org A External Org B External Org C External Org A VPN Org B VPN
VLAN 110 VLAN 120 VLAN 130 VLAN 140 VLAN 150 pNIC pNIC

Core Physical Switching

VMware vDS
OrgNet1 OrgNet2 OrgNet3 VCD-NI Pool A VCD-NI Pool B
VLAN 210 VLAN 220 VLAN 230 VLAN 298 VLAN 299 pNIC pNIC

Figure 12. vCloud Director Logical Networking w/ Cisco Nexus 1000V

T ECHNICAL W HI T E P A P E R / 2 9

VMware vCloud Architecting a vCloud

4.4 Establish Networking Options—Private vCloud Example
4.4.1 External Networks In general, for a private vCloud, the networking needs are comparable to that of a public vCloud and oftentimes simplified, representing a subset. As such, direct connections from inside the organization to the networking backbone provided by the enterprise are all that is necessary. This is analogous to “extending a wire” from the network switch that contains the network or VLAN to be used all the way through the cloud layers to the organization and into the vApp. One of these direct networks must be established for each network or VLAN to be used in the private vCloud.

Enterprise vCloud
Organization “Software Design”

Network Pool
Org Net: “Internal Network” Private Internal (optional) Org Net: “External Access” Private Direct “Corporate Backbone”

Figure 13. Example Diagram of Provider Networking for a Private vCloud

An important differentiation in a private vCloud vs. a public vCloud is the external network and organization external network. At least one external network is required to enable organization external networks to access resources outside of the vCloud Director resources—the Internet for public cloud deployments and an internal (local) network for private cloud deployments. It is a network that already exists within the address space used by the enterprise. To establish this network, follow the wizard, filling in the network mask, default gateway and other specifications of the LAN segment as required. When building this, specify enough address space for use as static assignments, as this is where vCloud Director draws “Public IP Pool” addresses from. A good starting range is 30 addresses that do not conflict with existing addresses in use, or ranges already committed for DHCP. Note: Static IP Pool address space is not used for DHCP, but the function is similar to that. This pool will be used to provision NAT-type connectivity between the organizations and the cloud services below it. 4.4.2 Network Pools You will need a network in the network pool for every private organization network and external organization network in the vCloud environment. The Service Definition for a Private Cloud calls for one external organization network and the ability for the organization to create private vApp networks. Since there is no minimum called out in the Service Definition for the number of vApp networks a good number of networks to start out with is 10 per organization. Make your network pool as large as the number of organizations times 10.

T ECHNICAL W HI T E P A P E R / 3 0

VMware vCloud Architecting a vCloud

4.4.3 Organization Networks At least one organization external network is required to connect vApps created within the organization to other vApps and/or the networking layers beyond the Private vCloud. To accomplish this, create an external network in the Cloud Resources section (under Manage & Monitor of the System Administration section of the vCloud Director UI). In the wizard, be sure to select a direct connection. This external network maps to an existing vSphere network for virtual machine use as defined in the External Networks section (above). Other networking options are available, like a routed organization external network, and could be used, but add complexity to the design that is normally not needed. For the purpose of this design there are no additional network requirements. For more information on adding additional network options, refer to the vCloud Director Administrator’s Guide. 4.4.4 Cisco Nexus 1000V Considerations The Cisco Nexus 1000V is applicable as a switching fabric in the Enterprise. The caveats expressed in the public vCloud section for Cisco Nexus 1000V also apply to the private vCloud. The primary difference is that the networking rails used as external networks will likely be simpler that those shown for a public vCloud. For example, there’s likely to be only a single “backbone” LAN to connect to that’s also the path to the Internet, as compared to a public vCloud where there will be multiple, distinct network paths. In summary, the Cisco Nexus 1000V is applicable to private vCloud implementations as a switching backbone for external networks only, to represent the one or more core LANs that lead to the rest of the business and/or Internet.

4.5 Establish Organization Virtual Datacenters
An organization virtual datacenter allocates resources from a provider virtual datacenter and makes it available for use for a given organization. Multiple organization virtual datacenters can take from the same provider virtual datacenter. An organization can have multiple organization virtual datacenters. Resources are taken from a provider virtual datacenter and allocated to an organization virtual datacenter using one of three resource allocation models: • Pay as you go. Resources are only reserved and committed for vApps as vApps are created. There is no upfront reservation of resources. • Allocation. A baseline amount (“guarantee”) of resources from the provider virtual datacenter is reserved for the organization virtual datacenter’s exclusive use. An additional percentage of resources is available to oversubscribe CPU and memory, but this taps into compute resources that are shared by other organization virtual datacenters drawing from the provider virtual datacenter. • Reservation. All resources assigned to the organization virtual datacenter are reserved exclusively for the organization virtual datacenter’s use. With all of the above models, the organization can be limited to deploy a certain number of virtual machines. Or, this can also be set to unlimited. The first organization virtual datacenter to be created should be an administration organization virtual datacenter for use by the administration organization. The allocation model is set to “Pay as you go” so as not to take resources from other organization virtual datacenters until they are needed. Subsequent organization virtual datacenters should be created to serve the organizations previously established. In selecting the appropriate allocation model, the Service Definition and organization’s use cases of workloads should be taken into consideration. Take into account anticipated workload capacity, discussed further in the Capacity Management section.

T ECHNICAL W HI T E P A P E R / 3 1

VMware vCloud Architecting a vCloud

4.5.1 Public vCloud Considerations The organization virtual datacenter allocation model maps directly to a corresponding vCenter Chargeback billing model: • Pay as you go. Pricing can be set per virtual machine, and a corresponding speed of a vCPU equivalent can be specified. Billing is unpredictable as it is tied directly to actual usage. • Allocation. Consumers are allocated a baseline set of resources but have the ability to burst by tapping into additional resources as needed, but are typically charged at higher rates for exceeding baseline usage. This model will result in more variable billing but allows for the possibility of more closely aligning variable workloads to their cost. • Reservation. Consumers are allocated and billed for a fixed container of resources, regardless of usage. This model allows for predictable billing and level of service, but consumers may pay for a premium if they do not consume all their allocated resources. These allocation models also map directly to the service tiers found in the Service Definition for a Public Cloud. The Basic vDC model will use the Pay-as-you-go allocation model since instances are only charged for the resources they consume and there is no commitment required from the consumer. The Committed vDC model will use the Allocation Pool model since the consumer is required to commit to a certain level of usage but is also allowed to exceed that usage. The Dedicated vDC model will use the Reservation Pool model since this service tier requires dedicated and guaranteed resources for the consumer. An option to “enable thin provision” allows provisioning virtual machines using thin disks to conserve disk usage. vSphere best practices apply in the use of thin-provisioned virtual disks. The Service Definition for Public Cloud provides detailed and descriptive guidance on how much a provider should charge for each service tier. Chargeback functionality is provided by VMware vCenter Chargeback, which is integrated with VMware vCloud Director. You can reference the VMware vCenter Chargeback User’s Guide for information on how to customize the individual reports generated. For further information, refer to the vCloud Chargeback Models Implementation Guide, which details how to set up vCloud Director and vCenter Chargeback to accommodate instance-based pricing (pay as you go), reservation-based pricing, and allocation-based pricing. 4.5.2 Private vCloud Considerations The organization virtual datacenter allocation model used depends on the type of workloads to be expected. • Pay as you go. A transient environment where workloads are repeatedly deployed and undeployed, such as a demonstration or training environment, would be suited for this model. • Allocation. Elastic workloads that have a steady state but during certain periods of time surge due to special processing needs would be suited for this model. • Reservation. Since a fixed set of resources are guaranteed, infrastructure-type workloads that demand a predictable level of service would run well using this model. When an organization virtual datacenter is created in vCloud Director, vCenter Server automatically creates child resource pools with the appropriate resource reservations and limits, under the resource pool representing the provider virtual datacenter. As part of creating an organization virtual datacenter, a storage limit must be set unless you are using the Pay as you go allocation model, which defaults to unlimited. For the purpose of this architecture there will be no limit on storage consumed by the vApps since we are providing static values for the individual virtual machine storage and we are also limiting the number of virtual machines in an organization. An option to “enable thin provision” allows provisioning virtual machines using thin disks to conserve disk usage. vSphere best practices apply in the use of thin-provisioned virtual disks. This feature can save substantial amounts of storage and have very little performance impact on workloads in the vCloud infrastructure. It is recommended to enable this feature when creating each organization. For more information about this feature, refer to the vCloud Director Administrator’s Guide or the VMware knowledge base.
T ECHNICAL W HI T E P A P E R / 3 2

VMware vCloud Architecting a vCloud

vCloud Request Manager can be used to facilitate policy-based creation of organization virtual datacenters in a private cloud to automate and standardize their creation through the creation and use of a organization virtual datacenter template (or “blueprint”) to deploy new organization virtual datacenters.

4.6 Create vApp Templates and Media Catalogs
The way to consume services in a cloud environment is from a catalog. Catalogs are stored in an organization virtual datacenter. The administrative organization virtual datacenter will have two catalogs: • Internal. Used for developing and staging new vApps and media. • Master. Published and shared to all other organization virtual datacenters. Organizations will use the master catalog that has been published from the administrative organization virtual datacenter with the default cloud templates. In addition, organizations will have a private catalog created by the organization administrator and used for uploading new vApps or media to the individual organization. vApp templates are used to deploy actual vApps. Guest customization can be applied during deployment and can facilitate the joining of vApps to domains. See the Deploy vApps section. There are no other configuration requirements for the catalogs or templates in this cloud architecture. Refer to the Service Definition for a full listing of recommended templates. Typically the templates can be blank virtual machines for installing OSs or virtual machines with applications. 4.6.1 Auto-Joining Active Directory Domains vCloud Director supports guest customization when deploying vApps. Guest customization can be configured to automatically join an Active Directory domain for Windows guests. 4.6.2 Establish Policies During the creation of an organization, you can set policies around the number of deployed and stored virtual machines: • Deployed virtual machines refers to the number of running virtual machines. • Stored virtual machines refers to the total number of virtual machines including virtual machines that are not powered on. You can also specify runtime policies to control vApps and vApp templates in an organization virtual datacenter. Specify the maximum length of time vApps and vApp templates can run and be stored in the organization virtual datacenters: • The runtime lease can be set to allow vApps or vApp templates to run for a defined period of time after which time vApps will be powered off, or set to “never expire”. • The storage lease can be specified, allowing vApps or vApp templates to be stored for a defined period of time, after which time vApps or vApp templates will be automatically cleaned up, or set to “never expire”. When any option for storage lease (with the exception of “never expire”) is selected, the storage will be automatically cleaned up. Additional options include: • Permanently deleted. After the specified period of time, the vApps or vApp templates will automatically be deleted. • Moved to expired items. This flags the vApps or vApp templates for deletion, which hides them from users so that they can no longer be used, allowing an administrator to remove them. The Service Definition for the Public Cloud has specific requirements for the maximum number of virtual machines each organization can have based on size. Refer to the Service Definition for the Public Cloud for the maximum virtual machine count for each of the three tiers of reservation pools.

T ECHNICAL W HI T E P A P E R / 3 3

VMware vCloud Architecting a vCloud

4.6.3 Accessing your vCloud Each organization should have a public URL configured to access the organization’s cloud portal using vCloud Director. These URLs will have the format of https://<vCD-cell-hostname>/cloud/org/<org-Name>. Each time a user of an organization logs in they should point their browser to the organization-specific URL. 4.6.4 Deploy vApps vApps can now be deployed from a catalog of vApp templates. 4.6.5 Employ Chargeback or Showback In a public vCloud, chargeback is essential in accurately metering consumer usage and recouping costs to ensure profitability. In a private vCloud, IT does not necessarily have the same cost pressures as a public vCloud service provider. IT may also not have chargeback procedures or policies in place, as chargeback typically is a financial policy. An alternative to chargeback is showback, which merely attempts to raise awareness of consumption usage and cost without involving formal accounting procedures to bill the usage back to the consumer’s department. To align consumer behavior with the actual cost of the resources being consumed, utilize the vCenter Chargeback reports to provide resource and financial transparency. Without showback or chargeback, consumers do not have the awareness of the actual cost of the resources they have consumed and thus have little incentive to change their consumption patterns. Cloud computing resources can be easily spun up, and with the exception of deployment policies dictating resource leases, there are no disincentives or penalties to curb excessive use. Showback or chargeback will expose heavy or demanding users.

5. Extending vCloud Capabilities
5.1 Core vCloud Components
The core vCloud components include vSphere, vCloud Director, vShield Edge, and vCenter Chargeback.
vCloud Request Manager vCenter Chargeback vCloud Director vShield Edge vCloud Connector vCenter Orchestrator VMware vSphere

vCloud API

Additional components to extend the vCloud capabilities are discussed further below.

5.2 vCloud Request Manager
vCloud Request Manager vCenter Chargeback vCloud Director vShield Edge vCloud Connector vCenter Orchestrator VMware vSphere

vCloud API

T ECHNICAL W HI T E P A P E R / 3 4

VMware vCloud Architecting a vCloud

vCloud Director provides a cloud portal whereby end consumers can self-provision their own workloads. For environments where a formal provisioning process, including a request /approval mechanism, is needed, vCloud Request Manager can be used as a front-end to vCloud Director. vCloud Request Manager 1.0 is largely intended for private vClouds due to the fact that vCloud Request Manager requires cloud system privileges to vCloud Director and therefore would not be ideal in a multi-tenant public cloud. vCloud Request Manager’s primary capabilities include: • Provisioning with approvals • Software license tracking • Policy-based cloud partitioning (new organization creation) In relation to the reference architecture, this new component sits in the functional stack above vCloud Director as a new user entry point/UI, but not obscuring it entirely, as to allow for vCloud Director to be an entry point as well. vCloud Request Manager is a front-end to the default vCloud Director portal to be exposed to those users who need the policy-based approval/provisioning/tracking functionality over the more freeform vCloud Director UI. vCloud Request Manager adds a vApp request/approval cycle to vCloud Director. This adds a flexible mechanism by which vApps in catalogs can be requested, approved, and provisioned, based on business needs. The included workflows can be modified as needed to match these requirements. vCloud Request Manager adds simple software asset tracking and pairs these assets with vApps in the catalog. The quantity of available licenses of a particular package that is being tracked is decremented each time a vApp with that package is provisioned, and incremented when the vApp is destroyed. This license-to-vApp relationship is managed manually inside vCloud Request Manager’s workflow engine. Lastly, vCloud Request Manager provides easy creation tools for new private clouds with approval cycles included. This enables the quick creation of new cloud instances (organizations) based on “blueprints” of cloud policies and parameters. One obvious application for this technology is adding lifecycle management to vApps provisioned in the private clouds, a solution formerly covered by vCenter Lifecycle Manager. vCloud Request Manager also provides a simplified interface for requesting resources from multiple cloud installations. vCloud Request Manager can be explored as an option if you require a single interface for all of your cloud installations. vCloud Request Manager does not provide the full richness of vApp management in a cloud environment. Several third party solutions exist as well that can provide one interface to multiple cloud installations. Refer to the vCloud Request Manager Installation and Configuration Guide for specific installation requirements. vCloud Request Manager runs in its own virtual machine separate from vCloud Director. Note that several objects in vCloud Request Manager map to corresponding vCloud Director objects but do not use consistent terms. They include: • organization in vCloud Request Manager refers to a vCloud Director system instance • location in vCloud Request Manager refers to an organization in vCloud Director • cloud in vCloud Request Manager refers to an organization vDC in vCloud Director

T ECHNICAL W HI T E P A P E R / 3 5

VMware vCloud Architecting a vCloud

5.3 vCloud API
vCloud Request Manager vCenter Chargeback vCloud Director vShield Edge vCloud Connector vCenter Orchestrator VMware vSphere

vCloud API

The vCloud API allows for interacting with a vCloud and can be used to facilitate communication with vCloud Director using a UI other than the portal included with vCloud Director. For example, vCloud Request Manager communicates with vCloud Director using the vCloud API. The vCloud API is the cornerstone to federation and ecosystem support in a vCloud environment. All of the current federation tools talk to the vCloud environment through the vCloud API. The ISV ecosystem also uses the vCloud API to allow their software to talk to vCloud environments. It is very important that a vCloud environment exposes the vCloud API to the cloud consumer. Currently, VMware vCloud Director is the only software package that exposes the vCloud API. In some environments, vCloud Director is deployed behind a portal or in another location not readily accessible to the cloud consumer. In this case there needs to be an API proxy or relay in to have the vCloud API exposed to the end consumer. Due to the value of the vCloud API, some environments may wish to meter API usage and charge extra for it to customers. Protecting the vCloud API through audit trails as well as API inspection is also a good idea. Lastly, there are several cases where cloud providers may wish to extend the vCloud API with new features. In order to aid in several of the vCloud API use cases discussed, the cloud provider may wish to implement an API proxy. The vCloud API is a REST-based service that contains XML payloads. For this reason any suitable XML gateway can be used to proxy the vCloud API. There are several third party solutions on the market that excel in XML gateway services today. VMware has begun to partner with some of these vendors to develop joint guidance on how to deploy their solutions in a vCloud Director environment. For the latest information on these efforts and collateral, please contact your local VMware vCloud specialist.

5.4 vCenter Orchestrator
vCloud Request Manager vCenter Chargeback vCloud Director vShield Edge vCloud Connector vCenter Orchestrator VMware vSphere

vCloud API

vCenter Orchestrator (vCO) is a system for assembling operational workflows. The primary benefit of vCloud Orchestrator is to coordinate multiple systems to achieve a composite operation that would have taken several individual operations on different systems. It is not meant to replace the APIs that it in turn calls; rather, it is meant to bring them together when necessary. In general, if an operation uses only one underlying system, direct access to that system should be considered for efficiency and complexity reduction. In the vCloud use case, orchestration can be used to automate highly repetitive tasks to avoid manual work and potential errors. vCloud Orchestrator has a plug-in framework. There are plug-ins for VMware products such as vCenter Server, vCloud Director, vCenter Chargeback. Thus, vCloud Orchestrator can orchestrate workflows at the VIM API, VIX API, vCloud API, and Chargeback API levels.

T ECHNICAL W HI T E P A P E R / 3 6

VMware vCloud Architecting a vCloud

There are three main categories of use cases that vCloud Orchestrator can help satisfy: • Cloud administration operations • Organization administration operations • Organization consumer operations 5.4.1 Cloud Administration Orchestration Examples Here are some example use cases that highlight the value of vCloud Orchestrator to the cloud owner. These use cases are primarily focused on infrastructure management and the control of the resource allocation process. First, consider the case of a provider who wants to bring a new customer into their vCloud. The major steps would be to create a new organization, users (possibly imported from Active Directory), networks, virtual datacenters, and catalogs. The provider may also want to set up a recurring chargeback report so the tenant can be billed, and possibly send an email notification to the tenant advising them that their new cloud environment is ready. Another example would be the case of a tenant request for additional external network capacity. In this case, the provider may want to automate the creation of the network, which would include name generation, identification and allocation of available VLAN and IP address range, configuration of the network switch and cloud perimeter firewall, creation of the external network in vCenter, and finally allocation of the external network to the tenant’s organization. 5.4.2 Organization Administration Orchestration Examples There are operational tasks within the tenant’s organization that can benefit from automation as well. These are typically tasks that address the vApp and virtual machine lifecycle management tasks including creation, configuration, routine maintenance, and decommissioning. Consider the case of virtual machine creation in an environment that uses Active Directory to identify services such as authentication and printing. After deployment, the virtual machine must join the Active Directory domain. In many cases it is preferable to use an organization unit (OU) other than the default Computers Container. vCloud Orchestrator could be used to create the virtual machine’s computer account in the proper OU prior to virtual machine deployment, insuring that the computer account name is unique and residing in the proper OU. Similarly, when the virtual machine is decommissioned, the entry in the OU can be removed as part of the same workflow. Another example is the case where an organization administrator would like to manage recurring updates to a software package or configuration element across several virtual machines in a single operation. In this case, a workflow could be created to accept a list of systems and a source for the software or configuration as parameters, and carry out the activity on each system. 5.4.3 Cloud Consumer Operation Orchestration Examples These operations generally fall into the category of tasks that the organization administrator wants to offload as a self-service operation. Performing the operation as a vCloud Orchestrator workflow provides an easy way to expose the operation to a customer via the built-in web portal or via a customized portal that leverages the web-services API. Many of the operations in this category can be satisfied directly via the vCloud Director UI; however, there are candidates that affect multiple systems or fit better into a customer portal, which may be better implemented as an orchestration workflow. None of this is exposed to the cloud consumer, which makes it somewhat difficult. This has to be initiated by the cloud provider using the vCenter Orchestrator Client, unless the provider creates a portal to front-end vCenter Orchestrator. Some examples of these types of use cases would include resetting of system or user account passwords on virtual machines using the VIX plug-in, putting a load balanced service into maintenance mode by stopping the service, removing it from the load balancing pool and disabling monitors, loading certificates into virtual machines, and deploying instances of custom applications from the organization’s catalog. vCloud Orchestrator can be used to create custom workflows at the vCloud API and VIM levels. vCloud Request Manager is an alternative to vCloud Orchestrator that has built-in workflow functionality that integrates with vCloud Director through the vCloud API.

T ECHNICAL W HI T E P A P E R / 3 7

VMware vCloud Architecting a vCloud

For additional information on topics on vCO installation and configuration and workflow solution development, refer to the vCenter Orchestrator v4.1 documentation set at http://www.vmware.com/support/pubs/ orchestrator_pubs.html.

5.5 vCloud Connector
vCloud Request Manager vCenter Chargeback vCloud Director vShield Edge vCloud Connector vCenter Orchestrator VMware vSphere

vCloud API

As more clouds are stood up, several clouds from different sites within a private enterprise can form a larger cloud, or a private and public cloud can form a hybrid cloud. Cloud consumers need a way to migrate workloads in a federated cloud. vCloud Connector (vCC) solves this problem by allowing you to perform migrations from all of your public and private clouds and obtain a consistent view of them from a single interface. vCloud Connector needs to be installed by cloud administrators but can be used by administrators and end users alike to view and manage workloads. Once vCloud Connector has been deployed to a vSphere host and registered with a vCenter Server, end users can access the “vCloud Connector” under “Solutions and Applications” from the vSphere Client where the OVF was deployed. 5.5.1 vCloud Connector Placement There are two considerations on where to place your vCloud Connector appliance. • The virtual appliance must be deployed to a vCenter Server that the target users can access via the vCenter Client. The only user access is via the vSphere Client so users of vCloud Connector must have the right to login to this vCenter Server. • Workload copy operations use the vCloud Connector appliance as a middleman, so network latency and bandwidth between clouds needs to be considered. In some cases it may be preferred to run multiple instances of vCloud Connector across multiple vCenter Servers to avoid network latency or consuming excessive bandwidth.

Local vCloud or vSphere
vCloud Director X
REST/HTTP(S) REST/HTTP(S)

Remote vCloud

vCloud Connector UI

vSphere Client
VM VM

vCloud Connector Appliance
VM
VM

vCloud API REST/HTTP(S)

vCloud Director Y

wa

re

vCenter Server A

ESXi Host

VIM API SOAP/HTTP(S)

vCenter Server B

Figure 14. vCloud Connector Architecture

T ECHNICAL W HI T E P A P E R / 3 8

VMware vCloud Architecting a vCloud

5.5.2 vCloud Connector Example Usage Scenarios vCloud Connector can support a number of workload migration use cases. The following examples assume migration of a vApp, comprised of one or more virtual machines: • Copying a vApp from vSphere to a vCloud • Copying a vApp from a private vCloud to a public vCloud • Copying a vApp from a vCenter to another vCenter. • Even in environments not running vCloud Director, vCloud Director can still be used to copy and move vApps. • As long as both vCenter Servers are added as clouds in vCloud Director, you can freely move workloads between them. 5.5.3 vCloud Connector Limitations The use of vCloud Connector to copy and migrate vApps is subject to the following limitations: • Currently there is no way to have predefined clouds appear in vCloud Connector. Each user must manually add all clouds to vCloud Connector that they intend intend to access. There are no clouds defined by default so that the user can add only the clouds they care to see. • Traffic to and from the vCloud Connector appliance is not WAN optimized so it is not ideal to migrate workloads over WAN links even if sufficient bandwidth exists. To avoid this scenario it is preferred to have vCloud Connector appliances installed in locations to avoid having to traverse WAN links as much as possible. There is currently no way to limit which clouds can be added to a vCloud Connector instance so you must instruct your users to only use the proper vCloud Connector instance for their needs. • All workloads being transferred are copied to a staging area on the vCloud Connector appliance before being copied to the destination cloud. This area is 20GB by default, which means the largest virtual machine that can be copied must be smaller in disk space than that. The ability to easily resize this staging area will be available in an upcoming update. • vCloud Connector is designed to give you a consistent view of your workloads across multiple clouds and migrate those workloads. Therefore, you will still need to use the vCenter Client and/or login to the vCloud Director to manage your workloads. • vApps/virtual machines must be powered off for migration, that is, the workloads must be offline. Hot migrations are currently not available. The vApp networking configuration will also need to be modified before powering on the virtual machines.

T ECHNICAL W HI T E P A P E R / 3 9

VMware vCloud Architecting a vCloud

6. Managing the vCloud
6.1 Monitoring
Monitoring the components of a vCloud Director implementation is essential to the health of a vCloud environment and meeting customer expectations. This section provides recommendations on what systems and associated objects to monitor, and what readily available tools to extract health-related metrics. Details of specific limits or thresholds are not identified, as they can be found in the detailed product documentation. This document will not provide specifics on setting up a monitoring solution, as service providers and enterprises may have very different monitoring solutions in place to be integrated. 6.1.1 Management Cluster Monitoring the management cluster components will follow the same best practices as monitoring vSphere. As part of this, a centralized monitoring tool such as Hyperic HQ Enterprise can be used to monitor the core objects (Oracle Server, SQL Server, Active Directory Server, DNS Server, Red Hat Enterprise Linux Server, Windows Server) that are needed to run a vCloud environment. SNMP and SMASH are not supported for monitoring vCloud Director cells, however, SNMP can be integrated from vCenter. Alternatively, cells can be monitored through integration with a third party monitoring platform via JMX Beans. Note that JMX Beans monitoring is only the start. vCloud and vSphere API provide a significant amount of details that can be used from a health and capacity management. 6.1.2 Cloud Consumer Resources and Workloads Monitoring the cloud consumer resources and workloads will also follow best practices for monitoring vSphere. However, additional cloud-specific considerations are as follows. vShield Edge vShield Edge appliances are self-contained environments that are stateless in nature. There is a “health check” API call you can make to a vShield Edge appliance to determine if it is functioning correctly. If the API returns negative, then you should initiate a reboot of the vShield Edge device. At the time of reboot, configuration information will be updated from the vShield Manager and the vShield Edge device will continue to function properly. Cloud Consumer Workloads It may be desirable to monitor workloads provisioned by cloud consumers. vCloud Director does not provide any built-in monitoring of workloads for availability or performance. Several third party solutions are available to monitor vSphere resources and workloads running on vSphere; however, not all of these solutions may work all of the time when vCloud Director is in use. Isolated networking in vApps may prevent monitoring tools from acquiring performance or availability information of a vApp. Furthermore, vApps may be provisioned and de-provisioned or power cycled at any time by a cloud consumer and these actions may create false positives in the monitoring environment. Until there are solutions that are fully integrated with vCloud Director on the market, it may be difficult to provide detailed monitoring for cloud consumer workloads. For detailed guidance on how to monitor a vCloud environment, refer to the Monitoring a VMware vCloud technical white paper.

6.2 Logging
This section describes logging architecture concerns and logging as a service. Logging requirements and considerations for a public vCloud can be extensive. The requirements and considerations for a private vCloud typically may not be as rigorous. For a private vCloud, refer to enterprise-specific requirements and guidelines. Logs should be available in a vCloud for numerous reasons, including the following: • Regulatory Compliance. Collect logs to make them available for analysis, security review, compliance requirements, as described in the Appendix: Security Considerations. Individual logs can then be used to satisfy specific compliance controls; for example, a user access log can be used to verify access to a resource is only by authorized users.

T ECHNICAL W HI T E P A P E R / 4 0

VMware vCloud Architecting a vCloud

• Customer Requirements. End customers (tenants) can retrieve logs that pertain to their environment in order to satisfying their own requirements, many of which, such as compliance, will likely be similar to provider requirements. • Operational Integrity. Operational alerts should be defined where specific logs trigger notifications for further remediation. This will typically be a backup alert, secondary to monitoring. • Troubleshooting. Closely related to operational integrity, troubleshooting can be done with logs. For example, the use of vShield Edge logs can show whether or not a specific external connection request is being passed through the firewall or NAT’ed by the firewall. 6.2.1 Logging Architectural Considerations Redundancy Many components rely on syslog for logging events. Syslog is a UDP-based protocol that lacks delivery guarantees. To help ensure delivery: • Verify that infrastructure components have physically and logically redundant network interfaces. • Send logs to two syslog targets. • Where only a single syslog target is possible, it is recommended to log to a local syslog daemon configured to retransmit the logs to two remote syslog targets. For example, vCloud Director 1.0 only supports a single syslog target for its activity logs. • Where possible, place log receivers on DRS so vCenter will restart them in the case of failure. Scalability vCloud infrastructure components generate a relatively low level of logs for provider infrastructure. Customer components, especially vShield Edge firewalls, can generate a very high volume of logs. Collecting logs on IOPS performance is critical. Collecting logs on CPU performance is negligible, but will matter for analysis. It is highly recommended that logs be collected to dedicated log partitions on collection servers. Reporting Logs need to be available to tenants. They should be able to download in raw format all vCloud Director and VSE logs pertaining to their organizations/networks. Logs with customer identifiers should be flagged or indexed for retrieval. • Customer activity in vCloud Director will generate logs that are flagged with their organization ID. • vShield Edge devices do not have unique identifiers in vCloud Director 1.0. Therefore, it is VMware recommends you keep NAT-routed organization external networks and fenced vApps connected only to single-tenant provider networks. When the vShield Edge device is deployed by vCloud Director and its external IP address is allocated, the tenant can then be identified by the IP address.

T ECHNICAL W HI T E P A P E R / 4 1

VMware vCloud Architecting a vCloud

The following is an example logging architecture.

vCloud Management Cluster

vCloud Resource Group – Customer Resource Pools

v C O

vCD
Log Collection Node

vSM

vSE vCenter

CBM
Log Collection Node

Log Collection Node

v C O

LaaS

© VMware,

© VMware,

© VMware,

vCenter
Log Collection Node Log Collection Node Log Collection Node

ESXi

ESXi

ESXi

ESXi

ESXi

ESXi

DB Collection/Agents - HA
Inc. © VMware, © VMware, Inc.

Web - Report/DL

© VMware,

© VMware,

© VMware,

Inc.

© VMware,

© VMware,

Inc.

Log Reporting/Analysis Tier - Virtual/Dedicated

Inc.

Inc.

© VMware,

© VMware,

Customer Access– Downloads, Customer Archiving

Figure 15. Architectural Example Drawing

6.2.2 Logging as a Service Logging as a service can be done in two directions: • With customer collection and forwarding to provider servers for analysis and reporting. • With customer collection, reporting and analysis in the customer environment; and provider logs forwarded to the customer environment. Customer Collection forwarding to Provider Pros: • Logs can be sent directly to collector even on customer private IP space. • Resources can be allocated at the customer level for collection, allowing more granular scaling of collection. Cons: • More difficult to scale analysis; challenges correlating customer activity to storage consumption. • Collection node(s) still required even though utilization will be low. • Most of the resource consumption is on the storage and analysis side, so the resources billed via the IaaS model will be minimal. Customer Collection, with Provider Logs Forwarded Into Customer Environment Pros: • Distributed analysis relies on general cloud resources and can scale. • Customer can employ their own analysis tools to organize and report on the data, or use a provider-supported package or appliance. Cons: • Provider needs duplicate copy of infrastructure logs for provider purposes. • Transmission of logs to the customer environment requires connectivity; either Internet or a provider service network, and inbound traffic through a firewall into the customer environment, adding risks.

© VMware,

Inc.

© VMware,

© VMware,

Inc.

Inc.

Inc.

Inc.

Inc.

T ECHNICAL W HI T E P A P E R / 4 2

© VMware,

Customer vApps

Org vDC

Org vDC
Inc. Inc.

LaaS
Inc. Inc.

Customer vApps

VMware vCloud Architecting a vCloud

6.3 End-to-End Security Considerations with vCloud
Security in a vCloud can be considered for three areas: vCloud environment, user access, and network-level workloads. 6.3.1 vCloud Environment Security While vCloud Director is designed for secure multi-tenancy —to reduce the impact of organizations on each other —additional steps can be taken to harden the environment. This is especially important in a service provider environment where multiple organizations will be connected directly to the Internet. Refer to the VMware vCloud Director Security Hardening Guide for detailed information on hardening VMware vCloud Director. Refer also to the Appendix on Signed Certificates for guidance on setting up signed certificates for vCloud Director. 6.3.2 User Access Security Authentication and authorization mechanisms built into vCloud Director provide user security for vCloud resources. Integration with a directory service (LDAPv3), such as Active Directory, OpenLDAP or Kerberos v5 can be configured. Refer to the VMware vCloud Director Administrator’s Guide for more information on how to set up the naming services (LDAPv3), Active Directory, or OpenLDAP and Kerberos v5 integration. It is up to the service provider to configure a directory service for vCloud Director. This is also true for a private vCloud. User access and privileges within vCloud Director is controlled through role-based access control (RBAC). Refer to the VMware vCloud Director Administrator’s Guide for additional information on permissions, roles, and default settings. 6.3.3 Securing Workloads at the Network—Level Workload Security Network visibility (external or internal to an organization or vApp) and connection types (direct or NAT routed) protect network-level workloads in a vCloud environment. vCloud Director automatically deploys vShield Edge devices to facilitate routed network connections. vShield Edge uses MAC encapsulation for NAT routing. This helps prevent Layer 2 network information from being seen by other organizations in the environment. vShield also provides a firewall service that can be configured to block inbound traffic to virtual machines connected to a public access organization network. The Service Definition for Public Cloud, specifies how network options should be set for service providers for security. Each of the organization networks are connected to the shared public network through a routed connection. A maximum of 8 public IP addresses should be allowed inbound access to an organization’s virtual machines to meet the requirements of the Service Definition. The organization administrator is the actual user that will be responsible for making this configuration change. Once a vApp is created and virtual machines are added to it and connected to the public access organization network, the vApp will obtain a private IP address from the static IP pool previously established. The organization administrator can then configure the firewall and the NAT external IP map for the newly created virtual machine, as well as the private IP address using the network configure services wizard shown below.

T ECHNICAL W HI T E P A P E R / 4 3

VMware vCloud Architecting a vCloud

Figure 16. Configure Firewall Services

Private vCloud network routing and firewall requirements depend on security policies, organizational requirements, and workloads of the enterprise. For further details, see the Appendix on Security.

6.4 Workload Availability Considerations
vCloud Director works transparently with vCenter Server to provision and deploy virtual machines on hosts. Therefore, it is imperative to architect redundancy and protect the infrastructure components. Provisioned virtual machines can be protected by VMware HA. Virtual machines can also be protected using backup tools within the Guest OS or vStorage API (VADP based) applications. See the Backup and Restore section for further information. At this time, virtual machines provisioned by vCloud Director cannot be protected by VMware FT or vCenter Site Recovery Manager. See the Disaster Recovery section for further information. 6.4.1 Uptime SLAs at 99.99% VMware’s current Service Definition for a Public Cloud specifies a 99.9% uptime SLA. This may be sufficient for noncritical applications or applications that are inherently highly available. Private and public cloud providers have found that the market for an enterprise-ready cloud expects uptime SLAs at 99.99% (5 minutes per month) or even greater. This means for vCloud verification that: • End customers workloads are running. • End customer workloads are accessible (via the vCloud Portal and API, as well as through remote access protocols).

T ECHNICAL W HI T E P A P E R / 4 4

VMware vCloud Architecting a vCloud

To address a 99.99% uptime SLA, VMware can only control the resiliency of its vCloud platform components and provide recommendations to mitigate single points of failure (SPOF) in the underlying infrastructure. A provider can eliminate SPOF by ensuring redundancy, as listed below: • Redundant power sourced from multiple feeds, with multiple whips to racks, as well as sufficient backup battery and generator capacity. • Redundant network components. • Redundant storage components. ––Storage design needs to be able to handle the I/O load as well. Customer workloads may not be accessible under high disk latency, file locks, and so forth. ––Storage design should also be tied to business continuity and disaster recovery efforts, possibly including array level backups. • Redundant server components (multiple independent power supplies, network interface cards (NICs), and, if appropriate, host based adaptors (HBAs). • Sufficient compute resources for a minimum of N+1 redundancy within a vSphere HA cluster including sufficient capacity for timely recovery. • Redundant databases and management Appropriate change, incident, problem and capacity management processes must also be well defined and enforced to make sure that poor operational processes do not result in unnecessary downtime. In addition to a redundant infrastructure, employees or contractors responsible for operating and maintaining the environment and the supporting infrastructure must be adequately trained and skilled. A vCloud is capable of supporting an uptime SLA of 99.99% by following the guidelines in the table in Appendix on vCloud Availability. The availability recommendations in the table allow a vCloud to achieve a 99.99% uptime SLA but only with no SPOFs in the underlying infrastructure, required skills available, and suitable processes defined and followed. 6.4.2 Load Balancing of vCloud Director Cells vCloud Director (vCloud Director) cells are stateless front-end processors for the vCloud. Each cell has a variety of purposes and self-manages various functions among cells, while connecting to a central database. The cell manages connectivity to the cloud and provides both API and UI end-points, or clients. Multiple cells (a load balanced group) should be used to address availability and scale. This is typically achieved by load balancing or content switching this front-end layer. Load balancers present a consistent address for services regardless of the underlying node responding. They can spread session load across cells, and monitor cell health and add/remove cells from the active service pool. The group should not be considered a true cluster since there is no failover from one cell to another. In general, any load balancer that supports SSL session persistence and has network connectivity to the “public” facing Internet or internal service network can perform load balancing of vCloud Director cells. General concerns around performance, security, manageability, and so forth should be taken into account when deciding to share or dedicate load balancing resources. For the purposes of this reference architecture, the load balancer is assumed to be a dedicated virtual machine or hardware device. For additional information on load balancing, see the Appendix on vCloud Availability, and the section on load balancers.

T ECHNICAL W HI T E P A P E R / 4 5

VMware vCloud Architecting a vCloud

6.4.3 I/O Considerations vCloud Director offers controls for new organizations to guard against the misuse of resources by other organizations. These include: • Quotas for running and stored virtual machines. Determines how many virtual machines each user in the organization can store and power on in the organization’s virtual datacenters. The quotas you specify act as the default for all new users added to the organization. • Limits for resource intensive operations. Prevents resource intensive operations from affecting all the users in an organization and also provide a defense against denial-of-service attacks. • Simultaneous VMware Remote Console (VMRC) connections. Limits the number of simultaneous connections for performance or security reasons. NOTE: VMware currently does not recommend the use of SIOC/NIOC in vSphere beneath the cloud abstraction layer. vSphere contains options for storage I/O control (SIOC) and network I/O control (NIOC) but these functions are not integrated into vCloud Director and their use in vCloud could cause unpredictable results. 6.4.4 Disaster Recovery Disaster Recovery (DR) focuses on the recovery of systems and infrastructure after an incident that interrupts normal operations. A disaster can be defined as partial or complete unavailability of resources and services, including software, the virtualization layer, the cloud layer, and the workloads running in the resource groups. There are different approaches and technologies supported, but there are at least two areas that require disaster recovery: the management cluster and consumer resources. Consumer resources are described later in this document. There are different approaches and technologies supported. The two areas include the management cluster and the consumer resources, as defined later. Management Cluster Disaster Recovery Good practices at the infrastructure level will lead to easier disaster recovery of the management cluster. This includes technologies such as HA and DRS for reactive and proactive protection at the primary site. vCenter Heartbeat can also be used to protect vCenter Server, specifically, at the primary site. For multi-site protection, vCenter Site Recovery Manager protection of virtual machines is VMware’s solution that works normally for this use case, since the management VMs are not part of a cloud instance of any type (but, rather, running the cloud instances). Cloud Consumer Resources Disaster Recovery This section focuses on disaster recovery of the cloud infrastructure to handle failure to an alternate site. vCenter Site Recovery Manager is not supported, although there are manual steps that can be applied as long as vApp metadata is saved, configuration information is matched between the primary site and the recovery site, and the documented steps are validated. While Site Recovery Manager is vCenter Server-aware, vCloud Director is not Site Recovery Manager-aware. Without the collaboration between vCloud Director and Site Recovery Manager, the mechanisms working beneath the covers to synchronize VMs cannot work to keep vCloud Director in sync as well, and as a result the recovery of vCloud Director can be problematic. While it is possible to architect a solution where one site’s total environment, 100% of the operational parameters of that site including IP addressing, start-up order of dependent systems and the like, are duplicated to another site to recover to, it would be very difficult to implement and maintain, so is therefore out of scope for this document. VMware is actively working on a streamlined solution to this situation, to be addressed by future product enhancements. 6.4.5 Backup and Restore of vApps This section focuses on handling of backup/restore procedures with the vApps that are deployed into the cloud. Traditional backup tools do not capture the required metadata associated with a vApp, including owner, network, and organization. This results in recovery/restoration issues. Without this data, recovery must include manual steps and configuration attributes to be manually re-entered.

T ECHNICAL W HI T E P A P E R / 4 6

VMware vCloud Architecting a vCloud

Within a vCloud environment, a vApp can be a single virtual machine or group of virtual machines, treated as one object. Backup of vApps on isolated networks must be supported. Identifying inventories of individual organizations becomes challenging based on current methods that enumerate the backup items using vSphere, which uses UUIDs to differentiate objects whereas vCloud Director uses object IDs. For backing up and restoring vApps, VMware recommends the use of vStorage API Data Protection (VADP) based backup technologies. This technology has no agents on Guest OSs, is centralized for improved manageability, and has a reduced dependency on backup windows. Guest-based backup solutions may not work in a vCloud because not all virtual machines are accessible by network. Also, virtual machines may have identical IP addresses that can cause problems. Therefore, backups of vCloud vApps must require a virtual machine-level approach. When deploying virtual machines (as part of a vApp), use the full name and computer name fields to specify realistic names that will help describe the virtual machines. If this is not done, the generic information in these fields can make it hard to specify individual virtual machines. vApps and virtual machines that are provisioned by vCloud Director have a large GUID template_name,which means many virtual machines could appear to be very similar and make it hard for a user or administrator to ask for a specific virtual machine to be restored. VMware Solutions VMware Data Recovery is a vStorage API Data Protection based solution from VMware. Other vStorage API Data Protection based backup technologies are available from third-party backup vendors. Currently due to the UUID versus object ID issue discussed above, VMware Data Recovery cannot be used with VMware vCloud Director. Backup of vCloud workloads has a few requirements to address. VMware recommends that clients validate the level of support provided by the vendor to make sure client requirements are supported. Here is a list of Cloud vApp requirements to ask your vendor about:
vA P P R E Q U I r E m E N t D E ta I L

vStorage API Data Protection (VADP) integration

□ □ □ □

vStorage API Data Protection provides change-block tracking capability to reduce backup windows. Integration to enable backup of isolated VMs and vApps. Integration with vStorage API Data Protection to provide LAN-free and server-free backups to support better consolidation rations for vCloud and the underlying vSphere infrastructure. Use of the virtual machine UUID versus virtual machine name will support multi-tenancy and avoid potential name space conflicts. Interface support for cloud provider administrator teams. In the future, consumer (organization administrator and users) access will potentially be provided by some vendors. Include vCloud metadata for the vApps. This includes temporary and permanent metadata per VM/vApp. This is required to make sure that recovery of the VM/vApp will have all data required to support resource requirements and SLAs. Provide vApp granularity for backups. Support backup of multitiered vApps (for example, a Microsoft Exchange vApp that has multiple virtual machines included. Backup selection of the Exchange vApp would pick up all the underlying virtual machines that are part of the main vApp).This is something that is not available today, but is being developed by vendors.

vCloud Director integration

□ □

vApp requirements



Table 6. vCloud vApp Requirements Checklist

T ECHNICAL W HI T E P A P E R / 4 7

VMware vCloud Architecting a vCloud

Challenges The following is a list of backup and restore challenges: • vApp naming posing conflict issues between tenants • vApp meta data required for recovery • Multi-object vApp backup (protection groups for multi-tiered vApps) • Manual recovery steps in the cloud • Support for backup of vApps on isolated networks or with no network connectivity • Enumeration of vApps by organization for use by the organization administrator • Enumeration of vApps by organization and provider for use by the organization provider • User initiated backup/recovery • Support of provider (provider administrator) and consumer (organization administrator and user)

7. Sizing the vCloud
Sizing the management cluster is fairly predictable, with the main variables being the number of vCloud Director cells and the size of the vCloud Director database. Sizing guidelines for the management cluster were provided earlier in the section on establishing provider virtual datacenters. The vCloud consumer resources have unpredictable usage, and thus should be sized by making an estimate of the initial capacity required, and employing capacity management techniques. Capacity management techniques predict future usage needs based upon past usage trends.

7.1 Initial Sizing of Cloud Consumer Resources
Sizing a vCloud is dependent on the service definitions for that vCloud. For a private vCloud, the service definition may not specifically call out a required number of workloads to support. In that case, you may want to take into consideration how a public vCloud is initially sized. For a public vCloud, initial sizing for the cloud consumer resources can be difficult to predict since the provider is not in charge of what the consumers may run. The provider is also not aware of existing usage statistics for virtual machines that are run in the vCloud. The information below should assist in initial sizing of the vCloud environment and is based on information from the Service Definition for a Public Cloud. This information is being provided as examples. VMware recommends that you engage your local VMware representative for detailed sizing of your environment. The Service Definition states that 50% of the total number of virtual machines will be run in the reservation pool model and 50% will be run in the Pay-As-You-Go model. Furthermore, the reservation pool is split into small, medium, and large pools with a respective split of 75%, 20%, and 5%. Using the 50% above, this means that small represents 37.5% of the total, medium represents 10% of the total, and large represents 2.5% of the total number of virtual machines in the environment. The definition for these resource pools and the split with the virtual machines is listed below. The total number of virtual machines of 1,500 from the Service Definition for a Public Cloud is used in this example. You can change this total to reflect your own target virtual machine count.

T ECHNICAL W HI T E P A P E R / 4 8

VMware vCloud Architecting a vCloud

T Y P E O F R E S O U rc E P O O L

TOta L P E rc E N taG E

TOta L V M S

Pay-As-You-Go Small Reservation Pool Medium Reservation Pool Large Reservation Pool TOTAL

50% 37.5% 10% 2.5% 100%

750 563* 150 37* 1,500

* Note: Some total virtual machines are rounded up or down due to percentages. Table 7. Definition of Resource Pool and Virtual Machine Split

The Service Definition for a Public Cloud also calls out the distribution for virtual machines in the environment with 45% small, 35% medium, 15% large, and 5% extra large. Below is a table that shows the total amount of memory, CPU, storage, and networking based on the assumptions and the total virtual machine count from the Service Definition for a Public Cloud.
It E m P E rc E N t v C PU S MEmOrY StO raG E N E tw O r K I N G

Small Medium Large Extra Large TOTAL (1,500)

45% 35% 15% 5% 100%

675 1,050 900 600 3,225

675 GB 1,050 GB 900 GB 600 GB 3,225 GB

40.5 TB 31.5 TB 54 TB 4.5 TB 130.5

400 GB 300 GB 400 GB 200 GB 1,300 GB

Table 8. Memory, CPU, Storage, and Networking

The above numbers may shock you. Before you determine your final sizing numbers, you should refer to VMware best practices for common consolidation ratios on the above resources. An example table below shows what final numbers could look like using typical consolidation ratios seen in field deployments.
R E S O U rc E BEFOrE R at I O AFtEr

CPU Memory Storage Network
Table 9. Example Consolidation Ratios

3,225 3,225 GB 130.5 TB 1,300 GB

8:1 1.6:1 2.5:1 6:1

403 vCPUs 2,016 GB 52 TB 217 GB

The above calculations could be served by 16 of the following hosts. • Socket count: 4 • Core count: 6 • Hyper threading: Yes • Memory: 128 GB • Networking: Dual 10 GigE

T ECHNICAL W HI T E P A P E R / 4 9

VMware vCloud Architecting a vCloud

The above calculations do not take into account the storage consumed by consumer’s or provider’s templates. The above calculations also do not take into account the resources consumed by the vShield Edge appliances that are deployed for each organization. There will be a vShield Edge for each private organization network and external organization network. Given the current Service Definition target of 25 organizations, a maximum of 275 vShield Edge appliances will be created. The specifications for each vShield Edge appliance are listed below. • CPU: 1 vCPU • Memory: 64 MB • Storage: 16 MB • Network: 1 GigE (this is already calculated in the throughput of the workloads and should not be added again)

7.2 Capacity Management
One of the key benefits of implementing a vCloud is the ability for service provider customers (here equivalent to public cloud providers or private cloud internal IT) to rapidly provision vApps into the vCloud environment. The goal of capacity management is to make sure that sufficient capacity exists within the vCloud infrastructure to meet the current and future needs of the service provider customers under normal circumstances. Sufficient reserve capacity must be maintained within the vCloud infrastructure to prevent vApps from contending for resources, and thus potentially breaching services levels agreed upon for customers. As vApps are provisioned and consumed within the vCloud infrastructure, capacity is reduced and additional capacity must be procured and provisioned. Capacity management processes should be instituted to make sure appropriate resources are available to support the service level requirements associated with vApp provisioning and performance. Proper capacity management will also prevent costly over-provisioning of hardware resources by balancing high resource utilization with agreed-upon levels of performance. As the vCloud is consumed, additional capacity must be added to the cloud consumer resources to allow for anticipated future demand while preserving sufficient headroom. To predict future capacity needs, you’ll need to analyze current capacity usage and trends to determine growth rates as well as estimate future needs, largely coming from new consumers and projects. VMware vCenter CapacityIQ is a tool that can be used to monitor and predict capacity usage and requirements. In a vCloud environment, while CapacityIQ can provide details on capacity at the virtual machine and host levels, at this time it does not provide insight at the provider and organization virtual datacenter levels. For further information on capacity planning and management, refer to the VMware vCenter CapacityIQ Installation Guide. Alternatively, refer to the Capacity Planning Appendix for guidance on how to manually calculate capacity requirements and forecast capacity.

T ECHNICAL W HI T E P A P E R / 5 0

VMware vCloud Architecting a vCloud

8. Implementing Your vCloud
Use the documents in the vCloud Reference Architecture Kit to help you implement a vCloud. Implementation Examples are provided for a public and private vCloud and help you visualize the end result of applying this Architecting a vCloud guide with the appropriate Service Definitions.
What is a vCloud? 1 Requirements for a vCloud

What should the vCloud offer? 2a Service Definition for a Public Cloud

What to consider in building a vCloud Public Cloud (Service Provider) 3 Architecting a vCloud

Reference Implementation Examples 4a Public vCloud Implementation Example

2b Service Definition for a Private Cloud
Business requirements for the vCloud. Public cloud doc draws upon VMware vCloudDatacenter program. Private Cloud doc shows typical use case.

4b Private Cloud
Architect-level guide identifying what components go into a vCloud and design considerations. Uses Service Definition as input into what to design.

Private vCloud Implementation Example
Use as a reference for what a vCloud architecture design document can/should look like.

Audience: vCloud Architect familiar with vSphere best practices (ideally VCP-level) and exposure to vCloud product components

Figure 17. Reference Architecture Kit

T ECHNICAL W HI T E P A P E R / 5 1

VMware vCloud Architecting a vCloud

9. Appendix: vCloud Director Cell Monitoring
The following table represents a subset of MBeans that can be used for improving the monitoring performance of a vCloud instance.
LO ca L U S E r S E S S I O N S

Mbean Description Cardinality Instance ID Attribute totalSessions successfulLogins failedLogins
G LO ba L U S E r S E S S I O N S

com.vmware.vcloud.diagnostics.UserSessions Local (cell) user session statistics 1 n/a Description Total number of sessions created on this cell Total number of successful logins to this cell Total number of failed login requests to this cell

Mbean Description Cardinality Instance ID Attribute organization active Open_Session
Data acc E S S D I aG N O S t I c S

com.vmware.vcloud.GlobalUserSessionStatistics List of active user sessions by organization. 1 n/a Description Database ID of the organization Number of active sessions number of open sessions

Mbean Description Cardinality Instance ID Attribute lastAccessInfo.objectType lastAccessInfo.accessTime worstAccessInfo.objectType worstAccessInfo.accessTime
Databa S E C O N N E ct I O N P O O L

com.vmware.vcloud.diagnostics.DataAccess Local (cell) user session statistics 1 Conversation Description object type of the last database object accessed time taken to access the last database object accessed object type of the worst (slowest) database object access time taken by the worst (slowest) database object access

Mbean Description

com.vmware.vcloud.datasource.globalDataSource Statistics and configuration information about the database connection pool. This information is currently specific to the database JDBC driver being used (Oracle). 1

Cardinality

T ECHNICAL W HI T E P A P E R / 5 2

VMware vCloud Architecting a vCloud

LO ca L U S E r S E S S I O N S

Instance ID Attribute abandonedConnectionTimeout availableConnectionsCount borrowedConnectionsCount connectionHarvestMaxCount connectionHarvestTriggerCount connectionPoolName connectionWaitTImeout databaseName dataSourceName fastConnectionFailoverEnabled inactiveConnectionTimeout initialPoolSize loginTImeout maxConnectionReuseCount maxIdleTime maxPoolSize maxStatements minPoolSize networkProtocol ONSConfiguration portNumber SQLForValidateConnection timeoutCheckInterval timeToLiveConnectionTimeout URL user validateConnectionOnBorrow
V I M O P E rat I O N S

Description

database connection database name (SID)

maximum number of connections allowed in the pool

minimum number of connections that will exist in the pool network protocol used by JDBC driver

database conenction port number

database connection URL database connection username

Mbean Description Cardinality Instance ID Attribute ObjectType.MethodName.httpTime

com.vmware.vcloud.diagnostics.VlsiOperations Local (cell) user session statistics 1 per VIM end-point (VC or host agent) VIM end-point URL Description the total network round-trip time taken to make the “MethodName” call on object of type “ObjectType” in the VIM end-point.
T ECHNICAL W HI T E P A P E R / 5 3

VMware vCloud Architecting a vCloud

LO ca L U S E r S E S S I O N S Pr E S E N tat I O N A PI M E t H O D S

Mbean Description Cardinality Instance ID Attribute currentInvocations totalFailed totalInvocations executionTime
J E tt Y

com.vmware.vcloud.diagnostics.VlsiOperations Local (cell) user session statistics 1 per presentation layer method method name Description currently active invocations total number of failed executions total number of invocations over time total time taken to execute

Mbean Description Cardinality Instance ID Attribute Active
R ES T A PI

com.vmware.vcloud.diagnostics.Jetty Web server request statistics 2, 1 for REST API and 1 for UI “UI Requests” for UI, “REST API Requests” for REST API Description number of web requests currently being handled

Mbean Description Cardinality

com.vmware.vcloud.diagnostics.VlsiOperations Local (cell) user session statistics 1 per operation stage/granularity: RoundTrip, BasicLogin, Logout, Authentication, SecurityFilter, ConversationFilter, JAXRSServlet. RoundTrip is the most interesting, as it represents the overall REST API performance. One of: RoundTrip, BasicLogin, Logout, Authentication, SecurityFilter, ConversationFilter, JAXRSServlet Description currently active invocations total number of failed executions total number of invocations over time total time taken to execute

Instance ID Attribute currentInvocations totalFailed totalInvocations executionTime
Ta S K E X E c U t I O N

Mbean Description Cardinality Instance ID Attribute currentInvocations

com.vmware.vcloud.diagnostics.TaskExecutionJobs Statistics about long running tasks 1 per task Name of task Description currently active invocations
T ECHNICAL W HI T E P A P E R / 5 4

VMware vCloud Architecting a vCloud

LO ca L U S E r S E S S I O N S

totalFailed totalInvocations executionTime
Q U E r Y S E rv I c E ( UI )

total number of failed executions total number of invocations over time total time taken to execute

Mbean Description Cardinality Instance ID Attribute currentInvocations totalFailed totalInvocations executionTime returnedItems
VC Ta S K M a N aG E r

com.vmware.vcloud.diagnostics.QueryService Presentation layer query service statistics 1 per query query name Description currently active invocations total number of failed executions total number of invocations over time total time taken to execute number of items returned by successful query executions

Mbean Description Cardinality Instance ID Attribute successfulTasksCount failedTasksCount waitForTaskInvocationsCount completedWaitForTasksCount historicalTasksCount vcRetrievedTaskCompletionsCount taskCompletionMessagesPublishedCount taskCompletionMessagesReceivedCount success_elapsedTaskWaitTime failed_elapsedTaskWaitTIme

com.vmware.vcloud.diagnostics.VcTasks VC task management statistics 1

Description total successful tasks total failed tasks total invocations of VIM “wait for task” total completed task waits total historical task updates received total task completions received total task completion messages published on message bus total task completion messages received on message bus time elapsed for successful tasks time elapsed for failed tasks

V I M I N v E N tO r Y U P Dat E Pr O c E S S I N G - O bj E ct U P Dat E S tat I S t I c S

Mbean Description Cardinality Instance ID Attribute

com.vmware.vcloud.diagnostics.VimInventoryUpdates Inventory processing statistics 3, one for ObjectUpdate, PropertyCollector and UpdateSets respectively ObjectUpdate Description
T ECHNICAL W HI T E P A P E R / 5 5

VMware vCloud Architecting a vCloud

LO ca L U S E r S E S S I O N S

totalUpdates totalFailed executionTime
V I M I N v E N tO r Y Ev E N t S

total number of object updates received total number of object updates failed to be processed time taken for updates

Mbean Description Cardinality Instance ID Attribute totalInvocations totalFailed executionTime
VC Obj E ct Va L I Dat I O N S

com.vmware.vcloud.diagnostics.VimInventoryEvents VIM inventory event manager statistics. Tracks the frequency of common vCenter events. 1 per folder per VC URL, 1 MBean per event name event name Description

total number of VIM inventory events that were failed to be handled total time time to handle VIM inventory events

Mbean Description Cardinality Instance ID Attribute totalInvocations executionTime totalItemsInQueue objectsInQueue objectBusyRequeueCount loadValidationObjectTime duplicatesDiscarded
VC Obj E ct Va L I Dat I O N R E act I O N S

com.vmware.vcloud.diagnostics.VcValidation VC object validation statistics 1 global plus 1 per validator null = global, validator name = per validator Description total number of validation executions total time spent in validator total items currently queued for validation (global) total items currently queued for validation (per validator) total number of objects requeued for validation due to object being busy time taken to load validation object total number of discarded duplicate validations

Mbean Description Cardinality Instance ID Attribute totalReactionsFired requeueCount totalInvocations executionTime

com.vmware.vcloud.diagnostics.Reactions validation reaction statistics 1 global plus 1 per reaction null = global, reaction name = per reaction Description total number of reaction executions total number of reactions requeued due to objects being busy total number of executions of this reaction total time spent in reaction

T ECHNICAL W HI T E P A P E R / 5 6

VMware vCloud Architecting a vCloud

LO ca L U S E r S E S S I O N S

failedReactions objectRequeueCount
VC c O N N E ct I O N S

total number of failed reactions number of times this reaction was requeued due to objects being busy

Mbean Description Cardinality Instance ID Attribute Connected Count Disconnected Count Start Count UI Vim Reconnect Count
Act I v E M Q

com.vmware.vcloud.diagnostics.VimConnection Local (cell) user session statistics 1 per VC “VC-VcInstanceId” where VcInstanceId is an integer identifying the virtual center Description Total successful connections Total disconnections Total number of times the VC listener was started Total number of times the VC was reconnected through the UI

Mbean Description Cardinality Instance ID Attribute lastHealthCheckDate messageRoundTripDurationMs unreachableCells isHealthy reachableCells
T ra N S F E r S E rv E r

com.vmware.vcloud.diagnostics.ActiveMQ Active MQ (message bus) statistics 1 global and 1 per peer VCSD cell (each cell other than the current one) “Global” = global statistics”to_cellName_cellPrimaryIp_ cellUUID”=per cell Description last time health check was performed Time taken for an echo message to be sent and returned Total number of unreachable cells Health of connections to pear Total number of reachable cells

Mbean Description Cardinality Instance ID Attribute successfulPuts failedPuts successfulUploads
Table 10. MBeans Used To Monitor vCloud Cells

com.vmware.vcloud.diagnostics.TransferService Transfer server statistics 1

Description number of items successfully transferred number of items that were failed to be transferred number of successful upload operations

T ECHNICAL W HI T E P A P E R / 5 7

VMware vCloud Architecting a vCloud

10. Appendix: vCloud Availability Considerations
A vCloud is capable of supporting an uptime Service Level Agreement (SLA) of 99.99% by following the guidelines presented in the following table. Following the availability recommendations in the table allows a vCloud to be capable of supporting a 99.99% uptime SLA but only if there are no single points of failure (SPOF) in the underlying infrastructure, the required skills are available, and suitable ITIL processes are defined and adhered to.
M a I N ta I N I N G R U N N I N G W O r K LOa D

Component vSphere ESXi hosts

Availability All VMware ESXi hosts will be configured in highly available clusters with a minimum of n+1 redundancy. This will provide protection not only for the customer’s virtual machines, but also the virtual machines hosting the platform portal/management applications and all of the vShield Edge appliances.

Failure Impact In the event of a failure of a host, VMware HA will detect the failure within 13 seconds and commence powering on the failed virtual machines on other hosts within the cluster. (Analogous to pressing the power button on a physical server, but not including time to boot the OS or launch applications) VMware HA Admission Control ensures sufficient resources are available to restart the virtual machines. The admission control policy “Percentage of cluster resources…” is recommended as it is flexible while guaranteeing resource availability. The following whitepaper contains best practices around increasing availability / resiliency: http://www.vmware.com/ files/pdf/techpaper/VMW-Server-WPBestPractices.pdf It is also recommended that vCenter is configured to proactively migrate virtual machines off a host in the even the host’s health become unstable. Rules can be defined in vCenter when monitoring host system health.

Virtual machine resource consumption

VMware DRS will automatically migrate virtual machines between hosts to make sure the cluster is balanced to reduce the risk of a ‘noisy neighbor’ virtual machine monopolizing CPU and memory resources within a host at the expense of other virtual machines running on the same host. VMware Storage I/O Control will automatically throttle hosts and virtual machines when it detects that the datastore is congested/bottlenecked. This ensures that a ‘noisy neighbor’ virtual machine does not monopolize storage I/O resources. Storage I/O Control ensures each virtual machine will receive the resources it is entitled to by leveraging the shares mechanism.

No impact. Virtual machines are automatically migrated between hosts with no downtime by VMware DRS.

No impact. Virtual machines and ESXi hosts are throttled by Storage I/O Control automatically based on their entitlement relative to the amount of shares or the maximum amount of IOPS configured. For more information on Storage I/O Control, check out the whitepaper: http://www.vmware.com/ files/pdf/techpaper/VMW-vSphere41SIOC.pdf
T ECHNICAL W HI T E P A P E R / 5 8

VMware vCloud Architecting a vCloud

M a I N ta I N I N G R U N N I N G W O r K LOa D

vSphere ESXi host network connectivity

ESXi hosts will be configured with a minimum of 2 physical paths to each required network (port group) to make sure a single link failure does not impact platform or virtual machine connectivity this should include management and vMotion networks. The Load Based Teaming mechanism will be used to avoid oversubscribed network links. ESXi hosts will be configured with a minimum of 2 physical paths to each LUN or NFS share to make sure a single storage path failure does not result in an impact to service. Path Selection Plug-in will be selected based on storage vendor’s best practices.

No impact. Failover will occur with no interruption to service. Configuration of failover and failback as well as corresponding physical settings such as Portfast are a requirement.

vSphere ESXi host storage connectivity

No impact. Failover will occur with no interruption to service.

M a I N ta I N I N G W O r K LOa D Acc E S S I b I L I t Y

Component VMware vCenter Server

Availability vCenter Server will run as a virtual machine and make use of vCenter Server Heartbeat.

Failure Impact vCenter Server Heartbeat provides a clustered solution for vCenter Server with fully automated failover between nodes, thereby providing near zero downtime. vCenter Heartbeat or Oracle RAC provides a clustered solution for a vCenter database with fully automated failover between nodes, thereby providing zero downtime. Oracle RAC supports the resiliency of the vCloud Director and Chargeback databases as it maintains vCloud Director state information and the critical Chargeback data required for customer billing respectively. While not required for maintaining workload accessibility, clustering the Chargeback database makes sure providers can accurately produce customer billing information. In the event that one of the data collectors for a group should go offline, the others will pick up the load such that transactions are captured by vCenter Chargeback.

VMware vCenter Database

VMware vCenter Database resiliency is provided with vCenter Heartbeat if MS SQL is used or Oracle RAC if Oracle is used. VMware vCloud component database resiliency is provided with Oracle RAC.

vCloud component databases (vCloud Director and Chargeback)

VMware vCenter Chargeback

vCenter Chargeback Data Collectors— vCenter Server, vCloud Director, vShield Manager— shall be distributed in the event of failure

vC LO U D I N F ra S tr U ct U r E Pr Ot E ct I O N

Component

Availability

Failure Impact

T ECHNICAL W HI T E P A P E R / 5 9

VMware vCloud Architecting a vCloud

M a I N ta I N I N G R U N N I N G W O r K LOa D

vShield Manager

vShield Manager will receive the additional protection of VMware FT, resulting in seamless failover between hosts in the event of a host failure. VM Monitoring is enabled on a cluster level within HA and uses the VMware Tools heartbeat to verify vShield Manager is alive. When a virtual machine fails and thus the VMware Tools heartbeat is not updated VM Monitoring will verify if any storage or networking I/O has occurred over the last 120 seconds before the virtual machine will be restarted. It is also recommended to create a scheduled backup of vShield Manager to an external FTP or SFTP server.

Infrastructure availability yes, service availability no. vShield Edge devices will continue to run without the management control, but no addition edge appliances or modifications to existing can occur until the service comes back online.

vCenter Chargeback

vCenter Chargeback virtual machines will be deployed as a two node, load balanced cluster. Multiple Chargeback data collectors can be deployed remotely to avoid a single point of failure.

There is no impact on Infrastructure availability or customer virtual machines. While not required for maintaining workload accessibility, clustering the vCenter Chargeback servers makes sure providers can accurately produce customer billing information and usage reports. Session state of users connected via the portal to failed instance will be lost. They will be able to reconnect immediately. No impact to customer virtual machines.

vCloud Director

The vCloud Director virtual machines will be deployed as a load balanced, highly available clustered pair in an N+1 redundancy set up, with the option to scale out when the environment requires this. vShield Edge can be deployed through API and vCloud Director. To provide network reliability VM Monitoring will be enabled. In case of a vShield Edge “Guest OS” failure VM Monitoring will restart the vShield Edge device. vShield Edge appliances do not have VMware Tools and thus are not monitored as part of VMware HA guest OS monitoring.

vShield Edge

Partial temporary loss of service. vShield Edge is possible connection into organization. No impact to customer virtual machines or VM Remote Console (VMRC) access. All external network routed connectivity will be lost if a vShield Edge appliance

Table 11. vCloud Availability Considerations

T ECHNICAL W HI T E P A P E R / 6 0

VMware vCloud Architecting a vCloud

11. Appendix: Security Considerations
11.1 Network Access Security
A VPN can be used to securely connect multiple vCloud deployments. vShield Edge’s VPN functionality allows the creation of site-to-site tunnels using IPSEC. It supports NAT-T traversal for using IPSEC through Network Address Translation (NAT) devices. Typical use cases include the following:
C at E G O r Y D E S cr I P t I O N

Multi-site cloud deployment

vShield VPN can connect multiple cloud deployments. For example, an organization’s virtual datacenter at a service provider on the West Coast could be connected with the organization’s virtual datacenter at a service provider on the East Coast. NOTE: Each internal subnet must have a unique address space to connect successfully. Because vShield also provides address translation, it is possible to deploy multiple organization virtual datacenters at different providers using the same RFC1918 address space. Unique subnets are required to connect them.

Office to cloud VPN

It is possible to have a permanent VPN from a router-based VPN, such as a Cisco 2821 Integrated Services Router, to a cloud environment with the vShield Edge. This can also be accomplished via a Linux-based gateway since vShield VPN is compatible with Openswan, a Linux IPSEC implementation. Client software is generally not supported, although robust clients with static IPs that support pre-shared key authentication can connect.

Client to cloud VPN

Table 12. Network Access Security Use Cases

To configure vShield VPN, the endpoint connecting to the vShield Edge device must: • Support the following: –– IKE for ISAKMP/Oakley ––Pre-Shared Key Authentication Mode ––3DES or AES128 encryption ––SHA1 authentication ––Diffie-Helman Group 2/5 (1024 bit/1536 bit, respectively) ––PFS (Perfect Forward Secrecy) ––ESP Tunnel Mode • Disable ISAKMP aggressive mode

11.2 Compliance
Audit concepts applied to a cloud environment, such as segmentation and monitoring, reveal new challenges. Elasticity may break old segmentation controls and the ability to isolate sensitive data within a rapidly growing environment. Role-based access controls and virtual firewalls must also demonstrate compatibility with audit requirements for segmentation, including detailed audit trails and logs. Can a provider guarantee that an off-line image with sensitive data in memory is accessible only by authorized users, and can a log tell who accessed it and when? Multiple admin-level roles are necessary for cloud resource management. The complexity of cloud environments, coupled with new and different technology, requires careful audits to document and detail compliance. The following are a set of common audit concerns within the cloud.

T ECHNICAL W HI T E P A P E R / 6 1

VMware vCloud Architecting a vCloud

CONcErN

D E ta I L

Hypervisor

An additional layer of technology is present in every cloud and therefore presents a different attack surface. It introduces a layer between the traditional processing environment and the physical layer, which brings vulnerabilities of its own as well as new paths of attack related to its communication with the layers above and below. May expose sensitive data when not configured and monitored properly; physical and logical isolation has always been an audit concern. The ease and speed of change to a virtualized environment within cloud computing, often called elasticity, makes the setup of review of segmentation controls even more critical to compliance. A cloud can make much more efficient use of hardware, but this brings auditors to assess whether sensitive data is at risk just from the proximity of other virtual systems managed to a lesser level of security. Some compliance standards for example, require one primary function per server (or virtual server), as illustrated below.
App App App Management Controls Distributed Network Hypervisor Cluster ESXi ESXi ESXi ESXi FW, IDS, AV App Regulated Data FW, IDS AV App App App App

Segmentation and isolation

Different/multiple primary functions per host

Nonregulated Data FW, IDS, AV FW, IDS, AV

Storage Fabric Server Server Server Server

Physical

Storage

Routers

Switches

Firewalls Load Balancers

Enforcement of least privilege

In a cloud environment, remote network access becomes the only path for customers to manage an environment. Before, physical access was audited for equipment installation and modification, now software controls customer access capabilities. Authorization software has to be more sophisticated to handle every user, group, and role request for a cloud customer. The ability of systems to quickly change and move within the cloud gives auditors a need to track this. Cloud environments make extensive use of short-lived instances. Virtual machines may have a lifecycle far shorter than physical systems because they are so easy to provision and then repurpose. The systems often share data across large arrays. Permanence of data is also affected by environments that push as much storage as possible through high-speed memory to avoid the latency of spinning disk. Customers need a view of their audit trails that is unique to their own use of the cloud environment and that can be used for investigations. Providers must enhance the sophistication of existing log tools in order to keep up with the new technology and new management practices within a cloud environment.

Machine state and migration Data is much less permanent

Immaturity of monitoring solutions in cloud environments

Table 13. Audit Concerns Within The Cloud

T ECHNICAL W HI T E P A P E R / 6 2

VMware vCloud Architecting a vCloud

11.3 Use Cases: Why Logs Should be Available
Recording and monitoring are important to mitigate damage and prevent future attacks. An audit log enables the organization to verify compliance, detect violations, and initiate remediation activities. It can help detect attempts, whether successful or not, for unauthorized access, information probes, or disruption. It is a best practice to regularly examine logs for suspicious, unusual, or unauthorized activity. External laws and regulations also require specific levels of monitoring and verification for access and authorization. Rules are necessary to restrict access, while routine log analysis also helps identify system configuration errors, failures, and enforcement of adherence to SLAs. Logs serve an essential purpose and are a foundation for most controls in a regulation; they track and record changes and incidents to form an audit trail. Log purposes include: • Compliance requirements. Logs are required for all compliance regulations to assist with control auditing as well as breach review, analysis and response. Individual logs often satisfy specific compliance controls; for example, an authentication log can verify a resource was accessed by only authorized users. • Customer requirements. End customers (tenants) can retrieve logs that pertain to their environment in order to meet their own requirements. • Operational integrity. Operational alerts should be defined for logs to trigger notifications for remediation. This may be a backup alert, secondary to monitoring. Troubleshooting. Closely related to operational integrity, logs are essential for troubleshooting. The use of vShield Edge logs, for example, can show whether a specific external connection request is being passed through or NATed by the firewall. The following list is the minimum set of data types required to adequately log cloud environment activity for regulatory compliance: • User (including system account) access • Action taken • Use of identification and authentication mechanisms • Start and stop of audit logs • Creation or deletion of system-level objects The audit trail entries recorded for each event must include the following six details: • Identification (ID) • Type of event • Date and time • Success or failure • Origination of event • ID of affected data or component The logs must be reviewed at least daily for all system components, and especially systems that handle intrusion detection, authentication, and authorization. Whereas daily review of logs may not be sufficient on its own to detect incidents, they also must be retained for a period consistent with “effective use” and legal regulations. The laws for log retention range from one year to more than twenty. Therefore, log archives should always be able to provide at least one year of history, typically scheduled to match financial calendar cycles, and a minimum of three months available for immediate response and review in case of an incident.

T ECHNICAL W HI T E P A P E R / 6 3

VMware vCloud Architecting a vCloud

11.3.1 Example Compliance Use Cases for Logs The following use cases are a sample of events that benefit from careful logging and monitoring in the cloud environment. Other examples may include unauthorized services or protocols, remote login success and certificate changes. • Shared accounts. An investigation is initiated to review network outages and finds multiple instances of an Administrator account had logged into critical servers before failure. Shared accounts make it very difficult to trace fault to one individual; it is impossible to determine from the logs on that system which person was logged into the user account that made the error. Therefore, usage must be tied to an individual user ID and unique password with correct time to aid in investigations. Systems also should be configured to detect any and all use of generic IDs such as an administrator or root account and trace them to unique identities. • User account changes. A malicious user finds an unpatched flaw in an environment that allows elevation of privileges. That user then uses system-level privileges to create a new bogus user object from which to launch further attacks. A user object is a Microsoft Widows Domain or Local user account, for example. User object logs can be used to figure out when a name was changed or an account added. This assists in detection of actions without authorization or users trying to hide attacks. • Unauthorized software. Malware or a new virtual machine instance in the cloud, can be found in system object logs. A system must track system objects that are added, removed or modified. This can be very helpful during installation to monitor system changes caused by software. 11.3.2 VMware vCloud Log Sources for Compliance Customers should be able to retrieve logs from all areas that are relevant and unique to their organization. Retrieval should be possible via a programmatic fashion such as an API to allow for automated queries. Log collection nodes must be added to a cloud environment, as illustrated below.
App App App Management Controls Distributed Network Hypervisor Cluster ESXi
Log Collection

App Regulated Data

App App

App App

Nonregulated Data

FW, IDS, AV

FW, IDS, AV FW, IDS, AV FW, IDS, AV

ESXi

ESXi

ESXi

Log Collection Log Collection Log Collection

Storage Fabric Server Server Server Server

Physical

Storage

Routers

Switches

Firewalls Load Balancers

Figure 18. Log Collection in the Cloud Environment

T ECHNICAL W HI T E P A P E R / 6 4

VMware vCloud Architecting a vCloud

Logs, generated by VMware components, must be maintained by the provider but also must be available to tenants. Tenants should be able to download in raw format all vCloud Director and VSE logs pertaining to their organizations/networks. Logs with customer identifiers should be flagged or indexed for retrieval. The following diagram illustrates architecture of vCloud components and log collection.
Log Reporting Archive Log Analysis Log Analysis

vSE VMware vCloud Director vCD Server vCenter Orchestrator vCD Server vCD DB vCC vCC DB VM VM vCenter eVDC Customer Env Log Collector VM VM vCenter Server vCenter DB

vCenter

vCenter Server vCenter DB

vShield Manager

vShield Manager

VMware vSphere Storage Pools

VMware vSphere

VMware vSphere

VMware vSphere

Figure 19. Architecture of vCloud Components and Log Collection

The following table illustrates to which logs a vCloud tenant must have access.
V M war E C O m P O N E N t Pr Ov I D E r LO G S T E N a N t LO G S

VMware vCloud Director (vCD) vCenter Server (VC) vSphere Server (ESXi) Chargeback Manager (CBM) vCenter Orchestrator (vCO) vShield Manager (VSM) vShield Edge (VSE)
Table 14. vCloud Component Logs

Other components also generate logs in the cloud environment that must be maintained by the provider but direct tenant access is not required.

T ECHNICAL W HI T E P A P E R / 6 5

VMware vCloud Architecting a vCloud

Ot H E r C O m P O N E N t

Pr Ov I D E r LO G S

T E N a N t LO G S

vCloud Director DB (Oracle) VMware Virtual Center Database VMware vCenter Chargeback Database MS SQL Server Linux (vCD) Windows System Logs (CBM, vCO, VC Server)
Table 15. Other Component Logs

Logs in the vCloud Datacenter environment can further be categorized into four logical business layers: 1. Cloud Application: Represents the external interface that the enterprise administrators of the cloud are able to interact with. These administrators are authenticated and authorized at this layer, and have no (direct or indirect) access to the underlying infrastructure. They interact only with the Business Orchestration Layer. 2. Business Orchestration: Represents both the configuration entities of the cloud, as well as, the governance policies to control the cloud deployment. – vCenter Chargeback: • Service Catalog: Presents the different service levels available and their configuration elements • Service Design: Represents the service level and specific configuration elements along with any policies defined. • Configuration Management Database (CMDB): Represents the system of record, which may be federated with enterprise CMDB. • Service Provision: Represents the final configuration specification. 3. Service Orchestration: Represents the provisioning logic for the cloud infrastructure. This layer consists of an orchestration director system and automation elements for network, storage, security, and server/ compute – vCenter Server (VC), VMware vCloud Director (vCloud Director), vCenter Orchestrator (vCO). 4. Infrastructure Layer: Represents the physical and virtual compute, network, storage, hypervisor, security and management components – vSphere Server (ESXi), vShield Manager (VSM), vShield Edge (VSE).

T ECHNICAL W HI T E P A P E R / 6 6

VMware vCloud Architecting a vCloud

API

Custom/Enterprise Portal

Administrators

Policy Application Admin

Application Management

Cloud Application Layer
Service Design CMDB Service Catalog Federated CMDB

Users

Enterprise
Service Provision

Business Admin

Business Management

Business Orchestration Layer

Firewall Load Balancer AppVM AppVM

IDS

Network Automation Storage Automation Orchestration Admin

Service Orchestration

Storage

AppVM

Security Automation

Service Management

Service Orchestration Layer

Server Automation

Cloud Deployment Tier 1
IDS

Firewall

Network

Security

Management
Hypervisor

Load Balancer AppVM AppVM

Storage

Infrastructure Admin

Storage

Compute Physical

AppVM

Infrastructure Management

Infrastructure Layer

Cloud Deployment Tier 2

Figure 20. Infrastructure Layers

The abstraction of these four layers and their security controls helps illustrate audit and compliance requirements for proper authentication and segregation. Cloud provider administrator accounts, for example, should be maintained in a central repository integrated with two-factor authentication. Different tiers of cloud deployments (VPDCs) would be made available to enterprise users. The following diagram illustrates architecture of vCloud components and log collection.

11.4 vCloud Director Diagnostic and Audit Logs
VMware vCloud Director includes two types of logs: • Audit logs that are maintained in the database, and optionally, in a syslog server • Diagnostic logs that are maintained in each cell’s log directory The VMware vCloud Director system audit log is maintained in the Oracle database and can be monitored through the Web UI. Each organization administrator and the system administrator have a view into the log scoped to their specific area of control. A more comprehensive view of the audit log (and long-term persistence) is achieved through the use of remote syslog, described below. Log management products are available from a variety of vendors and open-source projects.

T ECHNICAL W HI T E P A P E R / 6 7

VMware vCloud Architecting a vCloud

Audit events, as defined earlier, are not the only event types. Diagnostic logs, described below, contain information about system operation events and are stored as files in the local file system of each cell’s OS. Diagnostic logs can be useful for problem resolution but are not intended to preserve an trail of system interactions for audit. Each VMware vCloud Director cell creates several diagnostic log files described in the Viewing the vCloud Director Logs section of the VMware vCloud Director Administrator’s Guide. Audit logs, on the other hand, do record significant actions, including login and logout. A syslog server can be set up during installation as detailed in the vCloud Director Installation Guide. Exporting the logs to a syslog server is required for compliance due to multiple reasons: 1. Database logs are not retained after 90 days, while logs transmitted via syslog can be retained as long as desired. 2. It allows audit logs from all cells to be viewed together in a central location at the same time. 3. It protects the audit logs from loss on the local system due to failure, a lack of disk space, compromise, and so on. 4. It supports forensics operations in the face of problems like those listed above. 5. It is the method by which many log management and Security Information and Event Management (SIEM) systems will integrate with vCloud Director. This enables: a. Correlation of events and activities across vCloud Director, vShield, vSphere, and even the physical hardware layers of the stack b. Integration of cloud security operations with the rest of the cloud provider’s or enterprise’s security operations, cutting across physical, virtual, and cloud infrastructures 6. Logging to a remote system, rather than the system the cell is deployed on, provides data integrity, that is, inhibits tampering. A compromise of the cell does not necessarily enable access to or alteration of the audit log.

11.5 Load Balancer Considerations
Load balancer considerations are discussed in the following table.
C O N S I D E rat I O N D E ta I L

Security

A front-end firewall is typically deployed before the load balancer. In some environments additional firewalls may be located between vCloud Director cells and the resource tiers, including vCenter. Load Balancers may also provide NAT/SNAT (Source network address translation) and is typically configured to provide this for the clustered cells. It is also recommended to secure access between cells and the other management and resource group components. Refer to the vCloud Director Installation and Configuration Guide for ports that must be opened.

Single vCloud Director site and scope Sizing recommendations for number of cells

This architecture covers load balancing of a single vCloud Director site or instance. It does not cover client application load balancing or global load balancing. In general, VMware recommends that the number of vCloud Director cell instances = n + 1, where n is the number of vCenter Server instances providing compute resources for cloud consumption. Based on the Service Definition, two vCloud Director cell instances should be sufficient and allow for upgradability (upgrading one vCloud Director cell, then the other) and high availability.

T ECHNICAL W HI T E P A P E R / 6 8

VMware vCloud Architecting a vCloud

C O N S I D E rat I O N

D E ta I L

Requirements for multicell configurations

Multiple vCloud Director cells require NTP (Network Time Protocol) which is a best practice for all elements of the vCloud infrastructure. Consult www.vmware.com/files/pdf/Timekeeping-In-VirtualMachines.pdf for more information on how to set up NTP.

Load balancer availability

At least two load balancers in a HA configuration should be used to reduce single points of failure. There are multiple strategies for this depending on vendor or software used. Each load-balanced vCloud Director cell requires setting a proxy console IP address which should be provided by the load balancer in most cases. The cloud service URL should be mapped to the address provided via the load balancer. This is configured in the vCloud Director administrator GUI as well as in the load balancer configuration. This is the address that should be used to check the health status of the vCloud Director cell. Some vCloud Director cell roles may consume high resources, for example, image transfer. All cells can perform the same set of tasks but it would be possible to set policies that effect which ones are used. See the advanced configuration settings. Sessions are generally provided in secure methods and are terminated at the cells. Because of this, session persistence should be enabled using SSL. Least connections or round-robin is generally acceptable. Each load balancer service should be configured to check the health of the individual vCloud Director cells. Since each cell responds via HTTPS, this can be configured quickly via the IP and API end point URL. Load balancers may support other types of health checks. Generally services are checked every few-30 seconds based on load. A good starting point is 5 seconds. Example GUI URL - https://my.cloud.com/cloud/ Example API URL - https://my.cloud.com/api/versions In the second example, the versions supported by this end point should be returned as XML.

Proxy configuration Rest API url configuration

Awareness of Multi-Cell Roles

Load balancer session persistence Load balancing algorithm vcloud Director cell status health checks

Public IP/port Advanced configurations

The service IP should be specified appropriately before cells are added to the service group. Typically port 443 is the only port exposed – standard HTTPS. Load balancers can also provide layer 7 content switching or direction, which may allow a vCloud Director configuration send certain types of client traffic to “dedicated” cells. While cells can perform any function, they can be utilized in a dedicated fashion if they only receive those types of requests. When a cell joins an existing cluster, it may try and load balance sessions. This may impact connection mapping thru the load balancer as it will be unaware of the balancing happening within the cell cluster. VMware vShield Edge’s load balancing functionality does not support SSL session persistence today and may not be not be suitable at this point for this application.

Connection mapping

Session persistence

Table 16. Load Balancer Considerations

T ECHNICAL W HI T E P A P E R / 6 9

VMware vCloud Architecting a vCloud

12. Appendix: Signed Certificates with vCloud Director
A certificate is more than just an RSA key for authentication. Additionally, there is a unique serial number for easy referencing and an identity, which can be a person, computer, or group. The certificate binds the identity to the RSA key. Certificates also contain a time period for which this binding is considered valid. They can even contain information on where to verify the certificate. Finally, a purpose or group limitation can be included in the certificate. An issuer gives out a certificate. To prevent forgery, a cryptographic signature protects the entire certificate. Certificates enable us to hand out digitally signed identities for hosts and users. Certificates are written in a specific machine-parsable way called Abstract Syntax Notation One (ASN.1). This is then wrapped in a specific format such as the Public-Key Cryptography Standards (PKCS) of which there are various versions. Because people wanted to be able to use more than just ASCII characters (and because some old network connections were not even 8-bit clean and could not even safely transport ASCII), these ASN.1 notations are packed in either the binary DER format, or in the base-64 PEM format. The PEM format is commonly used for SSL/HTTPS certificates for websites. Certificates with different purposes are also named differently. To make things more confusing, some vendors give these certificates or certificate containers different names. The most commonly known X.509 Certificates are those used in web servers and web browsers. Web servers obtain signed X.509 Certificates from a commonly known and trusted source. In practice this means from companies like VeriSign / Thawte, but it could easily be your own Certificates. Another example is the non-profit organization, CAcert.org. Certificate Type Description Certificate Authority (CA) - The start of a trust chain. The CA is the only certificate that signs itself. (Trusted) Root certificate - Another name for a CA certificate. Intermediate Certificate Authorities A CA that is not signed by itself but by another (parent) CA. CSR Certificate Signing Request. This is a certificate generated by a person, with an embedded request to sign. This request is then given to a CA to be signed. Host (Local Computer) certificate - Certificate issued for a machine (usually signed by a CA). User (Personal) certificate - Certificate issued for a person (usually signed by a CA). Private Key - Private key file belonging to a public key of a .pem file. Can be pass-phrase protected. PKCS#12 (.pl file) - Certificate plus a private key issued and signed by a CA. Normally pass-phrase protected. PKCS#15 - Certificate standard for Cryptographic Tokens (for instance, USB crypto tokens). Usually PIN protected. Sometimes multilevel PINs differentiate user and admin access. CRL - Certificate Revocation List, a list of revoked serial numbers of issued certificates.
St E P REFErENcED DOcUmENt

a b c d e

Generate and import the CA signed wildcard certificate for vCloud Director Generate the wildcard untrusted certificate with the necessary details Generate the Certificate Signing Request Send the CSR to the Certificate Authority The CA will send you back the cert with a root cert and possibly an intermediate certificate

T ECHNICAL W HI T E P A P E R / 7 0

VMware vCloud Architecting a vCloud

St E P

REFErENcED DOcUmENt

f g h i j

import the root certificate into your keystore Import the intermediate certificate (if there is one, it depends on your CA) When you run the vCloud configuration script it looks for the correct aliases, you need to create them by cloning the wildcard alias. Now import the wildcard certificate for each one If you need to delete an entry from the keystore

Table 17. Certificate Steps

For further information, refer to the keytool man page keytool(1). Generating and Importing CA-Signed Wildcard Certificate for vCloud Director Each vCloud Director host requires two TLSv1/SSL certificates, one for each of its IP addresses. You can use wildcard certificates to simplify the addition of new ‘Cells’. The following is the procedure to create the keystore for vCloud Director to use. You create the keystore once and then copy it to each new ‘Cell’ added. NOTE: The keytool(1) certificate management utility gets installed with vCloud Director, the full path is /opt/ vmware/cloud-director/jre/bin/keytool Example: $ mkdir -p /opt/keystore $ chown vcloud:vcloud /opt/keystore $ cd /opt/keystore Generating the Wildcard Untrusted Certificate with the Necessary Details When you run the following command below, you are going to be asked specific questions in order to generate a wildcard certificate. Example: $ /opt/vmware/cloud-director/jre/bin/keytool -keystore certificates.vmware -storetype JCEKS -storepass <certificate passwd> -genkey -keyalg RSA -alias wildcard

What is your first and last name? [Unknown]: johndoe.example.com What is the name of your organizational unit? [Unknown]: Cloud Engineering What is the name of your organization? [Unknown]: Example, Inc. What is the name of your City or Locality? [Unknown]: Cambridge What is the name of your State or Province? [Unknown]: Massachusetts

T ECHNICAL W HI T E P A P E R / 7 1

VMware vCloud Architecting a vCloud

What is the two-letter country code for this unit? [Unknown]: US Is CN=johndoe.example.com, OU=Cloud Engineering, O=”Example, Inc.”, L=Cambridge, ST=Massachusetts, C=US correct? [no]: yes Enter key password for <wildcard> (RETURN if same as keystore password): Generating the Certificate Signing Request Once the certificate-signing request has been approved, you should be able to obtain the server certificate in ASCII format. This contains the public key of the certificate that corresponds with the private key that was created when you generated the certificate request. The ASCII representation of the approved certificate will look something like: −−−−−BEGIN CERTIFICATE−−−−− MIICtDCCAl6gAwIBAgIBHTANBgkqhkiG9w0BAQQFADB5MQswCQYDVQQGEwJVUzEO MAwGA1UECBMFVGV4YXMxDzANBgNVBAcTBkF1c3RpbjEZMBcGA1UEChMQU3VuIE1p Y3Jvc3lzdGVtczEQMA4GA1UECxMHaVBsYW5ldDEcMBoGA1UEAxMTQ2VydGlmaWNh dGUgTWFuYWdlcjAeFw0wMTEyMTMyMjQ4MzRaFw0wMjEyMTMyMjQ4MzRaMH0xCzAJ BgNVBAYTAlVTMQ4wDAYDVQQIEwVUZXhhczEPMA0GA1UEBxMGQXVzdGluMRkwFwYD VQQKExBTdW4gTWljcm9zeXN0ZW1zMRAwDgYDVQQLEwdpUGxhbmV0MSAwHgYDVQQD ExdzdW5maXJlLmNlbnRyYWwuc3VuLmNvbTCBnzANBgkqhkiG9w0BAQEFAAOBjQAw gYkCgYEA45ji7uN6LqdCVehxPnuKzzqq2PfFaTaWZhqYro903bSdf9Qp+sGabDfJ qrspwrgjE2Owwia4H3InHpvzkcf2O2uB89bwm/RyHhU5AGt3wVFmsgN16XIL+smk CBBSJo31RTuIZw11ZYkkqMZzVY84sBpGJ0mtD1xnWhsb0MYN5bMCAwEAAaOBiDCB hTARBglghkgBhvhCAQEEBAMCBsAwDgYDVR0PAQH/BAQDAgTwMB0GA1UdDgQWBBTM q/dM6tawflKUfRnqKupfhU3HlDAfBgNVHSMEGDAWgBRCOcKaQjn6l7Ft1OqsPcji gwlFuTAgBgNVHREEGTAXgRVuZWlsLmEud2lsc29uQHN1bi5jb20wDQYJKoZIhvcN AQEEBQADQQApqNPdeDARy6xWu7/SfxAH12S/wPD43OYJqbt/R2y5/Zpde/arIyhk fucakqo0Bk9DlI/A4IR+b9Q56k6Ce8tO −−−−−END CERTIFICATE−−−−− Example: $ /opt/vmware/cloud-director/jre/bin/keytool -keystore certificates.vmware -storetype JCEKS -storepass <certificate passwd> -certreq -alias wildcard -file wildcard.csr Send the CSR to the Certificate Authority The CA will send you back the cert with a root cert and possibly an intermediate certificate. Import the Root Certificate into your Keystore Example: $ /opt/vmware/cloud-director/jre/bin/keytool -storetype JCEKS -storepass <certificate passwd> -keystore certificates.vmware -import -alias root -file EntrustRootCertificate.cer
T ECHNICAL W HI T E P A P E R / 7 2

VMware vCloud Architecting a vCloud

This step or subsequent certificate import steps may result in a “trust this certificate?” prompt from keytool(1), so do not be surprised when this happens. Also be aware that the ‘certificates.vmware) needs to be the same keystore that the CSR was produced from. NOTE: The keystore needs to be in JCEKS format, as vCloud Director 1.0 does not support other formats. Import the Intermediate Certificate Example: $ /opt/vmware/cloud-director/jre/bin/keytool -storetype JCEKS -storepass <certificate passwd> -keystore certificates.vmware -import –alias intermediate -file EntrustCrossCertificate.cer Create Correct Aliases for vCloud Director’s Configuration Script $ /opt/vmware/cloud-director/jre/bin/keytool -storetype JCEKS -storepass <certificate passwd> -keystore certificates.vmware -keyclone –alias wildcard -dest http $ /opt/vmware/cloud-director/jre/bin/keytool -storetype JCEKS -storepass <certificate passwd> -keystore certificates.vmware -keyclone –alias wildcard -dest consoleproxy Import the Wildcard Certificate Example: $ /opt/vmware/cloud-director/jre/bin/keytool -storetype JCEKS -storepass <certificate passwd> -keystore certificates.vmware -import -alias http -file wildcard.cer $ /opt/vmware/cloud-director/jre/bin/keytool -storetype JCEKS -storepass <certificate passwd> -keystore certificates.vmware -import -alias consoleproxy -file wildcard.cer Copy the Certificates It may be obvious to the reader, but careful consideration should be made to ensure that you do not copy the keystores into the $VCLOUD_HOME location, as this will only confuse you when you run the vCloud Director configure tool which creates the certificates and proxycertificates files in $VCLOUD_HOME/etc. A more important note is to ensure that the permissions are set such that the vcloud user has read access, and that the files are not owned by root with the group and other bits unset). Deleting an Entry from the Keystore If you need to delete an entry from the keystore, then you can use the following procedure to do so. Example: $ /opt/vmware/cloud-director/jre/bin/keytool -storetype JCEKS -storepass <certificate passwd> -keystore certificates.vmware -delete –alias consoleproxy Viewing and Verifying the Keystore Entries The keystore used by vCloud Director is a storage mechanism for the cryptographic tokens. These tokens are also known as entries, and we will see shortly how to view these keystore entry certificate details using keytool(1). Each entry in a keystore is identified by a different alias or entry name. Entries also store their last modified date/time. The KeyStore is also password protected. The password is required to load the keystore and a password will be requested when saving a keystore for the first time. There are various different types of KeyStores available, such as: • JKS Java KeyStore. Sun’s keystore format. • JCEKS Java Cryptography Extension KeyStore. • PKCS #12 Public-Key Cryptography Standards • BKS Bouncy Castle KeyStore. • UBER Bouncy Castle UBER KeyStore.

T ECHNICAL W HI T E P A P E R / 7 3

VMware vCloud Architecting a vCloud

However, currently vCloud Director v1.0 only supports JCEKS, which is a more secure version of JKS. As previously mentioned, to view the contents of a KeyStore that has already been created, you will need to know the password, which is a random encrypted string. You can only verify a KeyStore that you have created or know the password for. The following example shows you how to view the contents of the KeyStore entries. There are two ways of passing the password to keytool(1). They are as follows: $ /opt/vmware/cloud-director/jre/bin/keytool -storetype JCEKS -list -v -keystore certificates.vmware Enter keystore password: <enter keystore password> and $ /opt/vmware/cloud-director/jre/bin/keytool -storetype JCEKS -list –storepass <keystore passwd> -v -keystore certificates.vmware If you do not provide the correct KeyStore password you will see the following error message: $ /opt/vmware/cloud-director/jre/bin/keytool -storetype JCEKS -list -v -keystore certificates.vmware Enter keystore password: <incorrect keystore password> keytool error: java.io.IOException: Keystore was tampered with, or password was incorrect java.io.IOException: Keystore was tampered with, or password was incorrect $ It is also worth mentioning that the vCloud Director ‘cells’ do not share a common keystore. As it states in the installation guide, each cell has its own keystore that contains two keys: one for the HTTP service and one for the console proxy. If you use a wildcard certificate, you still need two entries in the keystore with the appropriate aliases and since there is no cell-specific information in the keystore, you could use it for the configure step on each cell. It is also important to note that the keystore file should be protected by restrictive operating system permissions. The password for the file should be stored securely or should be prompted for on server startup. Things to Check If you encounter any issues whilst setting up and implementing certificates in vCloud Director, such as the infamous ‘Cryptographic error’ then here are a few things to check. 1. Make sure that certificates-vmware is a JCEKS keystore, and not the default JKS type. 2. Make sure that any password for the keystore does not use special characters that the shell will interpret, when the keystore is created, or that the arguments to keytool(1) were quoted to prevent the shell from interpreting it (namely the dollar sign followed by anything or asterisk, and so on). There was a case previously where a customer used something along the lines of “-storepass ca$$” and wondered why the configure script complained that that the password was wrong. It worked from one particular shell because bash would expand $$ to be the PID of the current shell. Obviously the vCloud Director configure script does not expand the value and kept using “ca$$” instead of “ca<someint>” 3. Can you successfully list the certificates in the keystore or print it using /opt/vmware/cloud-director/jre/ bin/keytool? Given that the same Java platform code will be used under the covers to read in the keystore, decrypt values, and so on, this is a good check to make sure that you have not done something wrong. at com.sun.crypto.provider.JceKeyStore.engineLoad(DashoA13*..) at java.security.KeyStore.load(KeyStore.java:1185) at sun.security.tools.KeyTool.doCommands(KeyTool.java:620) at sun.security.tools.KeyTool.run(KeyTool.java:172) at sun.security.tools.KeyTool.main(KeyTool.java:166)

T ECHNICAL W HI T E P A P E R / 7 4

VMware vCloud Architecting a vCloud

Detailed Example and Output of the vCloud Director HTTP Certificate $ /opt/vmware/cloud-director/jre/bin/keytool -storetype JCEKS -list –storepass <keystore passwd> -v -keystore certificates.vmware

Keystore type: JCEKS Keystore provider: SunJCE

Your keystore contains 1 entry

Alias name: http Creation date: Jan 18, 2011 Entry type: PrivateKeyEntry Certificate chain length: 3 Certificate[1]: Owner: CN=*.vcloud.vmware.com.us, OU=VMware Architects, O=VMware Inc, L=Hillview Avenue, C=US Issuer: CN=Entrust Certification Authority - L1C, OU=”(c) 2009 Entrust, Inc.”, OU=www.entrust.net/rpa is incorporated by reference, O=”Entrust, Inc.”, C=US Serial number: 3f865g44 Valid from: Tue Jan 18 13:11:31 EST 2011 until: Sat Jan 19 16:28:25 EST 2013 Certificate fingerprints: MD5: 80:04:E8:70:0E:F1:E8:8B:68:A7:7B:16:C7:69:60:FF SHA1: B4:C8:82:88:63:2C:E6:08:6C:23:7D:5C:53:4A:C0:54:16:0A:08:88 Signature algorithm name: SHA1withRSA Version: 3

Extensions:

#1: ObjectId: 2.5.29.15 Criticality=false KeyUsage [ DigitalSignature Key_Encipherment ]

T ECHNICAL W HI T E P A P E R / 7 5

VMware vCloud Architecting a vCloud

#2: ObjectId: 2.5.29.14 Criticality=false SubjectKeyIdentifier [ KeyIdentifier [ 0000: 18 F6 DA D8 68 E0 A5 F7 E3 CA 81 C8 56 3C 42 47 ....h.......V<BG 0010: 8F E0 DB F7 .... ] ]

#3: ObjectId: 1.3.6.1.5.5.7.1.1 Criticality=false AuthorityInfoAccess [ [ accessMethod: 1.3.6.1.5.5.7.48.1 accessLocation: URIName: http://ocsp.entrust.net] ]

#4: ObjectId: 2.5.29.31 Criticality=false CRLDistributionPoints [ [DistributionPoint: [URIName: http://crl.entrust.net/level1c.crl] ]]

#5: ObjectId: 2.5.29.32 Criticality=false CertificatePolicies [ [CertificatePolicyId: [1.2.840.113533.7.75.2] [PolicyQualifierInfo: [ qualifierID: 1.3.6.1.5.5.7.2.1 qualifier: 0000: 16 1A 68 74 74 70 3A 2F 2F 77 77 77 2E 65 6E 74 ..http://www.ent 0010: 72 75 73 74 2E 6E 65 74 2F 72 70 61 rust.net/rpa ]] ] ]

T ECHNICAL W HI T E P A P E R / 7 6

VMware vCloud Architecting a vCloud

#6: ObjectId: 2.5.29.37 Criticality=false ExtendedKeyUsages [ serverAuth ]

#7: ObjectId: 2.5.29.19 Criticality=false BasicConstraints:[ CA:false PathLen: undefined ]

#8: ObjectId: 2.5.29.35 Criticality=false AuthorityKeyIdentifier [ KeyIdentifier [ 0000: 1E F1 AB 89 06 F8 49 0F 01 33 77 EE 14 7A EE 19 ......I..3w..z.. 0010: 7C 93 28 4D ..(M ] ]

Certificate[2]: Owner: CN=Entrust Certification Authority - L1C, OU=”(c) 2009 Entrust, Inc.”, OU=www.entrust.net/rpa is incorporated by reference, O=”Entrust, Inc.”, C=US Issuer: CN=Entrust.net Certification Authority (2048), OU=(c) 1999 Entrust.net Limited, OU=www.entrust.net/ CPS_2048 incorp. by ref. (limits liab.), O=Entrust.net Serial number: 7245h1sx Valid from: Fri Dec 11 07:43:54 EST 2009 until: Wed Dec 11 08:13:54 EST 2019 Certificate fingerprints: MD5: 2F:B3:00:F2:FA:12:7B:BD:82:95:70:05:96:17:DB:BE SHA1: 61:43:AF:68:F7:B3:3A:47:94:04:74:98:8B:05:F7:B1:62:96:98:42 Signature algorithm name: SHA1withRSA Version: 3

T ECHNICAL W HI T E P A P E R / 7 7

VMware vCloud Architecting a vCloud

Extensions:

#1: ObjectId: 2.5.29.15 Criticality=true KeyUsage [ Key_CertSign Crl_Sign ]

#2: ObjectId: 2.5.29.19 Criticality=true BasicConstraints:[ CA:true PathLen:2147483647 ]

#3: ObjectId: 2.5.29.14 Criticality=false SubjectKeyIdentifier [ KeyIdentifier [ 0000: 1E F1 AB 89 06 F8 49 0F 01 33 77 EE 14 7A EE 19 ......I..3w..z.. 0010: 7C 93 28 4D ..(M ] ]

#4: ObjectId: 1.3.6.1.5.5.7.1.1 Criticality=false AuthorityInfoAccess [ [ accessMethod: 1.3.6.1.5.5.7.48.1 accessLocation: URIName: http://ocsp.entrust.net] ]

#5: ObjectId: 2.5.29.31 Criticality=false CRLDistributionPoints [ [DistributionPoint: [URIName: http://crl.entrust.net/2048ca.crl] ]]

T ECHNICAL W HI T E P A P E R / 7 8

VMware vCloud Architecting a vCloud

#6: ObjectId: 2.5.29.32 Criticality=false CertificatePolicies [ [CertificatePolicyId: [2.5.29.32.0] [PolicyQualifierInfo: [ qualifierID: 1.3.6.1.5.5.7.2.1 qualifier: 0000: 16 1A 68 74 74 70 3A 2F 2F 77 77 77 2E 65 6E 74 ..http://www.ent 0010: 72 75 73 74 2E 6E 65 74 2F 72 70 61 rust.net/rpa

]] ] ]

#7: ObjectId: 2.5.29.35 Criticality=false AuthorityKeyIdentifier [ KeyIdentifier [ 0000: 55 E4 81 D1 11 80 BE D8 89 B9 08 A3 31 F9 A1 24 U...........1..$ 0010: 09 16 B9 70 ...p ] ]

Certificate[3]: Owner: CN=Entrust.net Certification Authority (2048), OU=(c) 1999 Entrust.net Limited, OU=www.entrust.net/ CPS_2048 incorp. by ref. (limits liab.), O=Entrust.net Issuer: CN=Entrust.net Certification Authority (2048), OU=(c) 1999 Entrust.net Limited, OU=www.entrust.net/ CPS_2048 incorp. by ref. (limits liab.), O=Entrust.net Serial number: 8392k3du Valid from: Sat Dec 25 04:50:51 EST 1999 until: Wed Jul 25 00:15:12 EST 2029 Certificate fingerprints: MD5: EE:29:31:BC:32:7E:9A:E6:E8:B5:F7:51:B4:34:71:90 SHA1: 50:30:06:09:1D:97:D4:F5:AE:39:F7:CB:E7:92:7D:7D:65:2D:34:31 Signature algorithm name: SHA1withRSA Version: 3

T ECHNICAL W HI T E P A P E R / 7 9

VMware vCloud Architecting a vCloud

Extensions:

#1: ObjectId: 2.5.29.15 Criticality=true KeyUsage [ Key_CertSign Crl_Sign ]

#2: ObjectId: 2.5.29.19 Criticality=true BasicConstraints:[ CA:true PathLen:2147483647 ]

#3: ObjectId: 2.5.29.14 Criticality=false SubjectKeyIdentifier [ KeyIdentifier [ 0000: 55 E4 81 D1 11 80 BE D8 89 B9 08 A3 31 F9 A1 24 U...........1..$ 0010: 09 16 B9 70 ...p ] ]

******************************************* *******************************************

Detailed Example and Output of the vCloud Director CONSOLEPROXY Certificate $ /opt/vmware/cloud-director/jre/bin/keytool -storetype JCEKS -list –storepass <keystore passwd> -v -keystore consoleproxy.vmware

Keystore type: JCEKS Keystore provider: SunJCE

Your keystore contains 1 entry

T ECHNICAL W HI T E P A P E R / 8 0

VMware vCloud Architecting a vCloud

Alias name: consoleproxy Creation date: Jan 18, 2011 Entry type: PrivateKeyEntry Certificate chain length: 3 Certificate[1]: Owner: CN=*.vcloud.vmware.com.us, OU=VMware Architects, O=VMware Inc, L=Hillview Avenue, C=US Issuer: CN=Entrust Certification Authority - L1C, OU=”(c) 2009 Entrust, Inc.”, OU=www.entrust.net/rpa is incorporated by reference, O=”Entrust, Inc.”, C=US Serial number: 3f865g44 Valid from: Tue Jan 18 13:11:31 EST 2011 until: Sat Jan 19 16:28:25 EST 2013 Certificate fingerprints: MD5: 80:04:E8:70:0E:F1:E8:8B:68:A7:7B:16:C7:69:60:FF SHA1: B4:C8:82:88:63:2C:E6:08:6C:23:7D:5C:53:4A:C0:54:16:0A:08:88 Signature algorithm name: SHA1withRSA Version: 3

Extensions:

#1: ObjectId: 2.5.29.15 Criticality=false KeyUsage [ DigitalSignature Key_Encipherment ]

#2: ObjectId: 2.5.29.14 Criticality=false SubjectKeyIdentifier [ KeyIdentifier [ 0000: 18 F6 DA D8 68 E0 A5 F7 E3 CA 81 C8 56 3C 42 47 ....h.......V<BG 0010: 8F E0 DB F7 .... ] ]

T ECHNICAL W HI T E P A P E R / 8 1

VMware vCloud Architecting a vCloud

#3: ObjectId: 1.3.6.1.5.5.7.1.1 Criticality=false AuthorityInfoAccess [ [ accessMethod: 1.3.6.1.5.5.7.48.1 accessLocation: URIName: http://ocsp.entrust.net] ]

#4: ObjectId: 2.5.29.31 Criticality=false CRLDistributionPoints [ [DistributionPoint: [URIName: http://crl.entrust.net/level1c.crl] ]]

#5: ObjectId: 2.5.29.32 Criticality=false CertificatePolicies [ [CertificatePolicyId: [1.2.840.113533.7.75.2] [PolicyQualifierInfo: [ qualifierID: 1.3.6.1.5.5.7.2.1 qualifier: 0000: 16 1A 68 74 74 70 3A 2F 2F 77 77 77 2E 65 6E 74 ..http://www.ent 0010: 72 75 73 74 2E 6E 65 74 2F 72 70 61 rust.net/rpa

]] ] ]

#6: ObjectId: 2.5.29.37 Criticality=false ExtendedKeyUsages [ serverAuth ]

#7: ObjectId: 2.5.29.19 Criticality=false BasicConstraints:[ CA:false PathLen: undefined ]

T ECHNICAL W HI T E P A P E R / 8 2

VMware vCloud Architecting a vCloud

#8: ObjectId: 2.5.29.35 Criticality=false AuthorityKeyIdentifier [ KeyIdentifier [ 0000: 1E F1 AB 89 06 F8 49 0F 01 33 77 EE 14 7A EE 19 ......I..3w..z.. 0010: 7C 93 28 4D ..(M ] ]

Certificate[2]: Owner: CN=Entrust Certification Authority - L1C, OU=”(c) 2009 Entrust, Inc.”, OU=www.entrust.net/rpa is incorporated by reference, O=”Entrust, Inc.”, C=US Issuer: CN=Entrust.net Certification Authority (2048), OU=(c) 1999 Entrust.net Limited, OU=www.entrust.net/ CPS_2048 incorp. by ref. (limits liab.), O=Entrust.net Serial number: 7245h1sx Valid from: Fri Dec 11 07:43:54 EST 2009 until: Wed Dec 11 08:13:54 EST 2019 Certificate fingerprints: MD5: 2F:B3:00:F2:FA:12:7B:BD:82:95:70:05:96:17:DB:BE SHA1: 61:43:AF:68:F7:B3:3A:47:94:04:74:98:8B:05:F7:B1:62:96:98:42 Signature algorithm name: SHA1withRSA Version: 3

Extensions:

#1: ObjectId: 2.5.29.15 Criticality=true KeyUsage [ Key_CertSign Crl_Sign ]

#2: ObjectId: 2.5.29.19 Criticality=true BasicConstraints:[ CA:true PathLen:2147483647 ]

T ECHNICAL W HI T E P A P E R / 8 3

VMware vCloud Architecting a vCloud

#3: ObjectId: 2.5.29.14 Criticality=false SubjectKeyIdentifier [ KeyIdentifier [ 0000: 1E F1 AB 89 06 F8 49 0F 01 33 77 EE 14 7A EE 19 ......I..3w..z.. 0010: 7C 93 28 4D ..(M ] ]

#4: ObjectId: 1.3.6.1.5.5.7.1.1 Criticality=false AuthorityInfoAccess [ [ accessMethod: 1.3.6.1.5.5.7.48.1 accessLocation: URIName: http://ocsp.entrust.net] ]

#5: ObjectId: 2.5.29.31 Criticality=false CRLDistributionPoints [ [DistributionPoint: [URIName: http://crl.entrust.net/2048ca.crl] ]]

#6: ObjectId: 2.5.29.32 Criticality=false CertificatePolicies [ [CertificatePolicyId: [2.5.29.32.0] [PolicyQualifierInfo: [ qualifierID: 1.3.6.1.5.5.7.2.1 qualifier: 0000: 16 1A 68 74 74 70 3A 2F 2F 77 77 77 2E 65 6E 74 ..http://www.ent 0010: 72 75 73 74 2E 6E 65 74 2F 72 70 61 rust.net/rpa

]] ] ]

T ECHNICAL W HI T E P A P E R / 8 4

VMware vCloud Architecting a vCloud

#7: ObjectId: 2.5.29.35 Criticality=false AuthorityKeyIdentifier [ KeyIdentifier [ 0000: 55 E4 81 D1 11 80 BE D8 89 B9 08 A3 31 F9 A1 24 U...........1..$ 0010: 09 16 B9 70 ...p ] ]

Certificate[3]: Owner: CN=Entrust.net Certification Authority (2048), OU=(c) 1999 Entrust.net Limited, OU=www.entrust.net/ CPS_2048 incorp. by ref. (limits liab.), O=Entrust.net Issuer: CN=Entrust.net Certification Authority (2048), OU=(c) 1999 Entrust.net Limited, OU=www.entrust.net/ CPS_2048 incorp. by ref. (limits liab.), O=Entrust.net Serial number: 8392k3du Valid from: Sat Dec 25 04:50:51 EST 1999 until: Wed Jul 25 00:15:12 EST 2029 Certificate fingerprints: MD5: EE:29:31:BC:32:7E:9A:E6:E8:B5:F7:51:B4:34:71:90 SHA1: 50:30:06:09:1D:97:D4:F5:AE:39:F7:CB:E7:92:7D:7D:65:2D:34:31 Signature algorithm name: SHA1withRSA Version: 3

Extensions:

#1: ObjectId: 2.5.29.15 Criticality=true KeyUsage [ Key_CertSign Crl_Sign ]

#2: ObjectId: 2.5.29.19 Criticality=true BasicConstraints:[ CA:true PathLen:2147483647 ]

T ECHNICAL W HI T E P A P E R / 8 5

VMware vCloud Architecting a vCloud

#3: ObjectId: 2.5.29.14 Criticality=false SubjectKeyIdentifier [ KeyIdentifier [ 0000: 55 E4 81 D1 11 80 BE D8 89 B9 08 A3 31 F9 A1 24 U...........1..$ 0010: 09 16 B9 70 ...p ] ]

T ECHNICAL W HI T E P A P E R / 8 6

VMware vCloud Architecting a vCloud

13. Appendix: Capacity Planning
Capacity forecasting provides an efficient way to acquire the appropriate amount of physical resources to support the increased demand for the vCloud. This allows for the growth of the vCloud to be planned and included in the service providers’ budgetary process, and reduces the likelihood of “panic buying,” which generally increases costs dramatically, and undermines standardization efforts. It also reduces the likelihood of last minute surprises, such as a lack of available space or power to support the new vCloud infrastructure components. From a vCloud perspective, capacity management is both simplified by the existence of the provider virtual datacenter and organization virtual datacenter constructs, as well as potentially more complicated by the addition of three models of consumption: pay-as-you-go, allocation (committed), and reservation (dedicated). Finally, all of these capacity management aspects, within a vCloud context, must address both the cloud (service provider) administrator as well as the end-customer (organization) administrator perspectives. Sizing for the workload resource group clusters can be difficult to predict since the provider is not in charge of what the consumer may run. The provider is also not aware of existing usage statistics for virtual machines that are run in the cloud. The information below should assist in initial sizing of the vCloud environment and is based on information from the Service Definition. This information is being provided as examples. It is highly recommended that you engage you local VMware representative for detailed sizing of your environment.

13.1 Cloud Administrator (Service Provider) Perspective
The primary capacity management concerns of the cloud administrator are: • Capacity management of provider virtual datacenters and the service offerings backed by each • Network capacity management (network bandwidth capacity management is beyond the scope of this section) • Capacity forecasting • Capacity monitoring and establishing triggers VMware’s vCloud solution makes extensive use of reservations. As such, previous approaches to capacity management used in vSphere are not as applicable to a vCloud as one might think. CPU and memory overcommitment, for example, cannot be applied as extensively as it once was in a multi-tenant environment. Capacity management in a vCloud focuses on the vSphere hosts and associated shared storage that comprise the virtual datacenters. Unlike managing capacity for vSphere, in a vCloud, the virtual machine is no longer the basis for resource consumption from a service provider perspective. In the case of a vCloud, the organization virtual datacenter becomes the basis for resource consumption. Capacity management is further impacted by the introduction of multiple consumption models in the vCloud model. Each model requires its own capacity management approach. As a result, this section will provide guidance for capacity management, from a service provider cloud administrator perspective, as it applies to each of the consumption models: Pay-As-You-Go, Allocation, and Reservation. Regardless of the particular consumption model being applied in a provider virtual datacenter, the common starting point of vCloud capacity management is to calculate the total amount of CPU and memory resources available for consumption. Since the underlying infrastructure provisioning unit of a provider virtual datacenter is a vSphere host, the first step is to determine the total CPU and memory at the vSphere host level. The following table shows the key vSphere host variables needed to calculate capacity, along with example values.

T ECHNICAL W HI T E P A P E R / 8 7

VMware vCloud Architecting a vCloud

It E m

Var I ab L E

Va LU E

UNItS

Processor Sockets Processor Cores Processor Speed Host Memory
Table 18. vSphere Host Variables

Nsocket,1 Ncores,1 Sproc,1 Mhost,1

2 4 2.4 64

integer integer GHz GB

Calculating the total memory available is very straightforward; it is simply the total amount of RAM for the vSphere host. Total CPU resources are calculated using the following formula:

Phost = NsocketNcores Sproc
Using the example values from the table, the total CPU resources equals 19.2 GHz. Once the vSphere host capacity model has been defined, the next step is to determine the provider virtual datacenter (vSphere cluster) capacity. Determining the provider virtual datacenter capacity is critical as vCloud capacity management should be performed at the provider virtual datacenter level, not the vSphere host level. When considering vCloud provider virtual datacenter capacity, an additional step is required to ensure that redundancy has been accounted for. The provider virtual datacenter cluster redundancy may vary depending upon service levels offered. For the example below, we will assume N+2 cluster redundancy. This means that the provider virtual datacenter can absorb up to two vSphere host failures and continue to support all hosted virtual machines at the same level of performance. To accomplish this, there must be capacity available on the remaining vSphere hosts to take over all workloads. Based on a requirement for provider virtual datacenter cluster redundancy, the overall number of memory and CPU consumption units for the provider virtual datacenter (cluster) must be reduced. To determine the redundancy overhead, the number of vSphere hosts in the cluster and the desired number of redundant vSphere hosts need to be considered. This is described in the following table:
R E D U N Da N c Y Var I ab L E S D E S cr I P t I O N

Nnodes

This represents the number of nodes in a cluster This represents the minimum number of redundant nodes This represents a targeted ratio of redundancy as indicated by a real number greater than one. This ratio (such as 1.10) indicates that there is a ten percent overhead committed to availability. For example, a 10 node provider vDC with a 1.10 redundancy ratio would require 11 nodes to deliver the appropriate capacity. Note that this level of redundancy may vary depending on the class of service offering being delivered on that provider vDC. Redundancy variables can be determined with the equation below.

Nredundant

Rredundancy,HA

Table 19. Determining Redundancy Overhead

T ECHNICAL W HI T E P A P E R / 8 8

VMware vCloud Architecting a vCloud

Calculating Redundancy Ratio from Minimal Level of Redundancy

( (

Nnodes + Nredundant Nnodes

)

= Rredundancy

For example, the level of redundancy is calculated below for a cluster size of ten nodes containing two redundant nodes.

Nnodes + Nredundant Nnodes

)( )
=

8+2 8

= 1.25 = Rredundancy

Once the ratio of redundancy is calculated, the number of units of consumption per provider vDC can be determined with the following equation: CPU resources per Cluster

NCPU, cluster Nhosts, cluster PCPU, host
=

Rredundancy, HA

For our example where:

PCPU, host

=

19.2GHz

This results in:

NCPU, cluster =

(

8x19.2 1.25

)

=

122.88GHz

The number of memory units of consumption is calculated below. For our example where:

Nmem, host

=

64GB

This results in:

N M Nmem, cluster = hosts, cluster mem, host 1.25

=

8 x 64 = 409.6GB 1.25

So we’ve now established that our example provider virtual datacenter has 122.88GHz of available CPU and 409.6GB of available memory, taking a vSphere cluster redundancy of N+2 into account. Next we’ll look at some guidance for capacity management as it applies to each of the consumption models.

T ECHNICAL W HI T E P A P E R / 8 9

VMware vCloud Architecting a vCloud

Pay-As-You-Go Model When an organization virtual datacenter is created in the Pay-As-You-Go model, a resource pool is instantiated with expandable reservations. As such, the customer organization virtual datacenters contained on that provider virtual datacenter can grow to consume all of the available provider virtual datacenter resources. While this could be true in any vSphere environment, the added challenge in a vCloud is the use of reservations at the vApp level. When an organization virtual datacenter is created out of a provider virtual datacenter using the PayAs-You-Go consumption model, a %guarantee is configured for CPU and memory. This is applied to each vApp or virtual machine within a vApp. For example, if the service provider configures the organization virtual datacenter with a 50% guarantee for CPU and 75% guarantee for memory, then the customer creates a virtual machine consuming 1 vCPU of 1GHz and 1GB of memory, a reservation for that virtual machine will be set at 50% of 1GHz, or 0.5 GHZ and 75% of 1GB, or 0.75GB of memory. Since there is no way of knowing how a customer will define their virtual machine templates in their private customer catalogs, coupled with the fact that organization virtual datacenters can expand on demand, VMware recommends the following: • Calculate the total available CPU and memory resources (less an amount reserved for global catalog templates), adjusted by the cluster redundancy ratio, at the provider virtual datacenter level • Establish a CPU and Memory %RESERVED threshold at the provider virtual datacenter level • Establish the %RESERVED for the provider virtual datacenter at a number in the 60% range initially • As the total amount of reserved CPU or reserved memory approaches the %RESERVED threshold, do not deploy new organization virtual datacenters in that provider virtual datacenter without adding additional resources. If the corresponding vSphere cluster has reached its maximum point of expansion, a new provider virtual datacenter should be deployed and any new organization virtual datacenter s should be assigned to the new provider virtual datacenter. In this way there is 40% of expansion capacity for the existing organization virtual datacenters in the case where the provider virtual datacenter has reached its maximum point of expansion. • CPU and memory over-commitment can be applied, and if so the %RESERVED value should be set lower than if no over-commitment is applied due to the unpredictability of the virtual machine sizes being deployed (and hence reservations being established) • Monitor the %RESERVED on a regular basis and adjust the value according to historical usage as well as project demand Allocation Model When an organization virtual datacenter is created in the Allocation Model, a non-expandable resource pool is instantiated with a %guaranteed value for CPU and memory that was specified. Using a %guaranteed value of 75%, this means if an organization virtual datacenter is created specifying 100GHz of CPU and 100GB of memory, a resource pool is created for that organization virtual datacenter with a reservation of 75GHz and limit of 100GHz for CPU and a reservation of 75GB with a limit of 100GB for memory. The additional 25%, in this example, is not guaranteed and can be accessed only if it’s available across the provider virtual datacenter. In other words, the 25% can be over-committed by the provider at the provider virtual datacenter level and therefore may not be available depending on how ALL of the organization virtual datacenters in that provider virtual datacenter are using it. At the virtual machine level, when a virtual machine is deployed, it is instantiated with no CPU reservation but with a memory reservation equal to the virtual machine’s memory allocation multiplied by the %guaranteed. Despite the fact that no CPU reservation is set at the virtual machine level, the total amount of CPU allocated across all virtual machines in that organization virtual datacenter is still subject to the overall CPU reservation of the organization virtual datacenter established by the %guarantee value.

T ECHNICAL W HI T E P A P E R / 9 0

VMware vCloud Architecting a vCloud

Based on this use of reservations in the Allocation Model, VMware recommends the following: • Calculate the total available CPU and memory resources (less an amount reserved for global catalog templates), adjusted by the cluster redundancy ratio, at the provider virtual datacenter level • Determine how much resource, at the provider virtual datacenter level, you want to make available for expanding organization virtual datacenters that are deployed to that provider virtual datacenter • Establish a CPU and Memory %RESERVED (guaranteed, not allocated) threshold at the provider virtual datacenter level based on the %guaranteed less the amount reserved for growth. The remaining unreserved resources are available to all organization virtual datacenters for bursting. • As the total amount of reserved CPU or reserved memory approaches the %RESERVED threshold, do not deploy new organization virtual datacenters in that provider virtual datacenter without adding additional resources. If the corresponding vSphere cluster has reached its maximum point of expansion, a new provider virtual datacenter should be deployed and any new organization virtual datacenters should be assigned to the new provider virtual datacenter. This gives some predetermined amount of capacity available for expanding the existing organization virtual datacenters in the case where the provider virtual datacenter has reached its maximum point of expansion. • CPU and memory over-commitment can be applied, but it should be based only on the amount of unreserved resources at the provider virtual datacenter level, allowing for over-committing the resources available for organization virtual datacenter bursting. • Monitor the %RESERVED on a regular basis and adjust the value according to historical usage as well as project demand Reservation Model When an organization virtual datacenter is created in the Reservation Model, a non-expandable resource pool is instantiated with the reservation and limit values equivalent to the amount of resources allocated. This means if an organization virtual datacenter is created allocating 100GHz of CPU and 100GB of memory, a reservation pool is created for that organization virtual datacenter with a reservation and limit of 100GHz for CPU and a reservation and limit of 100GB for memory. At the virtual machine level, when a virtual machine is deployed, it is instantiated with no reservation or limit for either CPU or memory. Based on this use of reservations in the Reservation Model, VMware recommends the following: • Calculate the total available CPU and memory resources (less an amount reserved for global catalog templates), adjusted by the cluster redundancy ratio, at the provider virtual datacenter level • Determine how much resource, at the provider virtual datacenter level, you want to make available for expanding organization virtual datacenters that are deployed to that provider virtual datacenter • Establish a CPU and Memory %RESERVED threshold at the provider virtual datacenter level equivalent to the capacity of the underlying vSphere cluster, taking into account HA redundancy • As the total amount of reserved CPU or reserved memory approaches the %RESERVED threshold, do not deploy new organization virtual datacenters in that provider virtual datacenter without adding additional resources. If the corresponding vSphere cluster has reached its maximum point of expansion, a new provider virtual datacenter should be deployed and any new organization virtual datacenters should be assigned to the new provider virtual datacenter. In this way there is some predetermined amount of capacity available for expanding the existing organization virtual datacenters in the case where the provider virtual datacenter has reached its maximum point of expansion. • No over-commitment can be applied to the provider virtual datacenter in the Reservation Model, due to the reservation being at the resource pool level • Monitor the %RESERVED on a regular basis and adjust the value according to historical usage as well as project demand

T ECHNICAL W HI T E P A P E R / 9 1

VMware vCloud Architecting a vCloud

Storage VMware vCloud Director uses a largest available capacity algorithm for deploying virtual machines to datastores. Storage capacity must be managed on both an individual datastore basis as well as in the aggregate for a provider virtual datacenter. In addition to considering VMware storage allocation best practices, manage capacity at the datastore level using the largest virtual machine storage configuration, in terms of units of consumption, offered in the service catalog when determining the amount of spare capacity to reserve. For example, if using 1 TB datastores (100 storage units of consumption based on a 10 GB unit of consumption) and the largest virtual machine storage configuration is 6 storage units of consumption (60 GB), then applying the VMware best practice of approximately 80% datastore utilization would imply managing to 82 storage units of consumption. This would result in 82% datastore utilization and reserve capacity equivalent to 3 of the largest virtual machines offered in the service catalog in terms of storage.

13.2 Network Capacity Planning
A vCloud also brings network capacity planning to the forefront. Providers must consider IP address, VLAN, and ephemeral port capacity. The following table describes what needs to be managed from a capacity perspective and its impact:
It E m tO M a N aG E Im Pact

IP Addresses

• Available IP addresses to be assigned in support of a dedicated external network for an organization, such as for Internet access or hardwarebased firewall rules • Need to track IP addresses assigned to specific organizations to determine what’s available for a shared external Organization network VLANs available for VLAN-backed pool assignment, if required Additional vCloud Director Network Isolation networks that can be assigned from the vNetwork Distributed Switch (only 1016 ephemeral ports per vNetwork Distributed Switch), if used

VLANs Ephemeral Ports

Table 20. Network Capacity Planning Items

T ECHNICAL W HI T E P A P E R / 9 2

VMware vCloud Architecting a vCloud

14. Appendix: Capacity Management
14.1 vCloud-Specific Capacity Forecasting (Demand Management)
Capacity forecasting consists of determining how many organization virtual datacenters are expected to be provisioned during a specific time period. Capacity provisioning is concerned with determining when vCloud Infrastructure components must be purchased in order to maintain capacity. From a financial budget perspective, the procurement of the vCloud Infrastructure requires more planning and understanding of customer future requirements. VMware recommends performing two forecasting functions over time. • Capacity Trending. Using historical organization virtual datacenter capacity and utilization data, it is possible to predict future capacity requirements. • Demand Pipeline. Understanding future customer requirements via the sales pipeline provides the necessary information to understand future capacity requirements, as well as knowledge of marketing/business development functions bringing new service offerings to market. Initially, no historical utilization metrics will be available, and thus it will not be possible to perform capacity trending for some period of time. During this initial period, a good understanding of the customer demand pipeline will need to be established. Over time, this pipeline can be combined with trending analysis to more accurately predict capacity requirements. The customer demand pipeline must be established in conjunction with the service provider’s sales teams, or lines of business (LOB) if a private cloud, so future vCloud capacity requirements can be determined. This demand pipeline must contain information of all known new customers, expansion of existing customer organization virtual datacenters, projected sizing metrics, plus any new service offerings that are in development. The forecasting plan must fit both the budgetary cycle as well as the procurement and provisioning timeframes. For example, if a quarterly budgetary cycle exists, and the procurement and provisioning timeframe is one month, it will be necessary to have a pipeline of at least four months to ensure all requests in the pipeline can be fulfilled. Over time, capacity trending can be used to assist with the forecasting of organization virtual datacenter provisioning needs. It uses historical information to determine trends and validates the organization virtual datacenterforecast based on demand pipeline data.

14.2 Capacity Monitoring and Establishing Triggers
Certain metrics should be monitored carefully to warn of approaching or exceeding consumption thresholds, and are listed in the following table. These metrics should be measured against each vCloud provider virtual datacenter and for each organization virtual datacenter within the provider virtual datacenter. To monitor for threshold breaches, and possible subsequent violation of service level commitments to the cloud consumer, the appropriate tools and triggers are needed for proper notification.
Attr I b U t E M O N I tO r E D P E r

%RESERVED CPU

provider vDC, organization vDC
Note: For the Pay-As-You-Go consumption model this will be the aggregation of reservations values for the contained virtual machines.

%RESERVED Memory CPU utilization Memory utilization Datastore utilization Transfer store utilization Network IP addresses available

provider vDC, organization vDC provider vDC, organization vDC provider vDC, organization vDC provider vDC vCloud vCloud

T ECHNICAL W HI T E P A P E R / 9 3

VMware vCloud Architecting a vCloud

Attr I b U t E

M O N I tO r E D P E r

Network IP addresses consumed Network VLANs available Network ephemeral ports consumed
Table 21. Capacity Monitoring Metrics

organization vCloud vNetwork Distributed Switch

Once thresholds have been exceeded, the group responsible for capacity management of the vCloud should be notified to add additional capacity. You should account for the time required to add the physical components necessary to increase the capacity of a provider virtual datacenter. A vCloud-aware capacity management tool should be deployed. Whichever tool is chosen, the capacity model can be used to forecast new provider virtual datacenter capacity utilization as well as ongoing capacity management of existing provider virtual datacenters. It should also account for expansion triggers based on provisioning timeframes. Once a provider virtual datacenter has had its total amount of available resources calculated, no adjustments to that provider virtual datacenter such as adding or removing hosts should be made without updating the calculated value. This model may be altered if long-term CPU and memory reservations are not at the levels that they were designed for. An increase in the resources allocated to an organization virtual datacenter can affect the remaining capacity of a “full” provider virtual datacenter. Monitoring “full” provider virtual datacenters should be done on a weekly basis. The resource consumption of virtual machines within an organization virtual datacenter should be reviewed for trends that indicate the resources purchased for that organization virtual datacenter are insufficient. vCenter CapacitIQ, while not vCloud Director aware, can be used to provide insight into provider virtual datacenter utilization and trends.

14.3 Capacity Management Manual Processes—Provider Virtual Datacenter
The following cloud administrator capacity management activities include simple, periodic planning activities supported by operational day-to-day activities. Periodic continuous improvement activities are critical to extracting the most value from your vCloud Infrastructure. Planning activities (initially monthly, then quarterly): • Determining usable capacity by provider virtual datacenter and organization virtual datacenter (taking into account vSphere overhead) • Reviewing current utilization • Reviewing provisioning timeframes for new provider virtual datacenter components (hosts, network, storage) • Forecasting utilization growth over the coming period (preferably based on actual pipeline validated with historical trending) • Planning for procurement and implementation of additional capacity over the coming period, including bills of materials and budgets • Reviewing capacity alert threshold levels and setting alerts for capacity warnings Operational activities (daily): • Monitoring for alerts • Investigating performance issues to determine whether capacity is the root cause • Initiating and managing the procurement and provisioning of additional provider virtual datacenter capacity Continuous improvement activities (quarterly/yearly): • Comparing capacity model utilization levels to observed levels and tuning model to drive greater utilization without sacrificing reliability • Optimizing provisioning timeframes (shortening them and making them more predictable)
T ECHNICAL W HI T E P A P E R / 9 4

VMware vCloud Architecting a vCloud

14.4 End-Customer (Organization) Administrator Perspective
The primary capacity management concern of the organization administrator is capacity management of the organization’s organization virtual datacenters. VMware recommends that all organizations establish a capacity management process based on a standard unit of consumption. The recommended base Unit of Consumption for each resource important to capacity management from an organization administrator perspective is shown below.
Attr I b U t E Var I ab L E Va LU E

vCPU Memory Storage

PUC

1 GHz 1 GB 10 GB

MUC DUC

Table 22. Organization Virtual Datacenter Units of Consumption

Taking such an approach enables more efficient capacity management since the vApp component virtual machine resource allocations are predefined in the Service Catalog resulting in vCloud Infrastructure resource consumption being more accurately predicted. Each organization will be provided with a finite quantity of resources (in the cases of the allocation and reservation consumption models) from one or more provider virtual datacenters in the form of organization virtual datacenters. This means that as the organization consumes the organization virtual datacenter resources, a tripping point will need to be defined to make sure steps are taken to expand the organization virtual datacenter. First, the resource consumption limits for an organization’s organization virtual datacenters need to be defined, with these limits defining when action needs to be taken to remove the potential capacity issue.
Attr I b U t E Var I ab L E LImIt D E S cr I P t I O N

organization virtual datacenter CPU Peak Utilization

CCPULimit

80%

The limit for allocating CPU resources within the organization virtual datacenter before expansion is required. This value will vary depending on the consumption model being used. From an organization virtual datacenter perspective, reservation values should be considered equal to the amount of CPU allocated as reservation values are not available to the organization administrator. The limit for allocating memory resources within the organization virtual datacenter before expansion is required. This value will vary depending on the consumption model being used. From an organization virtual datacenter perspective, reservation values should be considered equal to the amount of memory allocated as reservation values are not available to the organization administrator.

organization virtual datacenter Memory Allocation Limit

CmemLimit

80%

Table 23. Recommended Organization Virtual Datacenter Capacity Thresholds

T ECHNICAL W HI T E P A P E R / 9 5

VMware vCloud Architecting a vCloud

The amount of CPU and Memory resources will vary dependant on the size of the organization virtual datacenter contracted for. The table below provides an example of the resources needed to calculate the organization virtual datacenter’s capacity.
It E m Var I ab L E Va LU E UNItS

Total organization virtual datacenter vCPU Units of Consumption organization virtual datacenter Memory Allocation in Units of Consumption

SorgvDC MorgvDC

50

GHz

64

GB

Table 24. Sample Organization Virtual Datacenter Resource Allocation

The number of capacity units available within this organization virtual datacenter is found by using the following equations. Determining organization virtual datacenter memory units of consumption

MUC, orgVDC =

( ( ( (

CmemLimit MorgVDC MUC

) )(
=

Based on the information from the above tables, the total memory unit of consumption for the organization virtual datacenter is calculated as shown below.

MUC, orgVDC =

CmemLimit MorgVDC MUC

0.8 x 64 1

)

=

51.2GB

This results in 51.2 memory units of consumption for the sample organization virtual datacenter. Determining organization virtual datacenter CPU units of consumption

PUC, orgVDC =

SorgVDC CCPULimit PUC

) )(
=

Based on the information from the above tables, the CPU units of consumption per organization virtual datacenter are calculated as shown below.

PUC, orgVDC =

SorgVDC CCPULimit PUC

50 x 0.8 1

)

=

40GHz

This results in 40 CPU units of consumption for this sample organization virtual datacenter.

T ECHNICAL W HI T E P A P E R / 9 6

VMware vCloud Architecting a vCloud

14.5 Organization Virtual Datacenter—Specific Capacity Forecasting
Capacity forecasting consists of determining how many virtual machines are expected to be deployed during a specific time period of the organization’s choosing. The time period used for the virtual machine forecast should correspond to the budgetary process. Capacity provisioning is concerned with determining when an organization virtual datacenter must be expanded in order to maintain capacity. VMware recommends that organizations perform two forecasting functions over time. • Capacity Trending —Using historical virtual machine capacity and utilization data, it is possible to predict future capacity requirements. • Capacity Pipeline —Understanding future end-user virtual machine(s) resource requirements, via IT and LOB projects, provides the necessary information to understand future capacity requirements Over time capacity trending can be used to assist with the forecasting of virtual machine provisioning needs. It uses historical information to determine trends and validates the virtual machine forecast based on pipeline data. Capacity provisioning depends on determining the point of expansion for the organization virtual datacenter. This is based on determining a point of resource consumption at which the process of procuring and expanding the organization virtual datacenter must begin so that reserve capacity is not exhausted before the additional capacity is available. In the vCloud context, this can be considered to be dependent upon the time it takes to process the purchase request for additional organization virtual datacenter resources. Provisioning time can be assumed to be zero but is dependent upon specific contractual agreements with the service provider. The recommended steps to perform capacity trending and to determine a point of organization virtual datacenter expansion follow. Regularly Collect Organization Virtual Datacenter Consumption Information The primary issue with the trending of organization virtual datacenter consumption is identifying the point of record for all new virtual machines. This can then be used to determine the capacity trends and therefore determine the overall need for purchasing additional organization virtual datacenter capacity. To establish the point of record for new virtual machines, the items listed in the following table should be tracked, ideally in a Configuration Management or Capacity Planning Database as virtual machine attributes.
Var I ab L E Nam E D E S cr I P t I O N UNItS

orgvDC Dbuild NUC,cpu NUC,mem NVGB

Organization virtual datacenter

This is the organization virtual datacenter in which the virtual machine resides This is the date the virtual machine is built. This is the number of CPU units of consumption allocated to the virtual machine. This is the number of memory units of consumption allocated to the virtual machine. This is the amount of storage (GB) allocated to the virtual machine.

Identifier

Build Date CPU Units of Consumption

Date CPU Units of Consumption

Memory Units of Consumption

Memory Units of Consumption

Storage

GB

Table 25. Organization Virtual Datacenter Trending Information

T ECHNICAL W HI T E P A P E R / 9 7

VMware vCloud Architecting a vCloud

Determine Trending Variables With the information recorded as described above, it is possible to determine the rate of organization virtual datacenter consumption.
Var I ab L E Nam E D E S cr I P t I O N UNItS

T NcpuUC

Time

This is the time between points of observation. This is the total number of CPU units of consumption required for the forecasted virtual machines. This is the total number of memory units of consumption required for the forecasted virtual machines. This is the total amount of storage required for the forecasted virtual machines. The amount of time to procure additional organization virtual datacenter resources.

Weeks

New CPU Units

CPU Units of Consumption

NmemUC

New Memory Units

Memory Units of Consumption

NVGB Tpurchase

New Storage (GB)

GB

Organization Virtual Datacenter Expansion Purchase Time

Weeks

Table 26. Organization Virtual Datacenter Capacity Trending Variables

Determine Standard Rate of Consumption Determining the Trended Growth Rate

∆NcpuUC

=

NcpuUC ∆T NmemUC ∆T

∆NmemUC

=

∆NVGB

=

NVGB ∆T

T ECHNICAL W HI T E P A P E R / 9 8

VMware vCloud Architecting a vCloud

Determining the Trend It is important to understand that the rate of increase dictates how far in advance additional organization virtual datacenter resources need to be purchased. The following table presents a sample virtual machine forecast for a quarter along with sample “time to purchase” value.
Attr I b U t E Va LU E

ΔNcpuUC ΔNVGB

12 12 360 GB 2 weeks 320 717

ΔNmemUC Tpurchase

NcpuUC,cluster

NmemUC,cluster
Table 27. Sample Organization Virtual Datacenter Trending Information

In this example, NcpuUC,free and NmemUC,free represents the number of free resources within an organization virtual datacenter at which point additional organization virtual datacenter resources should be ordered. In order to determine the trigger point for ordering equation 6 should be used if no pipeline data exists. Determining Trigger Point For Ordering Capacity Using Trends

NUC, free= ∆NCU x Tpurchase
For example, from the data provided below, one would calculate the needed free consumption units as listed in the following equation, or 24 units.

NcpuUC, free= ∆NcpuUC x Tpurchase = 12 x 2 = 24GHz


NmemUC, free= ∆NmemUC x Tpurchase = 12 x 2 = 24GB

For storage, in this example, the trigger point is calculated at 720 GB:

NVGB, free= ∆NVGB

x

(

Tpurchase

)

=

360 x 2 = 720GB

Determine the Automatic Point of Expansion Based on the example above, additional organization virtual datacenter resources would need to be ordered when the available units of CPU or Memory fall to 24GHz or 24GB respectively, or when storage capacity falls to 720 GB. The additional capacity needs to be on order when described or the capacity will not be available in time to meet demand. Currently there are no tools available to assist in organization virtual datacenter capacity management. However, it it is possible to develop scripts to gather pertinent information with languages such as PowerCLI.

T ECHNICAL W HI T E P A P E R / 9 9

VMware vCloud Architecting a vCloud

14.6 Capacity Management Manual Processes—Organization Virtual Datacenter
The following organization administrator capacity management activities include simple, periodic planning activities supported by operational day-to-day activities. Periodic continuous improvement activities are critical to extracting the most value from your vCloud. Planning activities (initially monthly, then quarterly): • Determining usable capacity by organization virtual datacenter • Reviewing current utilization (and performance, where possible) • Reviewing purchasing timeframes for expanding an organization virtual datacenter • Forecasting utilization growth over the coming period (preferably based on actual pipeline validated by historical trending) • Reviewing capacity alert threshold levels and setting alerts for capacity warnings Operational activities (daily): • Monitoring for alerts • Investigating performance issues to determine whether capacity is the root cause • Initiating and managing the procurement and provisioning of additional capacity Continuous improvement activities (quarterly/yearly): • Comparing capacity model utilization levels to observed levels and tuning model to drive greater utilization without sacrificing reliability.

VMware, Inc. 3401 Hillview Avenue Palo Alto CA 94304 USA Tel 877-486-9273 Fax 650-427-5001 www.vmware.com
Copyright © 2010 VMware, Inc. All rights reserved. This product is protected by U.S. and international copyright and intellectual property laws. VMware products are covered by one or more patents listed at http://www.vmware.com/go/patents. VMware is a registered trademark or trademark of VMware, Inc. in the United States and/or other jurisdictions. All other marks and names mentioned herein may be trademarks of their respective companies. Item No: VMW_11Q1_WP_ArchitectingvCloud_p100_R2

Sponsor Documents

Or use your account on DocShare.tips

Hide

Forgot your password?

Or register your new account on DocShare.tips

Hide

Lost your password? Please enter your email address. You will receive a link to create a new password.

Back to log-in

Close