Disaster Recovery_srinivas - Copy

Published on March 2017 | Categories: Documents | Downloads: 10 | Comments: 0 | Views: 156

of 5

Content

DISASTER RECOVERY

A Disaster is a situation in which critical components in the R/3 System environment Become unavailable so that service cannot be resumed in a short period of time. The critical components are the Database and the R/3 Application hot instance that runs Message and enqueue services. Risk Analysis Before starting to build a disaster recovery site , Identify the Vulnerabilities aspects of the System environment and consider the system uptime business requires according to the Costs of the failure and Return on Investment ( ROI ). Factors Affecting the Business Decision for Disaster Recovery Site. 1 2

3

Envi Enviro ronm nmen entt Fac Facto tors rs - Lik Likel elih ihoo ood d of Disa Disast ster er Such Such as Ea Eart rthq hqua uake ke.. Ti Time me exp expec ecte ted d to to rep repla lace ce cr crit itic ical al ha hard rdwa ware re co comp mpon onen ents ts or re reco conf nfig igur uree cri criti tical cal software components is more than the period that would put your entire business at risk. Recov Recover ery y Tim Timee and and Recov Recover ery y Poin Pointt - Es Esti tima mati ting ng th thee Tol Toler erab able le re recov cover ery y tim timee and Recovery Point.

Process 1

2

To protect the R/3 Application Host running enqueue and message services. This can be achieved by having standby system available ( at a remote site ) that can be started up in the event of disaster . To protect the database. The entire database can be replicated but you have to use a Method provided by the database vendors’ .Approach know as the Hot Site Backups or Standby Database. The products mentioned below follow the concept of replication

transparency . This meansinstead that the achieve replication into the database service offunctionality having to be to coded by the client is built applications.

The Normal Method of Building a system from scratch after a disaster recovery requires several steps .First the hardware must be procured and configured .The operating system must be installed to the pre-disaster OS configuration. Then the database can be recovered , depending on the availability of logs and offline or online tape backups , steps required to rebuild a local cluster based b ased on the business need . It takes a long time to fully recover the database and roll forward the database .Sometimes a full recovery is not possible.

Campus Clusters

Clustering products can improve the availability of SAP system Providing fast and automatic recovery of failures . A cost effective way of stretching stretching a cluster across a campus or larger site , up to 10 Km is to use software RAID 1 for the shared disks. The campus cluster configuration with DB and CI package or resource groups and the shared Storage system is mirrored from within each server cluster’s OS . The cluster quorum or lock disk is also mirrored .Dual, redundant Fibre channel paths are used between the servers and the storage and FDDI is used for the cluster IP networks to be in the same IP subnet at the 10Km distance. As it requires software RAID1 to function it requires a reliable file system. Presently such a system is only available on Unix clusters.Windows 2000 clusters with MSCS cannot cann ot support software mirroring of file systems .Microsoft has announced support for veritas file system with clustering, which would allow campus clustering configuration. This solution is cost effective because the shared disk systems can be mid-range systems and only two server nodes are required.As the software RAID1 is used over a large distance , Fast storage system with large chache along with an optimum layout to avoid unnecessary I/O delays , helping keep the database response times low . This Solution has some drawbacks – The problem with split-brain syndrome. Metro clusters

This solution can span city wide or metropolitan distances( less than 60 Km) .A metro cluster is designed for an automatic fail-over in a disaster recovery environment. This solution gives highest levels of availability a hardware- based clustering can offer and is in production by many SAP customers. The important function of this solution is to automatically switch the remote DR storage system read/write mode (site write-protect databaseinto in case of a primary failure. turned OFF ) so it can properly fail over the

There are atleast six server nodes configured in the metro cluster , although more are allowed .Two in Primary data center , two in Disaster recovery center , and two additional servers are needed in a third location to act as a cluster arbitrators .The arbitrator servers are required because there is no centralized c entralized cluster lock disk or quorum disk when using a split cluster configuration. Reason for supporting metropolitan distances is the need to synchronize the disk writeI/O commands. Only when both storage systems have written written the I/O into their cache and acknowledge it is the I/O cycle complete . Split Brain Syndrome

With geographically split data centers, the communication links between the clusters nodes may go down, yet the cluster nodes may remain functioning. In this case each cluster node thinks the other is down do wn because the cluster heartbeat isn’t able to make contact with the remote server node, each attempting take over the shared resource

resulting in a integrity problem. This is the reason to have arbitration servers in the third data center to ensure membership consistency in cluster. The arbitration servers can be workstations running other applications .They simply must run small background task that is arbitrator whenever cluster communication or failure events occur. DR Clusters with Microsoft cluster server

Disaster recovery can be configured using the enterprise storage copy in Microsoft cluster server environment. The configuration can use two cluster server nodes that are configured to use the primary storage system.The secondary storage system maintain a copy of the database volumes , but in a unshared read only mode . During normal mode it is not visible to the server nodes.If the primary stor storage age system fails , the remote or standby system’s disk volumes can be manually set to read-write mode by IT administrator. Continental Clusters.

Clusters across greater distances than metropolitan areas can be supported with SAP cluster which may be interesting for organizations that need a disaster recovery solution beyond the immediate geographic region. This protects against environmental events that affect an entire area such as hurricanes, earth quakes . This solution uses ESCON connection over the WAN to support making physical disk copies and thus can support both continental and inter continental distances the longer distances require the continental cluster solution to employ asynchronous disk I/O .The I/O is acknowledged as soon as the local disk storage system successfully writes it into cache , without waiting for the second disk system to acknowledge .

The ESCON over WAN links are slower than pure ESCON or FC channel , so full data replication of entire disk volumes is not feasible in this configuration . Backup and tape restores are needed to initially synchronize large data volumes v olumes and for recovery .The failure is typically in one direction and so is designated for fewer , but real disaster scenarios .It is not an automatic recovery . Some manual intervention is required.

Database Failover Database failover solutions make logical copies of the data .Failover to standby or recovery database server can be effectively used as an alternative to HA clustering in SAP Production environment. Remote standby Database server - A cost effective effective way to decrease the recovery time time after a local disaster it to use a remote standby database server .This is based on sending the log files to a remote server that runs the database in recovery mode .This solution can be used to recover from disasters , to provide recovery from logical errors , for fast restores and for decoupled backups. In a disaster scenario at the primary data center the standby database can be recovered unto the last available log file and an d set online in read/write mode .This helps reduce the time needed to restore , and can be used while the primary data center is being build. This solution requires two copies of database software and also two database servers and storage with identical copies of the data achieved with a full backup and restore. The remote database can be built with lower performance and cost effective disk layout . In a disaster situation the SAP application servers would need to be pointed to the new database host or the complete set SAP applications servers with remote database server can built and started .This does require manual intervention to set the DR database server out of recovery mode into online o nline read/write mode. Only the archived or inactive logs are sent to the remote system. In case of d disaster isaster to the primary database server , the recovery on the remote ser server ver can only be upto the last archived logs. The advantage of this solution is its protection against logical errors .Logical errors can destroy the integrity of primary database .In Microsoft SQL server it is called log shipping. One draw back in this solution is that the structural changes does not reflect in the logs so they are not sent to the standby database . Third party solutions are avail able which notes the database changes or DB catalog files for structural changes and provides more control over the time delay of the log recovery.

Disaster Recovery_srinivas - Copy

Comments

Content

Sponsor Documents

Recommended