How to Replace a Failed SVM Disk

Published on June 2016 | Categories: Documents | Downloads: 54 | Comments: 0 | Views: 330

of 7

Content

How To Replace A Failed SVM Disk
Before you replace (what you believe is) a failed Solaris Volume Manager (SVM) disk, you need to establish whether it has indeed failed or is still in the process of failing. Why is it important to determine if an SVM disk has failed? It could save you a little time replacing a failed SVM disk as opposed to a failing one. Read How To Tell The Difference Between A Failed Disk And A Failing Disk to find out which one your disk is. If your disk hasn’t quite failed yet, this article will show you How To Replace A Failing SVM Disk. Now that you have established that you do have a failed SVM disk, find out if the disk contains SVM metadatabase replicas and delete them. Assuming that the failed disk is c1t1d0.
# metadb | grep c1t1d0 W p l /dev/dsk/c1t1d0s7 W p l /dev/dsk/c1t1d0s7 W p l /dev/dsk/c1t1d0s7 # # metadb -d c1t1d0s7 # # metadb flags a m p luo /dev/dsk/c1t0d0s7 a p luo /dev/dsk/c1t0d0s7 a p luo /dev/dsk/c1t0d0s7 # 16 8208 16400 8192 8192 8192

first blk 16 8208 16400

block count 8192 8192 8192

Unconfigure the failed SVM disk
# cfgadm -al Ap_Id Condition c0 unknown c0::dsk/c0t0d0 unknown c1 unknown c1::dsk/c1t0d0 unknown Type scsi-bus CD-ROM scsi-bus disk Receptacle connected connected connected connected Occupant configured configured configured configured

c1::dsk/c1t1d0 disk connected configured unknown c1::dsk/c1t2d0 disk connected configured unknown c1::dsk/c1t3d0 disk connected configured unknown c2 scsi-bus connected unconfigured unknown c3 fc-fabric connected configured unknown c3::5006016239a02018 disk connected configured unknown c3::5006016b39a02018 disk connected configured unknown c3::5006048452a70c17 disk connected configured unknown c3::5006048c52a70c07 disk connected configured unknown c4 fc-fabric connected configured unknown c4::5006016339a02018 disk connected configured unknown c4::5006016a39a02018 disk connected configured unknown c4::5006048452a70c18 disk connected configured unknown c4::5006048c52a70c08 disk connected configured unknown usb0/1 unknown empty unconfigured usb0/2 unknown empty unconfigured usb1/1 unknown empty unconfigured usb1/2 unknown empty unconfigured # # cfgadm -c unconfigure c1::dsk/c1t1d0 cfgadm: Component system is busy, try again: failed to offline: Resource Information ------------------ ------------------------/dev/dsk/c1t1d0s2 Device being used by VxVM #

ok ok ok ok

Note: This host uses SVM to manage internal disks and Veritas Volume Manager (VxVM) to manage SAN attached disks. VxVM keeps track of the internal disks – even if it doesn’t actually manage them – and may not allow you to unconfigure them. To get around this restriction, you may need to forcibly unconfigure the failed SVM disk by specifying the -f parameter to cfgadm.
# cfgadm -f -c unconfigure c1::dsk/c1t1d0 # # cfgadm -al Ap_Id Type Condition c0 scsi-bus unknown c0::dsk/c0t0d0 CD-ROM unknown

Receptacle connected connected

Occupant configured configured

c1 unknown c1::dsk/c1t0d0 unknown c1::dsk/c1t1d0 unknown c1::dsk/c1t2d0 unknown c1::dsk/c1t3d0 unknown c2 unknown c3 unknown c3::5006016239a02018 unknown c3::5006016b39a02018 unknown c3::5006048452a70c17 unknown c3::5006048c52a70c07 unknown c4 unknown c4::5006016339a02018 unknown c4::5006016a39a02018 unknown c4::5006048452a70c18 unknown c4::5006048c52a70c08 unknown usb0/1 usb0/2 usb1/1 usb1/2 #

scsi-bus disk disk disk disk scsi-bus fc-fabric disk disk disk disk fc-fabric disk disk disk disk unknown unknown unknown unknown

connected connected connected connected connected connected connected connected connected connected connected connected connected connected connected connected empty empty empty empty

configured configured unconfigured configured configured unconfigured configured configured configured configured configured configured configured configured configured configured unconfigured unconfigured unconfigured unconfigured ok ok ok ok

Verify that the failed SVM disk is marked “unconfigured” as above. Sun servers with hotswappable disks will also have the disk’s blue “ready to remove” LED lit. Pull the failed SVM disk out of the drive bay and insert the new disk. The following message will come up in /var/adm/messages.
Jul 20 14:46:09 eap52 rmclomv: [ID 978967 kern.error] DISK @ HDD1 has been inserted.

Configure the new disk.
# cfgadm -c configure c1::dsk/c1t1d0 # # cfgadm -al Ap_Id Type Condition

Receptacle

Occupant

c0 unknown c0::dsk/c0t0d0 unknown c1 unknown c1::dsk/c1t0d0 unknown c1::dsk/c1t1d0 unknown c1::dsk/c1t2d0 unknown c1::dsk/c1t3d0 unknown c2 unknown c3 unknown c3::5006016239a02018 unknown c3::5006016b39a02018 unknown c3::5006048452a70c17 unknown c3::5006048c52a70c07 unknown c4 unknown c4::5006016339a02018 unknown c4::5006016a39a02018 unknown c4::5006048452a70c18 unknown c4::5006048c52a70c08 unknown usb0/1 usb0/2 usb1/1 usb1/2 #

scsi-bus CD-ROM scsi-bus disk disk disk disk scsi-bus fc-fabric disk disk disk disk fc-fabric disk disk disk disk unknown unknown unknown unknown

connected connected connected connected connected connected connected connected connected connected connected connected connected connected connected connected connected connected empty empty empty empty

configured configured configured configured configured configured configured unconfigured configured configured configured configured configured configured configured configured configured configured unconfigured unconfigured unconfigured unconfigured ok ok ok ok

Verify that the new disk has been configured as above. Copy the volume table of contents (VTOC) from the other disk in the mirror set, c1t0d0, onto the new disk.
# prtvtoc /dev/rdsk/c1t0d0s2 | fmthard -s - /dev/rdsk/c1t1d0s2 fmthard: New volume table of contents now in place. #

If prtvtoc returns with an error similar to this, “/dev/rdsk/c1t1d0s2: Cannot get disk geometry“, you will need to run format to label the disk.

# format Searching for disks...done c1t1d0: configured with capacity of 72.36GB AVAILABLE DISK SELECTIONS: 0. c1t0d0 <SUN72G cyl 14087 alt /pci@1f,700000/scsi@2/sd@0,0 1. c1t1d0 <SUN72G cyl 14087 alt /pci@1f,700000/scsi@2/sd@1,0 2. c1t2d0 <SUN72G cyl 14087 alt /pci@1f,700000/scsi@2/sd@2,0 3. c1t3d0 <SUN72G cyl 14087 alt /pci@1f,700000/scsi@2/sd@3,0 Specify disk (enter its number): 1 selecting c1t1d0 [disk formatted] Disk not labeled. Label it now? y 2 hd 24 sec 424> 2 hd 24 sec 424> 2 hd 24 sec 424> 2 hd 24 sec 424>

FORMAT MENU: disk - select a disk type - select (define) a disk type partition - select (define) a partition table current - describe the current disk format - format and analyze the disk repair - repair a defective sector label - write label to the disk analyze - surface analysis defect - defect list management backup - search for backup labels verify - read and display labels save - save new disk/partition definitions inquiry - show vendor, product and revision volname - set 8-character volume name ! - execute , then return quit format> q #

Recreate the metadatabase replicas on the new disk.
# metadb -a -c 3 c1t1d0s7 # # metadb flags first blk a m p luo 16 /dev/dsk/c1t0d0s7 a p luo 8208 /dev/dsk/c1t0d0s7 a p luo 16400 /dev/dsk/c1t0d0s7 a u 16 /dev/dsk/c1t1d0s7 a u 8208 /dev/dsk/c1t1d0s7

block count 8192 8192 8192 8192 8192

a u /dev/dsk/c1t1d0s7 #

16400

8192

Update the new disk’s device ID entry in SVM. This step may not be required but it’s a good idea to do it just in case.
# metadevadm -u c1t1d0 Updating Solaris Volume Manager device relocation information for c1t1d0 Old device reloc information: id1,sd@THITACHI_HUS103073FL3800_V3X6MDDA New device reloc information: id1,sd@THITACHI_HUS103073FL3800_V3X6MDDA #

Enable the submirrors on the replacement disk. Start with the swap partition as this won’t affect any data in case SVM runs into a problem. You may enable the submirrors in the new disk in parallel or in sequence. If the I/O load on the system is heavy then do it in sequence. Otherwise, enable the submirrors in parallel.
# metareplace -e d1 c1t1d0s1 d1: device c1t1d0s1 is enabled solaris_1# metastat d1 d1: Mirror Submirror 0: d11 State: Okay Submirror 1: d21 State: Resyncing Resync in progress: 0 % done Pass: 1 Read option: roundrobin (default) Write option: parallel (default) Size: 10491456 blocks (5.0 GB) d11: Submirror of d1 State: Okay Size: 10491456 blocks (5.0 GB) Stripe 0: Device Start Block Dbase c1t0d0s1 0 No d21: Submirror of d1 State: Resyncing Size: 10491456 blocks (5.0 GB) Stripe 0: Device Start Block Dbase c1t1d0s1 0 No

State Reloc Hot Spare Okay Yes

State Reloc Hot Spare Resyncing Yes

Device Relocation Information: Device Reloc Device ID c1t0d0 Yes id1,sd@SFUJITSU_MAW3073NCSUN72G_000707B0KHT4____DAN0P720KHT4 c1t1d0 Yes id1,sd@THITACHI_HUS103073FL3800_V3X6MDDA #

SVM will resync the submirrors as soon as they are enabled. This is done in the background and may take a fair amount of time depending on the size of the submirrors. Now is a good time to go for a cup of coffee. Don’t forget to check the progress of the resync when you return.

How to Replace a Failed SVM Disk

Comments

Content

Sponsor Documents

Recommended