Oracle 6140 Copyback doesnt start automatically

Again this week we had a disk fail in one of our ageing 6140 arrays. no big deal, but once I’d pulled the failed disk, waited around for a few mins, and replaced the disk with a new one copybook didn’t start of its own accord. I’ve seen this a number of times before, but not for a year or so now.
It pretty much does it if the disk fails with:

Event Message: Drive by-passed
Component type: Drive

It can sometimes also declare the disk Missing. Here’s how to breath some life into the process. Ok, before we start, this will only work when the array is on 07.xx firmware.

You will have a disk looking something like this:

Replaced Drive - No copyback

Replaced Drive – No copyback

First click on the volume group(VG) in the logical tab that the disk belongs to.

Now right click the volume group name and click on replace drives.

Replace drives

Replace drives….

It will now bring up a new window and in here you will see the missing disk mentioned in the top panel and in the bottom panel it should list the hot spare(HS) that is in-use (if you had any) and any unassigned disk that might be in the array.

Select drives to replace

Select drives to replace

Your disk you replaced should show here as an unassigned disk so click on it in the bottom panel and ‘replace drive’ will now become available, click it and the copyback will now happen.

If you did not want to have the copyback happen and would like to keep the HS as part of the VG you would click on that, it would then flip the disk from a ‘in-use hot spare’ to a member of the VG but you would now be one HS less unless you made the other disk the HS.

Once you make your disk selection, copyback should begin.

Copyback Begins

Copyback Begins

Pretty simple, but perhaps not obvious, I certainly can’t remember it first time every time, so in some ways, this post is for me.

Sun Storagetek/Oracle 6000 series firmware Preparation

I recently had some good feedback from a couple of really nice folk asking for advice/help/comment on some Sun/STK/Oracle 6xxx series issues they were having. I’ll start to try and put more stuff up about the arrays.

One very decent document I read was the upgrade guide. It’s pretty simple to read doc and contains good information about the upgrade process.

Sun Storage 6000 Series Array Firmware Upgrade Guide

To that end, it’s not exactly obvious, since the move away from Sunsolve, how to find the software in the MOS portal. To locate the 6xxx series software for the firmware upgrade process, I’ve put together this ‘where is it’ step by step.

1. Login to My Oracle Support at https://support.oracle.com/.
2. Along the top of the window that opens as your first page, click on the ‘Patches & Updates’ tab.
3. In the Patch Search pane, click on “Product or Family(Advanced Search).”
4. Tick or check the box for “Include all products in a family.”
5. In the Product field, clock the drop down and select “Sun StorageTek 6000 Series Software”
6. In the Release field, select “Sun StorageTek 6000 Series software 1.0”. It should already be selected, but just check.
7. Select the platform to install the tool and click search OR as i wanted to see all I just clicked Search.

My Oracle Support - Sun storagetek 6000 series firmware

8. This will take you to a new window with your search results. -Patch 10265930: “Sun StorageTek 6000 Series Array Firmware Upgrade Utility”
9. Download the zip file and extract the executables.

Once you have the file extracted and installed, we proceed to the firmware updates themselves.

vSphere4 causes AVT on Oracle (Sun/STK) 6140

Ever since we started our vSphere4 upgrade, our 6140 arrays are behaving well, but differently. Each new vSphere4 server we bring online results in an AVT (automated volume transfer) of a random lun. This process ,is ordinarily used for link failures or maintenance when the array needs to serve a host via an alternate path, which is of course the beauty of multiple-path I/O. Recovery Guru throws out a message like this:

Volume not on preferred path

As soon as we bring a new vSphere4 server onto the bus, it ‘walks’ the bus, and this results in a random lun or more moving between controllers. The first time it did this I was very dubious and went ploughing through logs before reaching a conclusion. However it’s repeatable and in reality fairly harmless, but if you see it, don’t panic, just move it back again and make sure your luns are balanced.

Why is it doing it? Well vSphere4 does a discovery on any bus, so it knows about all disks etc being offered or present on any bus. This is meant to be helpful in that you don’t have to use the virtual centre to do storage adaptor refreshes, it’s done for you. vSphere4 also tries to encapsulate any volume or lun it finds – all meant to happen, but on the sun/oracle hardware, just be aware of this side effect.

This is also the reason why with vSphere4 and crystal firmware, you MUST delete loon ID 31 to avoid any issues going forward with vmware having encapsulated your inbound santricity lun (lun 31)