Skip to main content

Replacing a Failed Physical Disk with New One on HP 3PAR Storage

On HP 3PAR Storage, disks are grouped inside magazines. So when it comes to replacing a failed disk, magazine that holds the disk has to be brought offline using a servicemag start command.


First, identify if there is a failed disk on the system:

cli% showpd -failed -degraded

                                                -Size(MB)-- ----Ports----
 Id CagePos Type RPM State   Total Free A      B      Cap(GB)
125 3:1:1   FC    15 failed 278528    0 0:6:4  1:6:4*     300
-------------------------------------------------------------
  1 total                   278528    0

There is one failed 300GB FC disk with id 125.



Second, check if there is an ongoing servicemag operation:

cli% servicemag status

No servicemag operations logged.



So we can start servicemag operation:

cli% servicemag start -pdid 125

Are you sure you want to run servicemag?
select q=quit y=yes n=no: y
servicemag start -pdid 125
... servicing disks in mag: 3 1
...      normal disks:  WWN [XXXXXXXXXXXX] Id [126]  diskpos [2]
....................    WWN [XXXXXXXXXXXX] Id [127]  diskpos [3]
....................    WWN [XXXXXXXXXXXX] Id [206]  diskpos [0]
...  not normal disks:  WWN [XXXXXXXXXXXX] Id [125]  diskpos [1]

The servicemag start operation will continue in the background.



We should keep track of the status of the servicemag process and physical disk state (Note that magazine position is 3 1, we later use it after we brought it online):

cli% servicemag status

Cage 3, magazine 1:
The magazine is being brought offline due to a servicemag start.
The last status update was at Thu Sep 28 11:19:55 2017.
Unable to provide a relocation estimate
servicemag start -pdid 125 -- is in Progress

cli% servicemag status

Cage 3, magazine 1:
The magazine is being brought offline due to a servicemag start.
The last status update was at Thu Sep 28 11:19:55 2017.
Chunklets relocated: 306 in 11 minutes and 41 seconds
Chunklets remaining: 662
Chunklets marked for moving: 662
Estimated time for relocation completion based on 2 seconds per chunklet is: 22 minutes and 4 seconds
servicemag start -pdid 125 -- is in Progress

cli% showpd -state

 Id CagePos Type -State- ---------------------Detailed_State---------------------
  0 0:0:0   FC   normal  normal
  1 0:0:1   FC   normal  normal
  2 0:0:2   FC   normal  normal
  3 0:0:3   FC   normal  normal
..
..
125 3:1:1   FC   failed  vacated,invalid_media,smart_threshold_exceeded,servicing

cli% servicemag status

Cage 3, magazine 1:
The magazine was successfully brought offline by a servicemag start command.
The command completed Thu Sep 28 13:27:54 2017.
servicemag start -pdid 125 -- Succeeded



After the magazine brought offline successfully, we can insert a new one by removing the failed disk physically. We ensure new disk plugged into the right place so we can bring the magazine online:

cli% servicemag resume 3 1

Are you sure you want to run servicemag?
select q=quit y=yes n=no: y
servicemag resume 3 1
... mag 3 1 already onlooped
... firmware is current on pd WWN [XXXXXXXXXXXX] Id [126]
... firmware is current on pd WWN [XXXXXXXXXXXX] Id [127]
... firmware is current on pd WWN [XXXXXXXXXXXX] Id [206]
... firmware is current on pd WWN [XXXXXXXXXXXX]
... firmware is current on pd WWN [XXXXXXXXXXXX] Id [125]
... checking for valid disks...
... checking for valid disks...
...   disks in mag  : 3 1
...      normal disks:  WWN [XXXXXXXXXXXX] Id [122]  diskpos [1]
....................    WWN [XXXXXXXXXXXX] Id [126]  diskpos [2]
....................    WWN [XXXXXXXXXXXX] Id [127]  diskpos [3]
....................    WWN [XXXXXXXXXXXX] Id [206]  diskpos [0]
...  not normal disks:  WWN [XXXXXXXXXXXX] Id [125]
... verifying spare space for disks 126 and 126
... verifying spare space for disks 127 and 127
... verifying spare space for disks 206 and 206
... verifying spare space for disks 125 and 122
... playback chunklets from pd WWN [XXXXXXXXXXXX] Id [122]
... playback chunklets from pd WWN [XXXXXXXXXXXX] Id [126]
... playback chunklets from pd WWN [XXXXXXXXXXXX] Id [127]
... playback chunklets from pd WWN [XXXXXXXXXXXX] Id [206]

The servicemag resume operation will continue in the background.



It will take a couple of hours to recover:

cli% servicemag status

Cage 3, magazine 1:
The magazine is being brought online due to a servicemag resume.
The last status update was at Thu Sep 28 15:58:25 2017.
Chunklets relocated: 6 in 2 minutes and 29 seconds
Chunklets remaining: 1442
Chunklets marked for moving: 1442
Estimated time for relocation completion based on 24 seconds per chunklet is: 9 hours, 36 minutes and 48 seconds
servicemag resume 3 1 -- is in Progress


Comments

Popular posts from this blog

Creating Multiple VLANs over Bonding Interfaces with Proper Routing on a Centos Linux Host

In this post, I am going to explain configuring multiple VLANs on a bond interface. First and foremost, I would like to describe the environment and give details of the infrastructure. The server has 4 Ethernet links to a layer 3 switch with names: enp3s0f0, enp3s0f1, enp4s0f0, enp4s0f1 There are two bond interfaces both configured as active-backup bond0, bond1 enp4s0f0 and enp4s0f1 interfaces are bonded as bond0. Bond0 is for making ssh connections and management only so corresponding switch ports are not configured in trunk mode. enp3s0f0 and enp3s0f1 interfaces are bonded as bond1. Bond1 is for data and corresponding switch ports are configured in trunk mode. Bond0 is the default gateway for the server and has IP address 10.1.10.11 Bond1 has three subinterfaces with VLAN 4, 36, 41. IP addresses are 10.1.3.11, 10.1.35.11, 10.1.40.11 respectively. Proper communication with other servers on the network we should use routing tables. There are three

Sending Jboss Server Logs to Logstash Using Filebeat with Multiline Support

In addition to sending system logs to logstash, it is possible to add a prospector section to the filebeat.yml for jboss server logs. Sometimes jboss server.log has single events made up from several lines of messages. In such cases Filebeat should be configured for a multiline prospector. Filebeat takes lines do not start with a date pattern (look at pattern in the multiline section "^[[:digit:]]{4}-[[:digit:]]{2}-[[:digit:]]{2}" and negate section is set to true ) and combines them with the previous line that starts with a date pattern. server.log file excerpt where DatePattern: yyyy-MM-dd-HH and ConversionPattern: %d %-5p [%c] %m%n Logstash filter: