A disk can be local , provided from a san , or provided from a vio
1)Determining the problem
Look into errpt you may get this kind of output
IDENTIFIER TIMESTAMP T C RESOURCE_NAME DESCRIPTION
B6267342 0424100608 P H hdisk10 DISK OPERATION ERROR
B6267342 0424100608 P H hdisk10 DISK OPERATION ERROR
C62E1EB7 0424100608 P H hdisk10 DISK OPERATION ERROR
B6267342 0424100608 P H hdisk10 DISK OPERATION ERROR
Step 1: open the error with command errpt –aj <identifier>
Trick : Some times the starting errors gives you more information for example
errpt –aj IDENTIFIER|tail -200
You can get a output like below
errpt -aj B6267342
LABEL: EMCP_PATH_ALIVE
IDENTIFIER: C6E26F3B
Date/Time: Fri Apr 25 08:15:38 TAIST 2008
Sequence Number: 46034
Machine Id: 000A2F72D600
Node Id: MAXX1
Class: H
Type: INFO
Resource Name: hdisk7
Resource Class: disk
Resource Type: CLAR_FC_raid5
Location: U787B.001.DNW8357-P1-C3-T1-W5006016130202370-
L1000000000000
VPD:
Manufacturer................DGC
Machine Type and Model......RAID 5
ROS Level and ID............0324
Serial Number...............CK200042500171
Device Specific.(SI)........CX3-2
Device Specific.(PQ)........00
Device Specific.(VS)........040000DA54CL
Device Specific.(UI)........6006016021701100948C45D4F1A9DC11
Device Specific.(FL)........0004
Device Specific.(Z0)........10
Device Specific.(Z1)........10
Description
BACK-UP PATH STATUS CHANGE
Probable Causes
DISK
SCSI ADAPTER
SCSI CABLE
Failure Causes
DISK
SCSI ADAPTER
CABLE LOOSE OR DEFECTIVE
Recommended Actions
PERFORM PROBLEM DETERMINATION ON SCSI TARGET DEVICE
PERFORM PROBLEM DETERMINATION ON HOST SCSI ADAPTER
REPLACE SCSI CABLE
Detail Data
RESOURCE NAME
--------------------------------------------------------------------
-------
LABEL: SC_DISK_ERR2
Run the command on the pv to see whether it is available
Lspv <pvname>
See
for any stale partition .If the disk is not getting accessed try to
identify the disk using diag and And just reinsert the disk Verify all
the lv residing on the disk using
lspv –l <pv name>
And take backup of those partitions
If the error repeats itself lock call in ibm to replace hardware
RUN THE COMMAND LQUERYPV TO QUERY FROM THE DISK
THE RESULT SHOULD NOT BE ALL ZERO
# lquerypv -h /dev/hdisk3
00000000 C9C2D4C1 00000000 00000000 00000000 |................|
00000010 00000000 00000000 00000000 00000000 |................|
00000020 00000000 00000000 00000000 00000000 |................|
00000030 00000000 00000000 00000000 00000000 |................|
00000040 00000000 00000000 00000000 00000000 |................|
00000050 00000000 00000000 00000000 00000000 |................|
00000060 00000000 00000000 00000000 00000000 |................|
00000070 00000000 00000000 00000000 00000000 |................|
00000080 00CF4DCC 463300A2 00000000 00000000 |..M.F3..........|
00000090 00000000 00000000 00000000 00000000 |................|
000000A0 00000000 00000000 00000000 00000000 |................|
000000B0 00000000 00000000 00000000 00000000 |................|
000000C0 00000000 00000000 00000000 00000000 |................|
000000D0 00000000 00000000 00000000 00000000 |................|
000000E0 00000000 00000000 00000000 00000000 |................|
000000F0 00000000 00000000 00000000 00000000 |................|
00000000 C9C2D4C1 00000000 00000000 00000000 |................|
00000010 00000000 00000000 00000000 00000000 |................|
00000020 00000000 00000000 00000000 00000000 |................|
00000030 00000000 00000000 00000000 00000000 |................|
00000040 00000000 00000000 00000000 00000000 |................|
00000050 00000000 00000000 00000000 00000000 |................|
00000060 00000000 00000000 00000000 00000000 |................|
00000070 00000000 00000000 00000000 00000000 |................|
00000080 00CF4DCC 463300A2 00000000 00000000 |..M.F3..........|
00000090 00000000 00000000 00000000 00000000 |................|
000000A0 00000000 00000000 00000000 00000000 |................|
000000B0 00000000 00000000 00000000 00000000 |................|
000000C0 00000000 00000000 00000000 00000000 |................|
000000D0 00000000 00000000 00000000 00000000 |................|
000000E0 00000000 00000000 00000000 00000000 |................|
000000F0 00000000 00000000 00000000 00000000 |................|
REPLACING A DISK
Identify the disk with diag => identify hot plug devices
Remove the disk
rmdev –Rdl hdisk#
And remove the disk physically
Step 1 If the disk is not local and provided by a storage
[/home/stiwari]-> lsdev -Cc disk
hdisk0 Available 04-08-00-3,0 16 Bit LVD SCSI Disk Drive
hdisk1 Available 04-08-00-4,0 16 Bit LVD SCSI Disk Drive
hdisk168 Available 07-08-02 EMC CLARiiON FCP RAID 5 Disk
hdisk169 Available 07-08-02 EMC CLARiiON FCP RAID 5 Disk
hdiskpower23 Available 07-08-02 PowerPath Device
hdiskpower24 Available 07-08-02 PowerPath Device
hdiskpower25 Available 07-08-02 PowerPath Device
hdiskpower26 Available 07-08-02 PowerPath Device
hdiskpower27 Available 07-08-02 PowerPath Device
hdiskpower28 Available 07-08-02 PowerPath Device
See the difference between LVD SCSI Disk Drive and EMC CLARIION FCP
RAID 5 Disk or PowerPath Device
Step 2 check for the parent device of the disk
stiwari@machine1[/home/stiwari]-> lsdev -l fscsi1 -F parent
fcs1
step 3) check for the parent fibre channel adapter
stiwari@machine1[/home/stiwari]-> fcstat fcs1
FIBRE CHANNEL STATISTICS REPORT: fcs1
Device Type: FC Adapter (df1000fa)
Serial Number: 1F5060BE0E
Option ROM Version: 02881955
Firmware Version: T1D1.91A5
World Wide Node Name: 0x20000000C9449604
World Wide Port Name: 0x10000000C9449604
FC-4 TYPES:
Supported: 0x0000012000000000000000000000000000000000000000000000000000000000
Active: 0x0000010000000000000000000000000000000000000000000000000000000000
Class of Service: 3
Port Speed (supported): 2 GBIT
Port Speed (running): 2 GBIT
Port FC ID: 0x611413
Port Type: Fabric
Seconds Since Last Reset: 50102731
Transmit Statistics Receive Statistics
------------------- ------------------
Frames: 4294967295 4294967295
Words: 1099511627520 1099511627520
LIP Count: 0
NOS Count: 0
Error Frames: 0
Dumped Frames: 0
Link Failure Count: 0
Loss of Sync Count: 5
Loss of Signal: 0
Primitive Seq Protocol Error Count: 0
Invalid Tx Word Count: 32729
Invalid CRC Count: 0
IP over FC Adapter Driver Information
No DMA Resource Count: 0
No Adapter Elements Count: 0
FC SCSI Adapter Driver Information
No DMA Resource Count: 0
No Adapter Elements Count: 2006
No Command Resource Count: 36900676
IP over FC Traffic Statistics
Input Requests: 0
Output Requests: 0
Control Requests: 0
Input Bytes: 0
Output Bytes: 0
FC SCSI Traffic Statistics
Input Requests: 493419410
Output Requests: 1139661947
Control Requests: 3346776
Input Bytes: 66221740349270
Output Bytes: 80335694275489
check for transmit and received data by the following commands
stiwari@machine1[/home/stiwari]-> fcstat fcs1|awk '/Transmit/,/Words/'
Transmit Statistics Receive Statistics
------------------- ------------------
Frames: 4294967295 4294967295
Words: 1099511627520 1099511627520
stiwari@machine1[/home/stiwari]-> fcstat fcs1|awk '/Transmit/,/Words/'
Transmit Statistics Receive Statistics
------------------- ------------------
Frames: 4294967295 4294967295
Words: 1099511627520 1099511627520
stiwari@machine1[/home/stiwari]-> fcstat fcs0|awk '/Transmit/,/Words/'
Transmit Statistics Receive Statistics
------------------- ------------------
Frames: 4294967295 4294967295
Words: 1099511627520 1099511627520
stiwari@machine1[/home/stiwari]-> fcstat fcs1|awk '/Transmit/,/Words/'
Transmit Statistics Receive Statistics
------------------- ------------------
Frames: 4294967295 4294967295
Words: 1099511627520 1099511627520
If the fcs is showing normal data transfer , data send and receive should be changing
If it is not changing there may be connectivity issue
And if every thing seems to be normal up to the fcs, the problem should be on the
storage side
Step 2 If disk is not local and provided by vio server
On the client side check the disk path’s available or failed
# lspath
Enabled hdisk1 vscsi2
Enabled hdisk0 vscsi1
Enabled hdisk2 vscsi1
Enabled hdisk2 vscsi2
Check for disk LUN and adapter ID
# lspv
hdisk0 00ccffe222e4bec9 rootvg active
hdisk1 00ccffe24650d7a3 altinst_rootvg
hdisk2 00ccffe2152cb2e6 nimvg active
# lsdev -Cc disk
hdisk0 Available Virtual SCSI Disk Drive
hdisk1 Available Virtual SCSI Disk Drive
hdisk2 Available Virtual SCSI Disk Drive
,
# lscfg -vpl hdisk0
hdisk0 U8234.EMA.10CFFE2-V3-
PLATFORM SPECIFIC
Name: disk
Node: disk
Device Type: block
C200-T1-L8100000000000000 Virtual SCSI Disk Drive
Check for the parent adapter
# lsdev -l hdisk0 -F parent
vscsi1
Check for card vital product information
# lscfg -vpl vscsi1
vscsi1 U8234.EMA.10CFFE2-V3-C200-T1 Virtual SCSI Client Adapter
Hardware Location Code......U8234.EMA.10CFFE2-V3-C200-T1
PLATFORM SPECIFIC
Name: v-scsi
Node: v-scsi@300000c8
Device Type: vscsi
Physical Location: U8234.EMA.10CFFE2-V3-C200-T1
Go to lpar properties to get the mapping of scsi adapters On the vio server
See all the vhost devices , this wil show you the adapter id also
$ lsmap -all |grep vhost
vhost0 U8234.EMA.10CFFE2-V1-C11 0x00000003
vhost1 U8234.EMA.10CFFE2-V1-C200 0x00000003
vhost2 U8234.EMA.10CFFE2-V1-C240 0x00000004
vhost3 U8234.EMA.10CFFE2-V1-C260 0x00000005
vhost4 U8234.EMA.10CFFE2-V1-C280 0x00000006
vhost5 U8234.EMA.10CFFE2-V1-C300 0x00000007
vhost6 U8234.EMA.10CFFE2-V1-C320 0x00000008
vhost7 U8234.EMA.10CFFE2-V1-C340 0x00000009
vhost8 U8234.EMA.10CFFE2-V1-C360 0x0000000a
vhost9 U8234.EMA.10CFFE2-V1-C380 0x0000000b
vhost10 U8234.EMA.10CFFE2-V1-C400 0x0000000c
vhost11 U8234.EMA.10CFFE2-V1-C420 0x0000000d
vhost12 U8234.EMA.10CFFE2-V1-C450 0x00000010
vhost13 U8234.EMA.10CFFE2-V1-C460 0x00000000
vhost14 U8234.EMA.10CFFE2-V1-C430 0x0000000e
vhost15 U8234.EMA.10CFFE2-V1-C440 0x0000000f
vhost16 U8234.EMA.10CFFE2-V1-C470 0x00000012
vhost17 U8234.EMA.10CFFE2-V1-C410 0x00000013
vhost18 U8234.EMA.10CFFE2-V1-C480 0x00000014
vhost19 U8234.EMA.10CFFE2-V1-C490 0x00000015
Check the related adapter for under lying disks Check that the LUN id matches with the disk in the client
$ lsmap -vadapter vhost0
SVSA Physloc Client Partition ID
--------------- -------------------------------------------- ------------------
vhost0 U8234.EMA.10CFFE2-V1-C11 0x00000003
VTD vcd0
Status Available
LUN 0x8100000000000000
Backing device cd0
Physloc U789D.001.DQD21MZ-P4-D1
$ lsmap -vadapter vhost1
SVSA Physloc Client Partition ID
--------------- -------------------------------------------- ------------------
vhost1 U8234.EMA.10CFFE2-V1-C200 0x00000003
VTD nim_hdisk81
Status Available
LUN 0x8300000000000000
Backing device hdisk81
Physloc U789D.001.DQD21MZ-P1-C2-T1-W5005076801105908-L4B000000000000
VTD vrootvg_nim
Status Available
LUN 0x8100000000000000
Backing device rootvgnim
Physloc
Check the related backing disk and its parent , check its parent fcs as we checked in local
$ lsdev -dev hdisk81 -field parent
parent
fscsi0
$ lsdev -dev fscsi0 -field parent
parent
fcs0
$ fcstat fcs0
FIBRE CHANNEL STATISTICS REPORT: fcs0
Device Type: 4Gb FC PCI Express Adapter (df1000fe) (adapter/pciex/df1000fe)
Serial Number: 1B84204D7A
Option ROM Version: 02E82752
Firmware Version: Z1F2.70A5
World Wide Node Name: 0x20000000C980F9B8
World Wide Port Name: 0x10000000C980F9B8
FC-4 TYPES:
Supported: 0x0000012000000000000000000000000000000000000000000000000000000000
Active: 0x0000010000000000000000000000000000000000000000000000000000000000
Class of Service: 3
Port Speed (supported): 4 GBIT
Port Speed (running): 4 GBIT
Port FC ID: 0x0a0007
Port Type: Fabric
Seconds Since Last Reset: 2428460
Transmit Statistics Receive Statistics
------------------- ------------------
Frames: 2816343231 4294967295
Words: 1099511627520 1099511627520
LIP Count: 0
NOS Count: 0
Error Frames: 0
Dumped Frames: 0
Link Failure Count: 1
Loss of Sync Count: 1
Loss of Signal: 2
Primitive Seq Protocol Error Count: 0
Invalid Tx Word Count: 0
Invalid CRC Count: 0
IP over FC Adapter Driver Information
No DMA Resource Count: 0
No Adapter Elements Count: 0
FC SCSI Adapter Driver Information
No DMA Resource Count: 0
No Adapter Elements Count: 0
No Command Resource Count: 0
IP over FC Traffic Statistics
Input Requests: 0
Output Requests: 0
Control Requests: 0
Input Bytes: 0
Output Bytes: 0
FC SCSI Traffic Statistics
Input Requests: 261680640
Output Requests: 181691062
Control Requests: 51720811
Input Bytes: 11908976819344
* Source Article from : Internet