Postings may contain unverified user-created content and change frequently.
The content is provided as-is and
is not warrantied by Cisco. 1 UCS Failure Scenarios Testing using CLI Introduction on page 1 Testing Fabric Interconnect failure on page 1 Testing UCS Blade failure on page 3 Testing Server Port failure on page 4 Testing Uplink Port failure on page 6 Related Information on page 8
Introduction This document shows various UCS failure scenarios and method to test them. These tests are good to check for correct UCS system behavior in case of a failure. A UCS system consists of one or two UCS 6100 series switches or fabric interconnects. A UCS system with two fabric interconnects is typically deployed in active-active pair that supports failover. UCS 2100 fabric extender enables the UCS 6100 fabric interconnect to provide all access- layer switching needs for the connected servers. This traffic is then switched to its required destination by the fabric interconnect and no switching, whatsoever, is done by the fabric extender. This document shows how to test the following failure scenarios using the UCS Manager CLI: Fabric Interconnect failure UCS Blade failure Server Port failure Uplink Port failure
Testing Fabric Interconnect failure The method here is to reboot one FI switch, preferably the subordinate fabric interconnect switch. The expected result is to check the "show cluster state" command will show the primary fabric interconnect "UP" and the subordinate fabric interconnect "DOWN". The fabric interconnect cluster will show "NOT READY".
Sample Test UCS Failure Scenarios Testing using CLI Postings may contain unverified user-created content and change frequently. The content is provided as-is and is not warrantied by Cisco. 2 On Fabric Interconnect A (Primary): UCS1-FA-A# show cluster state Cluster Id: 0x603bca7e0b7311e1-0x9987547fee1fd575 A: UP, PRIMARY B: UP, SUBORDINATE HA READY
On Fabric Interconnect B (Subordinate): UCS1-FA-B(local-mgmt)# reboot The switch will be rebooted. Are you sure? (yes/no):yes Read from remote host 10.134.166.89: Connection reset by peer Connection to 10.134.166.89 closed.
Verify on the Primary Fabric Interconnect: UCS1-FA-A# show cluster state Cluster Id: 0x603bca7e0b7311e1-0x9987547fee1fd575 A: UP, PRIMARY B: DOWN, INAPPLICABLE HA NOT READY Peer Fabric Interconnect is down
UCS Failure Scenarios Testing using CLI Postings may contain unverified user-created content and change frequently. The content is provided as-is and is not warrantied by Cisco. 3 Testing UCS Blade failure This method requires reboot of a UCS blade server. The expected result is that the Overall Status coloumn in the "show server status" command will show "Power Off" while the blade is rebooting.
Sample Test [UCS1-80 ~]$ reboot
Verify on the Primary Fabric Interconnect: UCS1-FA-A# show server status Server Slot Status Availability Overall Status Discovery ------- --------------------------------- ------------ --------------------- --------- <--output omitted--> 10/1 Equipped Unavailable Ok Complete 10/2 Equipped Unavailable Ok Complete 10/3 Equipped Unavailable Ok Complete 10/4 Equipped Unavailable Ok Complete 10/5 Equipped Unavailable Ok Complete 10/6 Equipped Unavailable Ok Complete 10/7 Equipped Unavailable Ok Complete 10/8 Equipped Unavailable Power Off Complete
UCS Failure Scenarios Testing using CLI Postings may contain unverified user-created content and change frequently. The content is provided as-is and is not warrantied by Cisco. 4 Testing Server Port failure This method requires to disable a server port (interface); e.g. server port 1/9 in fabric interconnect B. The expected result is that the server's "show interface" command in the fabric will show the port's Admin State "Disabled" and Oper State "Failed".
Sample Test UCS1-FA-A# scope eth-server UCS1-FA-A /eth-server # scope fabric b UCS1-FA-A /eth-server/fabric # show interface
Interface:
Slot Id Port Id Admin State Oper State Lic State Grace Prd State Reason ---------- ---------- ----------- ---------------- -------------------- --------------- ------------ 1 1 Enabled Up License Ok 0 1 10 Enabled Up License Ok 0 1 11 Enabled Up License Ok 0 <--output omitted--> 1 6 Enabled Up License Ok 0 1 7 Enabled Up License Ok 0 1 8 Enabled Up License Ok 0 1 9 Enabled Up License Ok 0 UCS Failure Scenarios Testing using CLI Postings may contain unverified user-created content and change frequently. The content is provided as-is and is not warrantied by Cisco. 5
To Verify: UCS1-FA-A /eth-server/fabric # show interface
Interface:
Slot Id Port Id Admin State Oper State Lic State Grace Prd State Reason ---------- ---------- ----------- ---------------- -------------------- --------------- ------------ 1 1 Enabled Up License Ok 0 1 10 Enabled Up License Ok 0 1 11 Enabled Up License Ok 0 <--output omitted--> 1 6 Enabled Up License Ok 0 1 7 Enabled Up License Ok 0 1 8 Enabled Up License Ok 0 1 9 Disabled Failed Unknown 0 Admin config change
UCS Failure Scenarios Testing using CLI Postings may contain unverified user-created content and change frequently. The content is provided as-is and is not warrantied by Cisco. 6 UCS1-FA-A / eth-server # exit
Testing Uplink Port failure This method requires to disable services on an uplink port, e.g. Uplink Port 1/39 on Fabric B. The expected result is that the fabric's uplink "show interface" command will show the port's Admin State "Disabled" and Oper State "Admin Down".
Sample Test UCS1-FA-A# scope eth-uplink UCS1-FA-A /eth-uplink # scope fabric b UCS1-FA-A /eth-uplink/fabric # show interface
Interface:
Slot Id Port Id Admin State Oper State Lic State Grace Period State Reason ---------- ---------- ----------- ---------------- -------------------- --------------- ------------ 1 36 Disabled Admin Down Unknown 0 Administratively down 1 40 Disabled Admin Down Unknown 0 Administratively down
Member Port:
UCS Failure Scenarios Testing using CLI Postings may contain unverified user-created content and change frequently. The content is provided as-is and is not warrantied by Cisco. 7 Port-channel Slot Port Oper State State Reason Lic State Grace Period ------------ ----- ----- --------------- ----------------------------------- -------------------- ------------ 1 1 35 Up License Ok 0 1 1 39 Up License Ok 0 2 1 33 Sfp Not Present Unknown Unknown 0 2 1 37 Sfp Not Present Unknown Unknown 0
To verify: UCS1-FA-A /eth-uplink/fabric # show interface
Interface:
Slot Id Port Id Admin State Oper State Lic State Grace Period State Reason ---------- ---------- ----------- ---------------- -------------------- --------------- ------------ 1 36 Disabled Admin Down Unknown 0 Administratively down UCS Failure Scenarios Testing using CLI Postings may contain unverified user-created content and change frequently. The content is provided as-is and is not warrantied by Cisco. 8 1 39 Disabled Admin Down Unknown 0 Administratively down 1 40 Disabled Admin Down Unknown 0 Administratively down
Member Port:
Port-channel Slot Port Oper State State Reason Lic State Grace Period ------------ ----- ----- --------------- ----------------------------------- -------------------- ------------ 1 1 35 Up License Ok 0 2 1 33 Sfp Not Present Unknown Unknown 0 2 1 37 Sfp Not Present Unknown Unknown 0
UCS1-FA-A /eth-uplink/fabric # exit
Related Information Understanding Fabric Failure and Failover in UCS How to recover from a software failure on the 6120 Fabric Interconnect Procedure to Gracefully Shutdown and Powerup UCS system