December 14, 2015

InfiniBand Diagnostic Data Collection

From Database Servers:
 # /opt/oracle.SupportTools/ibdiagtools/verify-topology  
 # ibqueryerrors.pl -rR -s PortRcvSwitchRelayErrors,PortXmitDiscards,PortXmitWait,VL15Dropped  
 # ibstat  
 # ibv_devinfo -v  
 # /opt/oracle.SupportTools/CheckSWProfile.sh -I <switch1>,<switch2>,<switch3>  


From InfiniBand Switches:
 -- On every infiniband switch in the network, collect the outputs of the following commands:  
 # nm2version  
 # version  
 # env_test  
 # listlinkup  
 # showunhealthy  
 # ibdiagnet -c 1000 or 500  
 -- Copy the following files from the infiniband switches;  
 /var/log/messages   
 /var/log/opensm.log   
 /var/log/opensm-subnet.lst   
 -- Collect the outputs of the following commands on any leaf switch:  
 # ibswitches  
 # ibnetdiscover  
 # sminfo  
 # getmaster -l  
 -- Run the following commands on a leaf switch  
 # ibqueryerrors.pl -rR -s RcvSwRelayErrors,XmtDiscards,XmtWait,VL15Dropped  
 # ibdiagnet -skip dup_guids -pm -P all=1  
             -- This command will create a few files in /tmp directory.  
             -- Example:  
                        # cd /tmp;  
                        # tar cvf pre-clear-ibdiagnet.tar ibdiagnet*  
 -- Collect IB iLOM Snapshot:  
 - the web ILOM interface at: http://<hostname of switch>  
 - go to 'Maintenance' tab  
 - go to 'Snaphost' tab  
 - Select Data Set=normal & choose preferred Transfer Method  
 - select 'run'  

Reference Documents



No comments:

Post a Comment