bityard Blog

// Display the AIX devices in a tree format

In AIX there is the proctree (or ps -fT 0) command to display the currently running processes in a tree format. This is very helpful when one is primaryly interested in the parent and child relationship between the individual processes. Unfortunately a similar command for the parent and child relationship of devices is still missing from the stock AIX. There are already several script implementations out on the net to fill that particular gap. As a scripting exercise i wanted to do my own version of a devtree command. It was included in the aaa_base RPM package. The output for e.g. an LPAR with virtual ethernet and virtual SCSI devices looks like this:

|
|-- inet0
|   |-- en1
|   |-- et1
|   |-- lo0
|-- iocp0
|-- lvdd
|-- pty0
|-- rootvg
|   |-- hd1
|   |-- hd2
|   |-- hd3
|   |-- hd4
|   |-- hd5
|   |-- hd6
|   |-- hd8
|   |-- hd10opt
|   |-- hd9var
|   |-- lv_srv
|-- sfw0
|-- sys0
|   |-- sysplanar0
|   |   |-- L2cache0
|   |   |-- mem0
|   |   |-- pci0
|   |   |   |-- pci4
|   |   |   |-- pci5
|   |   |   |-- pci6
|   |   |-- pci1
|   |   |   |-- pci7
|   |   |   |-- pci8
|   |   |-- pci2
|   |   |   |-- pci9
|   |   |   |-- pci10
|   |   |   |-- pci11
|   |   |-- pci3
|   |   |   |-- pci12
|   |   |   |-- pci13
|   |   |-- pci14
|   |   |-- proc0
|   |   |-- proc4
|   |   |-- vio0
|   |   |   |-- ent1
|   |   |   |-- vsa0
|   |   |   |   |-- vty0
|   |   |   |-- vscsi0
|   |   |   |   |-- hdisk0
|   |   |   |-- vscsi1
|
`-- End of the device tree

in the regular mode, or like this:

|
|-- inet0                                         Available                   Internet Network Extension
|   |-- en1                                       Available                   Standard Ethernet Network Interface
|   |-- et1                                         Defined                   IEEE 802.3 Ethernet Network Interface
|   |-- lo0                                       Available                   Loopback Network Interface
|-- iocp0                                           Defined                   I/O Completion Ports
|-- lvdd                                          Available                   LVM Device Driver
|-- pty0                                          Available                   Asynchronous Pseudo-Terminal
|-- rootvg                                          Defined                   Volume group
|   |-- hd1                                         Defined                   Logical volume
|   |-- hd2                                         Defined                   Logical volume
|   |-- hd3                                         Defined                   Logical volume
|   |-- hd4                                         Defined                   Logical volume
|   |-- hd5                                         Defined                   Logical volume
|   |-- hd6                                         Defined                   Logical volume
|   |-- hd8                                         Defined                   Logical volume
|   |-- hd10opt                                     Defined                   Logical volume
|   |-- hd9var                                      Defined                   Logical volume
|   |-- lv_srv                                      Defined                   Logical volume
|-- sfw0                                          Available                   Storage Framework Module
|-- sys0                                          Available                   System Object
|   |-- sysplanar0                                Available                   System Planar
|   |   |-- L2cache0                              Available                   L2 Cache
|   |   |-- mem0                                  Available                   Memory
|   |   |-- pci0                                    Defined                   PCI Bus
|   |   |   |-- pci4                                Defined            00-10  PCI Bus
|   |   |   |-- pci5                                Defined            00-12  PCI Bus
|   |   |   |-- pci6                                Defined            00-16  PCI Bus
|   |   |-- pci1                                    Defined                   PCI Bus
|   |   |   |-- pci7                                Defined            02-10  PCI Bus
|   |   |   |-- pci8                                Defined            02-12  PCI Bus
|   |   |-- pci2                                    Defined                   PCI Bus
|   |   |   |-- pci9                                Defined            03-10  PCI Bus
|   |   |   |-- pci10                               Defined            03-12  PCI Bus
|   |   |   |-- pci11                               Defined            03-16  PCI Bus
|   |   |-- pci3                                    Defined                   PCI Bus
|   |   |   |-- pci12                               Defined            01-10  PCI Bus
|   |   |   |-- pci13                               Defined            01-12  PCI Bus
|   |   |-- pci14                                   Defined                   PCI Bus
|   |   |-- proc0                                 Available            00-00  Processor
|   |   |-- proc4                                 Available            00-04  Processor
|   |   |-- vio0                                  Available                   Virtual I/O Bus
|   |   |   |-- ent1                              Available                   Virtual I/O Ethernet Adapter (l-lan)
|   |   |   |-- vsa0                              Available                   LPAR Virtual Serial Adapter
|   |   |   |   |-- vty0                          Available                   Asynchronous Terminal
|   |   |   |-- vscsi0                            Available                   Virtual SCSI Client Adapter
|   |   |   |   |-- hdisk0                        Available                   Virtual SCSI Disk Drive
|   |   |   |-- vscsi1                            Available                   Virtual SCSI Client Adapter
|
`-- End of the device tree

in the detailed output mode.

// Live Partition Mobility (LPM) with Debian on a IBM Power LPAR - Part 1

Live partition mobility (LPM) in the IBM Power environment is roughtly the same as vMotion in the VMware ESX world. It allows the movement of a LPAR from one IBM Power hardware system to another without (major) interruption to the system running within the LPAR. We use this feature on a regular basis for our AIX LPARs and it has simplified our work to a great extent, since we no longer need downtimes for a lot our regular administrative work. LPM also works for LPARs running Linux as an OS, but since IBM only supports the SuSE and Red Hat enterprise distributions, the necessary service and productivity tools to successfully perform LPM – and also DLPAR – operations are not readyly available to users of the Debian distribution. I still wanted to be able to do DLPAR and LPM operations on our Debian LPARs as well. To that effect, i did a conversion of the necessary RPM packages from the service and productivity tools mentioned before, to the DEB package format. Most of the conversion work was done by the alien tool provided by the Debian distribution. Besides that, there still were some manual patches necessary to adjust the components to the specifics of the Debian environment. A current version of the Linux kernel and a rebuild of the kernel package with some PPC specific options enabled was also necessary. Here are the individual steps:

  1. Install the prerequisite Debian packages:

    libstdc++5
    ksh
    uuid
    libsgutils1
    libsqlite3-0
    $ apt-get install libstdc++5 ksh uuid libsgutils1 libsqlite3-0

    The libsqlite3-0 is currently only necessary for the libservicelog-1-1-1-32bit package. The packages libservicelog and servicelog, which also have a dependency to libsqlite3-0, contain binaries and libraries that are build for a 64bit userland (ppc64) which is currently not available for Debian. Using the binaries or libraries from libservicelog or servicelog will therefore result in an error about unresolved symbols.

  2. Convert or download the IBM service and productivity tools:

    Option 1: Download the already converted IBM service and productivity tools:

    Option 2: Convert the IBM service and productivity tools from RPM to DEB:

    1. Install the prerequisite Debian packages:

      alien
      $ apt-get install alien
    2. Download the necessary patch files:

    3. Convert librtas:

      $ alien -gc librtas-32bit-1.3.6-4.ppc64.rpm
      $ patch -p 0 < librtas-32bit.patch
      $ cd librtas-32bit-1.3.6
      $ ./debian/rules binary
      
      $ alien -gc librtas-1.3.6-3.ppc64.rpm
      $ patch -p 0 < librtas.patch
      $ cd librtas-1.3.6
      $ ./debian/rules binary
    4. Convert src:

      $ alien -gc src-1.3.1.1-11277.ppc.rpm
      $ patch -p 0 < src.patch
      $ rm src-1.3.1.1/debian/postrm
      $ cd src-1.3.1.1
      $ perl -i -p -e 's/ppc64/powerpc/' ./debian/control
      $ ./debian/rules binary
    5. Convert RSCT core and utils:

      $ alien -gc rsct.core.utils-3.1.0.7-11277.ppc.rpm
      $ patch -p 0 < rsct.core.utils.patch
      $ cd rsct.core.utils-3.1.0.7
      $ ./debian/rules binary
      
      $ alien -gc rsct.core-3.1.0.7-11277.ppc.rpm
      $ patch -p 0 < rsct.core.patch
      $ cd rsct.core-3.1.0.7
      $ ./debian/rules binary
    6. Convert ServiceRM:

      $ alien -gc devices.chrp.base.ServiceRM-2.3.0.0-11231.ppc.rpm
      $ patch -p 0 < devices.chrp.base.ServiceRM.patch
      $ cd devices.chrp.base.ServiceRM-2.3.0.0
      $ ./debian/rules binary
    7. Convert DynamicRM:

      $ alien -gc DynamicRM-1.3.9-7.ppc64.rpm
      $ patch -p 0 < DynamicRM.patch
      $ cd DynamicRM-1.3.9
      $ ./debian/rules binary
    8. Convert lsvpd and libvpd:

      $ alien -gc libvpd2-2.1.3-3.ppc64.rpm
      $ patch -p 0 < libvpd.patch
      $ cd libvpd2-2.1.3
      $ ./debian/rules binary
      
      $ alien -gc lsvpd-1.6.11-4.ppc64.rpm
      $ patch -p 0 < lsvpd.patch
      $ cd lsvpd-1.6.11
      $ ./debian/rules binary
    9. Build, package and/or install PowerPC Utils:

      Install from sources at sourceforge.net or use custom build DEB package from:

    10. Convert libservicelog and servicelog:

      $ alien -gc libservicelog-1_1-1-1.1.11-9.ppc64.rpm
      $ patch -p 0 < libservicelog-1_1-1.patch
      $ cd libservicelog-1_1-1-1.1.11
      $ ./debian/rules binary
      
      $ alien -gc libservicelog-1.1.11-9.ppc64.rpm
      $ patch -p 0 < libservicelog.patch
      $ cd libservicelog-1.1.11
      $ ./debian/rules binary
      
      $ alien -gc libservicelog-1_1-1-32bit-1.1.11-9.ppc.rpm
      $ patch -p 0 < libservicelog-1_1-1-32bit.patch
      $ cd libservicelog-1_1-1-32bit-1.1.11
      $ ./debian/rules binary
      
      $ alien -gc servicelog-1.1.9-8.ppc64.rpm
      $ patch -p 0 < servicelog.patch
      $ cd servicelog-1.1.9
      $ ./debian/rules binary
  3. Install the IBM service and productivity tools DEB packages:

    $ dpkg -i librtas_1.3.6-4_powerpc.deb librtas-32bit_1.3.6-5_powerpc.deb \
        src_1.3.1.1-11278_powerpc.deb rsct.core.utils_3.1.0.7-11278_powerpc.deb \
        rsct.core_3.1.0.7-11278_powerpc.deb devices.chrp.base.servicerm_2.3.0.0-11232_powerpc.deb \
        dynamicrm_1.3.9-8_powerpc.deb libvpd2_2.1.3-4_powerpc.deb lsvpd_1.6.11-5_powerpc.deb \
        powerpc-ibm-utils_1.2.12-1_powerpc.deb libservicelog-1-1-1-32bit_1.1.11-10_powerpc.deb \
        libservicelog_1.1.11-10_powerpc.deb servicelog_1.1.9-9_powerpc.deb
  4. Rebuild the stock Debian kernel package as described in HowTo Rebuild An Official Debian Kernel Package. I've confirmed DLPAR and LPM to successfully work with at least the Debian kernel packages versions 2.6.39-3~bpo60+1 and 3.2.46-1~bpo60+1. On the make menuconfig step make sure the following kernel configuration options are selected:

    CONFIG_MIGRATION=y
    CONFIG_PPC_PSERIES=y
    CONFIG_PPC_SPLPAR=y
    CONFIG_LPARCFG=y
    CONFIG_PPC_SMLPAR=y
    CONFIG_PPC_RTAS=y
    CONFIG_RTAS_PROC=y
    CONFIG_NUMA=y
    # CONFIG_SPARSEMEM_VMEMMAP is not set
    CONFIG_MEMORY_HOTPLUG=y
    CONFIG_MEMORY_HOTPLUG_SPARSE=y
    CONFIG_MEMORY_HOTREMOVE=y
    CONFIG_ARCH_MEMORY_PROBE=y
    CONFIG_HOTPLUG_PCI=y
    CONFIG_HOTPLUG_PCI_RPA=y
    CONFIG_HOTPLUG_PCI_RPA_DLPAR=y

    Or use one of the following trimmed down kernel configuration files:

    Install the newly build kernel package, reboot and select the new kernel to be loaded.

  5. A few minutes after the system been started, DLPAR and LPM operations on the LPAR should now be possible from the HMC. A good indication from the HMC GUI is a properly filled field in the “OS Version” column. From the HMC CLI you can check with:

    $ lspartition -dlpar
    ...
    <#108> Partition:<45*8231-E2D*06AB35T, ststnagios02.lan.ssbag, 10.8.32.46>
           Active:<1>, OS:<Linux/Debian, 3.2.0-0.bpo.4.ssb.1-powerUnknown, Unknown>, DCaps:<0x2c7f>, CmdCaps:<0x19, 0x19>, PinnedMem:<0>
    ...

    The LPAR should show up in the output and the value of DCaps should be different from 0x0.

    After a successful LPM operation the output of dmesg from within the OS should look like this:

    ...
    [539043.613297] calling ibm,suspend-me on cpu 4
    [539043.960651] EPOW <0x6240040000000b8 0x0 0x0>
    [539043.960665] ibmvscsi 30000003: Re-enabling adapter!
    [539043.961606] RTAS: event: 21, Type: EPOW, Severity: 1
    [539043.962920] ibmvscsi 30000002: Re-enabling adapter!
    [539044.175848] property parse failed in parse_next_property at line 230
    [539044.485745] ibmvscsi 30000002: partner initialization complete
    [539044.485815] ibmvscsi 30000002: host srp version: 16.a, host partition vios1-p730-222 (1), OS 3, max io 262144
    [539044.485892] ibmvscsi 30000002: Client reserve enabled
    [539044.485907] ibmvscsi 30000002: sent SRP login
    [539044.485964] ibmvscsi 30000002: SRP_LOGIN succeeded
    [539044.525723] ibmvscsi 30000003: partner initialization complete
    [539044.525779] ibmvscsi 30000003: host srp version: 16.a, host partition vios2-p730-222 (2), OS 3, max io 262144
    [539044.525884] ibmvscsi 30000003: Client reserve enabled
    [539044.525897] ibmvscsi 30000003: sent SRP login
    [539044.525943] ibmvscsi 30000003: SRP_LOGIN succeeded
    [539044.884514] property parse failed in parse_next_property at line 230
    ...

Although this was done some time ago and there now have already been several new versions of the packages from the service and productivity tools, DLPAR and LPM still work with this setup. There will be another installment of this post in the future with updated package versions. Another item on my ToDo list is to provide those components from the service and productivity tools which are available in source code as native Debian packages.

// SSB puts reliable public transport on the fast track with IBM and SAP

Well, i guess here they are what could very well be construed as my 15 minutes of fame ;-) See the IBM case study “SSB puts reliable public transport on the fast track with IBM and SAP” (PDF) about our IBM Power, AIX, storage, SAN, backup and SAP environment or directly download the PDF here. It was an amazing experience, especially getting to know how to be able to write very well and precise, without losing the readers interest or compromising on technical accuracy. Many thanks to everyone involved!

// HMC Update to 7.7.7.0 SP2

Updating the HMC from v7.7.7.0 SP1 to v7.7.7.0 SP2 was once again and like the previous HMC Update to 7.7.6.0 SP1 very painless. The service pack MH01354 was easily installable from the ISO images via the HMC GUI. The previous restriction of having to disconnect one of the HMCs in a redundant setup seems to have been dropped, at least it wasn't mentioned in the release notes anymore. The readme of the eFixes (MH01367 and MH01373) to be installed on top of the service pack was this time very clear on MH01373 superseding MH01367.

This package includes a fix for HMC Version 7 Release 7.7.0 Service Pack 2. You can reference this package by APAR# MB03714. This fix must be installed on top of HMC Version 7 Release 7.7.0 Service Pack 2 with or without PTF MH01367 installed. PTF MH01373 supersedes PTF MH01367.

The service pack and the additional efixes showed the following output during the update process:

  1. MH01354:

    Mangement console corrective service installation in progress. Please wait...
    Corrective service file offload from remote server in progress...
    The corrective service file offload was successful. Continuing with HMC service installation...
    Verifying Certificate Information
    Authenticating Install Packages
    Installing Packages
    --- Installing ptf-req ....
    --- Installing RSCT ....
    src-3.1.4.4-13032
    rsct.core.utils-3.1.4.4-13032
    rsct.core-3.1.4.4-13032
    rsct.service-3.5.0.0-1
    rsct.basic-3.1.4.4-13032
    --- Installing CSM ....
    csm.core-1.7.1.20-1
    csm.deploy-1.7.1.20-1
    csm_hmc.server-1.7.1.20-1
    csm_hmc.hdwr_svr-7.0-3.4.0
    csm_hmc.client-1.7.1.20-1
    csm.server.hsc-1.7.1.20-1
    --- Installing LPARCMD ....
    hsc.lparcmd-3.0.0.1-1
    ln: creating symbolic link `/usr/hmcrbin/lsnodeid' : File exists
    ln: creating symbolic link `/usr/hmcrbin/lsrsrc-api' : File exists
    ln: creating symbolic link `/usr/hmcrbin/mkrsrc-api' : File exists
    ln: creating symbolic link `/usr/hmcrbin/rmrsrc-api' : File exists
    --- Installing InventoryScout ....
    --- Installing Pegasus ....
    --- Installing service documentation ....
    --- Updating baseOS ....
    Corrective service installation was successful.
  2. MH01367: Skipped, because - as mentioned in the release notes of MH01373 - it is superseded by MH01373.

  3. MH01373:

    Mangement console corrective service installation in progress. Please wait...
    Corrective service file offload from remote server in progress...
    The corrective service file offload was successful. Continuing with HMC service installation...
    Verifying Certificate Information
    Authenticating Install Packages
    Installing Packages
    --- Installing ptf-req ....
    Corrective service installation was successful.

The MH01354 updates still shows error messages with regard to symlink creation appearing during the update process. This – admittedly minor – issue and the associated PMR i opened with IBM support are virtually racing towards my all time top ten of downright ridiculous experiences. After a lengthy back and forth with L2 support/development things concluded in their position:

Most of our customers want to see the verbose output during upgrades/updates. There is no defect here.
If the customer wishes to do so he can open a DCR, but there is no defect.

Well, verbose output would indeed be nice, i suppose. An option to choose whether to show or to hide said verbose output would be even nicer. What i really don't care for are constant visual reminders of your lack of knowledge in proper shell scripting, not to mention your awful coding style! So here we are again, a simple, non-critical issue which could be fixed by a trivial change in a matter of minutes and still were spending a disproportionate amount of time arguing about whether it's really an issue or not. But alright, lets waste the time of even more people and open a DCR (MR0809134336) on this issue. And while we're at it, have another one (MR0618131954) for botching the mksysplan command in case a failover SEA is used on the VIO server.

Aside from that, up to now no issues with the new HMC version.

// Ganglia Fibre Channel Power/Attenuation Monitoring on AIX and VIO Servers

Although usually only available upon request via IBM support, efc_power is quite the handy tool when it comes to debugging or narrowing down fibre channel link issues. It provides information about the transmit and receive, power and attenuation values for a given FC port on a AIX or VIO server. Fortunately the output of efc_power:

$ /opt/freeware/bin/efc_power /dev/fscsi2
TX: 1232 -> 0.4658 mW, -3.32 dBm
RX: 10a9 -> 0.4265 mW, -3.70 dBm

is very parser-friendly, so it can very easily be read by a script for further processing. In this case further processing means a continuous Ganglia monitoring of the fibre channel transmit and receive, power and attenuation values for each FC port on a AIX or VIO server. This is accomplished by the two RPM packages ganglia-addons-aix and ganglia-addons-aix-scripts:

RPM packages

Source RPM packages

The package ganglia-addons-aix-scripts is to be installed on the AIX or VIO server which has the FC adapter installed. It depends on the aaa_base package for the efc_power binary and on the ganglia-addons-base package, specifically on the cronjob (/opt/freeware/etc/run_parts/conf.d/ganglia-addons.sh) defined by this package. In the context of this cronjob all avaliable scripts in the directory /opt/freeware/libexec/ganglia-addons/ are executed. For this specific Ganglia addon an iteration over all fscsi devices in the system is done and efc_power is called for each fscsi device. Devices can be excluded by assigning a regex pattern to the BLACKLIST variable in the configuration file /opt/freeware/etc/ganglia-addons/ganglia-addons-efc_power.cfg. The output of each efc_power call is parsed and via the gmetric command fed into a Ganglia monitoring system that has to be already set up.

The package ganglia-addons-aix is to be installed on the host running the Ganglia webinterface. It contains templates for the customization of the FC power and attenuation metrics within the Ganglia Web 2 interface. See the README.templates file for further installation instructions. Here are samples of the two graphs created with those Ganglia monitoring templates:

Example of FC power and attenuation with a bad cable

In the section “1” of the graphs, the receive attenuation on FC port fscsi2 was about -7.7 dBm, which means that of the 476.6 uW sent from the Brocade switchport:

$ sfpshow 1/10

Identifier:  3    SFP
Connector:   7    LC
Transceiver: 540c404000000000 200,400,800_MB/s M5,M6 sw Short_dist
Encoding:    1    8B10B
Baud Rate:   85   (units 100 megabaud)
Length 9u:   0    (units km)
Length 9u:   0    (units 100 meters)
Length 50u:  5    (units 10 meters)
Length 62.5u:2    (units 10 meters)
Length Cu:   0    (units 1 meter)
Vendor Name: BROCADE
Vendor OUI:  00:05:1e
Vendor PN:   57-1000012-01
Vendor Rev:  A
Wavelength:  850  (units nm)
Options:     003a Loss_of_Sig,Tx_Fault,Tx_Disable
BR Max:      0
BR Min:      0
Serial No:   UAF1112600001JW
Date Code:   110619
DD Type:     0x68
Enh Options: 0xfa
Status/Ctrl: 0x82
Alarm flags[0,1] = 0x5, 0x40
Warn Flags[0,1] = 0x5, 0x40
                                          Alarm                  Warn
                                      low        high       low         high
Temperature: 41      Centigrade     -10         90         -5          85
Current:     7.392   mAmps          1.000       17.000     2.000       14.000
Voltage:     3264.9  mVolts         2900.0      3700.0     3000.0      3600.0
RX Power:    -4.0    dBm (400.1 uW) 10.0   uW   1258.9 uW  15.8   uW   1000.0 uW
TX Power:    -3.2    dBm (476.6 uW) 125.9  uW   631.0  uW  158.5  uW   562.3  uW

only about 200 uW actually made it to the FC port fscsi2 on the VIO server. Section “2” shows even worse values during the time the FC connections and cables were checked, which basically means that the FC link was down during that time period. Section “3” shows the values after the bad cable was found and replaced. Receive attenuation on FC port fscsi2 went down to about -3.7 dBm, which means that of the now 473.6 uW sent from the Brocade switchport, 427.3 uW actually make it to the FC port fscsi2 on the VIO server.

The goal with the continuous monitoring of the fibre channel transmit and receive, power and attenuation values is to catch slowly deterioration situations early on, before they become a real issue or even a service interruption. As shown above, this can be accomplished with Ganglia and the two RPM packages ganglia-addons-aix and ganglia-addons-aix-scripts. For ad hoc checks, e.g. during the debugging of the components in a suspicious FC link, efc_power is still best to be called directly from the AIX or VIO server command line.

This website uses cookies. By using the website, you agree with storing cookies on your computer. Also you acknowledge that you have read and understand our Privacy Policy. If you do not agree leave the website. More information about cookies