Graphing ilcs in Grafana

Graphing of ilcs in AIX is not included in the nimon data output because ilcs comes from the mpstat command and ilcs is not part of the perfstat library that njmon uses.

ilcs is “Involuntary Logical CPU switches” and occurs when the LPAR is running on processor time in the shared pool, but is suddenly kicked off the processor due to contention. This seriously hurts performance.

The script below will run mpstat -d in the background and send the data to the ‘mpstat’ database and ‘mpstat’ measurements table. You will need to adopt this to your environment.

Run as ‘python mpstat.py 5 17420’ to get 5 second data points for 24h.


from influxdb import InfluxDBClient     # For InfluxDB connection
import subprocess       # For background process piping
import re               # For regexp matching
import sys              # For reading command line arguments
import socket           # For gethostname function

"""
Program name : mpstat.py

Syntax       : mpstat.py {interval} {count}

Author       : Henrik Morsing

Date         : 09/10/2023

Description  : Script starts mpstat in the background and
               Sends ilcs data to the "mpstat" database for Grafana,
               in the "mpstat" measurements.

               It sends data with the tag "cswitch=ilcs" and
               parameter "switches".

               The interval parameter is in seconds.
"""

host=socket.gethostname()
server=<server>

# Load command arguments
count = sys.argv[2]
interval = sys.argv[1]  # In seconds

# Define the connection string
conn = InfluxDBClient(host=server, port='8086', username='user', password='password123')

# Define function to run and return mpstat values continuously
def mpstat(interval, count):
    """
    This function calls mpstat to pick out ilcs values.
    It then returns them via 'yield', which enables the function
    to continue listening, and returning, mpstat output.
    """

    # Define process handle
    popen = subprocess.Popen(["/usr/bin/mpstat", "-d", str(interval), str(count)], stdout=subprocess.PIPE, universal_newlines=True)

    # Now we can read the output lines from the mpstat function.
    for line in iter(popen.stdout.readline, ""):
        match = re.search("ALL", line)          # "match" becomes true if line contains the word "ALL"

        if match:
            list = line.split() # Split the string line into a list of words
            ilcs = list[15]     # Field 15 in mpstat output is ilcs
            yield ilcs          # Return ilcs

    popen.stdout.close()



for ilcs in mpstat(interval, count):
    line = 'mpstat,host=%s,cswitch=ilcs switches=%s' % (host,ilcs)
    conn.write([line], {'db': 'mpstat'}, 204, 'line')

The guideline for performance impact is given in Earl Jew’s POWER VUG presentation ‘Simplest starting tactic for Power10 AIX exploitation V1.2‘:

Relieving the performance impact is done by increasing processor entitlement until the ilcs numbers drop.

Posted in AIX, IBM POWER, Performance tuning, Scripting | Tagged , , , , , , , | Leave a comment

Unable to increase MaxLPs for LV

As part of growing filesystems on AIX, you often hit the message:

Error Message:  0516-787 extendlv: Maximum allocation for logical volume is 512

which is a bit of a legacy setting from the days of small disks and when you could not shrink filesystems.

This is easily over-come by running chlv -x <number> but I was recently thwarted by being met by another error from trying this fairly routine command:

# chlv -x 65536 data2_lv
0516-1389 chlv: The -x parameter for MaxLPs must be between 1 and 32768.
        It cannot be smaller than the existing number of LP's.

It turns out that “MAX PPs per VG” limits the “MaxLPs” setting.

# lsvg data2_vg
VOLUME GROUP:       data2_vg                 VG IDENTIFIER:  00c85c6700004c000000015c5d99c816
VG STATE:           active                   PP SIZE:        128 megabyte(s)
VG PERMISSION:      read/write               TOTAL PPs:      31992 (4094976 megabytes)
MAX LVs:            256                      FREE PPs:       0 (0 megabytes)
LVs:                1                        USED PPs:       31992 (4094976 megabytes)
OPEN LVs:           1                        QUORUM:         5 (Enabled)
TOTAL PVs:          8                        VG DESCRIPTORS: 8
STALE PVs:          0                        STALE PPs:      0
ACTIVE PVs:         8                        AUTO ON:        yes
MAX PPs per VG:     32768                    MAX PVs:        1024

To fix this, run chvg -P 64K <VGname> which will set “MAX PPs per VG” to 65536.

Posted in AIX | Tagged , , , , | Leave a comment

Yubikey un-block PIN

Un-blocking the Yubikey PIN seems a bit un-documented. Here is how, bear in mind, the password (-P option) is 8 characters max, and if you type anything else in it will tell you the PIN can be max 8 characters, not the password. Totally stupid.

% /usr/local/bin/yubico-piv-tool -P <8 chr password> -N <PIS> -a unblock-pin

PIS = Personal Identification String (The manual call this a PIN, which is wrong). This is the string you type in every time you use the key.

Posted in Uncategorized | Leave a comment

Capacity and performance check script

Another little script I wrote to check capacity aspects of an AIX LPAR. I call it capacity checks as it is basing most of the checks on counters and averaging out over 90 days. Some of this is based on Earl Jew’s excellent vmstat presentation to the IBM POWER VUG.

The script checks memory and I/O buffer over-flow counters as well as LPAR SRAD spreading.

#!/bin/ksh93

# Performance recommendation tool
#
# Copyright Henrik Morsing, 2022
#
# Initial version 1.0
# 09-11-2022    Henrik Morsing  1.1     Added more informative output
#                                       and correct when to alert (6 digits, not 5)

# Set a reference to current days up

ref="$(uptime | grep days | awk '{ print $3 }')"

#

# If less than a day or two, exit, less than twenty, warn

if [[ "${ref}" == "" ]]
then
   echo "System uptime too low."
   exit 1
elseif [[ "${ref}" -lt 20 ]]
   echo "System uptime too low to give accurate results."
fi

echo
echo "Starting System Performance Analyser v1.0"
echo
echo "System Name: $(uname -n) - System Uptime Days: ${ref}"
echo
echo "Please bear in mind, as stats used are accumulated over time,"
echo "they can be a view of the past and issues may already have been rectified."
echo
echo

#####################
# MEMORY
#####################

echo "\t *** MEMORY CHECKS ***"
echo
echo "Add more memory to rectify these"
echo

# Start by checking some memory variables
# Read paging space page outs, revolutions of the clock hand, free frame waits

vmstat -s | grep -E 'paging space page outs|revolutions of the clock hand|free frame waits' | awk '{ print $1 } ' | tr '\n' ' ' | read page_outs revolutions frame_waits

# First, convert to 90 day reference
page_outs_90=$(( ${page_outs}/${ref}*90 ))
revolutions_90=$(( ${revolutions}/${ref}*90 ))
frame_waits_90=$(( ${frame_waits}/${ref}*90 ))

# echo ${page_outs_90}
# echo ${revolutions_90}
# echo ${frame_waits_90}

# Then, find number of digits
page_outs_digits=${#page_outs_90}
frame_waits_digits=${#frame_waits_90}

# echo "${page_outs_digits}"
# echo "${frame_waits_digits}"

# Check on numbers and warn as needed

if [[ ${page_outs_digits} -gt 7 || ${revolutions} -gt $(( ${ref}*100 )) || ${frame_waits_digits} -gt 6 ]]
then
   echo "You are extremely memory constrained:"
   [[ ${page_outs_digits} -gt 7 ]] && echo "- \033[1;31m'paging space page outs' extremely is high:\033[m ${page_outs} -> ${page_outs_90} per 90 days (${page_outs_digits} digits)"
   [[ ${revolutions} -gt $(( ${ref}*100 )) ]] && echo "- \033[1;31m'revolutions of the clock hand' is extremely high:\033[m ${revolutions} -> ${revolutions_90} per 90 days"
   [[ ${frame_waits_digits} -gt 6 ]] && echo "- \033[1;31m'free frame waits' is extremely high:\033[m ${frame_waits} -> ${frame_waits_90} per 90 days (${frame_waits_digits} digits)"

elif [[ ${page_outs_digits} -gt 6 || ${revolutions} -gt $(( ${ref}*10 )) || ${frame_waits_digits} -gt 5 ]]
then
   echo "You are very memory constrained:"
   [[ ${page_outs_digits} -gt 6 ]] && echo "- \033[1;33m'paging space page outs' very is high:\033[m ${page_outs} -> ${page_outs_90} per 90 days (${page_outs_digits} digits)"
   [[ ${revolutions} -gt $(( ${ref}*10 )) ]] && echo "- \033[1;33m'revolutions of the clock hand' is very high:\033[m ${revolutions} -> ${revolutions_90} per 90 days"
   [[ ${frame_waits_digits} -gt 5 ]] && echo "- \033[1;33m'free frame waits' is very high:\033[m ${frame_waits} -> ${frame_waits_90} per 90 days (${frame_waits_digits} digits)"

elif [[ ${page_outs_digits} -gt 5 || ${revolutions} -gt ${ref} || ${frame_waits_digits} -gt 4 ]]
then
   echo "You could benefit from adding more memory:"
   [[ ${page_outs_digits} -gt 5 ]] && echo "- 'paging space page outs' is high: ${page_outs} -> ${page_outs_90} per 90 days  (${page_outs_digits} digits)"
   [[ ${revolutions} -gt ${ref} ]] && echo "- 'revolutions of the clock hand' is high: ${revolutions} -> ${revolutions_90} per 90 days"
   [[ ${frame_waits_digits} -gt 4 ]] && echo "- 'free frame waits' is high: ${frame_waits} -> ${frame_waits_90} per 90 days (${frame_waits_digits} digits)"
fi


#####################
# PROCESSOR
#####################

echo
echo "\t *** PROCESSOR CHECKS ***"
echo

# Checking for LPAR SRAD spreading

num_srads="$(lssrad -a | grep -v SRAD | wc -l)"
vCPUs_online="$(lparstat -i | grep 'Online Virtual CPUs' | awk '{ print $NF }').0"
vCPUs_max="$(lparstat -i | grep "Maximum Virtual CPUs" | awk '{ print $NF }')"
Entitlement="$(lparstat -i | grep "Entitled Capacity" | grep -v "Pool" | awk '{ print $NF }')"

if [[ ${num_srads} -gt "2" ]]
then
        echo "LPAR is spread across multiple SRADs (${num_srads}). If memory (2TB?) and max processor allocations (less than 15 vCPUs, currently ${vCPUs_max}) suggests it can be contained within one SRAD, powering the LPAR off and on again might align it correctly."
fi

echo
printf "*** Checking spreading factor ***"

if [[ ${vCPUs_online} -gt "1" ]]
then
   if [[ ${spreading} -gt 2 ]]
   then
      echo "\t[\033[1;33mWARNING\033[m]"
      echo "Number of virtual processors is high compared to entitlement."
   else
      echo "\t[\033[1;32mOK\033[m]"
   fi
fi


#####################
# I/O
#####################

# Starting from the top, VGs first

echo
echo "\t *** I/O CHECKS ***"
echo

for volgroup in $(lsvg -o)
do

   printf "*** Checking ${volgroup} ***"
   msg=false

   ##################
   # Checking pbufs #
   ##################

   # Count blocked I/Os with no pbuf
   pervg_blocked_io_count=$(/usr/sbin/lvmo -v ${volgroup} -o pervg_blocked_io_count)

   # Reference to 90 days
   pbio_90=$(( ${pervg_blocked_io_count}/${ref}*90 ))

   # Find number of digits
   pbio_digits=${#pbio_90}

   # Recommendation based on number of digits
   if [[ ${pbio_digits} -gt 6 ]]
   then
      url=true
      echo "\t[\033[1;33mWARNING\033[m]"

      # Calculate recommended pv_pbuf_count for VG
      pbuf_curr=$(lvmo -v ${volgroup} -o pv_pbuf_count)
      pbuf_vg=$(( ${pbuf_curr}+16384 ))

      echo "Volume group ${volgroup} is extremely low on pbufs"
      echo "- \033[1;31m'pending disk I/Os blocked with no pbuf' is extremely high:\033[m ${pbuf_curr}. Increase 'pv_pbuf_count' to ${pbuf_vg}.\n"
   else
      echo "\t[\033[1;32mOK\033[m]"
   fi
done


   ###################
   # Checking psbufs #
   ###################

   # Count blocked paging space I/O with no psbuf

   vmstat -v | grep -E 'paging space I/Os blocked with no psbuf|external pager filesystem I/Os blocked with no fsbuf' | awk '{ print $1 } ' | tr '\n' ' ' | read psbuf fsbuf

   # Reference to 90 days
   psio_90=$(( ${psbuf}/${ref}*90 ))

   # Any psbufs blocked is bad
   if [[ ${#psio_90} -gt 1 ]]
   then
      url=true
      printf "[\033[1;33mWARNING\033[m] "
      echo "\033[1;31mpsbufs is above 10\033[m, indicating severe memory restriction causing excessive paging. If you cannot add memory, alleviate by adding parallel paging spaces."
   fi


   ###################
   # Checking fsbufs #
   ###################
   echo
   # Count blocked external pager filesystem I/O with no fsbuf

   # Reference to 90 days
   fsio_90=$(( ${fsbuf}/${ref}*90 ))

   # Any fsbufs blocked is bad
   if [[ ${#fsio_90} -gt 2 ]]
   then
      url=true
      printf "[\033[1;33mWARNING\033[m] "
      echo "\033[1;31mfsbufs is above 100\033[m, indicating filesystem I/O over-load. Increase j2_dynamicBufferPreallocation with ioo to fix this. Start by doubling value."
      echo "Also consider splitting into smaller file systems."
   fi

   [[ "${url}" == "true" ]] && echo "Info on I/O buffers: https://www.ibm.com/support/pages/blocked-ios-due-buffers-shortage"

   ###################
   # Fibre Adapters  #
   ###################

adapters=$(lsdev -Ccadapter | grep fcs | awk '{ print $1 }')

# Check No Command Resource Count (Update num_cmd_elems)

   for adapter in ${adapters}
   do
      ncrc=$(fcstat -D ${adapter} | grep "No Command Resource Count" | awk '{ print $NF }')

      # Reference to 90 days
      ncrc_90=$(( ${ncrc}/${ref}*90 ))

      # No sure how many is bad, let's start with 6 digits

      if [[ ${#ncrc_90} -gt 6 ]]
      then
         url=true
         printf "[\033[1;33mWARNING\033[m] "
         echo "- \033[1;31mNo Command Resource Count for adapter ${adapter} is extremely high:\033[m ${ncrc} -> ${ncrc_90} per 90 days (${#ncrc_90} digits)"
         echo "Increase num_cmd_elems on ${adapter} to fix, but not higher than num_cmd_elems on the VIO physical adapter."
      elif [[ ${#ncrc_90} -gt 5 ]]
      then
         url=true
         printf "[\033[1;33mWARNING\033[m] "
         echo "- \033[1;31mNo Command Resource Count for adapter ${adapter} is very high:\033[m ${ncrc} -> ${ncrc_90} per 90 days (${#ncrc_90} digits)"
         echo "Increase num_cmd_elems on ${adapter} to fix, but not higher than num_cmd_elems on the VIO physical adapter."
      fi
   done

   [[ "${url}" == "true" ]] && echo "Info on fcs buffers: https://www.ibm.com/support/pages/no-command-resource-count-and-high-water-mark-active-and-pending-commands"
   url=false

   echo


# Check High water mark of active/pending commands (Update num_cmd_elems)

   for adapter in ${adapters}
   do
      hwmac=$(fcstat -D ${adapter} | grep -p "FC SCSI Adapter Driver Queue" | grep "High water mark  of active commands" | awk '{ print $NF }')
      hwmpc=$(fcstat -D ${adapter} | grep -p "FC SCSI Adapter Driver Queue" | grep "High water mark of pending commands" | awk '{ print $NF }')

      # Reference to 90 days
      hwmac_90=$(( ${hwmac}/${ref}*90 ))
      hwmpc_90=$(( ${hwmpc}/${ref}*90 ))

      hwm_summ=$(( ${hwmac} + ${hwmpc} ))

      # We need the current num_cmd_elems setting

      nce=$(lsattr -El fcs0 -a num_cmd_elems -F value)

      if [[ ${hwm_summ} -gt ${nce} ]]
      then
         url=true
         printf "[\033[1;33mWARNING\033[m] "
         echo "- \033[1;31mHigh water mark for active/pending command for adapter ${adapter} is higher than num_cmd_elems:\033[m ${hwm_summ} vs. ${nce}"
         echo "Increase num_cmd_elems on ${adapter} to fix, but not higher than num_cmd_elems on the VIO physical adapter."
      fi
   done

   # Link to helpful web page.
   echo
   [[ "${url}" == "true" ]] && echo "Info on fcs buffers: https://www.ibm.com/support/pages/no-command-resource-count-and-high-water-mark-active-and-pending-commands"
   url=false
   echo


# Check No DMA Resource Count (Update max_xfer_size)

   for adapter in ${adapters}
   do
      nodma=$(fcstat -D ${adapter} | grep "No DMA Resource Count" | awk '{ print $NF }')

      # Reference to 90 days
      nodma_90=$(( ${nodma}/${ref}*90 ))

      if [[ ${#nodma_90} -gt 3 ]]
      then
         url=true
         printf "[\033[1;33mWARNING\033[m] "
         echo "- \033[1;31mNo DMA Resource Count for adapter ${adapter} is higher than 3 digits per 90 days:\033[m ${nodma_90}"
         echo "Increase max_xfer_size on ${adapter} to fix, but not higher than max_xfer_size on the VIO physical adapter."
      fi
   done

   # Link to helpful web page.
   echo
   [[ "${url}" == "true" ]] && echo "Info on fcs buffers: https://www.ibm.com/support/pages/no-command-resource-count-and-high-water-mark-active-and-pending-commands"
   url=false

echo
exit 0
Posted in AIX, IBM POWER, Performance tuning, Scripting | Tagged , , | Leave a comment

Fixing Alt-Enter maximising XTerm

Using Org-mode, it was incredibly inconvenient when both in FreeBSD and Linux, Alt-Enter, or Alt-Return as XTerm calls it, stole the Alt-Return key combination and just maximised itself.

Not sure if this is i3wm related, but could just be that other window managers re-configure this.

I can’t remember how I fixed this in Linux a few years back, but just had to go through it all again in FreeBSD with i3. XTerm reads .Xdefaults, and adding the following to the bottom fixed it:

*VT100.Translations: #override \
        Alt <Key>Return:        \n\

There are Google hits claiming XTerm*fullscreen: never or XTerm*omitTranslation: fullscreen fixes it, but that was not the case for me.

Hope this gets you back using Org-mode!

Posted in Software | Leave a comment

Checking SUSE bootability on POWER

This little script checks the GRUB boot image for the VG/LV/file string pointing to the correct place, e.g. /boot/grub/grub.cfg.

This works on SUSE / SLES on POWER.

# SLES bootability monitor
#
# Henrik Morsing 1.0 24-FEB-2023 Initial

ver=1.0

[[ “${1}” == “DEBUG” ]] && DEBUG=1

unset ERR

# Find rootvg disks

rootlv=”$(mount | grep ‘ / ‘ | awk ‘{ print $1 }’)”
rootvg=$(mount | grep ‘ / ‘ | awk ‘{ print $1 }’ | awk -F- ‘{ print $1 }’ | awk -F/ ‘{ print $NF }’)
disk=”$(/usr/sbin/pvs | grep ${rootvg} | awk -F- ‘{ print $1 }’ | awk -F/ ‘{ print $NF }’)”

PV=”$(/usr/sbin/pvs | grep ${rootvg} | awk ‘{ print $1 }’)”
boot_LV=”$(/usr/sbin/fdisk -l | grep PReP | grep mapper | awk ‘{ print $1 }’)”


# This checks the PVid and LVid in the boot LV
# points to the disk in the rootvg.
# It also checks the grub image pointed to exists.

/usr/local/bin/dd_wrapper ${boot_LV} 2>/dev/null | strings | grep lvmid | grep grub2 | awk -F / ‘{ print $2,$3 }’ | tr -d ‘)’ | while read vgid lvid
do

checkVG=”$(/sbin/vgdisplay -c | grep ${vgid} | awk -F: ‘{ print $1 }’ | tr -d ‘ ‘)”

if [[ “${checkVG}” == “${rootvg}” ]]
then
[[ ${DEBUG} ]] && printf “Boot image points to correct volume group\t\t\033[1;32m[OK]\033[0m\n”
else
[[ ${DEBUG} ]] && printf “Boot image does NOT point to correct volume group\t\t\033[1;31m[FAIL]\033[0m\n”
ERR=1
fi

# This is painfully annoying. lvdisplay command just lacks sensible output.

while read lv parm value
do

# If parm is “Path”, store value in case UUID matches what we are looking for.

if [[ “${parm}” == “Path” ]]
then
final_path=”$(ls -l ${value} | awk ‘{ print $NF }’)”
fi

if [[ “${parm}” == “UUID” ]]
then
if [[ “${value}” == “${lvid}” ]]
then
break
fi
fi
done < <(/sbin/lvdisplay | grep -E “LV Path|LV UUID”)

if [[ “${final_path}” == “$(ls -l ${rootlv} | awk ‘{ print $NF }’)” ]]
then
[[ ${DEBUG} ]] && printf “Boot image points to correct logical volume\t\t\033[1;32m[OK]\033[0m\n”
else
[[ ${DEBUG} ]] && printf “Boot image does NOT point to correct logical volume\t\t\033[1;31m[FAIL]\033[0m\n”
ERR=1
fi

done

# Last check, grub file

grub_file=”$(/usr/local/bin/dd_wrapper ${boot_LV} 2>/dev/null | strings | grep lvmid | grep grub2 | awk -F / ‘{ print “/”$(NF-1)”/”$NF”/grub.cfg” }’)”

if [[ -s ${grub_file} ]]
then
[[ ${DEBUG} ]] && printf “grub.cfg exists in correct location\t\t\t\033[1;32m[OK]\033[0m\n”
else
[[ ${DEBUG} ]] && printf “grub.cfg does NOT exist in correct location\t\t\033[1;31m[FAIL]\033[0m\n”
ERR=1
fi

[[ “${ERR}” == “” ]] && echo “Success” || echo “Fail”

exit 0

Posted in IBM POWER, Linux | Tagged , , , , , | Leave a comment

Zabbix API shell script to show maintenance association

Just a quick note to show how you can get a server to show which Zabbix maintenance group it is in upon login.

#!/bin/bash

#
# Script to show if host is in maintenance.

host="$(hostname)"

# First a little function to generate output for the curl call. This is to allow the use of a variable in the JSON string.

generate_post_data()
{
  cat <<EOF
{
"jsonrpc": "2.0",
"method": "host.get",
"params": {
    "filter": { "host": ["${host}"] },
    "output": "extend" },
"auth": "5159d718ca9a12a932a8b34506575a9ce0f58ff115da7f1d12a6feed957b90fd",
"id": 1
}
EOF
}

# Now send the API call and store the maintenance ID

maintID=$(curl -s -X GET -H "Content-Type: application/json-rpc" -d "$(generate_post_data)" http://zabbix-dev.eur-d.howdev.corp/zabbix/api_jsonrpc.php | jq '."result" | .[] | ."maintenanceid"' | tr -d '"')

typeset -A maint

# Create an associative array of maintenance groups for later lookup

for period in $(curl -s -X GET -H "Content-Type: application/json-rpc" -d '{
    "jsonrpc": "2.0",
    "method": "maintenance.get",
    "params": {
        "output": "extend"
    },
    "auth": "5159d718ca9a12a932a8b34506575a9ce0f58ff115da7f1d12a6feed957b90fd",
    "id": 1
}'  http://zabbix-dev.eur-d.howdev.corp/zabbix/api_jsonrpc.php | jq '."result" | .[] | .maintenanceid + "-" + .name' | tr ' ' '_')
do
    maint[$(echo ${period} | cut -d- -f 1 | tr -d '"')]="$(echo ${period} | cut -d- -f 2 | tr -d '"')"
done

# And we're ready to output which group this system is in

[[ "${maintID}" != "" ]] && echo "This server is in maintenance group ${maint[${maintID}]}"

exit 0
Posted in Automation, Scripting | Leave a comment

Using Ansible command return code

I was using the command module in Ansible but the command I was calling at the other end would cause the module to fail even on informational message. I believe this was partly due to the ‘rc’ value in the output being non-zero.

Googling and reading documentation showed a multitude or complex and not very accurate or not even working methods of responding to this return code.

The module output was:

{"changed": true, "cmd": ["/opt/IBM/ldap/V6.4/bin/idslink", "-g", "-i", "-l", "32", "-f"], "delta": "0:00:01.881979", "end": "2022-12-21 15:08:38.942974", "failed_when_result": false, "msg": "non-zero return code", "rc": 4, […]

The solution was quite simple:

name: Relink 64 bit libraries

when: ansible_os_family == "AIX"
command: /opt/IBM/ldap/V6.4/bin/idslink -g -i -l 64 -f
register: output
failed_when: output.rc not in [0, 2, 4 ]

This ensures reeturn code 0, 2 and 4 are not treated as errors.

Posted in Ansible, Automation | Tagged , , | Leave a comment

Script to change WWPNS in LPAR profile

This is a script to interactively change WWPNs on adapters in an IBM POWER/Power systems LPAR profile. This is useful if you have to re-create adapters or even whole profiles as the HMC GUI does not allow you to set or change WWPNs, and the HMC command line is not exactly elegant.

# Copyright AIXperts Consultancy ltd, Henrik Morsing - November 2022

echo
echo "You are about to write new WWPNs to the adapters for this profile. Please backup the profile first"
echo

# Prompt for hmc and host

printf "Please enter HMC to connect to: "; read hmc
printf "Please enter host to change: "; read host
echo

# Find LPAR and system names
while read sys
do
   #echo "Checking system "${sys}
   lpar="$(echo $(ssh -n ${hmc} lssyscfg -r lpar -m ${sys} -F name) | tr ' ' '\n' | grep ${host})"
   if [[ ${lpar} != "" ]]
   then
      #echo "Break-out"
      export system=${sys}
      break
   fi
done <<< "$(ssh ${hmc} 'lssyscfg -r sys -F name')"

# Find profile name
profile_name="$(echo $(ssh ${hmc} lssyscfg -r prof -m ${system} --filter \"lpar_names=${lpar}\" -F name) | cut -f1 -d' ')"

# Find WWPNs
typeset -A value        # For WWPNs
typeset -i f=0          # Counter for adapter
typeset -i field=1      # Counter for adapter
value[${f}]="kjfhkes"   # Inital, non-empty value

# Define some arrays to read output values into
typeset -A adapter_id
typeset -A rlpar_id
typeset -A rlpar_name
typeset -A rslot_no

# Let's fetch the output from the HMC
cmd_out="$(ssh ${hmc} lssyscfg -m ${system} -r prof --filter \"lpar_names=${lpar},profile_names=${profile_name}\" -F virtual_fc_adapters)"

# Loop to read every adapter output
while [[ -n ${value[${f}]} ]]
do
   f+=1
   # Read values for each adapter in turn into the arrays
   echo ${cmd_out} | tr -d '"' | cut -d, -f${field},$(( ${field}+1 )) | tr '/' ' '| read adapter_id[${f}] client rlpar_id[${f}] rlpar_name[${f}] rslot_no[${f}] value[${f}] req

   field+=2
done

# Before starting this loop, start building up a command to submit
cmd="ssh ${hmc} chsyscfg -r prof -m ${system} --force -i \"lpar_name=${lpar}, name=${profile_name}, \\\"virtual_fc_adapters="

f=1     # Adapter counter

while [[ -n ${value[${f}]} ]]
do
   printf "Enter new WWPN or hit <ENTER> to keep WWPN ["${value[${f}]}"]: "; read wwpns_input

   if [[ "${wwpns_input}" == "" ]]
   then

      cmd_build=${cmd}\\\"\\\"${adapter_id[${f}]}/client/${rlpar_id[${f}]}/${rlpar_name[${f}]}/${rslot_no[${f}]}/${value[${f}]}/0\\\"\\\",
      cmd=${cmd_build}

      # Increment f
      f+=1

   elif [[ "${wwpns_input}" =~ [0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f],[0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f] ]]
   then
      value[${f}]=${wwpns_input}

      cmd_build=${cmd}\\\"\\\"${adapter_id[${f}]}/client/${rlpar_id[${f}]}/${rlpar_name[${f}]}/${rslot_no[${f}]}/${value[${f}]}/0\\\"\\\",
      cmd=${cmd_build}

      # Increment f
      f+=1

   else
      echo "Invalid entry, please try again"
   fi
done

# Now, tidy-up end of command
cmd_end=$(echo ${cmd_build} | sed 's/,$//')     # Remove trailing comma
cmd_complete=${cmd_end}\\\"\"\                  # Finish off quites

echo
echo "Running: "${cmd_complete}

${cmd_complete}

exit 0

Posted in AIX, HMC, IBM POWER, Scripting | Tagged , , , , , , | Leave a comment