File s390-tools-sles15sp1-ziomon-fix-utilization-data-recording-with-multi-dig.patch of Package s390-tools.15658
Subject: ziomon: fix utilization data recording with multi-digit scsi hosts
From: Fedor Loshakov <loshakov@linux.ibm.com>
Description:   ziomon: fix utilization recording with multi-digit scsi hosts
Symptom:       During running of ziomon script, user receives ziomon
               error: "ziomon: Number of LUNs does not match number of
               devices: 2 devices and 1 LUNs". Also user receives
               ziomon_util error: "ziomon_util: Path does not exist:
               /sys/class/scsi_host/host0/utilization - correct kernel
                version?". After collection of records, user receives
               error during using of ziorep_utilization or
               ziorep_traffic: "ziorep_traffic: Could not retrieve
               initial data - data files corrupted or broken, or the
               .agg file is missing."
Problem:       s390-tools-1.9.0 introduced a new way of recognizing of
               multipath device paths with using of sed command
               invocation in ziomon script. With this new way of
               recognizing, if there are paths, related to multipath
               device, with SCSI host ID longer, than one digit,
               ziomon incorrectly parses the multipath -l command
               output. It erroneously cuts off all but the least
               significant digit of the SCSI host ID (H) of paths in
               H:B:T:L format (Host:Bus:Target:Lun). This leads to
               passing of hosts (-a) and paths (-l) with wrong
               SCSI host ID to ziomon_util. In turn ziomon_util cannot
               recognize hosts with non-existing SCSI host ID and
               issues an error. Also, wrong sed command invocation
               could lead to receiving of duplicate LUNs by ziomon
               after parsing of multipath -l command output. Then
               ziomon excludes duplicates from WRP_LUNS, which leads
               to mismatch between number of LUNs and number of
               detected block devices and issues ziomon script error,
               without starting ziomon_util and without writing to
               specified log file.
Solution:      The regular expression to match a path in H:B:T:L
               format started with a greedy ".*", which erroneously
               consumed parts of the SCSI host ID (H). The solution
               is to replace the greedy ".*" by "[^0-9]*", so that sed
               command does not consume parts of the SCSI host ID any
               more.
Reproduction:  Create such multipath device, that its related path
               contains LUN with SCSI host ID longer, than one digit.
               Example for wrong SCSI host ID:
               $ multipath -l
               mpathc (36005076307ffc5e300000000000083f5) dm-2 IBM ...
               size=20G features='1 queue_if_no_path' hwhandler='0'...
               `-+- policy='service-time 0' prio=0 status=active
                 |- 10:0:0:1089814659 sdb 8:16  active undef running
                 `- 11:0:0:1089814659 sdf 8:80  active undef running
               Example for duplicate SCSI host ID:
               $ multipath -l
               mpathd (36005076307ffc5e300000000000083f4) dm-3 IBM ...
               size=20G features='1 queue_if_no_path' hwhandler='0'...
               `-+- policy='service-time 0' prio=0 status=active
                 |- 10:0:0:1089749123 sda 8:0  active undef running
                 `- 0:0:0:1089749123 sdd 8:48  active undef running
               Use ziomon tool with one of the described multipath
               device as input.
Upstream-ID:   f2dee9f542439cf07e00df4296b05a47b81e2469
Problem-ID:    176863
Upstream-Description:
              ziomon: fix utilization data recording with multi-digit scsi hosts
              s390-tools-1.9.0 introduced a new way of recognizing of multipath
              device paths with using of sed command invocation in ziomon script.
              With this new way of recognizing, if there are paths, related to
              multipath device, with SCSI host ID longer, than one digit,
              ziomon incorrectly parses the multipath -l command output. It
              erroneously cuts off all but the least significant digit of the
              SCSI host ID (H) of paths in H:B:T:L format (Host:Bus:Target:Lun).
              This leads to passing of hosts (-a) and paths (-l) with
              non-existing SCSI host ID to ziomon_util. In turn ziomon_util
              cannot recognize hosts with non-existing SCSI host ID and issues
              an error.
              Also, wrong sed command invocation could lead to receiving of
              duplicate LUNs by ziomon after parsing of multipath -l command
              output. Then ziomon excludes duplicates from WRP_LUNS, which
              leads to mismatch between number of LUNs and number of detected
              block devices and issues ziomon script error, without starting
              ziomon_util and without writing to specified log file.
              The regular expression to match a path in H:B:T:L format started
              with a greedy ".*", which erroneously consumed parts of the SCSI
              host ID (H). This patch replaces the greedy ".*" by "[^0-9]*",
              so that sed command does not consume parts of the SCSI host ID
              any more.
              Test example with unique SCSI host IDs:
              $ multipath -l
              ...
              mpathc (36005076307ffc5e300000000000083f5) dm-2 IBM     ,2107900
              size=20G features='1 queue_if_no_path' hwhandler='0' wp=rw
              `-+- policy='service-time 0' prio=0 status=active
                |- 10:0:0:1089814659 sdb 8:16  active undef running
                `- 11:0:0:1089814659 sdf 8:80  active undef running
              ...
              Behavior without fix applied:
              $ ziomon -d 5 -o log /dev/mapper/mpathc
              Check devices...done
              NOTE: No size limit specified, run without a limit.
              Estimated maximum disk space required for log data: approx. <1 MBytes
              Collecting configuration data...done
              Start data collection processes...ziomon_util: Path does not exist: /sys/class/scsi_host/host0/utilization - correct kernel version?
              ziomon_util: Path does not exist: /sys/class/scsi_host/host0/utilization - correct kernel version?
              ziomon_util: Path does not exist: /sys/class/scsi_host/host0/queue_full - correct kernel version?
              ziomon_util: Path does not exist: /sys/class/scsi_host/host1/utilization - correct kernel version?
              ziomon_util: Path does not exist: /sys/class/scsi_host/host1/queue_full - correct kernel version?
              failed
              ziomon: Failed to determine ziomon_util pid
              Shutting down
              Shutting down blktrace process
              Shutting down blkiomon process
              Shutting down ziomon_zfcpdd process
              blkiomon: terminated by signal
              Shutting down data manager
              User can see more information, when using ziomon with -V option:
              ...
              === WRP_LUNS         : 0:0:0:1089814659 1:0:0:1089814659
              === WRP_HOST_ADAPTERS: host0 host1
              ...
              === starting ziomon_util: ziomon_util -V  -a 0 -a 1  -l 0:0:0:1089814659 -l 1:0:0:1089814659 ...
              ...
              User can also see this type of errors, while using
              ziorep_traffic or ziorep_utilization tools:
              $ ziorep_traffic -t1 log.log
              Extracting config data...done
              ziorep_traffic: Could not retrieve initial data - data files corrupted or broken, or the .agg file is missing.
              Behavior with fix applied:
              $ ziomon -d 5 -o log /dev/mapper/mpathc
              Check devices...done
              NOTE: No size limit specified, run without a limit.
              Estimated maximum disk space required for log data: approx. <1 MBytes
              Collecting configuration data...done
              Start data collection processes...done
              Collecting data...done
              Shutting down
              Shutting down data manager
              User can see more information, when using ziomon with -V option:
              ...
              === WRP_LUNS         : 10:0:0:1089814659 11:0:0:1089814659
              === WRP_HOST_ADAPTERS: host10 host11
              ...
              === starting ziomon_util: ziomon_util -V  -a 10 -a 11  -l 10:0:0:1089814659 -l 11:0:0:1089814659 ...
              ...
              Test example with duplicate SCSI host IDs:
              $ multipath -l
              ...
              mpathc (36005076307ffc5e300000000000083f5) dm-1 IBM     ,2107900
              size=20G features='1 queue_if_no_path' hwhandler='0' wp=rw
              `-+- policy='service-time 0' prio=0 status=active
                |- 0:0:0:1089814659  sdb 8:16  active undef running
                `- 10:0:0:1089814659 sdf 8:80  active undef running
              ...
              Behavior without fix applied:
              $ ziomon -d 5 -o log /dev/mapper/mpathc
              Check devices...done
              ziomon: Number of LUNs does not match number of devices: 2 devices and 1 LUNs
              User can see more information, when using ziomon with -V option:
              ...
                  === #Devices total   : 2
                  === WRP_DEVICES      : /dev/sdb /dev/sdf
                  === WRP_LUNS         : 0:0:0:1089814659
                  === WRP_HOST_ADAPTERS: host0
              ...
              Behavior with fix applied:
              $ ziomon -d 5 -o log /dev/mapper/mpathc
              Check devices...done
              NOTE: No size limit specified, run without a limit.
              Estimated maximum disk space required for log data: approx. <1 MBytes
              Collecting configuration data...done
              Start data collection processes...done
              Collecting data...done
              Shutting down
              Shutting down data manager
              User can see more information, when using ziomon with -V option:
              ...
              === #Devices total   : 2
              === WRP_DEVICES      : /dev/sdb /dev/sdf
              === WRP_LUNS         : 0:0:0:1089814659 10:0:0:1089814659
              === WRP_HOST_ADAPTERS: host0 host10
              ...
              === starting ziomon_util: ziomon_util -V  -a 0 -a 10  -l 0:0:0:1089814659 -l 10:0:0:1089814659 ...
              ...
              Signed-off-by: Fedor Loshakov <loshakov@linux.ibm.com>
              Reviewed-by: Steffen Maier <maier@linux.ibm.com>
              Reviewed-by: Jens Remus <jremus@linux.ibm.com>
              Signed-off-by: Jan Höppner <hoeppner@linux.ibm.com>
Signed-off-by: Fedor Loshakov <loshakov@linux.ibm.com>
---
 ziomon/ziomon |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/ziomon/ziomon
+++ b/ziomon/ziomon
@@ -514,7 +514,7 @@ function check_for_multipath_devices() {
                (( i+=2 ));
                while [[ `echo "${mp_arr[$i]:0:1}" | grep -ve "[0-9a-zA-Z]"` ]] && [ $i -lt ${#mp_arr[@]} ]; do
                   if [ `echo ${mp_arr[$i]} | grep -e "[0-9]\{1,\}:[0-9]\{1,\}:[0-9]\{1,\}:[0-9]\{1,\}" | wc -l` -ne 0 ]; then
-	             line="`echo ${mp_arr[$i]} | sed 's/.*\([0-9]\{1,\}:[0-9]\{1,\}:[0-9]\{1,\}:[0-9]\{1,\}\)/\1/'`";
+	             line="`echo ${mp_arr[$i]} | sed 's/[^0-9]*\([0-9]\{1,\}:[0-9]\{1,\}:[0-9]\{1,\}:[0-9]\{1,\}\)/\1/'`";
                      checked_devs[${#checked_devs[@]}]=`echo $line | awk '{print "/dev/"$2}'`;
                      ddebug "      adding ${checked_devs[${#checked_devs[@]}-1]}";
                      WRP_HOST_ADAPTERS[${#WRP_HOST_ADAPTERS[@]}]="host${line%%:*}";