File s390-tools-sles15sp1-ziomon-fix-utilization-data-recording-with-multi-dig.patch of Package s390-tools.14411
Subject: ziomon: fix utilization data recording with multi-digit scsi hosts
From: Fedor Loshakov <loshakov@linux.ibm.com>
Description: ziomon: fix utilization recording with multi-digit scsi hosts
Symptom: During running of ziomon script, user receives ziomon
error: "ziomon: Number of LUNs does not match number of
devices: 2 devices and 1 LUNs". Also user receives
ziomon_util error: "ziomon_util: Path does not exist:
/sys/class/scsi_host/host0/utilization - correct kernel
version?". After collection of records, user receives
error during using of ziorep_utilization or
ziorep_traffic: "ziorep_traffic: Could not retrieve
initial data - data files corrupted or broken, or the
.agg file is missing."
Problem: s390-tools-1.9.0 introduced a new way of recognizing of
multipath device paths with using of sed command
invocation in ziomon script. With this new way of
recognizing, if there are paths, related to multipath
device, with SCSI host ID longer, than one digit,
ziomon incorrectly parses the multipath -l command
output. It erroneously cuts off all but the least
significant digit of the SCSI host ID (H) of paths in
H:B:T:L format (Host:Bus:Target:Lun). This leads to
passing of hosts (-a) and paths (-l) with wrong
SCSI host ID to ziomon_util. In turn ziomon_util cannot
recognize hosts with non-existing SCSI host ID and
issues an error. Also, wrong sed command invocation
could lead to receiving of duplicate LUNs by ziomon
after parsing of multipath -l command output. Then
ziomon excludes duplicates from WRP_LUNS, which leads
to mismatch between number of LUNs and number of
detected block devices and issues ziomon script error,
without starting ziomon_util and without writing to
specified log file.
Solution: The regular expression to match a path in H:B:T:L
format started with a greedy ".*", which erroneously
consumed parts of the SCSI host ID (H). The solution
is to replace the greedy ".*" by "[^0-9]*", so that sed
command does not consume parts of the SCSI host ID any
more.
Reproduction: Create such multipath device, that its related path
contains LUN with SCSI host ID longer, than one digit.
Example for wrong SCSI host ID:
$ multipath -l
mpathc (36005076307ffc5e300000000000083f5) dm-2 IBM ...
size=20G features='1 queue_if_no_path' hwhandler='0'...
`-+- policy='service-time 0' prio=0 status=active
|- 10:0:0:1089814659 sdb 8:16 active undef running
`- 11:0:0:1089814659 sdf 8:80 active undef running
Example for duplicate SCSI host ID:
$ multipath -l
mpathd (36005076307ffc5e300000000000083f4) dm-3 IBM ...
size=20G features='1 queue_if_no_path' hwhandler='0'...
`-+- policy='service-time 0' prio=0 status=active
|- 10:0:0:1089749123 sda 8:0 active undef running
`- 0:0:0:1089749123 sdd 8:48 active undef running
Use ziomon tool with one of the described multipath
device as input.
Upstream-ID: f2dee9f542439cf07e00df4296b05a47b81e2469
Problem-ID: 176863
Upstream-Description:
ziomon: fix utilization data recording with multi-digit scsi hosts
s390-tools-1.9.0 introduced a new way of recognizing of multipath
device paths with using of sed command invocation in ziomon script.
With this new way of recognizing, if there are paths, related to
multipath device, with SCSI host ID longer, than one digit,
ziomon incorrectly parses the multipath -l command output. It
erroneously cuts off all but the least significant digit of the
SCSI host ID (H) of paths in H:B:T:L format (Host:Bus:Target:Lun).
This leads to passing of hosts (-a) and paths (-l) with
non-existing SCSI host ID to ziomon_util. In turn ziomon_util
cannot recognize hosts with non-existing SCSI host ID and issues
an error.
Also, wrong sed command invocation could lead to receiving of
duplicate LUNs by ziomon after parsing of multipath -l command
output. Then ziomon excludes duplicates from WRP_LUNS, which
leads to mismatch between number of LUNs and number of detected
block devices and issues ziomon script error, without starting
ziomon_util and without writing to specified log file.
The regular expression to match a path in H:B:T:L format started
with a greedy ".*", which erroneously consumed parts of the SCSI
host ID (H). This patch replaces the greedy ".*" by "[^0-9]*",
so that sed command does not consume parts of the SCSI host ID
any more.
Test example with unique SCSI host IDs:
$ multipath -l
...
mpathc (36005076307ffc5e300000000000083f5) dm-2 IBM ,2107900
size=20G features='1 queue_if_no_path' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=0 status=active
|- 10:0:0:1089814659 sdb 8:16 active undef running
`- 11:0:0:1089814659 sdf 8:80 active undef running
...
Behavior without fix applied:
$ ziomon -d 5 -o log /dev/mapper/mpathc
Check devices...done
NOTE: No size limit specified, run without a limit.
Estimated maximum disk space required for log data: approx. <1 MBytes
Collecting configuration data...done
Start data collection processes...ziomon_util: Path does not exist: /sys/class/scsi_host/host0/utilization - correct kernel version?
ziomon_util: Path does not exist: /sys/class/scsi_host/host0/utilization - correct kernel version?
ziomon_util: Path does not exist: /sys/class/scsi_host/host0/queue_full - correct kernel version?
ziomon_util: Path does not exist: /sys/class/scsi_host/host1/utilization - correct kernel version?
ziomon_util: Path does not exist: /sys/class/scsi_host/host1/queue_full - correct kernel version?
failed
ziomon: Failed to determine ziomon_util pid
Shutting down
Shutting down blktrace process
Shutting down blkiomon process
Shutting down ziomon_zfcpdd process
blkiomon: terminated by signal
Shutting down data manager
User can see more information, when using ziomon with -V option:
...
=== WRP_LUNS : 0:0:0:1089814659 1:0:0:1089814659
=== WRP_HOST_ADAPTERS: host0 host1
...
=== starting ziomon_util: ziomon_util -V -a 0 -a 1 -l 0:0:0:1089814659 -l 1:0:0:1089814659 ...
...
User can also see this type of errors, while using
ziorep_traffic or ziorep_utilization tools:
$ ziorep_traffic -t1 log.log
Extracting config data...done
ziorep_traffic: Could not retrieve initial data - data files corrupted or broken, or the .agg file is missing.
Behavior with fix applied:
$ ziomon -d 5 -o log /dev/mapper/mpathc
Check devices...done
NOTE: No size limit specified, run without a limit.
Estimated maximum disk space required for log data: approx. <1 MBytes
Collecting configuration data...done
Start data collection processes...done
Collecting data...done
Shutting down
Shutting down data manager
User can see more information, when using ziomon with -V option:
...
=== WRP_LUNS : 10:0:0:1089814659 11:0:0:1089814659
=== WRP_HOST_ADAPTERS: host10 host11
...
=== starting ziomon_util: ziomon_util -V -a 10 -a 11 -l 10:0:0:1089814659 -l 11:0:0:1089814659 ...
...
Test example with duplicate SCSI host IDs:
$ multipath -l
...
mpathc (36005076307ffc5e300000000000083f5) dm-1 IBM ,2107900
size=20G features='1 queue_if_no_path' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=0 status=active
|- 0:0:0:1089814659 sdb 8:16 active undef running
`- 10:0:0:1089814659 sdf 8:80 active undef running
...
Behavior without fix applied:
$ ziomon -d 5 -o log /dev/mapper/mpathc
Check devices...done
ziomon: Number of LUNs does not match number of devices: 2 devices and 1 LUNs
User can see more information, when using ziomon with -V option:
...
=== #Devices total : 2
=== WRP_DEVICES : /dev/sdb /dev/sdf
=== WRP_LUNS : 0:0:0:1089814659
=== WRP_HOST_ADAPTERS: host0
...
Behavior with fix applied:
$ ziomon -d 5 -o log /dev/mapper/mpathc
Check devices...done
NOTE: No size limit specified, run without a limit.
Estimated maximum disk space required for log data: approx. <1 MBytes
Collecting configuration data...done
Start data collection processes...done
Collecting data...done
Shutting down
Shutting down data manager
User can see more information, when using ziomon with -V option:
...
=== #Devices total : 2
=== WRP_DEVICES : /dev/sdb /dev/sdf
=== WRP_LUNS : 0:0:0:1089814659 10:0:0:1089814659
=== WRP_HOST_ADAPTERS: host0 host10
...
=== starting ziomon_util: ziomon_util -V -a 0 -a 10 -l 0:0:0:1089814659 -l 10:0:0:1089814659 ...
...
Signed-off-by: Fedor Loshakov <loshakov@linux.ibm.com>
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Jan Höppner <hoeppner@linux.ibm.com>
Signed-off-by: Fedor Loshakov <loshakov@linux.ibm.com>
---
ziomon/ziomon | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/ziomon/ziomon
+++ b/ziomon/ziomon
@@ -514,7 +514,7 @@ function check_for_multipath_devices() {
(( i+=2 ));
while [[ `echo "${mp_arr[$i]:0:1}" | grep -ve "[0-9a-zA-Z]"` ]] && [ $i -lt ${#mp_arr[@]} ]; do
if [ `echo ${mp_arr[$i]} | grep -e "[0-9]\{1,\}:[0-9]\{1,\}:[0-9]\{1,\}:[0-9]\{1,\}" | wc -l` -ne 0 ]; then
- line="`echo ${mp_arr[$i]} | sed 's/.*\([0-9]\{1,\}:[0-9]\{1,\}:[0-9]\{1,\}:[0-9]\{1,\}\)/\1/'`";
+ line="`echo ${mp_arr[$i]} | sed 's/[^0-9]*\([0-9]\{1,\}:[0-9]\{1,\}:[0-9]\{1,\}:[0-9]\{1,\}\)/\1/'`";
checked_devs[${#checked_devs[@]}]=`echo $line | awk '{print "/dev/"$2}'`;
ddebug " adding ${checked_devs[${#checked_devs[@]}-1]}";
WRP_HOST_ADAPTERS[${#WRP_HOST_ADAPTERS[@]}]="host${line%%:*}";