This patch addresses a scenario observed in production where disk links go down. After a system reboot, depending on which disk becomes available first, the IMSM RAID array may either fully assemble or come up with missing disks. Below is an example of the production case simulating disk link failures and subsequent system reboot. (note: "echo "1" | sudo tee /sys/class/scsi_device/x:x:x:x/device/delete" is used here to fail/unplug/disconnect disks) Raid Configuration: IMSM Raid1 with two disks - When sda is unplugged first, then sdb, and after reboot sdb is reconnected first followed by sda, the container (/dev/md127) and subarrays (/dev/md125, /dev/md126) correctly assemble and become active. - However, when sda is reconnected first, then sdb, the subarrays fail to fully reconstruct — sda remains missing from the assembled subarrays, due to stale metadata. Above behaviors are influenced by udev event handling: - When a disk disconnects, the rule ACTION=="remove", ENV{ID_PATH}=="?*", RUN+="/usr/sbin/mdadm -If $devnode --path $env{ID_PATH}" is triggered to inform mdadm of the removal. - When a disk reconnects (i.e., ACTION!="remove"), the rule IMPORT{program}="/usr/sbin/mdadm --incremental --export $devnode --offroot $env{DEVLINKS}" is triggered to incrementally assemble the RAID arrays. During array assembling, the array may not be fully assembled due to disks with stale metadata. This patch adds a udev-triggered script that detects this failure and brings the missing disks back to the array. Basically, it inspects the RAID configuration in /usr/sbin/mdadm --detail --scan --export, identifies disks that belong to a container array but are missing from their corresponding member (sub)arrays, and restores them by performing a hot remove-and-re-add cycle. The patch improves resilience by ensuring consistent array reconstruction regardless of disk detection order. This aligns system behavior with expected RAID redundancy and reduces risk of unnecessary manual recovery steps after reboots in degraded hardware environments. Signed-off-by: Richard Li <tianqi.li@xxxxxxxxxx> --- imsm_rescue.sh | 148 ++++++++++++++++++++++++++++++++++++ udev-md-raid-assembly.rules | 3 + 2 files changed, 151 insertions(+) create mode 100644 imsm_rescue.sh diff --git a/imsm_rescue.sh b/imsm_rescue.sh new file mode 100644 index 00000000..7dcb0773 --- /dev/null +++ b/imsm_rescue.sh @@ -0,0 +1,148 @@ +#!/bin/sh +# Check IMSM Raid array health and bring up failed/missing disk members + +mdadm_output=$(/usr/sbin/mdadm --detail --scan --export) +export MDADM_INFO="$mdadm_output" + +lines=$(echo "$MDADM_INFO" | grep '^MD_') + +arrays=() +array_indexes=() +index=0 +current=() + +# Parse mdadm_output into arrays +while IFS= read -r line; do + if [[ $line == MD_LEVEL=* ]]; then + if [[ ${#current[@]} -gt 0 ]]; then + arrays[index]="${current[*]}" + array_indexes+=($index) + current=() + index=$((index + 1)) + fi + fi + current+=("$line") +done <<< "$lines" + +if [[ ${#current[@]} -gt 0 ]]; then + arrays[index]="${current[*]}" + array_indexes+=($index) +fi + +# Parse containers and map them to disks +container_names=() +container_disks=() + +for i in "${array_indexes[@]}"; do + IFS=' ' read -r -a props <<< "${arrays[$i]}" + + level="" + devname="" + disks="" + + for entry in "${props[@]}"; do + key="${entry%%=*}" + val="${entry#*=}" + + case "$key" in + MD_LEVEL) level="$val" ;; + MD_DEVNAME) devname="$val" ;; + MD_DEVICE_dev*_DEV) disks+=" $val" ;; + esac + done + + if [[ "$level" == "container" && -n "$devname" ]]; then + container_names+=("$devname") + container_disks+=("${disks# }") + fi +done + +# Check and find missing disks of each container and their subarrays +containers_with_missing_disks_in_subarray=() +missing_disks_list=() + +for i in "${array_indexes[@]}"; do + IFS=' ' read -r -a props <<< "${arrays[$i]}" + + level="" + container_path="" + devname="" + devices="" + present=() + + for entry in "${props[@]}"; do + key="${entry%%=*}" + val="${entry#*=}" + + case "$key" in + MD_LEVEL) level="$val" ;; + MD_DEVNAME) devname="$val" ;; + MD_DEVICES) devices="$val" ;; + MD_CONTAINER) container_path="$val" ;; + MD_DEVICE_dev*_DEV) present+=("$val") ;; + esac + done + + if [[ "$level" == "container" || -z "$devices" ]]; then + continue + fi + + present_count="${#present[@]}" + if (( present_count < devices )); then + container_name=$(basename "$container_path") + # if MD_CONTAINER is empty, then it's a regular raid + if [[ -z "$container_name" ]]; then + continue + fi + + container_real=$(realpath "$container_path") + + if [[ -z "$container_real" ]]; then + continue + fi + + # Find disks in container + container_idx=-1 + for j in "${!container_names[@]}"; do + if [[ "${container_names[$j]}" == "$container_name" ]]; then + container_idx=$j + break + fi + done + + if (( container_idx >= 0 )); then + container_disk_line="${container_disks[$container_idx]}" + container_missing=() + + for dev in $container_disk_line; do + found=false + for pd in "${present[@]}"; do + [[ "$pd" == "$dev" ]] && found=true && break + done + $found || container_missing+=("$dev") + done + + if (( ${#container_missing[@]} > 0 )); then + containers_with_missing_disks_in_subarray+=("$container_real") + missing_disks_list+=("${container_missing[*]}") + fi + fi + fi +done + +# Perform a hot remove-and-re-add cycle to bring missing disks back +for idx in "${!containers_with_missing_disks_in_subarray[@]}"; do + container="${containers_with_missing_disks_in_subarray[$idx]}" + missing_disks="${missing_disks_list[$idx]}" + + for dev in $missing_disks; do + id_path=$(udevadm info --query=property --name="$dev" | grep '^ID_PATH=' | cut -d= -f2) + + if [[ -z "$id_path" ]]; then + continue + fi + + /usr/sbin/mdadm -If "$dev" --path "$id_path" + /usr/sbin/mdadm --add --run --export "$container" "$dev" + done +done diff --git a/udev-md-raid-assembly.rules b/udev-md-raid-assembly.rules index 4cd2c6f4..fc210437 100644 --- a/udev-md-raid-assembly.rules +++ b/udev-md-raid-assembly.rules @@ -41,6 +41,9 @@ ACTION=="change", KERNEL!="dm-*|md*", GOTO="md_inc_end" ACTION!="remove", IMPORT{program}="BINDIR/mdadm --incremental --export $devnode --offroot $env{DEVLINKS}" ACTION!="remove", ENV{MD_STARTED}=="*unsafe*", ENV{MD_FOREIGN}=="no", ENV{SYSTEMD_WANTS}+="mdadm-last-resort@$env{MD_DEVICE}.timer" +# do a health check and try to bring up missing disk members +ACTION=="add", RUN+="./imsm_rescue.sh" + ACTION=="remove", ENV{ID_PATH}=="?*", RUN+="BINDIR/mdadm -If $devnode --path $env{ID_PATH}" ACTION=="remove", ENV{ID_PATH}!="?*", RUN+="BINDIR/mdadm -If $devnode" -- 2.43.5