Linux multipath partitions unavailable after cold restart

Hello guys

Yesterday I shutdown one of our legacy system Linux servers after a long interval. May be almost after 1.5 years. Once restarted, I started getting alert emails from cronjobs which were using those mount points. After a quick checking I found that few the UUIDs I have used with fstab were missing when I issued “blkid” command…

Please note, I am not at all a Linux/Storage expert. I consider “finding” this solution as a blind shot as my immediate technical support guys were too busy to answer the calls. You are asked not to copy these to a production instance! I took the risk because the data, the server itself is NOT significant for us and we had the freedom to rebuild it as and when we needed it.

Linux Server IBM x3560 running OEL 6 & the storage device is DS3200 that uses HBA interface to the physical server.

Doing a fdisk -l listed the following for me:

Disk /dev/mapper/mpathb: 644.2 GB, 644245094400 bytes
255 heads, 63 sectors/track, 78325 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 

             Device Boot      Start         End      Blocks   Id  System

Disk /dev/mapper/mpathc: 214.7 GB, 214748364800 bytes
255 heads, 63 sectors/track, 26108 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 
			
			 Device Boot      Start         End      Blocks   Id  System

and I remembered that earlier I had listings like the below

Disk /dev/mapper/mpathb: 644.2 GB, 644245094400 bytes
255 heads, 63 sectors/track, 78325 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 

             Device Boot      Start         End      Blocks   Id  System
/dev/mapper/mpathbp1               1       78325   629145531   83  Linux

Disk /dev/mapper/mpathc: 214.7 GB, 214748364800 bytes
255 heads, 63 sectors/track, 26108 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 

             Device Boot      Start         End      Blocks   Id  System
/dev/mapper/mpathcp1               1       26108   209712478+  83  Linux

In addition to, I have noticed that there were two devices /dev/sdd, /dev/sde with the same size of the multipath sizes, however without any paritions.

After rebooting twice, I decided to create new partitions under /dev/sdd & /dev/sde which succeeded. However, when I tried to format these newly created partitions, I started getting “/dev/sdd1 is apparently in use by the system; will not make a filesystem here” and “/dev/sde1 is apparently in use by the system; will not make a filesystem here”, that forced me to restart the server once again.

To my utter surprises, once the machine booted up, all my mount points were back online once again without doing anything else.

[root@xyz multipath]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sdk2              59G   11G   46G  20% /
tmpfs                  10G  228K   10G   1% /dev/shm
/dev/sdk1             2.0G  330M  1.5G  18% /boot
/dev/sdk5             738G  632G   69G  91% /u01
/dev/mapper/mpathbp1  591G  332G  229G  60% /u02
/dev/mapper/mpathcp1  197G  134G   54G  72% /u03
/dev/sdg1             591G   70M  561G   1% /u04
/dev/sdj1             269G   59M  256G   1% /u05
/dev/sda1             917G  765G  107G  88% /usbdrive
/dev/sdb1             1.8T  642G  1.1T  37% /RDX
Disk /dev/sdd: 644.2 GB, 644245094400 bytes
255 heads, 63 sectors/track, 78325 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00035652

   Device Boot      Start         End      Blocks   Id  System
/dev/sdd1               1       78325   629145531   83  Linux

Disk /dev/sde: 214.7 GB, 214748364800 bytes
255 heads, 63 sectors/track, 26108 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x0002b47b

   Device Boot      Start         End      Blocks   Id  System
/dev/sde1               1       26108   209712478+  83  Linux

Disk /dev/mapper/mpathb: 644.2 GB, 644245094400 bytes
255 heads, 63 sectors/track, 78325 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00035652

             Device Boot      Start         End      Blocks   Id  System
/dev/mapper/mpathbp1               1       78325   629145531   83  Linux

Disk /dev/mapper/mpathc: 214.7 GB, 214748364800 bytes
255 heads, 63 sectors/track, 26108 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x0002b47b

             Device Boot      Start         End      Blocks   Id  System
/dev/mapper/mpathcp1               1       26108   209712478+  83  Linux

Prior deciding to create the partitions once again, I cross verified that multipath daemon was loaded and I can see the information. For a primarily a Windows OS person, the whole thing looked like a messed up “File Allocation Table”

[root@erp-prodbak ~]# multipath -ll
mpathc () dm-1 IBM,1726-2xx  FAStT
size=200G features='1 queue_if_no_path' hwhandler='1 rdac' wp=rw
|-+- policy='round-robin 0' prio=6 status=active
| `- 2:0:0:2  sde 8:64  active ready running
`-+- policy='round-robin 0' prio=1 status=enabled
  `- 3:0:0:2  sdi 8:128 active ghost running
mpathb () dm-0 IBM,1726-2xx  FAStT
size=600G features='1 queue_if_no_path' hwhandler='1 rdac' wp=rw
|-+- policy='round-robin 0' prio=6 status=active
| `- 2:0:0:1  sdd 8:48  active ready running
`-+- policy='round-robin 0' prio=1 status=enabled
  `- 3:0:0:1  sdh 8:112 active ghost running

I have confirmed that the ids in use were same as /etc/multipath/wwids and bindings files.

Well, may be I was truly lucky to “get it done” this time without understanding what actually went wrong. You may not apply this solution to production environment in case if you are dealing with important data!

regards,

rajesh