Device: /dev/sda, 1 Currently unreadable (pending) sectors
I sometimes get these annoying messages from smartd. They only occur with Samsung HD103SIS SATA disks. According to my google research, these messages are not harmful and a Samsung speciality. One reason for me to avoid Samsung in future. The messages should disappear after a reboot. However, since the disks are built into a RAID1 on a 24x7 server I want to avoid rebooting the server. Instead I can get rid of these messages without a reboot when overwriting the whole disk with zeros. Please click here for details.
Removing the disk from the array
The first step is remove the corresponding disk from the array. This is first done by failing the disk and afterwards removing it. In my case, the disk is part of three RAID1 volumes md0, md1 and md2:
mdadm --manage /dev/md0 --fail /dev/sda1 mdadm --manage /dev/md0 --remove /dev/sda1 mdadm --manage /dev/md2 --fail /dev/sda3 mdadm --manage /dev/md2 --remove /dev/sda3 mdadm --manage /dev/md1 --fail /dev/sda4 mdadm --manage /dev/md1 --remove /dev/sda4
Zeroing the disk
Since the disk is removed from the RAID it is safe now to overwrite is with zeroes. As soon as the disk is completely zeroed out, the messages disappear. In my case, the disk is 1TB:
cat /dev/zero | pv -s 1000G | dd of=/dev/sda bs=100M
Recreate partition layout
Now you have to recreate the partition layout with fdisk or cfdisk. In my case, the two RAID1 disks are identical, so I can copy the partition layout from the other disk:
sfdisk -d /dev/sdb | sfdisk /dev/sda sfdisk -R /dev/sda
Adding the disk to the RAID
This is simple as before:
mdadm --manage /dev/md0 --add /dev/sda1 mdadm --manage /dev/md2 --add /dev/sda3 mdadm --manage /dev/md1 --add /dev/sda4
In my case I installed grub on both disks. In case of failure of a single disk, the system is always fully bootable and running.
Or if you use symbolic names (like me), just reinstall on both devices:
grub-install /dev/disk/by-id/scsi-SATA_SAMSUNG_HD103SIS1VSJ1KS300499 grub-install /dev/disk/by-id/scsi-SATA_SAMSUNG_HD103SIS1VSJ1KS300505
Niki Hammler meinte …
<comment date="2012-08-30T08:30:04Z" name="Niki Hammler"> One additional comment: The problem is NOT solved by rebooting. It seems that zeroing out is necessary :-( </comment>
mike meinte …
<comment date="2012-12-26T17:07:31Z" name="mike"> hello! i have /dev/sdb 3TB how to perform Zeroing the disk for 3TB
cat /dev/zero | pv -s 3000G | dd of=/dev/sda bs=100M
or this is not right ? please show the correct command
I'm afraid to make a mistake
Niki meinte …
<comment date="2012-12-29T17:29:49Z" name="Niki"> Are you sure you have the same config as I (RAID1)? Only in this case my howto makes sense, otherwise you will destroy all your data!!
If you are sure (really sure!) then the disk in the command must match:
cat /dev/zero | pv -s 3000G | dd of=/dev/sdb bs=100M
James Hightower meinte …
<comment date="2013-06-13T15:37:50Z" name="James Hightower"> I think your can skip the zero-ing part. Unless you have write-intent bitmaps enabled, then just removing and re-adding a drive will cause MD to overwrite the entire disk, including your pending-bad blocks. Thiis has worked for me many times. Remember, though, that the SMART Offline_Uncorrectable won't get updated until the next offline collection occours, like with smartctl -t offline. </comment>
Niki meinte …
<comment date="2013-06-14T05:27:11Z" name="Niki"> Dear James,
Thank your for your comment! I will try it our the next time I encounter the message!