The linux computer crashed. Upon restart, it wanted a disk check. Fair enough. But then when it rebooted, it went to the recovery console. Uh, oh, something is up. I went to Advanced Options and did a dpkg check, which found a few things to correct before I could reboot back into the GUI. At first I thought the OS drive was bad, but it ends up that the data drive was the one that had the error.
Upon the next reboot, my RAID card gave me a warning, “HDD may be not available. Please contact…” but when I went into the RAID menu, all drives were good. Hmmm. Does the ASMedia really read the disks’ SMART status? Once inside Ubuntu I then checked the SMART status of my drives using smartctl:
sudo smartctl -d sat --all /dev/sdx -H
The OS drive was fine, but the RAID said DISK IS LIKELY TO FAIL SOON, even though the RAID menu reported both disks as fine. While smarctl is very useful, it cannot look inside the ASMedia controller to let me know which disk was failing. Card said fine, OS said not fine. Who do I trust? Ubuntu. Bottom line: SMART is not to be ignored.
First, I immediately did a backup. Success. I then popped down to my local Microcenter and purchased two new (price matched!) 4TB Seagate IronWolf drives and setup a new RAID1. Why? Foremost, all the drives were still working, no data had been lost. So why not start fresh, reset the clock on the drives to Late 2021 and gain an extra TB of space?
It’s just a lot of time to complete a restore, but everything is safe again.