I am using Debian 11.7 and a 4 TB Western Digital hard drive. One of my partitions, /dev/sdb15, supposedly had up to eight bad sectors. This file system is used for virtual machines and is not required for anything related to the general operation of the Linux computer. These sectors showed up in the system journal. I have tried using badblocks with mke2fs in Debian 10 and Debian 11 and it always locked up and caused trouble so that really isn't an option for me. I wrote a bash script to create a huge number of 1 MiB files using dd and /dev/zero to use up all of the free blocks on the file system. Then I tried reading all of these files, but only one of them appeared to have a bad sector. I tried reading all of the files 100 times in a row using direct I/O, iflag=direct with dd, so it wouldn't keep reading the same file from the system cache over and over again. When it got to the file with the bad sector the computer got locked up because it couldn't access the hard drive. The block size and physical sector size on this hard drive is 4096 bytes. I isolated the file with the bad sector and finished reading all of the other similar files without any trouble. Then I erased the problematic file and created new files, each 4096 bytes, in a different subdirectory so I could read those and isolate the bad block in a single, yet smaller, file. These two directories were already using quite a lot of space, like lost+found does. I was able to use /usr/bin/ls to read the directory just fine and e2fsck -fv /dev/sdb15 didn't complain about any problems with the directory. I could use /usr/bin/cat to verify that the 1 MiB file had a bad sector before removing it. After creating the 256 4096 byte files none of them appeared to have any bad sectors, even when reading each one hundreds of times using direct I/O. How can a sector go bad, and test as bad numerous times, and then suddenly not be bad anymore. I have erased the 256 smaller files and created a 1 MiB file again to hold on to the possibly bad sector. I have written a bash script to erase all of those other 1 MiB files because the command line can't handle over five hundred thousand command line arguments and I don't want to remove the directory itself. I might need that again later. It will take hours to remove all of them this way. This is currently a work in progress. The crash didn't cause any real harm because there wasn't much happening on the system when I had to shut it down when it locked up. The files were only being read, not written to.
When I tried to create a VM one of the files supposedly had a bad sector and /usr/bin/tar complained about this saying that the file shrank when the VM was not running. Then the tar file itself appeared to have a bad sector, but I didn't know this right away. Then I made a compressed file from the tar file using /usr/bin/gzip, but when I ran /usr/bin/gzip -t to test the file's integrity it failed. The compressed file appeared to have a bad sector too. There was already another file that supposedly had a bad sector and was already isolated. I erased all of the files related to the current VM and the other file that had the bad sector and created all of those 1 MiB files in a special directory created for that purpose. Only one of them had a bad sector when I tested those. No files had any bad sectors when I tested the 256 small files. How can this be explained? I may have purchased the hard drive in 2021. A good quality hard drive should last not less than ten to twenty years without any problems. This is not at all unreasonable, especially considering the advances in technology over the years. Anyone that believes otherwise has obviously learned to accept a lower standard of quality in our throw away culture. Western Digital is supposed to be a good quality brand. I sure wish I could get a new hard drive from Quantum. I hear those were really nice. Let me know if you want a copy of any of those bash scripts.
Signed,
Matthew Campbell
When I tried to create a VM one of the files supposedly had a bad sector and /usr/bin/tar complained about this saying that the file shrank when the VM was not running. Then the tar file itself appeared to have a bad sector, but I didn't know this right away. Then I made a compressed file from the tar file using /usr/bin/gzip, but when I ran /usr/bin/gzip -t to test the file's integrity it failed. The compressed file appeared to have a bad sector too. There was already another file that supposedly had a bad sector and was already isolated. I erased all of the files related to the current VM and the other file that had the bad sector and created all of those 1 MiB files in a special directory created for that purpose. Only one of them had a bad sector when I tested those. No files had any bad sectors when I tested the 256 small files. How can this be explained? I may have purchased the hard drive in 2021. A good quality hard drive should last not less than ten to twenty years without any problems. This is not at all unreasonable, especially considering the advances in technology over the years. Anyone that believes otherwise has obviously learned to accept a lower standard of quality in our throw away culture. Western Digital is supposed to be a good quality brand. I sure wish I could get a new hard drive from Quantum. I hear those were really nice. Let me know if you want a copy of any of those bash scripts.
Signed,
Matthew Campbell