ext3 nastiness
arugh, I feel like pulling out some hair! After wasting a few days doing hardware shuffles and so on on my file server and my workstation, I have finally removed the parallel ata disk that was being used for the OS on my file server. I just went to delete a stack of files and *BAM!* same fucking error again@!# *bashes head against the wall*
EXT3-fs error (device sda1): ext3_free_blocks_sb: bit already cleared for block 16899952
Remounting filesystem read-only
EXT3-fs error (device sda1): ext3_free_blocks_sb: bit already cleared for block 16899953
…and so on…
Google thus far has been totally crapola and i’m at the moment stuck as to what to do now… anyway, time for another damn fsck! that’ll give me some time to think about things…
Update!
One of the ways I was reproducing the error was by trying to delete some stray /var/log/apt-proxy.log* files which appeared to have not been picked up by logrotate and were all between 1 & 2Gb in size (yes, I had an excessively high debug level set at one stage). As soon as I would rm these logfiles the error would occur instantly and the filesystem would be thrown into read only mode. The way I got around this was to zero the files using something like this:
for file in `find /var/log/ iname “apt-proxy.log.*”`; do > ${file}; rm -f ${file}; done
The rm of the files once they had been zeroed out seemed to work without reproducing the errors. Since then the box has been quite stable so I can only presume that this was the root cause of the problem. Interesting though that the multitude of fsck’s didn’t find/fix the apparent ext3 problem. Really weird I must say… needless to say I’ll be closely monitoring the situation.
One thing that only dawned on me not too long ago was that when I did the whole disk shuffle I only created new partitions and filesystems on the new 160Gb sata2 disk for my machine, not the one that went into my file server. The steps were:
- insert 160Gb sata2 into my machine (to replace the existing 120Gb sata)
- create partitions & filesystems of the relevant size
- mount old & new filesystems while booted off an alternate media (read: knoppix)
- rsync data from old disk to new disk
- move 120Gb sata to fileserver
- dd if=/dev/sdb of=/dev/sda bs=4096 # /dev/sda being the 120Gb pata & /dev/sdb being the 120Gb sata
- fix grub & fstab entries and remove old pata disk
It was the dd step that obviously caused the issues, whatever filesystem errors were introduced from the original motherboard dying in the ass would be moved to the new disk.. I knew I should have just created the filesystems and used rsync like I did on my workstation. *slaps back of head* Oh well, its too late now.. least the problem appears to be resolved.
Filed under: m.o