The horrible lurking bug
I tried to be conservative and thus gave a generous 40GB to my root filesystem. On the other hand, I used btrfs. I recently updated to F22 and this morning tried to yum update (yes, dnf, I know…). You probably already know what’s coming:
Cannot open /var/cache/dnf/x86_64/22/updates/packages/fakeroot-libs-1.20.2-1.fc22.x86_64.rpm: No space left on device: u’Input/Output error’
write error
And now, of course, the usual spew about corrupt database.
I’m sure it doesn’t surprise you that df shows 18GB still free, nor that btrfs fi show says that all 39.06GiB are used. What might surprise you is that btrfs fi df / doesn’t show any resource as fully utilized or even close.
btrfs fi show
Label: ‘system’ uuid: ba2520ab-7fcf-4601-bf0f-b4d1fc10258c
Total devices 1 FS bytes used 20.41GiB
devid 1 size 39.06GiB used 39.06GiB path /dev/sda5
btrfs fi df /
Data, single: total=36.53GiB, used=18.14GiB
System, DUP: total=8.00MiB, used=16.00KiB
System, single: total=4.00MiB, used=0.00B
Metadata, DUP: total=1.25GiB, used=889.53MiB
Metadata, single: total=8.00MiB, used=0.00B
GlobalReserve, single: total=304.00MiB, used=0.00B
I’m a little scared to even try to rpm –rebuilddb before figuring out the discrepancy between btrfs fi show and btrfs fi df /.
Clearly I’m missing something and so far the google searches I’ve done are not enlightening me. It would be nice to get my computer back. ☺
Eugene Crosser June 30, 2015 07:28
I recall similar problems discussed on the btrfs maillist a couple of years ago.
Kevin Otte June 30, 2015 07:52
I had btrfs on my netbook and its measly 4GB drive so I could leverage compression. At some point along the line it imploded. Since with btrfs it’s not ‘if’ but ‘when’ you’ll get corruption, I switched back to ext4.
Michael K Johnson June 30, 2015 08:00
+Eugene Crosser I was assuming that it was just something I don’t know about btrfs rather than a bug —is your memory of btrfs mailing list discussion about bug out feature?
+Kevin Otte I have an SSD in this machine so a filesystem with checksums seems valuable…
Eugene Crosser June 30, 2015 08:44
It was a bug, I think, but there may have been a way to mitigate. Don’t remember the details, but it must be googlable.
Michael K Johnson June 30, 2015 09:12
All the searches I have tried so far are swamped by results that assume the normal case that you need to rebalance metadata. Searching some more and trying a hint from another page:
btrfs fi balance /
ERROR: error during balancing ‘/’ - No space left on device
There may be more info in syslog - try dmesg | tail |
(There is of course no additional useful information in dmesg output; just btrfs relocating block groups and then failing due to enospc.)
Michael K Johnson June 30, 2015 10:22
Aha! I can stop trying to search for a non-bug source for this problem. +Eugene Crosser is right.
Can’t mention the IRC channel by name without G+ deciding that it is a hashtag, but it appears that I have encountered “this ENOSPC problem” aka “the horrible lurking bug”. No data loss, just the annoyance of a filesystem that thinks it is full.
The downside of using a filesystem that its own developers say isn’t quite ready for production use is, of course, that you have to be prepared for the possibility that they are right. Worst case, I need to recreate my root filesystem from a complete backup that I just made. Oh, well. At least I have a metadata image for them if they need another one to help debug the problem.
Michael K Johnson June 30, 2015 11:21
Worked around!
I had another 40GB partition on the second drive in the system. It was my old root partition from when the second drive was the primary drive. /dev/sda5 is my root btrfs filesystem that showed the problem. /dev/sdb1 is the old root partition. I unmounted that (haven’t needed it for a long time) and ran:
btrfs device add -f /dev/sdb1
btrfs device delete /dev/sda5
btrfs device add /dev/sda5
btrfs device delete /dev/sdb1
I needed the -f because /dev/sdb1 had an ext4 filesystem and btrfs device add protected it from being overwritten by default. I was then able to complete a system upgrade. I didn’t even need rpm –rebuilddb because once the filesystem had space it recovered without help.
Eugene Crosser June 30, 2015 14:07
Btrfs is “not quite ready” for how long, ten years? I use it because I like to play and agree to pay with extra risk, but for “real work” btrfs looks less and less compelling..
Michael K Johnson June 30, 2015 17:51
Well, it’s a question of what tradeoffs you are making and what risks you are mitigating. On my laptop with SSD and what I think are good backups, I’ll take risk of file system failure for data checksums that notice underlying device data corruption. On servers with IO subsystems I trust more to give me good data or explicit errors, I use ext4 or xfs.
It may be that btrfs was harder than they originally expected. :-)
Imported from Google+ — content and formatting may not be reliable