GeistHaus
log in · sign up
7 pages link to this URL
fsync() after open() is an elaborate no-op

I have spent the last couple of years of my life trying to make sense of fsync() and bringing OpenZFS up to code. I’ve read a lot of horror stories about this apparently-simple syscall in that time, usually written by people who tried very hard to get it right but ended up losing data in different ways. I hesitate to say I enjoy reading these things, because they usually start with some catastrophic data loss situation and that’s just miserably unfair. At least, I think they’re important reads, and I’m always glad to see another story of fsync() done right, or done wrong.

0 inbound links article en blog
Why fsync() on OpenZFS can’t fail, and what happens when it does

This presentation was given at BSDCan 2024. Abstract On OpenZFS, fsync() cannot fail - it will wait until the application’s changes are on disk before it returns. If there is a problem, such as a hardware failure, that causes the pool to suspend, then it will block until the pool returns. This could be seconds, hours, or never, depending on the nature on the failure. Modern distributed systems can often cope with this type of failure by redirecting requests to another node, but they can only do this if fsync() returns an error instead of blocking.

0 inbound links article en presentations