Fosdem O'Reilly
2004 Edition Free and Open Source Software Developer's European Meeting






Interviews

2003-12-14 - Hans Reiser

ReiserFS

An interview conducted by Alain Buret
FOSDEM - First and traditional question : Please present yourself ...

Hans Reiser - Hi, I am Hans Reiser, architect of the Reiser Filesystem which is in the Linux Kernel.


FOSDEM - How, when and why did you start writting a filesystem ?

Hans Reiser - In 1984 I decided that naming was the key issue in OS design, that it needed to be integrated throughout the OS to be effective, and that the filesystem was the best starting place for someone who wanted to create a name space that would erode the barriers between the different namespaces of the operating system. I also felt that the fundamental theory of databases, known as relational algebra, needed generalizing because it was ill-suited for data with a non-table structure, and much of the world's data does not have a table structure. Yet I also felt that it had many advantages over boolean algebra because it COULD represent some forms of structure.
These are the sorts of grand ambitions 21-year olds tend to have, and of course I never imagined that 19 years later all I would have working was just the storage layer, but oh well.... maybe next year....;-)

I was a sysadmin at the time, and my employer (Microsoft) preferred that I focus my life on creating vi tutorials for the technical writers and performing sysadmin tasks, or at least that I spend not less than 60 hours a week on it. I was willing to work 44 hours a week for them, but wanted time off on the weekends to develop my filesystem ideas. We agreed that I should leave.
I then spent until 1993 developing my ideas, sending papers off to conferences and getting them rejected (one such rejecting conference sent me a request that I submit a paper to them recently..... I was tempted to send them the old paper.....), and generally failing to get the corporate world to fund my ideas. There was a good reason for this failure: I was completely unqualified for the work. So finally I had no choice but to just do it on my own because clearly nobody was going to think me qualified to be paid to do it.

In 1993 I went to Russia to see if it was at all reasonable to hire any of these researchers I had heard were subsisting on $100 a month. It was, and I hired a team of 5. I then spent nights and weekends arguing over algorithms, and spent 5.5 years in a state of pure hell, as any significant time spent not working meant that the filesystem could not happen at all, and I was always worried that 40-44 hours a week was not doing justice to some of the day jobs I had, much less allowing me to properly supervise the code.
When people talk about how money is not important to free software writers, and that they can just have other jobs --- those people are clueless. The difference between a programmer working 60 hours a week, and one working 20 hours a week, is the difference between the NBA and the gifted guys at the local gym. Fortunately for our movement, monopolies are so devastatingly unproductive, and barriers to entry are so destructive of innovation, that Gnu/Linux grew enough to eventually allow many Gnu/Linux authors to go pro. I was one of those fortunate enough to be able to quit the day job in 1999.

Now free software is becoming like the music business: controlled by the major distributors, but at least providing some sort of living to a few in the business. It would be nice if it got to be better than the music business, maybe someday reaching the level of broadcast television. I wish it could exceed even that, but for that to happen our fundamental economic structure for software would have to change to one in which users rate free software and tax dollars go to what they rate as useful. Unfortunately I don't think anyone else supports me on this.....


FOSDEM - What's the main difference between Reiser3 and Reiser4 ?

Hans Reiser - Version 4 is built to be highly extensible using plugins. The quality of the code is much higher from top to bottom, because we are now qualified for the work.;-) The performance is dramatically higher, 2-5x higher than V3.
We fixed every mistake we made in V3. We got rids of BLOBs, we replaced balanced trees with dancing trees, we made the filesystem atomic without losing performance, and we added features like encryption and compression. It is going to make a great code base for V5 and V6, which will add the enhanced semantics we need for competing with the LongHorn FS. It is amusing to see Microsoft embracing namespace unification years after I left them. They've got the wrong algorithms and the wrong semantics, but hey, that is what will make the competition fun.

FOSDEM - Did it implied a complete rewrite ?

Hans Reiser - We threw V3 in the trash and rewrote it completely from scratch, this time the right way.


FOSDEM - For you, which is the fastest filesystem (not only on Linux/BSD platform) ?

Hans Reiser - Linux has probably the four or five fastest filesystems in the industry. NTFS performance used to be the worst in the industry, I haven't benchmarked it recently but I know it still isn't wondrous. UFS and FFS are old and slower than ext2. I haven't benchmarked soft updates, so I won't say a lot about it. I would be curious to compare. We have benchmarks on our webpage at http://www.namesys.com/benchmarks.html comparing the Linux filesystems.


FOSDEM - How do you respond to the rumours that ReiserFS is dangerous because you can't fix it when 'Something Bad (TM)' happens. And that in a number of occasions even Namesys can't fix it. This is a much heard comment in favour of ext3 nowadays.

Hans Reiser - There is a lot of FUD floating around about us. This particular rumor has the advantage that it sounds right if you don't look at the code. You know, balanced trees, sounds complex, must be fragile. When we run fsck --rebuild-tree, it throws the entire internal tree structure away and is able to rebuild it from scratch by looking at the leaves, so I don't think that is very fragile at all. I don't know of any cases using a recent fsck where we could not get data back and ext3 could have. If you know of such cases, let me know.
Now before I say too much, let me acknowledge a few things. When we first shipped in 1999, we had no fsck at all. That was pretty fragile.....;-) Then we had an fsck that was very buggy. Now, we finally have a decent fsck. It just took time.

With V4 I had the luxury of a budget that allowed me to pay for fsck to be developed at the same time as the filesystem itself, and so V4 is going to have a much more mature fsck at a much earlier stage. Also, with V4 we have specially designed our disk format to make fsck more robust by providing more information in the node header that it can use when reconstructing the filesystem. Having done fsck once, we better understand what info it needs to work well.
None of this will fully protect you against a dead hard drive, and probably all of the Linux filesystems leave you more at risk of hardware failure than their software failure, but at least we can improve the odds a little. .
Part of the reason for this FUD is that one of the distros understands open source so poorly that they think they are a competitor (hint, the one that recently oh so cleverly committed seppuku on the desktop, thank you bean counters for surrendering your market leverage you chose to use against us.....), and is contributing to it. They recently told one of our users that ReiserFS would cause database corruption for Oracle. So we emailed them about it, and they admitted they had no idea what they were talking about and could not substantiate it. Oracle is certified on SuSE, and reiserfs is the default filesystem for SuSE, so go figure. Also note that RedHat propagates our bugfixes to its flavor of the kernel only when it syncs with the official kernel, which is not often. RedHat simply refuses to take bugfixes even if we send them. The other distros are all responsible about ReiserFS bugfixes, as they aren't trying to market leverage. Oh well, at least we shamed RedHat into not turning on the reiserfs debug option in their kernel so that users will think we are slow anymore. I wish they would understand that this isn't what open source is supposed to be about, they are a good company in so many other ways, ah well, real human beings with egos on the line and all that. I still keep trying to convince them that we should all just get along and kill off Microsoft....;-)
If something bad happens, the most important thing to do is to upgrade to the latest fsck.

ReiserFS V3 in the official kernel is much more stable than ext3 for a simple reason: we shipped first, and had longer to stabilize. I would say that historically we have always been more stable than ext3 was at the same moment in time in the official kernel, but that both filesystems were unstable in their beginnings. We don't modify V3 anymore, and this also makes it more stable. If you want exciting unstable improvements, use V4....


FOSDEM - Do you pay attention to software patents ? Did you ever had problem regarding that ?

Hans Reiser - It is Namesys policy to not respond to questions about software patents posed by persons other than those holding the patents in question. I would suggest this policy to Linus, except that I haven't the heart for it.


FOSDEM - Introduce in few words what you're going to talk about during your presentation ...

Hans Reiser - Reiser4 blows V3 away. 2-5x faster. I am going to tell you why.


FOSDEM - What are you expecting from your talk at FOSDEM and from the interactions with other developers present at the event?

Hans Reiser - I never know what I will learn from conferences until after the conference, I just know that I usually learn very important things at them, form unexpected connections that are crucial to the project's survival, and develop long lasting friendships. Europe seems to be leading the way on open source now, and FOSDEM will be exciting.

 







© FOSDEM 2003-2004 - powered by Argon7