06 June 2010

Serious performance issues with ext4fs barriers

More often lately users ask for support because they experience horrendously slow performance. In almost all cases they run Liferea on ext4fs where sqlite, when used with a lot of small update operations doing a lot of fsync() calls, is quite slow. This is not a specific problem of sqlite though, it just is very visible with applications performing a lot of write access like Liferea does using sqlite.

The important difference in this aspect from ext4fs to ext3fs is that ext4fs comes with barriers enabled, which is a filesystem feature (optional in ext3fs) that tries to improves filesystem integrity. But this comes at a cost: depending on your application use case this might decrease filesystem throughput a lot, which is what many Liferea users experience.

A workaround is to disable the ext4fs barriers by adding "barrier=0" to the mount options in /etc/fstab and remount the partition.

16 comments:

Anonymous said...

Maybe Liferea doesn't need all of these "fsync" if it doesn't need integrity.

beroal said...

This is not a specific problem of sqlite though
I am not so sure. Tell me, are those updates in a separate transaction each? Also. “An Asynchronous I/O Module For SQLite”.

Lars said...

@beroal: I'm not sure who's at fault with the fsync() performance/semantics, so you might be right. Concerning using asyncvfs: we added an implementation several days ago in SVN trunk. This should improve performance in 1.8

beroal said...

Let me explain. I think that fsync should not be blamed but liferea forces sqlite to issue fsync more often than needed. To check this requires digging into the source code, guessing in general is not enough. Can you please point to specific source files which have "update" SQL command at question?

Lars said...

@beroal: If you want to analyze our access pattern I'd suggest to run Liferea with --debug-db which makes all DB accesses visible. As for the source you have a look at src/db.c which contains all DB code.

beroal said...

Thanks, Lars.

Anonymous said...

Same issue for btrfs

Ari said...

anyone tried SVN trunk on EXT4 and could verify if there is improvement or not?
On ubuntu 10.04, SSD and EXT4, liferea stable and unstable performance is dismal.
I've been looking for an RSS reader alternative for a few months now due to this.

beroal said...

2 Ari:
anyone tried SVN trunk on EXT4 and could verify if there is improvement or not?
Yes, I am using a quite old SVN version and it is faster because it uses sqlite's asynchronous I/O module. I though must confirm that that version is very buggy.

Jean-François said...

Just wanted to report my findings with Liferea 1.6.3 in Ubuntu 10.10 on a Dell Mini 9 (slow SSD, ext4 partition).

Cold startup times for liferea:
* With barriers (default): 52 seconds
* With barriers=0: 48 seconds
* With barriers, noatime: 52 seconds

Observations/conclusions:
- the barriers are not a "significant" performance problem on this bench machine
- I have noticed that hard disk I/O only happens at the very end, in the last ~5 seconds or so. During the first ~45 seconds, it doesn't seem to do anything (even the CPU is not maxed out). It just seems to be idling or doing menial tasks.
- therefore: the problem lies elsewhere.

I'd be happy to provide more info if you need it. This problem is quite visible on netbooks.

Jean-François said...

Here's an even more interesting observation: I made an ext3 partition just for the ~./liferea_1.6 folder, moved the data there and measured the cold startup time...

52 seconds.

Thus, ext4 was not the culprit at all on my netbook. Something else is going on.

stelmed said...

The difference with barriers enabled and disabled is ~ 10% in start-up time. This doesn't solve the problem at all. IS there any other possible solution? It is impossible to use Liferea during my daily work anymore...

beroal said...

2 Jean-François, stelmed:
First, run "liferea" with "--debug-db", "--debug-trace", then you can see which step is slow.

Considering the startup time, I observed that the main part of it is the "delete" SQL statement and a lot of short transactions in the function "import_OPML_feedlist".

I have a patch dealing with "import_OPML_feedlist". It is for the git version.

On "delete", ask the developers. It seems that this statement reads the entire database at startup. Or decrease the size of your database.

Jonas Finnemann Jensen said...

Hi, I've just installed liferea stable... And noticed some performance issues (possibly related).
From looking at my disk utilization led, I think liferea might be doing a commit with every sqlite request (sqlite does this by default, auto-committing).
In my experience with sqlite this is bad. Sqlite does a flush and file swap for every commit, which is quite expensive.

The trick is to initiate a new transaction when the program starts, then commit and begin a new transaction no more than every 3 minutes (assuming there's something to commit).
This could cause you to loose 3 min of flagged RSS entries in case of system failure, but given the nature of the data and the frequency of system failures this is hardly an issue.

I've checked out the unstable version, and the asynchronous sqlite hack does make things better, but there's still a lot of unecessary disk I/O. Just, commit every 3 minutes (not during sync) and after every completed synchronization (when no more feeds are pending).

Anyways, thanks for an awesome RSS reader. Synchronization through Google Reader to gReader on my phone is a killer feature :)

beroal said...

@Jonas Finnemann Jensen: The problem is not just lost changes. If you commit at arbitrary moments, integrity of a database may be lost.

Jonas Finnemann Jensen said...

@beroal
disclaimer: I'm obviously not familiar all the details in the code, and I've just done a quick grep through the code for any database related stuff.
But as far as I can see from db.h, every single update to an item is a transaction (and the app could crash between anyone of these).

The only place I see multi-statement transactions is in db.c:db_init and db.c:db_item_update.
But since all the database work is done in the UI-thread, g_timeout_add() could be used to commit every 3 min, then the commit cannot occur inside db_init or db_item_update.
- Unless g_main_loop_run is called inside db_init or db_item_update, but that's probably not an issue :)