GlusterFS bit by ext4 structure change
Published: Gluster F S, Howtos, Rants & Raves Estimated reading time: ~2 minutes
On Sunday, March 18th, Fan Yong commited a patch against ext4 to “return 32⁄64-bit dir name hash according to usage type”. Prior to that, ext2/3/4 would return a 32-bit hash value from telldir()/seekdir() as NFSv2 wasn’t designed to accomidate anything larger. This broke the distribute translator as suddenly the dirent structure was returning 64bit d_off values. When DHT (Distributed Hash Translator) applied dht_itransform() on those values, it would overflow. Since the dictionary entry did not have a cached offset, it would try to create one again and would end up in an endless loop.
That patch was for kernel v3.3-rc2. To make things more fun, Jarod Wilson merged in that patch in 2.6.32-268.el6 (from “rpm -q –changelog kernel | less). My personal feelings on this is that structure changes shouldn’t have been backported into Enterprise kernels. This has caused a lot of frustrated users on the IRC channel. Most have just reformatted with xfs, which is a valid solution and falls in line with the officially recommended configuration. For some, however that’s just not possible.
Distributions known to be affected by this change are:
- Fedora >= 17
- Red Hat Enterprise Linux (RHEL) 6.3
- CentOS 6.3
- Debian Sid
- Debian Wheezy
The workaround is to either downgrade your kernel, or reformat your bricks xfs OR for RHEL/CentOS, downgrade your kernel to 2.6.32-267 or for everybody else, downgrade to 3.2.9.
The patches that are related to this issue can be tracked at http://review.gluster.com/
UPDATE 2012-08-17 04:02 GMT
Spoke briefly with Vijay ‘hagarth’ Bellur, one of the lead developers, who said, “there are some problems getting NFS and ext3/4 to work with this patch .. hence it is sitting in the queue.”
It is still being actively worked on, though, and is a high priority.