Monday, October 21, 2013

Large filesystem on CentOS-6

I had the pleasure of testing out a new server which came with 10 x 4Tb drives. Configured on and HP Smart Array P420 in RAID 50 that gave me a 32Tb drive to use.

I installed CentOS-6 but the installer would not allow me to use the full amount of space. It turns out the maximum ext4 filesystem on CentOS-6 is 16Tb. This is the limit of 32bit block addressing and 4k blocks. (232 * 4k = 232 * 212= 244=240 * 24 = 16T)

That did not make for a very exciting test. Only half the disk available was available in a single filesystem. Sure I could create two logical volumes and then create two filesystems but that seemed a bit like a DOS solution.

Some research turned up the options of using ext4 48bit block addressing. This enables a filesysem up to 1Eb in size and allows for future 64bit addressing for an even larger limit.

The catch was of course that 48bit addressing is not supported by the tools which come with CentOS-6.

Fedora 20 (rawhide) does come with the required updates to e2fsprogs (e2fsprogs-1.42.8-3) which enable 48bit addressing (somehow future-proofed by being called 64bit addressing).  I then set out to rebuild the fedora 20 rpm for CentOS-6. Amazingly the compile was clean though I did have to disable some tests which is a tad alarming. All in all it did not take long to produce a set of replacement rpms for e2fsprogs.

After installing the new tools I was able to create a new filesystem larger than 16Tb. I started at 17Tb to give me room to play. It seems that the CentOS-6 kernel does already have support for 48bit addressing. I ran a number of workloads and I could not find any problems. Still, I don't know what bugs may be lurking in there.

dumpe2fs shows the new filesystem feature '64bit'.

I also attempted to do an offline resize of the filesystem to the maximum size of my disk. This just worked as expected. Online resize is not available until a much later kernel version. I did not attempt this because there are know bugs.

The last limit I wanted to test was the file size limit. Even on my new filesystem the individual file size limit is 16Tb. It is not every day that I get to make one of them so I did
 dd if=/dev/zero of=big bs=1M count=$(expr 16 \* 1024 \* 1024)

At ~500Mb/s it still took 9 hours to complete.
And it turns out that the maximum file size is actually 17592186040320 and not 17592186044416. That is 4k (one block) short of 16Tb. (You can actually check that on any machine with the command truncate big --size 17592186040321 )

That also raised an issue which was a problem on ext3. How long does it take to delete a large file? Well this is the largest possible file and it took 13 minutes.

In conclusion, the 16Tb filesystem limit is easliy raised on CentOS-6 but it comes at the expense of using untested tools and kernel features. Although I did not find any problems in my testing, this could pose a substantial risk if you have over 16Tb of data which you do not want to loose.

If anyone is interested in my rpms you can get them from http://www.chrysocome.net/download