John's Linux Blog: File Systems

Showing posts with label File Systems. Show all posts

Monday, October 21, 2013

Large filesystem on CentOS-6

I had the pleasure of testing out a new server which came with 10 x 4Tb drives. Configured on and HP Smart Array P420 in RAID 50 that gave me a 32Tb drive to use.

I installed CentOS-6 but the installer would not allow me to use the full amount of space. It turns out the maximum ext4 filesystem on CentOS-6 is 16Tb. This is the limit of 32bit block addressing and 4k blocks. (2³² * 4k = 2³² * 2¹²= 2⁴⁴=2⁴⁰ * 2⁴ = 16T)

That did not make for a very exciting test. Only half the disk available was available in a single filesystem. Sure I could create two logical volumes and then create two filesystems but that seemed a bit like a DOS solution.

Some research turned up the options of using ext4 48bit block addressing. This enables a filesysem up to 1Eb in size and allows for future 64bit addressing for an even larger limit.

The catch was of course that 48bit addressing is not supported by the tools which come with CentOS-6.

Fedora 20 (rawhide) does come with the required updates to e2fsprogs (e2fsprogs-1.42.8-3) which enable 48bit addressing (somehow future-proofed by being called 64bit addressing). I then set out to rebuild the fedora 20 rpm for CentOS-6. Amazingly the compile was clean though I did have to disable some tests which is a tad alarming. All in all it did not take long to produce a set of replacement rpms for e2fsprogs.

After installing the new tools I was able to create a new filesystem larger than 16Tb. I started at 17Tb to give me room to play. It seems that the CentOS-6 kernel does already have support for 48bit addressing. I ran a number of workloads and I could not find any problems. Still, I don't know what bugs may be lurking in there.

dumpe2fs shows the new filesystem feature '64bit'.

I also attempted to do an offline resize of the filesystem to the maximum size of my disk. This just worked as expected. Online resize is not available until a much later kernel version. I did not attempt this because there are know bugs.

The last limit I wanted to test was the file size limit. Even on my new filesystem the individual file size limit is 16Tb. It is not every day that I get to make one of them so I did
dd if=/dev/zero of=big bs=1M count=$(expr 16 \* 1024 \* 1024)

At ~500Mb/s it still took 9 hours to complete.
And it turns out that the maximum file size is actually 17592186040320 and not 17592186044416. That is 4k (one block) short of 16Tb. (You can actually check that on any machine with the command truncate big --size 17592186040321 )

That also raised an issue which was a problem on ext3. How long does it take to delete a large file? Well this is the largest possible file and it took 13 minutes.

In conclusion, the 16Tb filesystem limit is easliy raised on CentOS-6 but it comes at the expense of using untested tools and kernel features. Although I did not find any problems in my testing, this could pose a substantial risk if you have over 16Tb of data which you do not want to loose.

If anyone is interested in my rpms you can get them from http://www.chrysocome.net/download

Thursday, July 18, 2013

Sparse files on Windows

Once again I am drawn away from Linux to solve a Windows problem. The source of the problem is Hyper-V which (as always) has a cryptic error message about 'cannot open attachment' and 'incorrect file version'.

The source of the error was tracked down to the file being flagged as Sparse.

What is a sparse file?
Under UNIX/Linux, a sparse file is a file where not all of the storage for the file has been allocated. Handling of sparse files is normally transparent but some tools like file copy and backup programs can handle sparse files more efficiently if they know where the sparse bits are. Getting that information can be tricky.

In contrast, under windows, a sparse file is a file which has the sparse flag set. Presumably the sparse flag is set because not all of the storage for the file has been allocated (much like under Linux). Interestingly, even if all the storage is allocated, the sparse flag may still be set. (It seems the flag indicates the potential to be sparse rather than actually being sparse. There is an API to find out the actual sparse parts).

The the problem started when I happened to download a Hyper-V virtual machine using BitTorrent. When the files are being created, not all of the content exists so it is indeed sparse. Once all the content has been supplied, the file is (to my mind anyway) no longer sparse. However, under windows it seems, once a sparse file, always a sparse file.

Microsoft provide a tool to check and set the sparse flag:
fsutil sparse queryflag <filename>
fsutil sparse setflag <filename>
Note 1: Have they not heard of get and set
Note 2: You can't use a wildcard for <filename>
The amazing thing to note here is that there is no clearflag option. This might lead you to believe that you can not do that. In fact you can. For users in a pickle, there is a program called Far Manager which can (among other things) clear the flag. Far Manager is open source and a quick peek at the code shows that it uses a standard IOCTL to do this named FSCTL_SET_SPARSE.

So with that knowledge, it is actually quite easy to make a file not be sparse any more. In fact, I wrote a program called unsparse.

Not only does the tool have the ability to clear the sparse flag, it can recursively process a directory and unsparse all the sparse files found, making it perfect to fix up a Hyper-V download.

Look for the program soon on my chrysocome website http://www.chrysocome.net/unsparse

Friday, April 19, 2013

Visual Studio 2010

I am not a fan of Visual Studio. Unfortunately I must use it for some projects. Recently I was forced to upgrade to Windows 7 and Visual Studio 2010. Not wanting to duplicate all my files, I decided to leave them on a network drive and just access them via the network.

Seems like a good idea, after all, why have a network if you store everything locally? Well, it seems that Visual Studio does not like that.

For some settings it will decide to find a suitable local directory for you. For some other settings it leaves you high and dry.

For example, when I try and build my program I get the error:

Error 1 error C1033: cannot open program database '\\server\share\working\project\debug\vc100.pdb' \\server\share\working\project\stdafx.cpp 1 1 project

The internet was of little use which is why I thought I would put it in my blog.

This goes a long way to explain my criticism of Visual Studio. After installing several gig of software, is that the best error message it can come up with? Well I will try the help, Oh, that is online only. After jumping through some hoops, the help tells me that:

This error can be caused by disk error.

Well, no disks errors here. Perhaps it means that it does not like saving the .pdb on a network share. What is a .pdb anyway???

In the end, my solution (can I call it that or as Visual Studio hijacked that word?) was to save intermediate files locally:

Open Project -> Properties...
Select Configuration Properties\General
Select Intermediate Directory
Select <Edit...>
Expand Macros>>
Edit the value (by double clicking on the macros or just typing in):

$(TEMP)\$(ProjectName)\Debug\

Select OK
Select OK

And that seems to sort it out.

Tuesday, March 26, 2013

scp files with spaces in the filename

scp is a great tool. Built to run over ssh it maintains a good unix design. It does however cause a few problems with it's nonchalant handling of file names.

For example, transferring a file which has a space in the name causes problems because of the amount of escaping required to get the space to persist on the remote side of the connection.

Copying files from local to remote is easy enough, just quote or escape the file name using your shell.
scp Test\ File remote:
scp 'Test File' remote:
scp "Test File" remote:

Copying files from remote to local can be more tricky
scp remtote:Test\ File .
scp: Test: No such file or directory
scp: File: No such file or directory

scp stimulus:"Test File" .
scp: Test: No such file or directory
scp: File: No such file or directory

scp "stimulus:Test File" .
scp: Test: No such file or directory
scp: File: No such file or directory

The escaping is working on your local machine but the remote is still splitting the name at the space.

To solve that, we need an extra level of escape, one for the remote server. Remember that your shell will eat the \ so you need a double \\ to send it to the remote
scp remote:Test\\\ File .
Test File 100% 0 0.0KB/s 00:00

You can also combine the remote escape with local quoting
scp remote:"Test\ File" .
or
scp "remote:Test\ File" .

Best solution of all, avoid spaces in file names. If you can't avoid it, I find the easiest solution is to replace ' ' with '\\\ '.

Sunday, December 23, 2012

SMART errors and software RAID

For a while now I have been having an intermittent problem with some of my drives. They seem to be working but stop responding to SMART. If I reboot they don't show up any more but if I power off and back on they work fine.

I have two 640Gig WD Green drives in a software RAID1 and another WD Blue drive. After a suspend/resume, sometimes one of the drives is MIA. The kernel thinks it is still there but my SMART tools can't talk to it and start to complain.

The Blue drive has a bad (unreadable) sector but touch wood, that has not caused a problem yet. SMART knows about this and tells me about it frequently. This however does not appear to be the problem. There is a telling message is in dmesg:

sd 0:0:0:0: [sda] START_STOP FAILED

sd 0:0:0:0: [sda] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK

PM: Device 0:0:0:0 failed to resume: error 262144

So for some reason, the disk is not responding when the computer resumes. I guess it is a timeout (and I wonder if I can extend it?) Anyway, now I have the situation where my disk is not working even though I know there is nothing actually wrong with it.

Luckily I am using software RAID and the other disk is working so I can continue about my business without crashing or loosing data. After poking and prodding a few different things I have worked out a solution:

Hot-remove the device (from the kernel)

echo 1 > /sys/block/sda/device/delete

Rescan for the device to hot-add it to the kernel

echo "- - -" > /sys/class/scsi_host/host0/scan

Add the 'failed' drive back into the RAID set

mdadm /dev/md127 --re-add /dev/sda2 

You must remove the existing (sda) device first or the disk will be re-detected and added with a new name (sde in my case).

Because I have a write-intent bitmap, the RAID set knows what has changed since the drive was failed and only the changes must be re-synced which is quite fast.

There seems to be a 'vibe' that green drives are not good for RAID. I don't really think this is a problem because the drive is green, I think it is a problem because the driver is not trying hard enough to restart the disk.

So in the end this was not a SMART problem after all. Not there there are no bugs to fix there. Particularly in udisks-helper-ata-smart-collect which keeps running and locking up sending the load average into the hundreds. For a tool designed to detect error conditions it probably needs a bit more work.

My next job is to select a replacement drive for the faulty WD Blue...

Saturday, November 10, 2012

Samba guest access

I was trying to share some photos for my wife to download from my linux desktop machine (CentOS 6). I had ~10Gig of photos but she only wanted a hand full. I though the best option would be to share out the folder using samba and she can use windows explorer to pick the ones she wants

Well it seems simple now but it took a few tries to get it working.

I use security=server which is not the most secure method but it is normally easy & convenient. It also seems to be poorly documented, particularly for more recent samba releases and versions of Windows.

My wife has an account on the password server but not on my new desktop. Guest access I though would be a simple solution here but no so. The problem is that guest access will not work by default when using security=server (despite what the man page says). There is a new setting called map to guest which defaults to Never. I had to change this to Bad User to get it to work.

There are the relevant parts from my working smb.conf:

security = server
password server = myserver

map to guest = Bad User

[photos]
comment = John's photos
path = /home/jnewbigin/photos
guest ok = yes
writable = no
force user = jnewbigin

So now it is working and I am happy and my wife is happy too.

Wednesday, August 29, 2012

Virtual Volumes and Dokan

It has been a holy grail for many years now and it is very close. The ability to assign a drive letter to your linux filesystems under Windows.

I am doing some final tests on my first release of Virtual Volumes which includes support for Dokan. Dokan is the Windows equivalent of FUSE under Linux. This means you can implement a filesystem in a userspace application.

There are some knows issues but my Windows XP install test was successful and I am currently doing some large file testing. If all goes well I might get a beta version released this weekend.