Saving Disk Space on your Linux Server with Squashfs
We have been running our browsable code repository code.woboq.org for quite a while now. Adding more and more projects, at some point we noticed that we were getting low on disk space. In this blog post, we explain how we saved huge amount of disk space holding our static HTML files.
First some background on how we do things on the server: The subdirectories (like /linux) are mounted file system images (loop mount). This allows easy upload from a powerful machine where we generate the HTML, and reference files for the code browser. There is a huge amount of small files so using a file system image makes uploading easier and also allows us to update code.woboq.org in a more transactional way: You can just remount the image!
To improve our (lack of) disk space situation, we thought about how we could use compression. The current uncompressed size of code.woboq.org was about 25 GB on an ext4 file system images. A natural idea would be to switch to Btrfs images (which can do compression). However our kernel does not support Btrfs.
Next idea was to use the power of FUSE, the file system in user space. Our kernel supports FUSE, so we didn't have to do a recompile and reboot in this case.
We looked at fuse-zip first, a way to mount ZIP archives as a directory. However, we found out after some time that the fuse-zip version in our Linux distro does not support ZIP64 yet. This means the huge (in terms of inode count) directories that the code browser generator can create were not supported.
So if we would have needed to compile fuse-zip ourselves (in a more current version) anyway, we thought: Maybe there is an even better way than mounting a ZIP archive. After all it was never the intention of the ZIP format to have people use it as a file system.
Turns out there is a better way! We remembered that a lot of embedded devices and Linux Live CDs also need to save space. They often use Squashfs for that. So that's what we decided to use too. Our kernel does not support Squashfs so we are using the FUSE module squashfuse and so far are quite happy with it.
Generating the image (on local machine) is as simple as:
mksquashfs qt5/ qt5.img
Then we just have to upload it to the server and mount it as
squashfuse -o allow_other qt5.img ~/public_html/qt5
A size comparison of the /qt5 tree:
|Original ext4 image||~5 GB||█████████████████████████|
|ZIP file||~470 MB||██|
|Squashfs image||~280 MB||█|
Yes, that is a factor 18x compression for Squashfs!
Regarding the performance, we have not found any drawbacks yet. Possibly Squashfs is even faster since less data needs to be read from the slow hard drive, making the slowdown that the decompression must cause irrelevant.
If you want to look at the implementation of the squashfs linux driver, you can browse it in our code browser.