File index.html of Package zerofree
<html> <head> <!-- Support idiotic mobile browsers that are incapable of rendering straightforward HTML properly --> <meta name="viewport" content="width=device-width, initial-scale=1"> <title>Keeping filesystem images sparse</title> </head> <body> <h2>Keeping filesystem images sparse</h2> <p> Filesystem images in local files can be used by many PC emulators and virtual machines (<a href="http://user-mode-linux.sourceforge.net/">user-mode Linux</a>, <a href="http://fabrice.bellard.free.fr/qemu/">QEMU</a> and <a href="http://wiki.xensource.com/xenwiki/">Xen</a>, to name but three). Typically these filesystems are created as sparse files using commands like: <p> <pre> dd if=/dev/zero of=fs.image bs=1024 seek=2000000 count=0 /sbin/mke2fs fs.image </pre> where the enormous <code>seek</code> value causes <code>dd</code> to move forward by 2GB before writing nothing at all. This results in the creation of a sparse file which takes disk space only for blocks which are actually used: <p> <pre> $ ls -l fs.image -rw-rw-r-- 1 rmy rmy 2048001024 Apr 18 19:10 fs.image $ du -s fs.image 31692 fs.image </pre> As the filesystem is used, more and more of the non-existent blocks are filled with data and the size of the file on disk grows. Sometimes it would be nice to be able to reclaim unused blocks from a filesystem image. However, deleting files from the image doesn't return the space to the underlying filesystem: even free blocks in the image still consume space. Reclaiming the space can be achieved in two stages: <ul> <li>Fill unused blocks with zeroes <li>Make the file sparse again </ul> <p> One traditional way to zero unused blocks is to create a file that fills all the free space: <p> <pre> dd if=/dev/zero of=junk sync rm junk </pre> <p> The disadvantage of <code>dd</code> in this context is that it destroys any sparseness that exists: free blocks that were originally represented as holes in the image file are replaced with actual blocks containing zeroes. Also, filling up a live filesystem is probably a bad idea. <p> As an alternative approach, and as practice in mucking about with ext2 filesystems, I've written a utility which scans the free blocks in an ext2 filesystem and fills any non-zero blocks with zeroes. The source, <a href="zerofree-1.1.1.tgz">zerofree-1.1.1.tgz</a>, is available for download. <p> <ul> <li>A cautious user would run fsck on the filesytem both before and after running zerofree. <li>The filesystem to be processed should be unmounted or mounted read-only. <li>The utility also works on ext3 or ext4 filesystems. <li>Binary packages are available in the standard repositories for Debian and Fedora. <li>The <a href="http://www.sysresccd.org/SystemRescueCd_Homepage">SystemRescueCd</a> live distribution includes zerofree. <li><a href="http://libguestfs.org/">guestfish</a> can run zerofree on many types of virtual machine filesystems. </ul> <p> However, this is only half the story: the empty free blocks still consume space in the underlying filesystem, so something must to be done to reclaim that space. <p> A common suggestion is to use the sparse file handling capabilities of the GNU <code>cp</code> command to take a copy of the filesystem image with <code>cp --sparse=always</code> (though this does require the original and sparse files to exist at the same time, which may be inconvenient). <p> If your kernel and util-linux are sufficiently modern and you have a supported filesystem you can use <code>fallocate -d</code> to 'dig holes' in a file. This makes the file sparse in-place, without using extra disk space. <p> <hr> <address> <a href="mailto:rmy@pobox.com">Ron Yorston</a><br> 18th April 2004 (updated 19th February 2018)<br> Some <a href="obsolete.html">obsolete</a> information has been moved to a separate page. </address> </body> </html>