You are correct in your assumption that while all directory entries are deleted immediately after calling unlink(), the actual blocks that physically make up the file are only cleared on disk when nothing is using the inode anymore. (I say "directory entries" because in vfat, a file can actually have several of those, because of how vfat's long file name support is implemented.)
In this context, by inode, I mean the structure in memory that the Linux kernel uses for handling files. It is used even when the filesystem is not "inode based". In the case of vfat, the inode is simply backed by some blocks on disk.
Taking a look at the Linux kernel source code, we see that
vfat_unlink
, which implements the
unlink()
system call for vfat, does roughly the
following (extremely simplified for illustration):
static int vfat_unlink(struct inode *dir, struct dentry *dentry)
{
fat_remove_entries(dir, &sinfo);
clear_nlink(inode);
}
So what happens is:
-
fat_remove_entries
simply removes the entry for the file in its directory. -
clear_nlink
sets the link count for the inode to0
, which means that no file (i.e. no directory entry) points to this inode anymore.
Note that at this point, neither the inode nor its physical representation are touched in any way (except for the decreased link count), so it still happily exists in memory and on disk, as if nothing happened!
(By the way, it's also interesting to note that
vfat_unlink
always sets the link count to
0
instead of just decrementing it using
drop_link
. This should tip you off that FAT
filesystems do not support hard links! And is further indication
that FAT itself does not know of any separate inode concept.)
Now let's take a look at what happens when the inode is
evicted. evict_inode
is called when we do
not want the inode in memory anymore. At its earliest, this can
of course only happen when no process holds any open file
descriptor to that inode anymore (but may in theory also happen
at a later time). The FAT implementation for
evict_inode
looks (again, simplified) like this:
static void fat_evict_inode(struct inode *inode)
{
truncate_inode_pages(&inode->i_data, 0);
if (!inode->i_nlink) {
inode->i_size = 0;
fat_truncate_blocks(inode, 0);
}
invalidate_inode_buffers(inode);
clear_inode(inode);
}
The magic happens exactly within the if
-clause: if
the inode's link count was 0, it means that no directory entry is
actually pointing to it. So we set its size to 0 and actually
truncate it down to 0 bytes, which actually deletes it from disk
by clearing up the blocks it was made of.
So, the corruption you are experiencing in your experiments is
easily explained: Just as you suspected, the directory entry has
already been removed (by vfat_unlink
), but because
the inode wasn't evicted yet, the actual blocks were still
untouched, and were still marked in the FAT (an acronym for File
Allocation Table) as used. fsck.vfat
however detects
that there is no directory entry which points to those blocks
anymore, complains, and repairs it.
By the way, CHKDSK
would not just clear those blocks
by marking them as free, but create new files in the root
directory pointing to the first block in each chain, with names
like FILE0001.CHK
.