
The LC_ALL=C is to ensure reliable sorting order across systems.This is the solution I came up with: dir= (find "$dir" -type f -exec md5sum + find "$dir" -type d) | LC_ALL=C sort | md5sum The proposed find-based solutions are also no good because they only include files, not directories, which becomes an issue if you the checksumming should keep in mind empty directories.įinally, most suggested solutions don't sort consistently, because the collation might be different across systems. tar will include the filename of the directory you're checking itself, just something to be aware of.Īs long as there is no fix for the first problem (or unless you're sure it does not affect you), I would not use this approach.So if you synced to a different system that doesn't necessarily have the same users/groups, you should add the -numeric-owner flag to tar This is in line with what for example rsync -a -delete does: it synchronizes virtually everything (minus xattrs and acls), but it will sync owner and group based on their ID, not on string representation. I usually care about whether groupid and ownerid numbers are the same, not necessarily whether the string representation of group/owner are the same.

This effectively can yield completely different results if you have the "same" directory on different places, and I know no way to fix this (tar cannot "sort" its input files in a particular order). tar processes directory entries in the order which they are stored in the filesystem, and there is no way to change this order.

Ire_and_curses's suggestion of using tar c has some issues:
