rsync does not detect when local file changes content without changing size
See original GitHub issueHey all,
I work on a decent size team and we use the gsutil across a wide number of projects to support our devops and deploy tasks. If the -m
flag is left off there will be a warning when using commands like gsutil rsync
, this leads a reasonable developer to do the right thing and update any scripts which shell out to gsutil
to use the -m
flag.
A problem with this scenario, that has bitten us numerous times, is that the content matching causes edits like single line config changes to be treated as if there is no change at all. The gsutil
tool will treat locally updated files as if they are the same as outdated content in the storage buckets. Sometimes we can catch this early but often it takes quite a bit of debugging and head scratching to arrive at the cause of unexpected behavior being the -m
flag.
Sometimes in our code the flag will be removed, only to be added again later by a different, well meaning developer later.
I don’t have a specific solution to this but have been bitten by numerous times. I think, at the very least the warning/suggestion to use the flag should clearly mention this undesired side effect and what other flags should be added to get around issues if it is used.
Issue Analytics
- State:
- Created 8 years ago
- Comments:6 (4 by maintainers)
Top GitHub Comments
This behavior doesn’t have anything to do with the presence of the -m option. Instead, it is related to the -c option - see the “CHANGE DETECTION ALGORITHM” section of
gsutil help rsync
.By default, gsutil does not perform a full checksum of local files, but checks only their sizes. If you expect that files will change contents without changing size, you’ll need to use the -c option to trigger checksumming that will detect these changes. The downside is that this generates a lot of local disk I/O, since every local file within the scope of the rsync must be read. This is why the option is turned off by default.
When copy project files, for example javascript files, this is very important to check checksum, because size won’t do the job. Simple example, just swap arguments in the function, this won’t change the size of file, but will change the logic of function and this won’t be copied with rsync, without -c flag. I have spent some time to understand this one (behaviour of rsync)…