An occasional outlet for my thoughts on life, technology, motorcycles, backpacking, kayaking, skydiving...

Thursday, April 23, 2009

How to create patches with diff or “svn diff” and apply them with patch

Nearly all of this information is available elsewhere on the internet. However, I am going to go about demonstrating it in ways that I wish others had. I am primarily concerned with recursively patching directories (patching a single file is very simple), so I ran into some problems which I didn't see anyone else address. Those unique solutions are the main reason I chose to create yet another diff/patch how to.

I like to create demonstrations that you (or I, 6 months from now when I forget all of this) can copy/paste on your own system to prove that it works. (The whole freaking thing, or line-by-line.) I learn best by doing, so I encourage you to "do" these proofs instead of trusting me.

First exercise: Let us create our test environment.

mkdir -p difftest/date
date >> difftest/date/time.txt
find difftest > difftest/files.txt
# Okay, no you have a project folder called difftest

cp -r difftest difftest_new
date >> difftest_new/date/time.txt
mkdir difftest_new/files
mv difftest_new/files.txt difftest_new/files/list.txt
# Now we have made a few modifications to new revision called difftest_new

# Lets not touch the original project folder. We will make a copy to hack on.
cp -r difftest difftest-to-be-patched

# The following command creates a patch
diff -ruN difftest difftest_new > difftest-to-difftest_new.patch
# That is it. You have a patch file that you can share with anyone without embarrassment.
# Note: It is customary to create the patch from the parent folder of both the old and
# new folders. It is bad form to create the patch from either of those folders
# with ../path to get to the other.

# A customary diff patch can be with -p1 which strips the first dir off the path
patch -E -p1 -d difftest-to-be-patched < difftest-to-difftest_new.patch
# The -E is not important here, but it is needed for patches created by svn. I suggest you
# always use it.

# The following command will tell you 2 directories are identical. You just have to trust me.
# We will use this more later.
diff -r difftest_new difftest-to-be-patched
# You want it to return nothing. (It is a diff command, not a same command.)

You need to lookup the means of the args given to the diff and patch commands in there man pages. I hate telling people to RTFM because most are so long you cannot find what you need. I am telling you what you need and asking you to learn why.

The mysterious argument is "N" on the diff command. The manual says, "Treat absent files as empty." Which meant nothing to me, but I learned that I need it. That is what causes the patch to handle the files you move/remove/rename in your revision. That's important and hard to troubleshoot.

Second exercise: Let's put our test into an svn repo and create the patch from there.

# These steps assume you have completed the steps above.
svnadmin create repo
svn mkdir file://$PWD/repo/tags -m 'initial setup'
svn import -m "initial import" difftest file://$PWD/repo/trunk
# Now we have our original project in a new local repo

svn copy file://$PWD/repo/trunk file://$PWD/repo/tags/release1 -m "deployed $(date)"
# Now we have it tagged

# We will make the same changes to trunk that we made in the first exercise.
svn checkout file://$PWD/repo/trunk
date >> trunk/date/time.txt
mkdir trunk/files
mv trunk/files.txt trunk/files/list.txt
svn add trunk/files
svn remove trunk/files.txt
svn commit trunk -m "new"
# Changes are made and commited

svn copy file://$PWD/repo/trunk file://$PWD/repo/tags/release2 -m "deployed $(date)"
# Now we have this one tagged

# Just to prove that you do not need to have a local copy of the code to do this...
rm -rf trunk

# You can create a patch that would bring release1 up to release2 like so...
svn diff file://$PWD/repo/tags/release1 file://$PWD/repo/tags/release2 > release1-to-release2.patch

# So, if your client has release1... created like so...
svn export file://$PWD/repo/tags/release1

# You can tell them to apply the patch like so...
patch -d release1 -p0 -E < release1-to-release2.patch

# I can prove it to you like so...
svn export file://$PWD/repo/tags/release2
diff -r release1 release2
# Again, you have to trust that diff works.

# Finally, if you do not have tags to work with you can do a similar thing with revision numbers
svn diff -r2:4 file://$PWD/repo/trunk > revision2-to-revision4.patch
# That patch is functionally identical to release1-to-release2.patch
# To discover which -r range to use, examine: svn log file://$PWD/repo/trunk

The argument "-E" is important when applying patches created from svn. (There is a subtle difference between patches created with diff and svn diff.) If you leave off the -E you will notice that files you remove (and that includes the original of any file your rename or move) will still exist after applying the patch, but they will have a size of 0 bytes. This is a booger to trouble shoot, and none of the other "how tos" are going to help you much.

So, why would you do this? Why not just send the complete revision to your client, production server, etc. In many cases your projects code can be hundreds of Megabytes. These patches are usually small enough to email. Even if you only modified a single file, following the customary form will save the recipient from having to locate the file that needs to be replaced/patched.

If that doesn't convince you, check out Transferring large PSD files quickly using Diff-Patch