Note that this attack is now the basis of Scott's paper, "Cryptanalysis of the Mercy block cipher", which was presented to FSE 2001.
While you are busy having fun gathering statistics on RC4 tetragraphs, I've been dinking around with Mercy, and I happened to stumble across a differential that holds with probability 2**-94 (according to my calculations). I'll draw out the bitwise differences. I'm using Fig 2 of your paper as a reference, so it proceeds from ciphertext to plaintext (not that it matters to the attack).
The numbers are the bitwise (xor) differences for each 32 bit word (XXXXXXXX means the difference could be anything), and the top set of numbers is the left-hand side half-block, and the bottom set of numbers is the right-hand side. In other words, for each round, the bottom figures are sent through the Q array, and xored into the top, and then the bottom and top are swapped.
The ellipsis on either side of the half-block refer to the rest of the data in the half-block, and is either 00000000 (no differences) or XXXXXXXX (unknown difference) -- the first/last word in the half-block will tell you which.
The actual differences occur somewhere in the middle of the half-block (exactly where doesn't matter too much, but it shouldn't be too close to either side).
Before/after initial whitening, it looks like:
After round 1, it (naturally) looks like
After round 2, it looks like:
The trick here is that the 128 output coming down out of the Q box after the last delta has a high probability of having zero difference, and so the difference does not propagate to the rest of the block. Continuing,
After round 3, it looks like:
(Same trick as last round)
After round 4, it looks like:
Now, at round 5, it starts to break down:
And, after round 6,
And, after post-whitening, that's what it outputs. If we place the initial difference near the middle of the half-blocks, then the whole front of the left hand half block (which might be 1024 bytes) is absolutely identical between the two resulting plaintexts, and that is not profoundly likely to happen with a truly random function.
This differential assume that the adds in the Q box treat the bit differences exactly like an XOR would -- this actually occurs with probability 0.5, and since I count a total of 94 nontrivial bitpositions added, this results in the differential probability. If the adds were replaced by xors, this should be a probability 1 differential.
One thing I haven't done is taken a copy of Mercy, replaced the adds with xors, and expirementally tried it. Just too lazy at the moment.
Much more efficient differentials exist for 5 rounds (probability 2**-25), and even better for 4 rounds (probability 2**-7). The technique I'm using runs out of gas after 6 rounds -- no similar differential exists at any probability level for 7 rounds.
Another method of generating differentials would involve starting off with nonzero differences in both halves. I haven't investigated this -- you don't get the first round "for free", but it may lead to higher differential probabilities overall.
I've been working a bit more with it, and I've found a much better differential. It holds with probability 2**-30, and it can be used against 7 round Mercy (!). Against standard (6 round) Mercy, you can lop off a round, increasing the probability to 2**-26, or you can keep it, which allows you to isolate individual T box accesses. In addition, I suspect that it can be used as the basis of a boomarang attack against 10 round Mercy with probability slightly better than brute force.
This differential uses the same basic approach. Here is how it looks (using the same syntax as before):
Before/after initial whitening, it looks like:
After round 1, with probability 2**-4,
After round 2, with probability 2**-11,
After round 3, with probability 1,
After round 4, with probability 2**-11,
After round 5, with probability 2**-4
After round 6, with probability 1
After round 7, and after post whitening, with probability 1
Again, this differential works by cleverly arranging things so that the differential of the vertical components within the Q-array is 0 with high probability immeditately after the difference, preventing the differences from avalanching.
I also moved the differences from the least significant bit in each byte to the most significant -- this is a gain because an add always treats a difference in the MSBit exactly as an xor does, and so any add in that bit position does not change the overall probability.
I forgot to mention it, but these differentials are sensitive to the overall Festel structure (Fig 2 of the paper), and the exact Q box and M box definition. It does not care what the T box does, or the spice/key schedule. For the T box, it maintains the inputs for all T boxes until the last two rounds (when it is too late to matter).