BDiff / BPatch Utilities

   
Release: 0.2.8
Date: 19 September 2016
O/S: All Windows versions

About BDiff

The information in this tab is based on the bdiff (1) man page written by Stefan Reuther, updated as necessary.

Synopsis

bdiff [options] old-file new-file [ >patch-file ]

Description

bdiff computes differences between two binary files. Output can be either a somewhat human-readable protocol, or a binary file readable by bpatch. Output is sent to standard output unless the --output option is used to specify an output file.

bdiff handles insertion and deletion of data as well as changed bytes.

Options

-q
Use QUOTED format (default).
-f
Use FILTERED format.
-b
Use BINARY format.
--format=FMT
Select format by name: (binary, filtered, quoted).
-m N
--min-equal=N
Two chunks of data are recognized as being identical if they are at least N bytes long, the default is 24.
-o FILENAME
--output=FILENAME
Specify output file name. Specifying --output=- does nothing. Can be used as an alternative to shell redirection.
-V
--verbose
Print status messages while processing input.
-h
--help
Show help screen and exit.
-v
--version
Show version number and exit.

Algorithm

bdiff tries to locate maximum-length substrings of the new file in the old data. Substrings shorter than N (argument to the -m option) are not considered acceptable matches. Everything covered by such a substring is transmitted as a position, length pair, everything else as literal data.

bdiff uses the block-sort technique to allow O(lgN) searches in the file, giving an estimated O(NlgN) algorithm on average.

The program requires about five times as much memory as the old file, plus storage for the new file. This should be real memory, bdiff accesses all of it very often.

Output formats

Quoted format

The quoted format (default) is similar to diff output in unified format: '+' means added data, and a space means data kept from the old file. Lines prefixed with '@' inform you about the position of the next 'space' line in the source file byte offset).

Unlike in diff, there's no implicit line feed after each line of output. Non-printable characters (see isprint (3)) and the backslash character are represented by a \ character followed by the octal three-digit character code.

Filtered format

The filtered format is like the quoted format, but non-printable characters are replaced by dots (.).

Binary format

The binary format is machine-readable, and omits details for common blocks. All words are in little-endian format (low byte first). The format is:

8 bytes
Signature "bdiff02\x1A", where 02 is kind-of a version number. An earlier version (with an O(n^3) algorithm) used the number 01. \x1A is ASCII 26 (Control-Z, an MS-DOS end-of-file marker).
4 bytes
Length of old file in bytes.
4 bytes
Length of new file in bytes.
n bytes
The patch itself, a sequence of the following records:
literally added data:
1 byte
ASCII 43 ('+')
4 bytes
number of bytes
n bytes
data
common block:
1 byte
ASCII 64 ('@')
4 bytes
file position in old file
4 bytes
number of bytes
4 bytes
checksum

The checksum is computed using the following algorithm:

long checksum(char* data, size_t len)
{
    long l = 0;
    while(len--) {
        l = ((l >> 30) & 3) | (l << 2);
        l ^= *data++;
    }
    return l;
}

(rotate current checksum left by two and xor in the current byte).

Administrativia

This manual page is for version 0.2.6 or later of bdiff.

See the license for details of licensing and copyright.

THIS SOFTWARE IS PROVIDED "AS-IS", WITHOUT ANY EXPRESS OR IMPLIED WARRANTY. IN NO EVENT WILL THE AUTHORS BE HELD LIABLE FOR ANY DAMAGES ARISING FROM THE USE OF THIS SOFTWARE.

See also: