regregex.bbcmicro.net

[ Disclaimer | Changelog | Home ]

README for CRC RevEng v0.60

CRC RevEng, an arbitrary-precision CRC calculator and algorithm finder
Copyright (C) 2010, 2011, 2012 Gregory Cook

This file is part of CRC RevEng.

CRC RevEng is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

CRC RevEng is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with CRC RevEng. If not, see <http://www.gnu.org/licenses/>.

CRC REVENG

CRC RevEng is an arbitrary-precision, machine word length-independent, byte order-independent CRC calculator and algorithm finder in ANSI C. It is a port and expansion of the author's crcbfs.pl script from 2007, and runs up to 200 times faster on equivalent problems. It is also a reference implementation of the author's "Catalogue of parametrised CRC algorithms", with the 61 currently listed models available as presets.

To the author's knowledge CRC RevEng is the first published compiled application to address the general case of CRC algorithm reversal and reverse engineering, its predecessor crcbfs.pl being the first published application of any type to do so. Greg Ewing of Canterbury University in New Zealand solved a CRC algorithm manually on similar principles in 2010, but partly by feeding chosen plaintexts into an implementation at hand.

COMPILING

Compiling CRC RevEng is straightforward: in the i386 GNU/Linux and MinGW environments, simply `cd' to the directory containing the source files, and enter

        make

Otherwise, enter commands similar to the following to compile CRC RevEng on any ANSI C compliant compiler:

        gcc -O3 -Wall -ansi -c bmpbit.c cli.c model.c poly.c reveng.c
        gcc -o reveng bmpbit.o cli.o model.o poly.o reveng.o

If your environment does not include a getopt library, an ANSI C implementation is available under the GPL in several places, for instance the ASPEX project:

        <http://aspex.sourceforge.net/>

The platform-independent profile does not include the preset models. To compile them, you will need to edit the configuration options in config.h to suit your architecture. Having done so, define the macro PRESETS in config.h and recompile as above.

Some enterprise users may wish to disable the -F switch to minimise CPU usage. To do this, define the macro NOFORCE in config.h or on the command line.

SPECIFYING A MODEL

You can use one of the preset models or specify your own.

        reveng -m crc-32

selects the CRC-32 algorithm used in PKZIP and elsewhere. You can dump any preset model as a Williams model record using -d:

        reveng -m crc-32 -d

           Name   : "CRC-32"
           Width  : 32
           Poly   : 04C11DB7
           Init   : FFFFFFFF
           RefIn  : True
           RefOut : True
           XorOut : FFFFFFFF
           Check  : CBF43926

You can specify the parameters of the model instead:

        reveng -w 32 -p 04c11db7 -i ffffffff -l -x ffffffff

This is equivalent to

        reveng -m crc-32

except the model will have no name when dumped. (The -l switch sets both RefIn = True and RefOut = True. To set RefOut separately, use switches -L and -B.)

The options and switches for specifying a model are:

-b
Big-endian input and output (RefIn = False, RefOut = False). Implies -B and -r. This is the default.
-B
Big-endian output (RefOut = False). Implies -r.
-i INIT
The Init value, that is, the initial value to set in the shift register before computing the CRC.
The width must be specified with -k, -m or -w before -i.
-k KPOLY
The generator polynomial (see -p), written in the reversed-reciprocal notation found in Philip Koopman's papers. The highest-order term is included and the lowest-order term is omitted, shifting the other terms to the right. Thus 0x18005 is specified as C002. This automatically provides the Width value.
-l
Little-endian input and output (RefIn = True, RefOut = True). Implies -L and -t.
-L
Little-endian output (RefOut = True). Implies -t.
-M
Specifies that the algorithm does not multiply the message polynomial by xn before division. The resulting algorithm does not comply with the Williams model.
-m MODEL
Set the Width, Poly, Init, RefIn, RefOut and XorOut values to the preset named MODEL. MODEL is matched case-insensitively with the Name and Alias fields in the author's "Catalogue of parametrised CRC algorithms", as of the date of this release. The Name fields can be listed with -D.
-p POLY
The Poly value, that is, the generator polynomial that sets the feedback tap positions of the shift register. Poly is written in the direct notation found in MSB-first code. The highest-order term is omitted, thus 0x18005 is specified as 8005.
The width must be specified with -k, -m or -w before -p.
-P RPOLY
The generator polynomial (see -p), written in the reversed notation found in LSB-first code. The highest-order term is omitted before reversal, thus 0x18005 is specified as A001.
The width must be specified with -k, -m or -w before -P.
-V
Reverse the current model. If RefOut = True, the Init value is reversed otherwise the XorOut value is reversed. Then the Init and XorOut values are swapped, the generator polynomial is reciprocated and the RefIn and RefOut values are negated.
This switch is distinct from -v, see below.
-w WIDTH
The width, that is, the number of bits in the shift register.
-x XOROUT
The XorOut value, that is, the value to be added to the final shift register value before output.
The width must be specified with -k, -m or -w before -x.

Other model-related options:

-d
Dump a Williams model record of the specified model on standard output. Disabled for the non-compliant models created by -M.
-D
List the names of all preset models on standard output.

INPUT AND OUTPUT

Messages for CRC RevEng to process can be specified as files or as numerical (typically hexadecimal) string arguments on the command line. Output from CRC RevEng is either as Williams model records (having their own fixed format) or as numerical string arguments printed one per line on standard output.

When passing numerical arguments on the command line, each argument is conceptually divided into characters, each character consisting of one or more hexadecimal digits. For each character, enough hex digits to specify it are read then a number of bits (specified by the -a option) are taken from the least significant end, reversed (if RefIn = True) and appended to the binary representation of the argument; any excess bits are discarded. The -a option (q.v.) permits a number of useful representations of a given underlying binary sequence.

When passing messages as files, the same division into characters applies; bytes of the file are read until enough bits have been collected, then a specified number of bits are taken from the least significant end, reflected (if RefIn = True) and added to the binary representation.

When printing output, the binary representation is again divided into characters, each of which is reversed (if RefOut = True) and printed with the minimum sufficient number of hex digits.

INPUT/OUTPUT OPTIONS

There are a few more options for controlling the presentation of input and output:

-a BITS
Specifies the number of bits per character in input and output. Implies -A BITS. Particular values of BITS cause the input to be interpreted as a sequence of:
-a 16
16-bit words, raw (in files) or as quartets of hex digits. When RefIn or RefOut are True, this is equivalent to swapping the bytes of each pair before input to an 8-bit calculator.
-a 8
8-bit characters, raw (in files) or as pairs of hex digits (in arguments). This is the default.
-a 7
7-bit ASCII characters, one per 8-bit byte (in files) or as pairs of hex digits (in arguments).
-a 4
Single hex digits (not recommended for files).
-a 3
Single octal digits 0-7.
-a 1
Single binary digits 0-1.
-A OBITS
Specifies the number of bits per character in output.
-f
Arguments are file names; read input messages from the named files.
-r
Right-justified output. If a binary output message does not consist of a whole number of characters, this switch arranges for padding zeroes to be added to the start of the message. The padding will appear in the MSB of the first character (RefOut = False) or the LSB of the first character (RefOut = True). -r is the default when RefOut = False.
-S
Print spaces between the characters of the output string(s).
-t
Left-justified output. If a binary output message does not consist of a whole number of characters, this switch arranges for padding zeroes to be added to the end of the message. The padding will appear in the LSB of the last character (RefOut = False) or the MSB of the last character (RefOut = True). -t is the default when RefOut = True.
-X
Print uppercase hexadecimal characters.
-y
When reading messages from files and the -a option is more than 8, this option specifies that the first byte of each character contains the LSB.

CALCULATING AND REVERSING CRCs

When a model has been specified, use -c or -v to calculate CRCs for input messages.

-c
Calculate a CRC for each argument and print it on standard output, one per line.
-v
Reverse the current model (as for -V) AND the order of characters in each argument, calculate a CRC for each reversed argument, reverse the order of characters in each CRC and print it on standard output, one per line.

If -V and -v are given together, their respective model reversals cancel out. CRC RevEng then calculates an ordinary CRC for each argument, processing the characters from right to left.
Correspondingly, to obtain the same effect as -v using a model reversed by -V, the user must present the characters of his or her message, and process those of the returned CRC, in reverse order.

Take care when the CRC width (-w) is not a multiple of the character width (-a). If the result of a calculation is not what you expect, try selecting left-justification (with -t) or right-justification (with -r).

The -c mode is, of course, useful for creating a checksum to append to a message so that the combination will pass a particular CRC check. The -v mode, on the other hand, is useful for editing a message so as to force its checksum to a desired or at least predetermined value. In order to do this there must be some part of the message's data that can be modified freely without observable effect; many network protocols and file formats, including executables, images and word processor documents, have (or can be altered to have) reserved fields or comment sections that cannot be easily viewed, and whose contents are entirely ignored.

Among the simplest ways to control a CRC calculation is to find one such unused space that is both contiguous and large enough to hold a checksum. For example, suppose we have an existing message with an X.25 CRC:

  0: 44 6F 67 73 2F 2A 12 34  2A 2F 72 6F 63 6B 4E 47 | Dogs/*.4*/rockNG

Here, 4E 47 is the X.25 checksum, and we wish to alter the message without either changing the checksum or failing the CRC. We notice that the 7th and 8th bytes can be replaced at will, and these can contain a calculated value to force the CRC. Firstly we change the text as we wish:

  0: 43 61 74 73 2F 2A 12 34  2A 2F 72 75 6C 65 4E 47 | Cats/*.4*/ruleNG

Calculate the CRC of the part on the left of the unused space with XorOut = 0:

        reveng -m x-25 -x 0 -c 436174732f2a
        9dc5

Then reverse-calculate the CRC of the part on the right, including the old CRC, with Init = 0:

        reveng -m x-25 -i 0 -v 2a2f72756c654e47
        1505

Now exclusive-OR the two returned CRCs together, and insert the result in the unused space. CRC RevEng can be used to do the exclusive-OR if a hex calculator is not to hand:

        reveng -w 16 -p 0001 -c 9dc51505
        88c0

Our edited message now looks like this:

  0: 43 61 74 73 2F 2A 88 C0  2A 2F 72 75 6C 65 4E 47 | Cats/*.A*/ruleNG

To confirm that it still passes the X.25 CRC:

        reveng -m x-25 -c 436174732f2a88c02a2f72756c65
        4e47

* * *

In Stigge et al. (section 4.1) a polynomial q(x) is calculated as the multiplicative inverse of XN such that (xN) q(x) = 1 (mod pCRC(x)). The authors promote the extended Euclidean algorithm as a means of calculating q(x), however any CRC calculator can also produce it. The reciprocal of the CRC-32 polynomial is 0xdb710641, as output by:

        reveng -w 32 -p 04c11db7 -V -d

The authors' constant CRCINV, a reflected representation of q(x), is the CRC of the reversal of the desired remainder, 0x00000001:

        reveng -w 32 -p db710641 -c 80000000
        5b358fd3

Equivalently, the -v function returns q(x) in direct order from the unreflected parameters:

        reveng -w 32 -p 04c11db7 -v 00000001
        cbf1acda

SEARCHING FOR CRC MODELS

The most important feature of CRC RevEng is the ability to recover the parameters of a CRC algorithm from a handful of codewords created by that algorithm. In general at least four data points are needed, either as known parameters or as message-CRC pairs. Extra data points help to eliminate false results and to confirm models that are found.

Known parameters are specified using -w, -p, -i and -x (see SPECIFYING A MODEL above). The width, -w, is a required parameter for all searches and counts as one of the data points. The size of characters (words) in the protocol must also be known and set with -a if this is not 8 bits.

The search function is selected with -s, and message-CRC pairs are given as arguments, each message and CRC combined into one argument. There must not be any non-participating characters between each message and its CRC, or the search will not work. Typically, end-of-message markers do not participate in the CRC.

As non-standard algorithms are comparatively rare, the program first tries all the preset models of the given width, reporting and exiting if one is found. Otherwise it will commence a full search.

If -b or -l are specified, CRC RevEng only searches for algorithms with that bit ordering. Otherwise, it tries RefIn = False, RefOut = False then RefIn = True, RefOut = True. Crossed-endian algorithms are also uncommon and the program will only search for them if requested (with -L).

To find the Poly value when Init is not known, at least two arguments must have the same length.

To find both the Init and XorOut values, at least two arguments must have different lengths; otherwise there is only enough information to determine one value, given the other. If all arguments have the same length then, by default, CRC RevEng fixes XorOut at zero and calculates Init accordingly. (In hardware it is easier to set a non-zero Init than to apply a non-zero XorOut.) To set XorOut to another value, specify -x XOROUT; to fix Init and calculate XorOut instead, use -i INIT.

The full list of search options is as follows:

-F
Do not check against the preset models before searching. (Not recommended.)
-s
Search for CRC models using the given model parameters and arguments as hints.

OTHER FEATURES

CRC RevEng provides a few additional options for convenience:

-e
Echo arguments to standard output. Useful to check that files are being read correctly and, together with -a, -A, -b, -B, -l, -L, -r, -s, -t and -y, to reformat argument strings.
The Init value is exclusive-ORed with the beginning of each argument, so that when the argument is not a whole number of bytes long, an equivalent string can be produced for input to a bytewise calculator (which has Init set to 0).
-h
-u
-?
Print a summary of options and switches to standard error, and exit.

SEARCH EXAMPLES

        reveng -w 16 -l -F -s 31816b 32c16a 31326a0a
        reveng -w 32 -p 04c11db7 -l -s c98964f6b9 a5fa49f2fd 13370aee7df0
        reveng -w 32 -l -s c98964f6b9 a5fa49f2fd 13370aee7df0
                (may take several days)

A comprehensive list is being compiled.

CAVEATS

In addition to the disclaimers listed at the top of this file and in the GNU General Public Licence (see file COPYING), remember that CRC RevEng is merely a search tool, and not authoritative. Searching is only statistical and any particular result may be a fluke, especially from a small number of samples. Also any output is only as accurate as the input.

Model reversal (-V, -v) makes little sense on crossed-endian models.

REFERENCES

Bies, Lammert; et al. "Computer Interfacing Forum" (section "Error detection and correction").

Cook, Greg (20 February 2012). "Catalogue of parametrised CRC algorithms".

Ewing, Gregory C. (March 2010). "Reverse-Engineering a CRC Algorithm". Christchurch: University of Canterbury.

Koopman, Philip (July 2002). "32-Bit Cyclic Redundancy Codes for Internet Applications". The International Conference on Dependable Systems and Networks: 459-468. doi:10.1109/DSN.2002.1028931.

Koopman, Philip; Chakravarty, Tridib (June 2004). "Cyclic Redundancy Code (CRC) Polynomial Selection For Embedded Networks". The International Conference on Dependable Systems and Networks: 145-154. doi:10.1109/DSN.2004.1311885.

Stigge, Martin; Ploetz, Henryk; Mueller, Wolf; Redlich, Jens-Peter (May 2006). "Reversing CRC – Theory and Practice". Berlin: Humboldt University Berlin.

Williams, Ross N. (24 September 1996). "A Painless Guide to CRC Error Detection Algorithms V3.00".

TO DO

INSPIRATION

CRC RevEng came about from the coincidence of four events:

AUTHOR

Greg Cook
<[email address]>
<http://regregex.bbcmicro.net/>

-END-

[ Top of page ]

Changelog for CRC RevEng 0.60

CRC RevEng, an arbitrary-precision CRC calculator and algorithm finder
Copyright (C) 2010, 2011, 2012 Gregory Cook

This file is part of CRC RevEng.

CRC RevEng is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

CRC RevEng is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with CRC RevEng. If not, see <http://www.gnu.org/licenses/>.

Revision history of CRC RevEng

v0.60 2012-02-20

v0.50 2011-09-07

v0.44 2011-08-28

v0.43 2011-08-27

v0.42 2011-05-03

v0.41 2011-04-30

v0.40 2011-02-10

v0.31 2011-02-04

v0.30 2011-01-18

v0.21 2011-01-15

v0.20 2011-01-11

v0.13 2011-01-08

v0.12 2011-01-07

v0.11 2011-01-05

v0.10 2011-01-05

Greg Cook, [email address]
http://regregex.bbcmicro.net/reveng-readme.htm Last updated 2012-02-20

Valid HTML 4.01 StrictHosted by Netnorth

[ Top of page ]