Cyclic Redundancy Check: Mathematical and Hardware Overview
Cyclic Redundancy Check: Mathematical and Hardware Overview
ELECTGON
www.electgon.com
[email protected]
09.10.2020
Contents
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2 Error Detection Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
3 CRC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
4 Modular Arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
5 CRC Arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
6 CRC Mathematics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
7 CRC Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
8 Hardware Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
9 Serial CRC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
10 XOR First Serial CRC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
11 External Linear Feedback Shift Register . . . . . . . . . . . . . . . . . . . . . . . . . . 11
12 Parallel CRC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
12.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
12.2 Input Data Order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
12.3 Accumulated CRC Calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Abstract
Although CRC technique is almost found in every digital system on chip, the theory and proof of
this processing step is ambiguous for many engineers. This document is trying to declare how
CRC works and how it can change and be used with different parameters to check accuracy of data.
Dealing with mathematics maybe confusing for some readers but it is necessary to understand how
CRC validates checksum of train of binary bits. The purpose of this document is not to explore deep
background of the CRC but to explain its theory hopefully in understandable way to help engineers
build or adapt their own CRC according to their system needs.
CRC Overview 1. Introduction
1 Introduction
In digital data processing, Data are handled as chunk of bytes, each byte is 8 bit. Thus any set of
data can be considered as sequence of bytes or bits. Transferring these data between two points,
requires then to transfer it in the right sequence. Since bits are considered to be independent and
a bit sequence is limited by the size of data chunks (or frames, packets, datagrams, etc.), thus we
consider the problem to check whether a inite sequence of bits is changed unintentionally, e.g. by
transmission errors or faulty storage in memories.
To verify whether data are sent in right sequence or not, we add some redundant information
to the bit sequence. Those redundancy information are extracted from transmitted data based
on some operations on the bit sequence. The result of these operations is added as redundancy
information to the actual data.
The receiver of a message, or the reader of memory input can make the same operations on the
received data and compare against the stored redundancy information and decides if an error has
occurred in the data or not.
3 CRC
Cyclic Redundancy Check is a process which is based on division between two binary vectors: irst
binary vector is the data (the dividend) which is divided by another ixed binary number (the
divisor). The remainder of this division is the check value that we use to validate accuracy of the
data. This operation is most used during transmission of data and when used, it is performed at
both sides transmitter and receiver sides. In transmission side, the CRC (which is the remainder of
the division) is calculated by dividing the data by the ixed binary number which is usually called
the generator polynomial. After performing the division, the resulted CRC is appended to the data
and transmitted to its destination.
At receiver side, the receiver either computes the CRC again for the data only (without appended
CRC) or computes the CRC for the whole received message (original Data and its appended CRC). In
case of computation of CRC for the data only, the receiver shall compare the computed CRC against
appended CRC. If they both are the same, this means that the message has been received without
errors. In case of computation of CRC for both data and appended CRC, the computation result
should be zero. If not, it means that some error has be produced in received message. In fact, this
www.electgon.com 1
CRC Overview 4. Modular Arithmetic
last sentence is not practically correct, that is because the appended CRC is modi ied a little bit
before transmission, it is ones‐complemented then appended to the data then transmitted. In this
case when calculating the CRC for received data and its appended CRC, the result will give another
ixed binary number called residual constant. It is possible to explain CRC better in mathematical
way to understand how CRC is calculated and we get at receiver side either constant value of zero.
But before that, we have to introduce some arithmetic basics that are used in CRC computation.
4 Modular Arithmetic
Basic arithmetic is built around addition, subtraction, multiplication or division but there are other
arithmetic that is derived from this basic arithmetic which is used to perform more developed
operations like logarithms which is applied to one operand only but according to the base of the
logarithm. Modulo operations is special set of operations also that are applied to one operand only
but using the base of the modulo operation, we can de ine how to perform the Modulo operation.
Modulo operation means to ind remainder of division operation between the operand and the
base of the Modulo. For example Modulo‐two of 5 means to ind the remainder when dividing 5 by
2 (5 is the operand, 2 is the base of the modulo operation), the result then is 1. Modulo‐three of 8
is 2, since dividing 8 by 3 gives remainder 2. Modulo‐ ive of 15 is 0, since dividing 15 by 5 gives no
remainder.
In Modular Arithmetic we perform also addition, subtraction and multiplication. Modular division
is avoided. Modular addition means inding Modulo of two or more added numbers. Modular
subtraction means to ind Modulo of two or more subtracted numbers. Modular Multiplication
means inding Modulo of two or more multiplied numbers. Modular division doesn’t exist as it
is ignored because we need to guarantee that the quotient (of divided numbers before Modulo
operation) is an integer.
5 CRC Arithmetic
Since CRC is a digital signal processing scheme, it has its own rules when applying mathematical
operations on group of digital bits. These rules are derived from Modular Arithmetic in a way that
allows appending some bits or inverting some bits. Since we are talking about digital bits, the CRC
Arithmetic is derived from modulo‐2 arithmetic with the following rules:
• The binary numbers (dividend & divisor) are considered as polynomials. Therefore, when
we add or subtract, we add or subtract each corresponding bits. When we multiply or divide,
we multiply or divide polynomials together.
Lets now explain irst rule which tells that binary numbers shall be considered as polynomials.
Assume we have binary number 10111. This is 5 digits number, so it will be represented as a
polynomial of the fourth order as x4 + x2 + x + 1. Then when we need to perform any arithmetic
operation, it shall be handled as polynomial arithmetic.
www.electgon.com 2
CRC Overview 6. CRC Mathematics
The second rule tells us that addition or subtraction is performed as modulo‐2 operation. Addition
or subtraction in modulo‐2 operation is the following
0±1=1 1±0=1 0±0=0 1±1=0
which is exactly performing XOR logic between the two bits.
Example ‐ Multiplication:
(x3 + x + 1).(x2 + 1) = x5 + x3 + x2 + x3 + x + 1
= x5 + x2 + x + 1
= 100111
Example ‐ Addition:
(x3 + x + 1) + (x2 + 1) = x3 + x2 + x + 1 + 1
= x3 + x2 + x
= 1110
Example ‐ Division:
x4 + 1
x4 +x+1 x8 + x5 + x2 +x
x8 + x5 + x4
x4 +x2 +x
x4 + x +1
x2 + 1 ⇒ 101
6 CRC Mathematics
We can understand that performing CRC computation on the message at receiver and transmitter
side we will get the same CRC at both. But how do we get ’0’ (or constant value) if we computed
the CRC for the transmitted message with appended CRC value? To answer that question we will
demonstrate the CRC computation in a mathematical way. In general, an n‐bit CRC is calculated by
representing the data stream as a polynomial M(x). Then this M(x) is multiplied by xn , where n is
the degree of the Generator polynomial G(x). Then dividing the result by the Generator polynomial
G(x). The resulting remainder is appended to the polynomial M(x) and transmitted. At the receiver
side, The complete transmitted polynomial is then divided by the same Generator polynomial to get
the resulting remainder which shall be a constant. Before we explain that mathematically we need
to declare some concepts
Multiplication:
Multiplication of M(x) by xn means trailing n‐bits zeros to the end of M(x). For example assume
that M(x) is 10111. Which is as a polynomial shall be written as M (x) = x4 + x2 + x1 + 1. If we
multiplied it by x4 this will result in
F (x) = M (x) ∗ x4
F (x) = (x4 + x2 + x1 + 1) ∗ x4
F (x) = x8 + x6 + x5 + x4
www.electgon.com 3
CRC Overview 6. CRC Mathematics
Complement:
It is know that to get ones‐complement of a binary vector, you need just to invert every bit from 1
to 0 or 0 to 1. In digital logic, this can be performed by XORing each bit with ’1’. If we used XOR with
’0’, no change will take place in the binary vector. This is important to note because we start CRC
calculation with some initial value in the CRC registers. If this initial value is sequence of ’1’, this
will result in ones‐complementing the irst part of the message. So lets assume now that the CRC
registers are loaded initially with L(x), which might be [0 0 · · · 0] or [1 1 · · · 1] but both of them has
n bits which is the CRC width.
These two concepts were important before discussing the mathematics behind CRC. Now, as we
mentioned previously, the CRC is eventually appended to end of the message. Therefore, irst step
we do is to shift up the message by n bits in order to prepare location of the CRC and note that n is
the width of the CRC.
xn .M (x)
If we have our message with length k, the previous multiplication will result in having message
length of k + n. To start the CRC computation, irst step will be XORing the Message with initial
value of the CRC registers (why do we XOR irst? this will be clear in section”XOR First Serial CRC”).
But in order to XOR the initial value (L(x)) with the message, we must have same length in both.
This means we need to increase length of L(x) to be k + n also
xk L(x)
Next Operation is to divide by the Generator polynomial G(x) as a modulo2 division, which can
be described mathematically as follows
where Q(x) are a quotient. R(x) are the remainder of the division operation. From that last
equation we can understand the R(X) is the CRC value, but actually we will not take it barely because
some implementations of the CRC requires to take ones‐complement of the remainder. So we will
consider the CRC as follows:
L(x)value will be according to the implementation value. The transmitted message T(x) can be
represented by
www.electgon.com 4
CRC Overview 6. CRC Mathematics
Note that hereT (x) is k + n bits. substituting CRC as in 3 we can write T(x) as:
At the receiver, we are interested in obtaining the remainder of the received message so that
we make sure if it has been received correctly or not. So the same CRC technique is applied at the
receiver.
CRCrx = rem xn [{[xn M (x) + xk L(x)]/G(x)} + {L(x)/G(x)} + {R(x)/G(x)}] (10)
CRCrx = rem (xn {Q(x) + R(x)/G(x)}) + rem (xn {L(x)/G(x)}) + rem (xn {R(x)/G(x)}) (12)
It should be clear that remainder of irst term is R(x). Last term should result also in R(x) as
remainder as R(x) is a polynomial of degree n − 1 (at maximum) while G(x) is a polynomial of
degree n.
Since we are using modulo2 arithmetic, the term R(x) will cancel each other
This means that the CRC operation at the receiver side depends only on L(x) and G(x). The
www.electgon.com 5
CRC Overview 7. CRC Parameters
xn L(x)/G(x) (15)
which is a constant value known as residual value. This explains why we always have a constant
value that shall be obtained at receiver side no matter what are the content of the message. In
appendix there is MATLAB or Octave code that can be used to make modulo2 division and to obtain
this residual value
7 CRC Parameters
As seen in previous section, CRC computation scheme can be speci ied with some parameters for
example the width of the CRC, width of the polynomial, residual value. There are other contexts
discussing other parameter of a CRC method. These parameters depend actually on the implementation
method of the CRC, for example ones‐complemented initial value, reversed CRC. So basically we can
say that the CRC has some basic parameters and other implementation parameters. All in all, the
following parameters de ine a CRC scheme.
• Width: This is the basic identi ier of a CRC scheme which expresses width of the CRC value,
i.e. how many bits the CRC is.
• Polynomial: The value of the divisor determines also the CRC scheme which is the polynomial.
If the CRC width is n‐bit, then the polynomial width is (n+1) bit, or we can say it has n+1
coef icients and it is of degree n. Choosing the polynomial value is a big research topic, but the
theoretical idea is based on the concept of prime number. Since prime numbers don’t have
underlying components, so dividing by prime numbers should give more common factors
than non‐prime numbers, this gives less probability to have same CRC for different numbers.
In practice, it is not necessary to choose the polynomial to represent prime number, the main
concern is to have better opportunity for error detection. Therefore, the polynomial is chosen
to that purpose.
• Residual Value: At receiver side, in which we perform same CRC operation, we call the
result of the operation then the residual value that gives indication if we have received the
message correctly or not. For example the residual value for CCITT CRC‐32 is C704DD7B, so
any received message should give this value.
• Initial Value: When looking at implementation of the CRC procedure, we will ind the we
perform XOR operation between the message (dividend) and remaining of division (eventually
the CRC). So in computation we perform repetitive XOR between Message variable and CRC
variable until we inish all bits of the message. The CRC variable can be initialized to certain
value (usually it is either 0x00 or 0xFF) before starting the computation.
• Reverse Input: We just mentioned that the CRC calculation is performed via iterative XOR
operation. Usually each iteration step is performing the XOR operation byte by byte. Some
computations requires that each byte of the message (dividend) shall be reversed. i.e. swap
bit sequence to start with LSB instead of MSB.
www.electgon.com 6
CRC Overview 8. Hardware Implementation
• Reverse CRC: Some CRC computations require also that the inal obtained CRC value shall be
reversed also so that it can be ready for transmission.
• Complement CRC: Some computations require to ones‐complement the obtained CRC value
before transmission.
8 Hardware Implementation
Processing of binary data is usually done in hardware level, therefore the implementation of CRC
processing will be discussed here in hardware point of view. The CRC processing as discussed
before is a process of mathematical long division, however we are not interested in the quotient
of this division. We are interested in the remainder only which is the CRC value that we intend to
obtain. This will enable us to simplify the long division process i.e. we don’t need to execute the
long division at each step, we need only to execute steps where we apply XOR between the dividend
and divisor. This can be discussed as follows, assume the long division example:
1 0 0 0 1
10011 1 0 0 1 0 0 1 1 0
1 0 0 1 1 ↓ ↓ ↓ ↓
0 0 0 1 0
↓ ↓ ↓
0 0 0 0 0
0 0 1 0 1
↓ ↓
0 0 0 0 0
0 1 0 1 1
↓
0 0 0 0 0
1 0 1 1 0
1 0 0 1 1
0 1 0 1
Procedure steps can be minimized into two steps only since we are not interested in the quotient.
10011 1 0 0 1 0 0 1 1 0
1 0 0 1 1 ↓ ↓ ↓ ↓
1 0 1 1 0
1 0 0 1 1
0 1 0 1
The key point in that is we look at the MSB after the XOR operation, if this MSB is 0 we drop it
and seek for the next bit until we meet ’1’. When we ind this ’1’ we shift in n‐bits from the original
dividend, where n is the length of the CRC (or “length of divisor‐1”). If there are no enough bits we
stop the division because it means division is inished. But actually in real applications there will
be enough bit as any dividend is appended by n 0‐bits. In other words, after the irst XOR operation
we align the divisor (polynomial) with the irst ’1’ we see in XOR result, then perform the next XOR
operation.
www.electgon.com 7
CRC Overview 9. Serial CRC
To interpret previous description in hardware implementation, the process of seeking for the
next ’1’ is done simply by shifting the dividend until we ind that MSB of the dividend became ’1’.
This means that the dividend is maintained in a shift register and XORed with the polynomial only
when the MSB of the dividend is ’1’.
9 Serial CRC
Previous steps show that to implement hardware for CRC calculation we need at least n‐bit shift
register, where n is the degree of the generator polynomial. If we tried to implement previous
description literally, the hardware logic will be as in igure 1.
Then, irst thing we can conclude from igure 1 is output of last XOR operation between (G4 and
M4 ) is not used anywhere, therefore we can de initely ignore this operation and dismiss G_4 totally
from out calculation. This is explains why we always discard MSB bit of the polynomial equation
although it is part of its de inition. The new hardware logic will be as in igure 2.
www.electgon.com 8
CRC Overview 9. Serial CRC
More simpli ication can be done in the hardware implementation if we know the polynomial
(and indeed we know). In our example here the polynomial is 10011. After discarding MSB we get
0011 which means G3 is 0, G2 is 0, G1 is 1, G0 is 1. Knowing that XORing any value with 0 gives the
same value, we can simplify the hardware of G3 and G2 as the its multiplexer will result always in
value of M3 and M2 . This will result in hardware of igure 3.
In previous step we made simpli ication based on the fact that XORing anything with 0 gives
the same value. On the other side, XORing any value with 1 gives inverted value. This can help in
simplifying the circuit to be as in igure 4 in which we don’t need any registers for storing values of
the polynomial.
Looking into remaining Multiplexer units, we can see that they result in value of M1 (or M0 ) in
case of M4 is 0; and they result in inverted value of M1 (or M0 ) is case of M4 is 1 which is exactly
the logic of XOR operation. So we can replace these two multiplexers with simply two XOR as in
igure 5.
www.electgon.com 9
CRC Overview 10. XOR First Serial CRC
This circuit seems not a practical one, since we always drop the MSB (M4 in our example) of the
registers to obtain the correct result.M4 is used only to know when to shift only or to shift and XOR.
In addition there is some more logic before M0 which is used only at last step in which we don’t
shift and XOR but XOR only. There should be more simpli ication for this circuit.
10011 1 0 0 1 0 0 1 1 0
1 0 0 1 1 ↓ ↓ ↓ ↓
1 0 1 1 0
1 0 0 1 1
0 1 0 1
Because the MSB in all steps is trimmed and not used after the XOR operation, this operation
can yet be enhanced to avoid need of the MSB as follows
10011 0 1 0 0 1 0 0 1 1 0
↙ ↙ ↙ ↙ ↙ ↙ ↙ ↙ ↙
1 0 0 1 1 ↓ ↓ ↓ ↓
0 0 0 0 1 0 1 1 0
↙ ↙ ↙ ↙ ↙
1 0 0 1 1
0 0 1 0 1
Which means that we don’t need to shift the dividend until its MSB to align MSB of the dividend
with the MSB divisor. Instead, we can predict that the MSB of the dividend is going to be ’1’ if the
(MSB‐1) bit was ’1’ in the preceding shift operation. In that way we can eliminate the need for the
MSB of the dividend and we can align (MSB‐1) of the dividend with the divisor. This means that we
do the XOR operation before shifting operation. To know how this will save one more register in
the circuit, lets describe the division operation according to our modi ied process.
www.electgon.com 10
CRC Overview 11. External Linear Feedback Shift Register
As we can see in igure 6, M4 is not affecting anything now in the circuit so we can remove it
and G4 as well.
We can follow same previous simpli ication to reach eventually circuit in igure 8. The obtained
circuit here is called Linear Feedback Shift Register (LFSR).
It is clear then that this circuit eliminates also the extra logic before M0 .
www.electgon.com 11
CRC Overview 11. External Linear Feedback Shift Register
X
k/w
M (x) = Sc (x) (16)
c=1
Where Sc (x) is sub‐message of M (x) each with speci ic width w (8‐bit for example), k is length
k
of message M (x), so we will have number of chunks = w. But note that each Sc (x) has speci ic
location in M (x). i.e. So it is better to decompose that and represent M (x) as:
X
k/w
M (x) = x(c−1).w .sc (x) (17)
c=1
Where x(c−1).w is shifting each sub‐message to its correct position. But to add CRC to this message
we multiply it with xn
X
k/w
M (x).xn = {x(c−1).w .sc (x)}.xn
c=1
X
k/w
= x(c−1).w+n .sc (x) (18)
c=1
So if we are going to use circuit of igure 8 we have to take into consideration that this circuit
will take (c.w + n) cycles until it inishes. As a side note if we used circuit of igure 5 it will take
(c.w + n + 1) cycles to inish.
Since it is a LFSR circuit, we can use another form of this circuit which is called External (or
Fibonacci) LFSR in which we try to ind equivalent circuit to circuit of igure 8. The circuit in igure
8 is called Internal (or Galois) LFSR.
In External LFSR, we can provide the data chunk serially without the need to append n Zeros.
The architecture of External LFSR results in the needed CRC as if we appended n Zeros to the data
chunk. Therefore the External LFSR will take only (length of data chunk) cycles to complete CRC
calculation. To understand that, lets use previous example to calculate its CRC after appending
Zeros to the message.
10011 1 0 0 1 0 0 1 1 0 0 0 0 0
1 0 0 1 1 ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓
1 0 1 1 0 0 0 0 0
1 0 0 1 1 ↓ ↓ ↓ ↓
0 1 0 1 0 0 0 0
1 0 0 1 1 ↓ ↓
0 1 1 1 0 0
1 0 0 1 1
1 1 1 1
Using circuit in igure 8 we can expect that it takes 13 cycles until we get the CRC as declared
below
www.electgon.com 12
CRC Overview 11. External Linear Feedback Shift Register
This can be enhanced if we consider External LFSR depicted in igure 9 which is very similar to
Internal LFSR except the XOR gates is controlled by output of irst XOR operation.
The design of this circuit can be understood from next section which is discussing parallel CRC
and how to compute CRC for part of the message. But for the moment we need to know that this
circuit can calculate the CRC in 9 cycles which is the length of our data as shown below
This External LFSR is commonly used in case we need to provide the input stream bit by bit as it
is optimized in terms of number of used registers and execution cycles. For example, CRC‐32 used
in Ethernet packets has the polynomial 104C11DB7. CRC circuit for such polynomial can be built
using circuit in igure 10.
www.electgon.com 13
CRC Overview 12. Parallel CRC
12 Parallel CRC
In real applications received messages packets are parsed and stored in internal FIFOs for further
processing.This means that if we need to check CRC of these received messages, it will not be
ef icient if we provided this message to the CRC logic serially. Providing this as chunk of data will
be more practical. Therefore, we need to have logic that can process chunk of data (for instance 8
bit of data) instantly.
Declaration of Parallel CRC is based considering the LFSR as discrete‐time, linear time‐invariant
system which is modeled by certain equation but in our context here we will discuss this equation
in a reverse way. i.e. we will reach to this equation starting from the proposed hardware depicted
in igure 9.
Lets describe the state of circuit in igure 9 in a matrix form
M3 [t]
M2 [t]
M [t] =
M [t]
1
M0 [t]
Where M3 [t], M2 [t], M1 [t], M0 [t] are state of each register at cycle t. Now lets derive next state
with respect to this state t.
M3 [t + 1] = M2 [t]
M2 [t + 1] = M1 [t]
M1 [t + 1] = M0 [t] ⊕ M3 [t] ⊕ d
M0 [t + 1] = M3 [t] ⊕ d
www.electgon.com 14
CRC Overview 12. Parallel CRC
where d is serial input bit before irst XOR gate. Writing M [t + 1] in form of matrix will be:
M3 [t + 1] M2 [t]
M2 [t + 1] M1 [t]
M [t + 1] = =
M [t + 1] M [t] ⊕ M [t] ⊕ d
1 0 3
M0 [t + 1] M3 [t] ⊕ d
M2 [t] 0
M1 [t] 0
=
M [t] ⊕ M [t]
⊕
d
0 3
M3 [t] d
In last step we separated registers coef icients M from data coef icient d. This form is better
represented as follows
0 1 0 0 M3 [t] 0
0 0 1 0 M2 [t] 0
M [t + 1] =
1 0 0 1
.
M [t]
⊕
.d
1
1
1 0 0 0 M0 [t] 1
M [t + 1] = F.M [t] ⊕ G.d (19)
gn−1 1 0 · · · 0
gn−2 0 1 · · · 0
. .. .. . . ..
F = .. . . . .
g
1 0 0 ··· 1
g0 0 0 · · · 0
So irst column represents the polynomial without its MSB, other columns will have value 1
diagonally. Last row consists of LSB of the polynomial then zeros. For sake of further simpli ication,
we will denote here F as:
F = [A0 A1 A2 · · · An−1 ]
where
www.electgon.com 15
CRC Overview 12. Parallel CRC
gn−1
gn−2
.
A0 = .. (20)
g
1
g0
1
0
.
A1 = .. (21)
0
0
0
1
.
A2 = .. (22)
0
0
0
0
.
An−1 = .. (23)
1
0
Lets de ine also An although it is not used in F
0
0
..
An = . (24)
0
1
Matrix G represents actually the polynomial if we consider External LFSR. So in case of External
LFSR G is described by
gn−1
gn−2
.
G = .. (25)
g
1
g0
www.electgon.com 16
CRC Overview 12. Parallel CRC
0
0
..
G = . (26)
0
1
So far we have concluded from equation 19 the register content at time t + 1 with respect to
time t. The difference between times t and t + 1 is we have provided only 1 bit to the system. If we
provided 2 bits, this means that we are now at time t + 2. If we provided w bits, it means we are
now at time t + w. Thus, if we got content of the system at time t + w with respect to t, it means that
we can provide w bits to the system at time t and we can calculate the CRC by substituting with the
content of the system at time t + w. This can be explained mathematically as below:
Since it is time‐invariant system and only serial input is changing in each time, we can then
write:
M (t + 2) = F.M (t + 1) ⊕ G.dt+1
= F.F.M (t) ⊕ F.G.dt ⊕ G.dt+1 (28)
= F 2 .M (t) ⊕ (F.G.dt ⊕ G.dt+1 )
M (t + 3) = F.M (t + 2) ⊕ G.dt+2
= F.F.F.M (t) ⊕ F.F.G.dt ⊕ F.G.dt+1 ⊕ G.dt+2
= F 3 .M (t) ⊕ (F 2 .G.dt ⊕ F.G.dt+1 ⊕ G.dt+2 ) (29)
M (t + 3) = F 3 .M (t) ⊕ ( F 2 .G F.G G . [dt dt+1 dt+2 ]T ) (32)
www.electgon.com 17
CRC Overview 12. Parallel CRC
M (i) = F i .M (0) ⊕ F i−1 .G · · · F.G G . [D(0) · · · D(i − 1)]T (33)
where D(0) is irst bit provided to the system(dt ), D(1) second bit (dt+1 ) and so on.
Equation 33 may look like complex and hard to calculate specially it has a lot of matrices. To
simplify it lets consider equations 21, 22, 23 with which we will add trivial additions to equations
30, 31, 32 to become
M (t + 3) = F 3 .M (t) ⊕ ( F 2 .G F.G G | A3 · · · An−1 . [dt dt+1 dt+2 | 0 0 · · · 0]T ) (36)
M (n − 1) = F n−1 .M (0) ⊕ ( F n−2 .G · · · F.G G | An−1 . [d0 · · · dn−3 dn−2 | 0]T ) (37)
M (n) = F n .M (0) ⊕ ( F n−1 .G · · · F.G G . [d0 · · · dn−1 ]T ) (38)
So what we did is adding A matrices (equations 21,22,23,...) then we added zeros in corresponding
position in data [d0 d1 ...]T matrix so that it has no effect on the value. Then we can write equation
33 as:
F i .M (0) ⊕ (F i−1 .G · · · F.G G | A · · · A T
i n−1 . [d0 d1 · · · di−1 | 0 0 · · · 0] ) i<n
(39)
F i .M (0) ⊕ (F i−1 .G · · · F.G G . [d · · · d
M (i) =
T
0 n−1 ] ) i=n
where n is the width of the CRC. To understand why we did that, lets calculateF 2 , F 3 in our
example
0 1 0 0
0 0 1 0
F = 1 0
0 1
1 0 0 0
0 1 0 0 0 1 0 0 0 0 1 0
0 0 1
0 0 0 1 0 1 0 0 1
F =
2
1 0 0 . =
1
1 0 0 1
1 1 0 0
1 0 0 0 1 0 0 0 0 1 0 0
www.electgon.com 18
CRC Overview 12. Parallel CRC
0 1 0 0 0 0 1 0 1 0 0 1
0 0 1 0 1 0 0 1 1 1 0 0
F =
3
1 0 0 1
.
1 1 0 0
=
0 1 1 0
1 0 0 0 0 1 0 0 0 0 1 0
According to [4], power of F can be calculated by
F i = F i−1 ⊗ A0 | f irst n − 1 columns of F i−1 (40)
where D = [d0 d1 · · · dn−1 | 0 · · · 0]T . And when we provide w bits as parallel to the External
LFSR, this equation will be:
c = F w .(M ⊕ D)
M (44)
c is next state after M which is the CRC if we provided w bit in one cycle.
where M
In case of Internal LFSR, then G will be as denoted in equation 26. Then:
F n−1 .G = A1 (45)
where An−1 , A1 are as denoted in equations 2321. Then we can simplify equation 33 directly
to be
and when we provide w bits as parallel to the Internal LFSR, this equation will be:
c = F w .M ⊕ D
M e (47)
e = [0 · · · 0 | d0 d1 · · · dn−1 ]T .
where D
www.electgon.com 19
CRC Overview 12. Parallel CRC
12.1 Example
Lets now use our example to see how we can build parallel CRC instead of the serial LFSR. In our
example we have n = 4. So if we going to calculate for w = 4 bit parallel data then we need to ind
F 4 which is:
1 1 0 0
0 1 1 0
4
F =
1 0 1 1
1 0 0 1
c = [m
with M c3 m
c2 m
c1 m
c0 ]T and M = [m3 m2 m1 m0 ]T
If we used External LFSR
c3
m 1 1 0 0 m3 + d3
m
c2 = 0 1 1 0 . m2 + d2
m
c1 1 0 1 1 m1 + d1
c0
m 1 0 0 1 m0 + d0
m3 ⊕ d3 ⊕ m2 ⊕ d2
m2 ⊕ d2 ⊕ m1 ⊕ d1
=
m ⊕d ⊕m ⊕d ⊕m ⊕d
3 3 1 1 0 0
m3 ⊕ d3 ⊕ m0 ⊕ d0
note here that order of the parallel data D = [d3 d2 d1 d0 ]T but you can choose any order you like,
however you have to be careful of order of resulted CRC which will be discussed in next sub‐section.
What we can conclude from our example here that we can build the parallel CRC this combination
of signals and XOR logic. You can try to build it also using Internal LFSR which will have different
combination.
Assume a generated LFSR logic with CRC width 8 bit and input parallel data 4 bit. The system maybe
expressed then as:
c7
m m7 + d3
c6
m m6 + d2
c5 m +d
m 5 1
c4
m
= F 4 . m4 + d0
c3
m m3
c2
m m2
c1
m m1
c0
m m0
here we assumed that user will consider chunk of data as D = [d3 d2 d1 d0 ]T . But what if provided
www.electgon.com 20
CRC Overview 12. Parallel CRC
chunk of data was considered as the inverse: D = [d0 d1 d2 d3 ]T . The LFSR will be described by:
c7
m m7 + d0
c6
m m6 + d1
c5 m +d
m 5 2
c4
m
= F 4 . m4 + d3
c3
m m3
c2
m m2
c1
m m 1
c0
m m0
which will give different combination of resulted XOR circuit which will give different CRC value
accordingly.
So far we got the idea of how to calculate CRC for parallel chunk of data. The question now; is
this applicable in case of large amount of data (Ethernet packets for instance)? The answer is yes.
We can divide for example Ethernet packet (which is usually about 1500 bytes) into data chunk
each chunk is 8 bit for example, then calculate the CRC of each 8 bit but without resetting the CRC
registers between each chunk. So CRC of each chunk shall be accumulated on the previous chunk
CRC. In other words, First data chunk will have initial value in CRC registers as zeros or ones, but
second data chunk will have previous CRC as initial value of the registers and so on. In case of
External LFSR we can provide each data chunk directly. In case of Internal LFSR, we need to append
each data chunk with n‐bits of zeros.
www.electgon.com 21
CRC Overview 12. Parallel CRC
Appendix
MATLAB/Octave code to perform polynomial division.
clear
len_end = length(dividend);
len_sor = length(divisor);
if (len_diff >= 0)
for div_idx = 1:len_diff+1
if (dividend(div_idx) == 1)
dividend = xor(dividend, divisor);
div_res(div_idx) = 1;
else
div_res(div_idx) = 0;
end
if (div_idx == len_diff+1)
break;
end
divisor = shift(divisor, 1);
divisor(:,1) = 0;
end
end
www.electgon.com 22
Bibliography
[3] https://en.wikipedia.org/wiki/Cyclic_redundancy_check.
[4] Giuseppe Campobello, Giuseppe Patane, Marco Russo “Parallel CRC Realization”, 2003.
[5] Wu Chuxiong, Shi Haifeng “Design and implementation of parallel CRC algorithm for ibre
channel on FPGA”, 2019.
[6] Dawood Alnajjar, Mauricio Suguiy “A Comprehensive Guide for CRC Hardware Implementation”.
[7] https://www.cl.cam.ac.uk/research/srg/bluebook/21/crc/node2.html.
23