Improve Integrated SRAM Reliability With Hamming Error-Correction Code

Many error-correcting codes (ECCs) are proposed in the industry’s literature for correcting bit errors present in the received data. We will discuss Hamming codes that are used to correct singlebit errors and detect all double-bit errors that could occur during data transmission or residing in the memory. How much improvement we can get in the bit-errorrate (BER) performance curves using onebit error correction depends on the raw BER (RBER) without error correction and the codeword length used with the singlebit error-correction coder.

In Figure 1, the BER performance curves illustrate the improvement (i.e., UBER (uncorrectable BER)) with 0- to 6-bit error correction for given raw BER values. The codeword length used to generate the BER curves is 2048 bits. For example, with the given BER = 10-7, we can achieve an UBER of 10-11 with singlebit error correction. For a given RBER, a shorter codeword will provide better error-correcting capability or a higher UBER (Fig. 2).

Given the RBER (P), codeword length (N), and the number of error bits (n), we get the UBER with onebit error correction using the equation shown at the bottom of the page.

MEMORY ERROR CORRECTION In automotive applications, software integrity level, or ASIL (memory with error-correction capabilities), is one of the most important issues when choosing embedded processors. Software-based Hamming codes can be used to improve the reliability of the most important sections of memory, thus improving the ASIL metric. Memory is used to store information of various types. Some types of information require strong protection against errors, while other types do not.

For example, application software code, data structures, parameters, lookup tables, etc., are very sensitive and any content alteration may end up with catastrophic errors. On the other hand, information such as data samples, image pixels, etc., isn’t as sensitive and may not require error protection.

A typical automotive application can be broken into different sections of memory consisting of different types of information: constant data such as software code, lookup tables, etc.; slowly varying data such as application parameters, data, etc.; and continuously varying data such as audio/video data, navigation data, etc. (Fig. 3). Therefore, a software ROMbased error-correction approach that uses Hamming code to correct the single-bit errors in the first two sections of memory can be implemented using a very small percentage of processor resources.

In the first case, since the data is constant, the extra error-correction information is constant and can be generated once. Every time information is retrieved from this memory section, an ECC decoder is applied to correct the possible errors. In the second case, we call the decoder for each memory load, and the encoder is called to update the error-correction information only when the new data is ready for storing to memory. In these two cases, a software-based single-bit error correction can be implemented using a very small percentage of processor resources.

Figure 4 illustrates a schematic diagram of the software-based memory error correction. With a Hamming (n, k) coder, we divide the data into k bits long, compute the n-k parity bits, and store the n-bits long block to memory area. The parity overhead percentage with respect to data length can be computed as (n-k)*100/k.

When the data is retrieved, the decoder uses the n-k parity bits to detect and correct errors that corrupted during the time when data was residing in memory. We verify the computed parity with the received parity data. If the parity data matches, then no error bits are present in the received data, otherwise there will be error bits in the received data. The Hamming decoder can detect and correct all single-bit errors or detect all double-bit errors.

Because error-correction software is permanently stored in the ROM and uses the core resources whenever memory is accessed, an ECC solution that employs a very small amount of memory and processor cycles is the preferable scenario.

THE HAMMING(2072, 2048) ENCODER The Hamming(2072, 2048) coder divides the input data stream into 256-byte blocks and computes 24 bits of parity that are used to correct all single-bit errors, detect all double-bit errors, and detect singlebit errors in the parity data. To compute the parity, the 256 bytes of input data are arranged as shown in Figure 5. The bits xP0 to xP15 represent the parity computed row-wise, and bits yP0 to yP5 represent the parity computed column-wise. And, bit dP6 is XOR of all bits of 256 bytes of input data, while bit dP7 is XOR of all 22 parity bits xP0 to xP15, yP0 to yP5, and dP6.THE HAMMING(2072, 2048) DECODER The Hamming decoder computes all parity bits for the retrieved data bits in the same way as the encoder using Figure 5. Before computing these parity bits, the decoder checks the parity of the retrieved 24 parity bits computed at the encoder. If this parity fails, then decoding is skipped because there are errors in the retrieved parity data itself. Otherwise, the decoder proceeds to compute the 23 parity bits xP0 to xP15, yP0 to yP5, and dP6. After computing the parity at the decoder, the decoder parity bits are XORed with the retrieved encoder parity bits. Then, it performs the error-correction process using the decoder flow chart (Fig. 6).

The single-bit errors are corrected using the following steps. Allow P0 to P15 to represent the XORed row parity bits and C0 to C5 represent the XORed column parity bits. Subsequently, correct the single-bit errors. This is accomplished by locating the byte address and bit position in the byte using the row and column parity bits.

The byte address is given by the offset obtained from the eight bits “P15 P13 P11 P9 P7 P5 P3 P1,” and the bit position is obtained from the three bits “C5 C3 C1.” Once the decoder comes to know the error bit’s exact location, it corrects the bit value by toggling that bit.

HAMMING(2072, 2048) CODER IMPLEMENTATION The costliest part of the Hamming coder is the computation of row and column parity bits. The parity bits of the Hamming(2072, 2048) coder are computed by arranging the data as shown in Figure 5. It will be very expensive if one parity bit is computed at a time in a loop covering all the 256 data bytes.

Continue to page 2

The computational complexity of parity bit computation can be reduced by using the Blackfin general-purpose LFSR instructions. For example, using the BXOR instruction, the parity bits of the Hamming coder can be efficiently computed as discussed below.

Given the 32-bit data and the 32-bit mask, the BXOR instruction computes the XOR of all bits defined by the mask in the 32-bit data. For example, if the 32-bit data is stored in an accumulator register A = 0xABCDEFAB, and 32-bit mask value is stored in a register R = 0x33333333, then BXOR(A, R) outputs the XOR of all 2-LSB bits in nibbles of register A. That’s because the mask is defined as 1s in 2-LSB bits of all eight nibbles.

COMPUTATION OF COLUMN PARITY BITS Because the Blackfin processor is 32 bits, the column parity bits can be efficiently computed in very few steps:

• Pack the bytes into 32-bit words. (It’s free because we can load 4 bytes or 32 bits at a time.)

• XOR all 64 32-bit words (i.e., 256 bytes) and say register R contains the result.

• Then shrink the result from 32-bit to 8-bit column parity as S = R>>16, R = R ^ S, S = R>>8, R = R ^ S, R = R & 0xff. Now, R contains the column wise XORed bits of 256 bytes.

• Then, compute the individual parity bits using the register R content.

• Using the masks and BXOR instruction, compute the individual parity bits and the masks used for computing the parity bits yP0, yP1, yP2, yP3, yP4, yP5, which are 0x55, 0xaa, 0x33, 0xcc, 0x0f, 0xf0.

• Compute the parity dP6 using the mask 0xff.

All of the above steps consume less than 100 cycles on the Blackfin processor.

COMPUTATION OF ROW PARITY BITS The 32-bit-word method used to compute column parity bits can’t be applied to row-parity-bit computation. Each data byte independently contributes toward the parity information, so we must handle each byte separately. For this, the parity of each byte is computed using the mask value 0xff and the BXOR instruction (see “Sample Code Using BXOR Instruction”).

The computed 256 parity bits of 256 bytes are organized into eight 32-bit words. Now, to compute the column parity on eight 32-bit words using the masks, first compute the 10 parity bits (xP0 to xP9). The values of the 10 masks are 0x55555555, 0xaaaaaaaa, 0x33333333, 0xcccccccc, 0x0f0f0f0f, 0xf0f0f0f0, 0x00ff00ff, 0xff00ff00, 0x0000ffff, and 0xffff0000. The reset of six parity bits is computed by shrinking the 256-bit parity data to a byte parity data using the mask 0xffffffff.

Once we have 8-bit parity, compute the last six row parity bits (xP10 to xP15) in the same way that the six column parity bits (yP0 to yP5) were computed using the mask values 0x55, 0xaa, 0x33, 0xcc, 0x0f, and 0xf0. All of the steps present in the xP0 to xP15 row-parity-bit computation consume about 850 cycles on a Blackfin processor.

Finally, the parity of parity bit dP7 is computed by XORing all parity bits yP0 to yP5 and xP0 to xP15. If all 22 parity bits are present in a 32-bit register starting from LSB side, compute the parity bit dP7 with the BXOR instruction using the mask value 0x3fffff.

With the techniques described above, the parity-bit computation of the Hamming coder (2072, 2048) can be implemented with less than 0.5 cycles per bit.

By using the techniques suggested in this article, one can efficiently implement a Hamming (2072, 2048) coder on a Blackfin processor for single-bit error-correction applications at less than 0.5 cycles per bit. The memory used in this implementation is approximately 640 bytes.

REFERENCES:

Analog Devices Inc., “ADSPBF53x/ 56x Blackfin Processor Programming Reference,” Revision 1.0, June 2005.

Neal Mielke, Todd Marquart, Ning Wu, Jeff Kessenich, Hanmant Belgal, Eric Schares, Falgun Trivedi, Evan Goodness, and Leland R. Nevill, “Bit Error Rate in NAND Flash Memories,” IEEE CFP08, 46th AIRPS, Phoenix, Ariz., 2008.