Many state-of-the-art embedded systems use “platform” FPGAs such as Xilinx Virtex 4/5 class devices or Altera Stratix III/IV class devices. Until recently, it wasn’t possible to deploy keyed applications in these devices, where keys are unique to each device.

Although these FPGAs do have bitstream decryption keys— whether battery-backed or fuse-based—that can be chosen by the user to be unique to each device, these keys can’t be accessed from an FPGA’s programmable fabric. As a result, users can’t employ them to deploy keyed applications.

With recent advances in physical-unclonable-function (PUF) circuits and the availability of Soft-PUFs as a new class of design primitives for FPGAs, FPGA designers now can deploy keyed applications in a way that previously wasn’t possible. For example, an identical bitstream can produce device-unique keys accessible from the FPGA’s programmable fabric (*Fig. 1*).

The PUF output can be used to feed AES, RSA, ECC (elliptic curve cryptography), key derivation function, hash, and various standard and custom cryptographic functions. These functions can then be implemented either as gates in the programmable fabric, as ROM code executed by a processor inside an FPGA, or by using a hybrid solution.

PHYSICAL UNCLONABLE FUNCTIONS

PUFs are circuits that extract chip-unique signatures based on semiconductor fabrication variations, which are very difficult to control or reproduce. These chip-unique signatures can help identify chips (a form of “silicon biometrics”), as well as generate “volatile” keys. These keys disappear when a device is powered off, and can be bit-accurately restored, with the aid of error correction, on subsequent power-ups.

Many different types of silicon-based PUFs have been realized in ASICs and in FPGAs. A sample PUF circuit consists of N stages, followed by an arbiter (*Fig. 2*). Each stage includes a crossbar switch, with a top output and a bottom output. The top output is connected to the previous stage’s top output if the Challenge bit is 1, and it is connected to the previous stage’s bottom output if the Challenge bit is 0. The bottom output is connected to the value not routed to the top output.

A race condition is created between two paths, and N Challenge bits that feed the multiplexer select lines for each stage determine the paths. An arbiter digitizes the race condition to determine a “1” output bit or “0” output bit for each Challenge applied. What is formed is the following function:

F_{PUF_INSTANCE}: \\{0,1\\}^{N} → \\{0,1\\}

Here, F_{PUF_INSTANCE} depends on a manufacturing instance of a device containing a PUF circuit, and it is different for each device. Multiple single-bit outputs can be concatenated together to form a multi-bit response. Challenge bits C_{0} to C_{N-1} can be based on an initial Challenge phase that’s run through a mixer (for example, a linear feedback shift register) to produce instantaneous Challenge phases, with each of these phases being used as C_{0} to C_{N-1} to produce a 1-bit output for that phase.

SOFT-PUFS

Soft-PUFs are PUF circuits that can be implemented in existing FPGA devices, using programmable resources such as lookup tables, registers, and memories. These circuits can be implemented without modifications to existing FPGA silicon or existing FPGA design tools. Figure 3 shows the results of Soft-PUFs implemented in Xilinx Virtex-4 FPGAs using standard Xilinx ISE design tools.

Each of the three plots contains two sets of curves. The middle curves, centered around block size / 2, are obtained by comparing the Hamming distances of Responses from different FPGA devices given the same Challenge applied across these devices. These curves, called “Inter-PUF” curves, are a measure of the cross-correlation of different Responses across different chips given the same Challenge. The left curves are the “Intra- PUF” curves, which show the Hamming distance of responses on a device given a repeated Challenge on that same device. It is an auto-correlation measurement.

The chasm between the two curves indicates that the design implementation is useful as a PUF circuit. Specifically, a Hamming threshold can be set to identify one FPGA device from another. Keys/seeds derived from a Response are unique to each device. Xilinx Virtex-5 FPGAs as well as Altera FPGAs also produce chasms between Intra-PUF and Inter-PUF curves, suiting these devices for implementing PUF-based authentication and PUF-derived (device-unique) keys.

Because PUF output bits are the result of random manufacturing variations that are difficult to control or predict, it’s not surprising that the output sequence of bits in a properly designed PUF is reasonably random. In a test setup where the output of a properly designed PUF was treated as a stream of bits captured by the National Institute of Standards and Technology (NIST) Statistical Tests for Randomness, the output sequence of bits was tested to be reasonably random.

In the table, columns 3 and 4 are based on several hundred million bits of PUF responses across representative Xilinx Virtex-4 FPGAs. Notice that the success ratios are in line with those from reference random bits from George Marsaglia’s *Random Number CD-ROM* (column 7). From high-pass rates, one can conclude that PUF output bits are reasonably random and can be used as cryptographic keys or as seeds to derive cryptographic keys.

Returning to Figure 3, note that the Inter-PUF curve peaks are very close to the ideal value of block size / 2, indicating that a PUF is a good entropy source with excellent dc bias to provide virtually uncorrelated keys across different FPGAs. This Inter- PUF behavior is in line with the results of standard-based NIST Statistical Tests for Randomness, which show that PUF output bits are fairly indistinguishable from reference random bits. With a properly designed Soft-PUF, an identical bitstream programmed across different devices produces different and virtually uncorrelated keys on those devices.

ERROR CORRECTION AND MINIMIZING INFORMATION LEAKAGE

After obtaining a properly designed Soft-PUF with suitable output characteristics, as quantified by metrics represented by Inter- PUF/Intra-PUF curves and randomness test results, the next step is to apply error correction. Cryptographic key generation is achieved using error-correction code that accounts for Intra-PUF (auto-correlation) variations. A syndrome encoder/decoder is required.

This differs from a conventional error-correction codec in the information leaked via “helper” bits. In a conventional error-correction code encoder, k data bits are applied as input, and n-k parity bits are outputted. These “helper” n-k parity bits, used to correct and detect bit corruptions, reveal information about k data bits.

Continue to page 2

As an example, if an error-correction code has (hypothetically) three bits of k and four bits of parity, chances are that parity completely reveals k. That’s because 2^{3} = 8 possible values of k are mapped to 2^{4} = 16 possible values of parity, resulting in a likely one-to-one mapping between k and parity (with some parity combinations unmapped).

Suppose (hypothetically) that four bits of k are mapped to three bits of parity, resulting in 16 possible values of k mapped to eight possible values of parity. In this case, parity information would likely narrow the search space of k from 16 down to two. This concept can be applied to error-correction codes of a higher order (larger blocks sizes as reflected by larger n).

To keep keys secret, it’s important to minimize the information leak via “helper” bits. There are numerous methods for doing so. One method encapsulates a conventional error-correction-code encoder in a syndrome encoder block, as well as a conventional error-correction-code decoder in a syndrome decoder block (*Fig. 4*).

Unlike an error-correction-code encoder that takes k bits input and outputs n-k bits, the syndrome encoder now takes n bits input and n-k bits output. On the decode side, instead of taking n bits input and outputs k bits, like the error-correction-code decoder, the syndrome decoder takes n bits input and outputs n bits. The end result of encapsulation is to produce helper (syndrome) bits that reveal much less information than plain-text parity bits.

Employing a properly designed Soft- PUF, such as one with Inter/Intra-PUF curves and NIST randomness test results like that in Figure 3, one could analyze randomness of a particular syndrome coding scheme given a certain sequence of parity bits. Columns 5 and 6 in the table show the results of applying randomness testing to syndrome bits. In this case, syndrome is generated using a more sophisticated syndrome coding scheme than the one just described.

Specifically, the 0th order “dc” sequence of all zeros and all ones and the first order “ac” sequence of alternating 1010s and 0101s were used as paritybit sequences. NIST Statistical Tests for Randomness analyzed the resulting syndrome from a properly designed syndrome encoder’s output.

Again, given the physical random nature of the PUF output in a properly designed Soft-PUF, syndrome bits from a properly designed syndrome encoder are reasonably random. This implies that it’s very difficult to infer information about parity (and, in turn, data bits k) from syndrome bits.

Custom correlation tests analyzing syndrome encoder input versus output confirm results from the earlier NIST randomness tests—over 95% of correlation results were within two standard errors of ideal uncorrelated value. In addition, the few present outliers didn't stray far from two standard errors away from ideal.