Latest from Embedded

Machine Learning

Speculations on AI, Co-Evolution, and the Future of Our Species

April 24, 2024

Machine Learning

Quick Poll: How Are You Using Generative AI and LLMs?

April 24, 2024

EDA

UMI Scales the Memory Wall in the Chiplet/Multi-Die Era

April 22, 2024

FPGA

Adaptive SoC Delivers Single-Chip Automotive Solution

April 22, 2024

Embedded

AMD’s New Family of Low-Cost FPGAs is All About Flexibility

April 19, 2024

Embedded

More Product Highlights from embedded world 2024

April 19, 2024

Embedded

Top Strategies to Improve Microcontroller Security (Part 1)

April 18, 2024

Embedded

What Caught Our Attention at embedded world 2024? @ mwrf.com

April 17, 2024

Automotive

NXP Outfits Its First 5-nm Automotive SoC with Real-Time CPU Cores

April 16, 2024

What’s the Difference Between Fixed-Point, Floating-Point, and Numerical Formats? (.PDF Download)

Aug. 31, 2017

Embedded C and C++ programmers are familiar with signed and unsigned integers and floating-point values of various sizes, but a number of numerical formats can be used in embedded applications. Here we take a look at all of these formats and where they might be found.

One reason for examining different formats is to understand how they work and where they can be applied. For example, fixed-point values can often be used when floating-point support isn’t available. Fixed point may be preferable in some instances, while floating-point support is available for other reasons, such as precision or representation.

Developers may be using single- and double-precision IEEE 754 standard formats, but what about 16-bit half precision or even 8-bit floating point? The latter is being used in deep neural networks (DNNs), where small values are useful. Small integers and fixed point can be used with DNN weights as well, depending on the application and hardware.

There are a variety number of ways to represent numbers. However, the layouts tend to vary only in the number of bits involved (see the figure). The use of the sign bit in binary-encoded values differs depending on whether 1’s or 2’s complement encoding is used. The 1’s complement approach uses the same encoding for the integer portion, which means there is actually a positive and negative zero value. A 2’s complement number has a single zero value, but there’s one more negative value than positive value. For example, an 8-bit signed integer includes values −128 to −1, 0 and 1 to 127.

Download

Latest from Embedded

Speculations on AI, Co-Evolution, and the Future of Our Species

Quick Poll: How Are You Using Generative AI and LLMs?

UMI Scales the Memory Wall in the Chiplet/Multi-Die Era

Adaptive SoC Delivers Single-Chip Automotive Solution

AMD’s New Family of Low-Cost FPGAs is All About Flexibility

More Product Highlights from embedded world 2024

Top Strategies to Improve Microcontroller Security (Part 1)

What Caught Our Attention at embedded world 2024? @ mwrf.com

NXP Outfits Its First 5-nm Automotive SoC with Real-Time CPU Cores

What’s the Difference Between Fixed-Point, Floating-Point, and Numerical Formats? (.PDF Download)

Comments

To join the conversation, and become an exclusive member of Electronic Design, create an account today!