Ifd Electrical Engineer Scanrail Dreamstime

Text Encoding Simplifies Microcontroller Command Parsing

May 9, 2014
Parsing text such as used in SCPI commands for a PIC microcontroller consumes considerable processor time and resources. By converting the text into two-letter pairs using a simple encoding table, and representing each by a 16-bit number, a much-more efficient parsing scheme can be implemented. 

This article is part of the Ideas for Design Series: Vol. 3, No. 5

While working on a Microchip PIC project, I created a set of SCPI-style (Standard Commands for Programmable Instruments) commands to control the PIC. These SCPI commands use the first four characters of text words separated by a colon.

Download this article in .PDF format
This file type includes high resolution graphics and schematics when applicable.

In previous projects, I found that parsing text consumes significant computing time and code space. Typically, text parsing is handled by string comparisons or developing a parsing tree. Neither of these techniques is simple to design and implement on a microcontroller.

I knew that it would be faster to parse commands if I could convert the text into 16-bit numbers. So, I developed a method that converts the first four characters of each command to upper case and then encodes them as a 16-bit number. Each character is translated into a four-bit representation and then packed into a 16-bit number. 

But don’t you need 5 bits to represent 26 letters? Yes, if each letter is treated uniquely. To reduce the letters to four bits, I analyzed two-letter pairs and grouped the letters based on how often they are used. This encoding worked out well for the 25 or so commands I needed. (More extensive command sets may need to be checked for duplication and the encoding changed accordingly.)

The encoding gives , A, E, I, O, U, Y, and S single codes since they are very common. The consonants are then grouped together in sets. The table shows the encoding for the letters.

One implementation in C with the space character handled separately is:

const unsigned char LookUpTable[] = {0x1,0xA,0xB,0xA,0x2,0xD,0xB,0xD, 0x3,0xF,0xC,0x8,0xE,0x9,0x4,0xC,0xF,0x9,0x7,0x8,0x5,0xE,0xE,0xF,0x6,0xF};

These two examples show the encoding of SCPI commands:

CLS translates to 0xB870

CALCulate:AVERage:COUNt translates to 0xB18B,0x1E29,0xB459

After encoding the incoming text, parsing is just a matter of checking 16-bit numbers rather than text strings. This can be done as a CASE statement or series of IF statements, either of which is much simpler (and usually faster) than handling text strings in a microcontroller. Using this approach greatly reduced the amount of code needed.

Read more articles in the Ideas for Design Series: Vol. 3

David Hunter is an electrical engineer with First Consulting Inc. in Rochester, N.Y. He has a BSEE and an MSEE from the Rochester Institute of Technology and has worked for more than 25 years as a design engineer in embedded-systems software, digital, analog, and RF circuit hardware design.

Sponsored Recommendations

The Importance of PCB Design in Consumer Products

April 25, 2024
Explore the importance of PCB design and how Fusion 360 can help your team react to evolving consumer demands.

PCB Design Mastery for Assembly & Fabrication

April 25, 2024
This guide explores PCB circuit board design, focusing on both Design For Assembly (DFA) and Design For Fabrication (DFab) perspectives.

What is Design Rule Checking in PCBs?

April 25, 2024
Explore the importance of Design Rule Checking (DRC) in manufacturing and how Autodesk Fusion 360 enhances the process.

Unlocking the Power of IoT Integration for Elevated PCB Designs

April 25, 2024
What does it take to add IoT into your product? What advantages does IoT have in PCB related projects? Read to find answers to your IoT design questions.

Comments

To join the conversation, and become an exclusive member of Electronic Design, create an account today!