Electronic Design
  • Resources
  • Directory
  • Webinars
  • CAD Models
  • Video
  • Blogs
  • More Publications
  • Advertise
    • Search
  • Top Stories
  • Tech Topics
  • Analog
  • Power
  • Embedded
  • Test
  • AI / ML
  • Automotive
  • Data Sheets
  • Topics
    - TechXchange Topics --- Markets --AutomotiveAutomation-- Technologies --AnalogPowerTest & MeasurementEmbedded
    Resources
    Electronic Design ResourcesTop Stories of the WeekNew ProductsKit Close-UpElectronic Design LibrarySearch Data SheetsCompany DirectoryBlogsContribute
    Members
    ContentBenefitsSubscribeDigital editions
    Advertise
    https://www.facebook.com/ElectronicDesign
    https://www.linkedin.com/groups/4210549/
    https://twitter.com/ElectronicDesgn
    https://www.youtube.com/channel/UCXKEiQ9dob20rIqTA7ONfJg
    Electronicdesign 9390 Whelmgifcropdisplay
    1. Technologies
    2. EDA

    Programming Efficiency: Part 2

    March 6, 2015
    A number of years ago I did some work on a PABX. The device was electromechanical and far from state of the art, but the owner had been talked into spending a bunch of money expanding it and couldn’t justify scrapping it.
    Wilton Helm Blog
    Electronicdesign Com Sites Electronicdesign com Files Uploads 2015 02 Programming Efficiency Part2 Fig
    Image courtesy of Thinkstock.

    A number of years ago I did some work on a PABX. The device was electromechanical and far from state of the art, but the owner had been talked into spending a bunch of money expanding it and couldn’t justify scrapping it. I was also young and worked for a relatively low salary, so they could afford the luxury of my design efforts.

    The system was given most of the features of a state of the art PABX, thanks to an 8-bit micro and about four rack panels of custom interface cards. The part of the project that is interesting here is that it provided Detail Call Recording output data that the IT mainframe could query and bill to departments.

    As a result, I became intimately familiar with phone company tariffs and call zones. It was about the time FAX machines, modems, and cell phones started exploding the number plan, so the area codes and prefixes changed frequently. The mainframe program had to create tables for charging and look each call up by area code and prefix, as well has how it had been placed. 

    Probably today this would have been done using a database, and the gory details left to the OS, but it was before SQL became a household word. Any readers who are familiar with optimizing searches on databases with tens of thousands of records should instantly recognize this as an optimal fit for hashing.

    Hashing is of no value when data needs to be sorted in some order and lookups done on the sorted data, but this project had no such requirement. It simply had to create a new database every month (because of the significant changes that occurred) and then look up each phone call in that database.

    Hashing was well documented by Donald Knuth in his The Art of Computer Programming series of books, first published in the late 1960s. I am not aware of any significant new insights that have come to light since.

    I was not (and still am not) a mainframe or IT programmer. The program needed to be written in RPG II, a “language” that wasn’t even intended to be a full programming language. In fact, it has often been referred to as “write only” because reading and understanding a program listing in RPG is very difficult.

    So I enlisted the help of a very bright medical student who had done summer work for the IT department. I gave him the basics of Knuth’s algorithm and how it worked, along with some recommendations on creating a hash function for this particular application, based on what I knew about the data.

    The basic idea behind Knuth’s approach is to create a “pseudo-random” number from the key field of the data (in this case the area code and prefix)—pseudo-random, in that it should result in as even and wide a distribution of records as possible. The next step is to divide that number by a prime number equal to the size allocated for the table, which should be about 20% larger than needed.

    The remainder from that division is the index into the table to store (or retrieve) the data record. If storing, and the record is already taken, add the quotient from the first division and take a new remainder (in case it wraps off the end) and try again.

    For retrieving, if the initial access doesn’t match the key, do the same thing. The remainder randomizes the accesses and guarantees against falling off the end. Adding the quotient moves to a new location an arbitrary distance down the table. Because the table size is a prime number, this movement will eventually access all records before it ever repeats, so the worst degenerate case is guaranteed to be a complete linear search.

    The useful part about it, though, based on Knuth’s study, is that no matter how large the table, if its size is about 20% larger than needed, average access time is under six tries. That beats a binary search of anything larger than 64 items!

    The finished program took about an hour on a mainframe to process around 10,000 records. It didn’t seem reasonable to me, and my department was paying for the CPU time, so I looked at the program. The programmer had done his own research from less enlightened sources and decided to forgo the part about adding the quotient; instead, he created an overflow table for cases where the first try was already taken.

    The result was that a lot of the records ended up in the overflow table, requiring a linear search. I knew enough RPG to modify his code, so I implemented it as described above. The run time went down to a few minutes!

    The moral of the story is that there are a lot of things in programming that can be subtle: a small misunderstanding, a wrong assumption about how something works, a high-level construct used without knowing the code it generates…any of these can make a huge difference in performance. That is why I maintain that any good programmer should be able to describe the assembly language code that the compiler will generate from any construct in his program.

    I often look at disassembly listings of key pieces of the code I write. Sometimes I marvel at the tricks optimization can play. Other times I am shocked at what the compiler had to do to implement a seemingly simple concept. And occasionally I discover that if I change the source code slightly, the compiler can generate a significantly better executable.

    Anyone who has used SQL knows that, especially in joins, the details and order of operations can sometimes make a huge difference in how long it takes the command to execute and how much memory is required. Experience teaches the programmer to have a feel for what does and does not work, but that experience is acquired by trying several versions and timing them.

    In C—at least, on older compilers without good optimization—there is a difference between ++I and i++. I’m referring to cases where there are no side effects, where the pre- or post-increment value isn’t part of another calculation or stored somewhere else. The net effect is the same, but the process may not be. The i++ is widely, almost instinctively used by most C programmers. I have trained myself to use the ++I form instead. The C standard specifies that the result of either of those expressions can be used in a subsequent calculation. In essence it is left in an accumulator.

    However, for ++I the result after the increment is used, and for i++ the value before the increment is used. That means for i++ a copy of the value has to be stored somewhere before the real value can be incremented; after that, it no longer has the correct value. So an extra register or memory location is tied up and an extra instruction is required.

    Now a good optimizing compiler can recognize when the value is not needed, and optimize out those steps, but with optimization off (or inadequate) the compiler must generate the extra code in case it is needed. Back when I learned this secret, that level of optimization was often omitted in the interest of keeping the compile times manageable. Even today, it is often easier to debug code that hasn’t been optimized, so at least during development, the trick may result in savings.

    Here’s another one that tripped me up recently. I’ve long known the advantages of lookup tables. Working mostly with small 8- or 16-bit integer CPUs, a lot of stuff can be sped up by replacing complex calculations with a table lookup (assuming the size of the table is affordable).

    I was part of a discussion recently in the Asterisk community about ulaw and alaw transcoding and various ways to accomplish it. I was shocked that given the amount of memory available today and the small size of the required table, anyone would even consider an approach other than indexing into an array.

    I was enlightened by others that all this memory is available to us at the price of multilevel caches for performance. While the table size was insignificant for memory utilization, it was not insignificant compared to cache size, and a cache miss was more costly than the algorithm that might replace the table. Looks like life has come full circle.

    What optimization tricks have you learned? Or what pitfalls have you encountered by painful experience?

    Continue Reading

    11 Myths About Generative AI

    Compiler Qualification is Critical for Smart-Device Code

    Sponsored Recommendations

    Designing automotive-grade camera-based mirror systems

    Dec. 2, 2023

    Design security cameras and other low-power smart cameras with AI vision processors

    Dec. 2, 2023

    Automotive 1 TOPS vision SoC with RGB-IR ISP for 1-2 cameras, driver monitoring, dashcams

    Dec. 2, 2023

    AM62A starter kit for edge AI, vision, analytics and general purpose processors

    Dec. 2, 2023

    Comments

    To join the conversation, and become an exclusive member of Electronic Design, create an account today!

    I already have an account

    New

    Securing Data in the Quantum Era

    Celebrating Field Engineers: The Unsung Heroes of Innovation

    Checking Out the NXP Hovergames NavQ Plus

    Most Read

    MEMS Mirrors: The Next Big Wave in MEMS Technology

    Altech Corporation Products for Electronic Design

    Partnership Develops Coherent Detection-Based LiDAR Platforms


    Sponsored

    Design automotive occupancy detection systems with new Arm-based processors

    How to build a smart retail scanner

    Empowering Medical Memory Solutions

    Electronic Design
    https://www.facebook.com/ElectronicDesign
    https://www.linkedin.com/groups/4210549/
    https://twitter.com/ElectronicDesgn
    https://www.youtube.com/channel/UCXKEiQ9dob20rIqTA7ONfJg
    • About Us
    • Contact Us
    • Advertise
    • Do Not Sell or Share
    • Privacy & Cookie Policy
    • Terms of Service
    © 2023 Endeavor Business Media, LLC. All rights reserved.
    Endeavor Business Media Logo