Electronic Design

ARM On-Chip Debug Hardware

ARM, one of the twenty-first century microprocessor standards, has a major presence in SoCs. A 32-bit RISC, it was designed for mid- to low-level applications. It was also designed for low-power operation.

Vendors can license the ARM architecture, and some licensees can modify it. Intel has developed its own variation on ARM, the StrongARM, and its communications derivatives, like XScale. ARM has enhanced the architecture with higher performance cores, as well as with DSP capability and JAVA code execution.

Because of its ASSP and ASIC deployment, ARM has developed on-chip debugging resources for its cores. The latest is the combination of EmbeddedICE and ETM. One defines a debug architecture built into the CPU core that includes a Debug state, hardware comparators, load/store instructions and memory (from the JTAG port), and a JTAG/TAP port. The second, Embedded Trace Macrocell, defines the on-chip hooks to implement an on-chip instruction and data-access trace facility.

EmbeddedICE is an optional block available for ARM cores. The ETM specification has been released, and the block is available.

A third debug tool, the Multi-ICE emulator box, supports multiple EmbeddedICE TAP controllers on multiple chips. The TAP controllers are daisy-chained, like multiple TAP controllers on a single ASIC. All the processors on the chain do not have to be ARM cores. Others will be bypassed.

The ARM CPU implements both Debug and Monitor modes. The Debug Mode takes control of the CPU, stopping all software, while the Monitor Mode lets the real-time operating-system (RTOS) software run. Also, the Debug Mode is entered either by a breakpoint/watchpoint exception, by executing the breakpoint instruction, or by external signals.

EmbeddedICE supports four primary external signals for debug: DBGIEBKPT (hw breakpoint), DBGDEWPT (data breakpoint), EDBGRQ (enter debug), and DBGACK (ack in debug state).

Instructions fetched from memory are sampled at the end of a cycle for breakpoints/watchpoints. If an instruction doesn't reach the execute stage, it won't trigger a breakpoint/watchpoint. If there's an interrupt on an instruction boundary, then the interrupt is taken and the instruction is repeated, where its breakpoint will occur.

Watchpoints generate a data abort, and breakpoints generate a prefetch abort. Users can combine two compare units in a CHAIN or a Range arrangement. The watchpoints can include a masked data compare. In this mode, users can load instructions or read/write memory via the TAP controller. They can single-step the CPU.

The Monitor Mode enables the user to monitor break/watchpoints, using a software monitor. It allows the processor to keep running and supports interrupt ISRs, but stops the main application thread. This lets the CPU interface with on-chip functions and peripherals while debugging. It doesn't support external watch/datapoints or single-step. An RTOS monitor allows the kernel to continue to run. A hardware breakpoint will trigger a prefetch/data abort exception into the monitor, rather than into Debug Mode.

ETM is a separate, optional module. It incorporates its own address/data comparators, context ID (process) comparators, Range, three-state sequencer, external inputs, 16-bit counters, and trace start/stop. Users can configure complex trigger and filter combinations. Each Context ID comparator uses a 32-bit value register and shares a mask register. Defined trace signals include PIPSTAT, TRACEPKT, TRACESYNC, and TRACECLK.

The ETM was designed to support full-speed trace, at least for the instruction trace. It traces PC changes and branch addresses. In addition, the ETM traces data accesses (address, data, or both), as well as filter and do address compression. ETM implements two kinds of trace filtering: start/stop and includes/excludes per element.

See associated figure

  • Two watchpoints (can be breakpoints)
  • Can add more breakpoints/watchpoints
  • Supports Debug and Monitor modes
  • Takes breakpoint exception in the Monitor Mode
  • Multi-ICE supports multiple CPUs
  • EJTAG/TAP port (read/write reg. and memory, insert instructions)
  • High-level, full-duplex comm channel over JTAG

  • Compresses trace data
  • Supports instruction and data traces
  • Complex Triggering (when), Filtering (type)
  • Address and data comparators, 16-bit counters,
    three-state sequencers
  • Transmits trace packets
  • Trace port—2 X multiplexed, or Þ X—4/8/16-bit ports

Hide comments


  • Allowed HTML tags: <em> <strong> <blockquote> <br> <p>

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.