The Processing System


Architectural Issues

Two different architectural solutions have been pointed out and evaluated: special-purpose and standard processing system. In particular during the project a massively parallel architecture was developed: PAPRICA, based on a square array of 256 single-bit Processing Elements was able to deliver a sufficiently high computational power to allow real-time image processing when other general-purpose processors weren't (at that time only 80286 and 80386 were commercially available).

The advantages offered by the first solution, such as an ad-hoc design of both the processing paradigm and the overall system architecture, are diminished by the necessity of managing the complete project, starting from the hardware level (design of the ASICs) up to the design of the architecture, of the programming language along with an optimizing compiler, and finally to the development of applications using the specific computational paradigm. Conversely the latter takes advantage of standard development tools and environments but suffers from a less specific instruction set and a less oriented system architecture.

In addition also the following technological aspects need to be considered:

  • the fast technological improvements, which tend to reduce the life time of the system;
  • the costs of the system design and engineering, which are justified for productions based on large volumes only.

For these reasons the architectural solution currently under evaluation on the ARGO vehicle is based on a standard 200 MHz MMX Pentium processor.

The MMX Technology

MMX technology represents an enhancement of the Intel processor family, adding instructions, registers, and data types specifically designed for multimedia data processing. Namely software performance are boosted exploiting a SIMD technique: multiple data elements can be processed in parallel using a single instruction. The new general-purpose instructions supported by MMX technology perform arithmetic and logical operations on multiple data elements packed into 64-bit quantities. These instructions accelerate the performance of applications based on compute-intensive algorithms that perform localized recurring operations on small native data. More specifically in the processing of gray level images, data is represented in 8 bit quantities, hence an MMX instruction can operate on 8 pixels simultaneously.

Basically the MMX extensions provide the programmers with the following new features:

MMX Registers:
the MMX technology provides eight general-purpose 64-bit new registers. MMX registers have been overlapped to the floating point registers to assure the backward compatibility with the existing software and specifically with the multitasking operating systems[3].

Unfortunately this solution has two drawbacks:

  • the programmer is expected to not mix MMX instructions and floating point code in any way, but is forced to use a specific instruction (EMMS) at the end of every MMX enhanced routine. The EMMS instruction empties the floating point tag word, thus allowing the correct execution of floating point operations;
  • frequent transitions between the MMX and floating-point instructions may cause significant performance degradation.

MMX Data Types:
the MMX instructions can handle four different 64-bit data types:
  • 8 bytes packed into one 64-bit quantity,
  • 4 words packed into one 64-bit quantity,
  • 2 double-words packed into one 64-bit quantity, or
  • 1 quadword (64-bit).
This allows to process multiple data using a single instruction or to directly manage 64-bit data.

MMX arithmetics:
the main innovation of the MMX technology consists in the two different methods used to process the data:
  • saturation arithmetic and
  • wraparound mode.
Their difference depends on how the overflow or underflow caused by mathematical operations is managed. In both cases MMX instructions do not generate exceptions nor set flags, but in wraparound mode, it results that overflow or underflow are truncated and only the least significant part of the result is returned; conversely, the saturation approach consists in setting the result of an operation that overflows to the maximum value of the range, as well as the result of an operation that underflows is set to the minimum value. For example packed unsigned bytes for results that overflow or underflow are saturated to 0x00 or to 0xFF respectively.

The latter approach is very useful for grey-level image processing, in fact, saturation brings grey value to pure black or pure white, without allowing for an inversion as in the former approach.

MMX instructions:
MMX processors are featured by 57 new instructions, which may be grouped into the following functional categories: arithmetic instructions, comparison instructions, conversion instructions, logical instructions, shift instructions, data transfer instructions, and the EMMS instruction.