|
Two different architectural solutions have been pointed out and
evaluated: special-purpose and standard processing system.
In particular during the project a massively parallel architecture was
developed: PAPRICA, based on a square array of 256 single-bit
Processing Elements was able to deliver a sufficiently high
computational power to allow real-time image processing when
other general-purpose processors weren't (at that time only 80286 and 80386
were commercially available).
The advantages offered by the first solution, such as an
ad-hoc design of both the processing paradigm and the overall system
architecture, are diminished by the necessity of managing the complete
project, starting from the hardware level (design of the ASICs) up to
the design of the architecture, of the programming language along with an
optimizing compiler, and finally to the development of applications
using the specific computational paradigm.
Conversely the latter
takes advantage of standard development tools and environments but suffers
from a less specific instruction set and a less oriented system
architecture.
In addition also the following technological aspects need to be considered:
- the fast technological improvements, which tend to reduce the life
time of the system;
- the costs of the system design and engineering,
which are justified
for productions based on large volumes only.
For these reasons the architectural solution currently under evaluation on
the ARGO vehicle is based on a standard 200 MHz MMX Pentium processor.
MMX technology represents an enhancement of the Intel processor family,
adding instructions, registers, and data types specifically
designed for multimedia data processing.
Namely software performance are boosted exploiting a
SIMD technique: multiple
data elements can be processed in parallel using a single instruction.
The new general-purpose instructions supported by MMX technology
perform arithmetic and logical operations on multiple data elements
packed into 64-bit quantities.
These instructions accelerate the performance of applications
based on compute-intensive algorithms that
perform localized recurring operations on small native data.
More specifically in the processing of gray level images,
data is represented in 8 bit quantities, hence an MMX instruction can
operate on 8 pixels simultaneously.
Basically the MMX extensions provide the programmers with the following
new features:
- MMX Registers:
-
the MMX technology provides eight general-purpose 64-bit new registers.
MMX registers have been overlapped to the
floating point registers to assure
the backward compatibility with the existing software and specifically
with the multitasking operating systems[3].
Unfortunately this solution has two drawbacks:
- the programmer is expected to not mix MMX instructions and
floating point code in any way, but is forced to use a specific
instruction (EMMS) at the end of every MMX enhanced routine.
The EMMS instruction empties the floating point tag word,
thus allowing
the correct execution of floating point operations;
- frequent transitions between the MMX and
floating-point instructions
may cause significant performance degradation.
- MMX Data Types:
-
the MMX instructions can handle four different
64-bit data types:
- 8 bytes packed into one 64-bit quantity,
- 4 words packed into one 64-bit quantity,
- 2 double-words packed into one 64-bit
quantity, or
- 1 quadword (64-bit).
This allows to process multiple data using a
single instruction or
to directly manage 64-bit data.
- MMX arithmetics:
-
the main innovation of the MMX technology
consists in the two different methods used to
process the data:
- saturation arithmetic and
- wraparound mode.
Their difference depends on how the overflow or
underflow caused
by mathematical operations is managed.
In both cases MMX instructions do not generate
exceptions nor
set flags, but
in wraparound mode, it results that overflow or
underflow
are truncated and only the least significant
part of the result
is returned;
conversely, the saturation approach consists in
setting
the result of an operation
that overflows
to the maximum value of the range, as well as
the result of an operation
that underflows is set to the minimum value.
For example packed unsigned bytes for results
that
overflow or underflow are saturated to
0x00 or to 0xFF
respectively.
The latter approach is very useful for
grey-level image processing, in fact, saturation
brings grey value to pure black or pure white,
without allowing for an inversion as in the
former
approach.
- MMX instructions:
-
MMX processors are featured by 57 new
instructions, which may be grouped into
the following functional categories:
arithmetic instructions,
comparison instructions,
conversion instructions,
logical instructions,
shift instructions,
data transfer instructions,
and the EMMS instruction.
|
|