Many applications,
such as multimedia processing, digital signal, image and vision
processing, and scientific computing, generate computation tasks
that are structured and repetitive exhibiting varying degree of
instruction-level parallelism (ILP). The examples of embedded
structured computations in these applications include matrix-vector
product, convolution, various digital signal processing (DSP)
filters. These tasks operate on large blocks of data involving
the same computation. There are multiple such tasks active simultaneously.
Moreover, over time, the set of computed tasks varies. These needs
dictate that the following architecture level parameters be dynamically
variable: number and types of function units (FUs), amount of
on-chip register and cache memory, an ability to trade the on-chip
memory resources with the number of FUs dynamically, and the interconnectivity
between these units. We propose to use reconfigurable logic blocks
to make the architecture adaptive to the applications' computing
and memory bandwidth needs.
Our goal in
this research is to study and evaluate the effects of integrating
reconfiguration schemes into the on-chip function units (FUs)
and cache memories on a processor chip. Specifically, the following
architecture components would be reconfigurable.
Dynamically
programmable FUs and effective management of data
and configuration flow.
FUs reconfigurable
as queues embedded in the register address space.
Dynamically
variable mapping of memory address space onto cache memory space
to reduce the application's I/O bandwidth requirement.
Caches reconfigurable
as multi-function FUs.
Reconfigurable
multiple-buses (RMB) to provide effective connectivity among the
FUs and cache memories.
Our objective
is to use these novel ideas, reconfigurable FUs, cache memory,
RMB, to achieve balanced computing. A computation is balanced
if the computing bandwidth matches the memory bandwidth. We propose
to use reconfigurable function units and caches to offer fast
dynamic reconfiguration of the computing and memory bandwidth.
The RMB will be used to dynamically meet the variable bus traffic
demands. We also propose to modify the compiler and operating
system to support an application to utilize the new features.
We will develop and evaluate the ABC (Adaptive Balanced Computing)
architecture and a high-level design for the entire system.