You are on page 1of 2

2012 IEEE Computer Society Annual Symposium on VLSI

A Dataow Framework for DSP Algorithm Renement


Youngsoo Kim and Winser E Alexander Dept. of Electrical and Computer Engineering North Carolina State University Raleigh, USA Email: {youngsoo kim, winser}@ncsu.edu William W Edmonson Dept. of Electrical and Computer Engineering North Carolina A&T State University Greensboro, USA

AbstractCurrent video compression algorithms are increasingly complicated and difcult to analyze and prole. Design tools and system level languages often prove to be inefcient and incapable of providing complexity analysis as a rst step directed toward at the implementation of video compression algorithms. This paper proposes framework that will help to develop a methodology that facilitates the derivation of analytical dataow models. The framework proposes dataow models for quantifying the underlying algorithms memory complexity, related timing considerations, and verication of the correctness of the video compression algorithm. Keywords-renement; DSP; dataow; framework

I. I NTRODUCTION Current digital signal processing algorithms such as H.264 video compression, and BioImage Signal Processing (BISP) are increasingly complex and difcult to analyze nal performance with an abstract design description [1]. Various design environments have been used for analyzing and mapping DSP algorithms onto appropriate hardware architecture. These design environments and system level languages were often incapable of providing upfront performance estimates as a rst step in the implementation of DSP algorithms even while speeding up the design process. It was a challenge for us to design hardware that is fast, small and power-efcient to be used practically within a short amount of time. Rapid performance estimation has gained prominence as a challenging problem to solve in a multidisciplinary team environment. These design processes involved a team effort similar to that of the MEPG standard working group: an imaging scientist (mathematical formulation, imaging prole design, and the interpretation of results), an architect (architectural decisions), hardware designer (implementation) and/or BISP team: a biologist or medical doctor (protocol preparation, and interpretation of results), biochemists (e.g. uorescent markers design), an imaging scientist (formulation), an image processing architect and a designer [1]. The main advantage of dataow use is its modeling and estimating capabilities which formulate DSP algorithms as do insightful designers without additional translation or transformation steps. A DSP system is typically described using a block diagram connected by inherent dataow. The
978-0-7695-4767-1/12 $26.00 2012 IEEE DOI 10.1109/ISVLSI.2012.74 1

DSP algorithms process the data in sets of samples. Their stream is controlled by a single sample period or multiples of periods. The use of dataow naturally exhibits a streaming nature in its algorithm description and estimations. The proposed framework estimates the nal performance through designer-dened metrics with dataow models which are used to quantify and evaluate the designated algorithms. Those include performance parameters such as throughput, latency and area; memory parameters such as peak memory bandwidth and number of memory accesses; dataow parameters such as number of actors and number of produced actions and communication parameters. II. R ELEVANT W ORK One difculty that both hardware and software designers share is the problem of constructing a reasonably accurate design model with high-level abstractions at an early stage of the design process. There are several tools for traditional proling including software proling [2]. The basic idea of these tools is that the applications spend a large share of execution time in a kernel or inner loops. Intel VTune and GNU Prof were the standard tools for this purpose but they were focused more on instruction-level complexity in a program rather than on a potential measure for the nal implemented system in terms of memory complexity or other timing considerations [2]. Based on an open literature survey, these common tools did not provide customized design metrics such as memory complexity and bandwidth information beyond memory access counts for hardware implementation alternatives selection. Therefore, designers were hesitant to utilize these tools to assess algorithm candidates or to provide architecture candidates for specic DSP algorithms. The hardware/software co-design communitys focus had been directed toward estimating the early performance of applications. The work of HW/SW codesign tools such as Polis, Ptolemy II and Synopsys CoWARE tools led to effective design environments which co-simulated and/or co-synthesized heterogeneous systems and techniques for optimizing and reducing memory requirements [3]. However, these tools relied on time consuming RTL simulations with incremental renement. They focused on more accurate

Application CAL Description


Input Parameters (e.g. H.264 Algorithm)

CAL Dataflow Template Library

Application CAL Pre-processing

User Pragma

H/V frame size in pixel Frame Rate Size of Motion Block (MB) Search Range Local/External Memory Parameters

Dataflow Transform (Dataflow actor pipeline, merging, split)

Xilinx FPGA Libraries Parameters

Presynthesized Netlist and cores Parameters

.mhs .mss

Statistic Files

Output Parameters (e.g. H.264 Algorithm) Visual Quality (PSNR) Processing Speed Regularity of Address, Data Area Memory Bandwidth, IO Bandwidth

Figure 2.

Design Exploration Pareto curve of PSNR vs. QP Table I E ARLY E STIMATION TAT R EDUCTION HDL modeling 4 weeks 3,900(verilog) This framework 3 days 920

FPGA Area Model FPGA Power Model

Trade-offs curves (Area, Power, Speed, PSNR)

Simulation based Verification

Figure 1.

Our Dataow Framework Design Flow

Estimation Time Lines of code

memory metrics with specic algorithm-architecture binding such as domain specic platform. III. A NALYTICAL DATAFLOW F RAMEWORK We observe that video coding algorithm models typically involve two stream memories and control led conditions of those stream memories. In this context, Caltrop language (CAL) was good candidate for studying and analyzing the behavior of video compression models for early estimation [4]. We determined that CAL had the efcient modeling capability since it came with language property and automated tool sets. We selected CAL Actor Dataow Language as the analytical modeling language for the operation of the Function Units (FUs) in the dataow template library. Fig. 1 illustrates an overall view of the design ow employed in an H.264 design space exploration loop. We developed a CAL description to model an algorithm. The description included input parameters such as maximum width of frame size in macroblocks, size in bits of macroblock, horizontal and vertical search range in pixels in H.264 algorithm. We generated the extended CAL model in an automated manner using our framework by combining a set of dataow templates such as parameterized motion estimation sub-blocks. This allowed us to reuse CAL dataow templates for different video algorithms and provided designers with trade-off curves based upon design metrics. Trade-off curves with Design Space Exploration (DSE) statistical les were generated for designers to assess different algorithms in the early stages of design. Fig. 2 illustrates the DSE curve generated by our framework for H.264 encoding in terms of Quantization Parameters (QP) and reconstructed image quality. Table 1 conrms the effective use of this framework for performance estimation time saving. We reduced the Turn Around Time (TAT) of building dataow models and performance estima-

tion time using our framework by nine fold compared to HDL modeling. The time for modifying simulation model description and simulating it grew in a linear manner with reference to the size of the description. Additionally, it was time consuming and error prone to build test benches and verication scripts to run conventional simulation models. The results show the clear advantage of using our dataow framework for initial DSP specication. Development of dataow models required less time and the dataow model can be more easily expanded and reused compared to Verilog/C languages for early estimation. IV. C ONCLUSION The proposed framework presented an efcient dataow performance estimation methodology, capable of accurately proling various DSP algorithms. This paper presented a framework and methodology for efcient dataow performance estimation, capable of accurately proling various DSP algorithms. This design ow can link the DSE steps by offering a centralized framework tool to the community while reducing the design efforts among designers. R EFERENCES
[1] L. Blanc-Feraud, A. Laine, B. Lelieveldt, J.C. Olivio-Marin, M. Unser, Trends in bioimaging and signal processing, 2011. [2] E.M, Saad, M.H.A. Awadalla, K.E. El-Deen, FPGA-based software proler for Hardware/Software co-design, Radio Science Conference, 2009. NRSC 2009. National , vol., no., pp.1-8, 17-19 March 2009 [3] A. Sangiovanni-Vincentelli, M. Di. Nat ale, Embedded System Design for Automotive Applications, Computer , vol.40, no.10, pp.42-51, Oct. 2007. [4] http://embedded.eecs.berkeley.edu/caltrop/language.html

You might also like