Professional Documents
Culture Documents
Architectures
(and how they'll affect parallel
programming models)
Simon McIntosh-Smith simonm@cs.bris.ac.uk
Head of Microelectronics Research
University of Bristol, UK
2x/3yrs
2x/2yrs
~2B transistors
High-performance
MPU, e.g.
Intel Nehalem
Cost-performance
MPU, e.g.
Nvidia Tegra
http://www.itrs.net/Links/2011ITRS/2011Chapters/2011ExecSum.pdf
Interconnect asymmetry
Memory hierarchies
Software (OS, middleware, tools, )
! Heterogeneous Systems
AMD Llano Fusion APUs
Intel MIC
! Heterogeneity is mainstream
! Current limitations
Disjoint view of memory spaces between
CPUs and GPUs
Hard partition between host and
devices in programming models
Dynamically varying nested parallelism
almost impossible to support
Large overheads in scheduling
heterogeneous, parallel tasks
7
The emerging
Heterogeneous System
Architecture (HSA) standard
Promoters
Supporters
Contributors
Academic
University of Illinois
Computer Science
! HSA overview
The HSA Foundation launched mid 2012
HSA is a new, open architecture specification
HSAIL virtual (parallel) instruction set
HSA memory model
HSA dispatcher and run-time
! HSA overview
11
! HSA dispatch
HSA designed to enable heterogeneous
task queuing
A work queue per core (CPU, GPU, )
Distribution of work into queues
Load balancing by work stealing
Application
A
Direct3D
Data Flow
User
Mode
Driver
Soft
Queue
Command Buffer
Kernel
Mode
Driver
DMA Buffer
GPU
HARDWARE
Hardware
Queue
Application
A
Direct3D
Data Flow
User
Mode
Driver
Soft
Queue
Kernel
Mode
Driver
Command Buffer
Command Flow
DMA Buffer
Data Flow
C
Application
B
Direct3D
User
Mode
Driver
Soft
Queue
Kernel
Mode
Driver
Command Buffer
Command Flow
Application
C
Direct3D
GPU
HARDWARE
DMA Buffer
Data Flow
User
Mode
Driver
Soft
Queue
Command Buffer
Hardware
Queue
Kernel
Mode
Driver
DMA Buffer
Application
A
Direct3D
Data Flow
User
Mode
Driver
Soft
Queue
Kernel
Mode
Driver
Command Buffer
Command Flow
DMA Buffer
Data Flow
C
Application
B
Direct3D
User
Mode
Driver
Soft
Queue
Kernel
Mode
Driver
Command Buffer
Command Flow
Application
C
Direct3D
GPU
HARDWARE
DMA Buffer
Data Flow
User
Mode
Driver
Soft
Queue
Command Buffer
Hardware
Queue
Kernel
Mode
Driver
DMA Buffer
CPU1
CPU2
GPU
19
20
Purpose
Enable research
LLVM Contributions
HSA Assembler
HSA Runtime
HSA Finalizer
! Conclusions
Heterogeneity is an increasingly important trend
The market is finally starting to create and adopt
the necessary open standards
Proprietary models likely to start declining now
Dont get locked into any one vendor!
www.cs.bris.ac.uk/Research/Micro
24