You are on page 1of 91

AXI Overview

Upscale Training

2010 Wipro Ltd - Confidential

AMBA
Advanced Microcontroller Bus Architecture
On-chip bus protocol from ARM
On-chip interconnect specification for the connection and management of
functional blocks including processor and peripheral devices

Introduced in 1996
AMBA is a registered trademark of ARM Limited
AMBA is an open standard

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

Evolution of AMBA Standard

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

Course Summary
This presentation outlines the specific topics/sections that need to be
understood.
For each of the topic corresponding section number in the AMBA
AXI and ACE Protocol Specification (Issue E, Date 22 February 2013)
is provided.
This has been divided into 3 parts:
Part A: AMBA AXI3 and AXI4 Protocol Specifications
Part B: AMBA AXI4-Lite
Part C: ACE Protocol Specification

Please go through the details in the AMBA specification.

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

Part A: AMBA AXI3 and AXI4 Protocol Specifications

Introduction to AXI Protocol

Features (A1.1)

Revisions (A1.2)

AXI Architecture (A1.3)


Signal Descriptions

Global Signals (A2.1)

Write Address Channel Signals (A2.2)

Write Data Channel Signals (A2.3)

Write Response Channel Signals (A2.4)

Read Address Channel Signals (A2.5)

Read Data Channel Signals (A2.6)

Low Power Interface Signals (A2.7)


Signal Interface Requirements

Clock and Reset (A3.1)

Basic read and write transactions (A3.2)

Relationship between channels (A3.3)

Transaction Structure (A3.4)


Transaction Attributes

Transaction Attributes (A4.1)

AXI3 memory attribute signaling (A4.2)

AXI4 changes to memory attribute signaling (A4.3)

Memory Types (A4.4)

Mismatched memory attributes (A4.5)

Transaction Buffering (A4.6)

Access Permissions (A4.7)

Legacy Considerations (A4.8)

Multiple Transactions

AXI Transaction Identifiers (A5.1)

Transaction ID (A5.2)

Transaction Ordering (A5.3)

Removal of Write Interleaving Support (A5.4)


AXI4 Ordering Model

Definition of the ordering model (A6.1)

Master Ordering (A6.2)

Interconnect Ordering (A6.3)

Slave Ordering (A6.4)

Response before final destination (A6.5)

Ordered write observation (A6.6)


Atomic Accesses

Single Copy Atomicity Size (A7.1)

Exclusive Accesses (A7.2)

Locked Accesses (A7.3)

Atomic Access Signaling (A7.4)


AXI4 Additional Signalling

QoS Signaling (A8.1)

Multiple Region Signaling (A8.2)

User defined Signaling (A8.3)


Low Power Interface

About Low Power interface (A9.1)

Low Power Clock Control (A9.2)


Default signaling and Interoperability

Interoperability principles (A10.1)

Major interface categories (A10.2)

Default Signal Values (A10.3)

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

Part B: AMBA AXI4-Lite

Definition of AXI4-Lite (B1.1)


Interoperability (B1.2)
Defined Conversion Mechanism (B1.3)
Conversion, Protection and Detection (B1.4)

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

Part C: ACE Protocol Specification

About ACE (C1)


Signal Descriptions (C2)
Channel Signaling (C3)
Coherency Transactions on Read and Write Address Channels (C4)
Snoop Transactions (C5)
Interconnect Requirements (C6)
Cache Maintenance (C7)
Barrier Transactions (C8)
Exclusive Accesses (C9)
Optional External Snoop Filtering (C10)
ACE-Lite (C11)
Distributed Virtual Memory (C12)
Interface Control (C13)
Master Design Recommendations (C14)

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

Part A
AMBA AXI3 and AXI4 Protocol Specifications

2010 Wipro Ltd - Confidential

Advanced eXtensible Interface (AXI):

AMBA AXI protocol is targeted at high-performance, high-frequency system designs

AXI key features

Support for separate channels for:

Support for unaligned data transfers using byte strobes


Ability to issue multiple outstanding addresses
Out of order (OO) transaction completion
Support for data interleaving
Advanced system cache support

Specify if transaction is cacheable/bufferable


Specify attributes such as write-back/write-through

Enhanced protection support

Read Address
Read Data
Write Address
Write Data and
Write Response

Secure/non-secure transaction specification

Exclusive access (for semaphore operations)


Register slice can be easily added for timing-closure

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

AXI System Components

10

2010 Wipro Ltd - Confidential

5 Independent Channels

Read address channel and Write address channel

Conveys address and other control information


Variable length burst: 1 ~ 16 data transfers

Convey data and any read response info.


Data bus can be 8, 16, 32, 64, 128, 256, 512, or 1024 bits wide
Read response is signaled per transfer.

Write data channel

Burst with a transfer size of 8 ~ 1024 bits (i.e.1Byte ~ 128Bytes)

Read data channel

Exception: In AXI4, INCR bursts can have lengths upto 256 transfers.

Data bus can be 8, 16, 32, 64, 128, 256, 512, or 1024 bits wide

Write response channel

Write response info, signaled for the entire burst.

NOTES:
Each channel is independent and uses a 2-way flow-control.

11

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

AMBA AXI Read Channels

12

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

AMBA AXI Read Channels

Independent

13

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

AMBA AXI Read Channels

Give me some data

Independent

14

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

AXI Read Channels

Give me some data

Independent

Here you go

15

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

AXI Read Channels


channels synchronized with ID #
or tags
Give me some data

Independent

Here you go

16

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

AXI Write Channels

17

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

AXI Write Channels

Independent

Independent

18

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

AXI Write Channels


Im sending data. Please store it.

Independent

Independent

19

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

AXI Write Channels


Im sending data. Please store it.

Independent
Here is the data.

Independent

20

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

AXI Write Channels


Im sending data. Please store it.

Independent
Here is the data.

Independent

I received that data correctly.

21

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

AXI Write Channels


Im sending data. Please store it.

Independent
Here is the data.

Independent

I received that data correctly.


channels synchronized with ID #
or tags
22

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

AXI Flow-Control

AXI uses a valid/ready


handshake acknowledge

Each channel has its own


valid/ready
Inserting Wait States

Information moves only when:

Source has Valid information and


Destination is Ready

On each channel the master or


slave can limit the flow

Always Ready

Flexible signaling functionality

Inserting wait states


Always Ready
Same Cycle Acknowledge
Same Cycle Acknowledge

23

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

AXI Flow-Control

AXI uses a valid/ready


handshake acknowledge

Each channel has its own


valid/ready
Inserting Wait States

Information moves only when:

Source has Valid information and


Destination is Ready

On each channel the master or


slave can limit the flow

Always Ready

Transfer
Flexible signaling functionality

Inserting wait states


Always Ready
Same Cycle Acknowledge
Same Cycle Acknowledge

24

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

AXI Read
Read Address Channel

Read Data Channel

25

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

Read Burst Operation

Read request
is initiated

Slave
is ready

Read request
is accepted
26

1st data
is transferred

The last data


is transferred

Note: data transfer only when valid = ready = 1

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

One Address for Burst


Separation of address and data channel
Master provides only the start address of burst
Slave needs to generate the remaining addresses based on
burst type (FIXED, INCR, WRAP)

ADDRESS

DATA

27

A11

A21

D11

D12 D13 D14 D21

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

A31

D22 D23

D31

Overlapping Read Bursts

Read request A
is accepted Read request B
is accepted via AR channel
while data A(0) is
transferred via R channel
28

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

AMBA AXI Write


Write Address Channel

Write Data
Channel

Write Response Channel


29
29

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

Write Burst Operation

Write request A
is accepted

30

Response completes
write operation

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

Dependencies between Channel Handshake


Signals (AXI3)
To prevent a deadlock situation, you must observe
the dependencies that exist between the handshake
signals
In any transaction:
The VALID signal of one AXI component must not
be dependent on the READY signal of the other
component in the transaction
The READY signal can wait for assertion of the VALID
signal
WLAST

31

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

Dependencies between Channel Handshake


Signals (AXI4)

[AXI4]

The AXI3 protocol requires that the write response for all transactions
must not be given until the clock cycle after the acceptance of the last data
transfer
In addition, the AXI4 protocol requires that the write response for all
transactions must not be given until the clock cycle after address
acceptance
WLAST

AXI3

AXI4
32

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

Use of IDs
AXI gives an ID tag to every transaction

Write
Data
Channel

Write
Address
Channel

Write
Response
Channel
Read
Address
Channel

Read
Data
Channel
33

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

Transaction ID Implementation
Real implementation
Transaction ID = <master ID, channel ID>
Channel ID = original AXI transaction id
Master ID is needed to identify the initiating master among all the masters

CPU

ID: 3

Video
Decode
r

ID: 2

3D
Graphic
s

LCD
Control

ID: 3

ID: 0

Video
Process

Mixer

ID: 3

Interconnect
ID: 4 + ceil(log27)=7 bits
Memory
Controller
34

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

ID: 2

DMA

ID: 4

Use of IDs
Multiple Outstanding Addresses:
By using IDs, a master can issue transactions without waiting for earlier
transactions to complete.

Write Data Interleaving in AXI3 Slaves:


With Write Data Interleaving, an AXI3 slave can accept interleaved write-data
with different AWID values.
This feature is not supported in AXI4
All Write Data for a transaction must be provided in consecutive transfers on the
write data channel.
WID signal is not supported in AXI4

Out of Order completion


Transactions with the same ID are completed in order
Transactions with different IDs can be completed out of order
Fast-responding slaves respond in advance of earlier transactions with slower
slaves
This is not a required feature. Simple masters and slaves can process one
transaction at a time in the order they are issued
35

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

Out-of-Order Transaction
Ordering by transaction ID
Slave can handle data transfers with different transaction IDs out-oforder
The order within a single burst is maintained
ADDRESS

RDATA

36

A11

A21

A31

D21

D22 D23

D31

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

D11 D12

D13

D14

Ordering Rules #1: Request and Response


Write request and data
The write data can appear at an interface before the write
address that relates to it

Two relationships that must be maintained are:


Read data must always follow the address to which the
data relates
A write response must always follow the last write
transfer in the write transaction to which the write
response relates

37

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

Ordering Rules #2: Read & Write Transactions with Same ID

No ordering restrictions between read and write transactions with the


same AWID and ARID. If a master requires an ordering restriction then
it must enforce the ordering.

38

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

Ordering Rules #3: Multiple reads with same ARID


The data for a sequence of read
transactions with the same
ARID must be returned in
order that:

Master
IP

When reads with the same ARID


are from the same slave then the
slave must ensure that the read
data is returned in the same order
that the addresses are received

When reads with the same ARID


Master
are from different slaves, the
IP
interconnect must ensure that
the read data is returned in the
same order that the master issued
the addresses in.
39

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

Slave
IP

Slave
IP 1
Interconnect
Slave
IP 2

Ordering Rules #4: Write Interleaving

Interleaving rule
Write Data with different ID can be interleaved.
This is supported only in AXI3
The order within a single burst is maintained

ADDRESS

WDATA

40

A11

A21

A31

D11 D21 D22 D12

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

D23

D31 D13

D14

Write Interleaving in AXI4


No support of write interleaving in AXI4
Master must ensure same order for write data as that of address
Removal of WID in AXI4, why?
Write data with different AWIDs follow their address order + no write
interleaving no need of WID!
Responses to multiple writes with different IDs can be out-of-order from
address order BID remains!

41

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

[AXI4]

AXI Protocol Responses


OKAY
Normal access success/Exclusive access failure/Exclusive access to nonsupporting Slave

EXOKAY
Exclusive access success

SLVERR
Slave generates error response/unsupported transfer size/WR access to
RO/timeout condition in slave/access to address where no register
present/access to disabled or powered-down function

DECERR
Can not Decode Slave Access then default slave gives DECERR

(Note: For a write transaction, there is just one response given for the entire
burst and not for each data transfer within the burst. In a read transaction,
the slave can give different responses for different transfers within a
burst.)
42

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

Register Slice for Timing Isolation


AXI enables the insertion of a register slice in any channel at the cost of
an additional cycle latency
Trade-off between latency and maximum frequency

Register slice can be used at any channel independently


Register slice incurs one cycle latency per insertion
Write Address/Control
AWREADY

Write data
WREADY

AXI
Master

Response

AXI
Slave

BREADY

Read Address/Control
ARREADY

Read data
RREADY
43

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

Some More things to know

A single clock signal, ACLK.


All input signals sampled on the rising edge of ACLK. All output signal changes
must occur after the rising edge of ACLK.
Must be no combinatorial paths between input and output signals on both
master and slave interfaces.

Single Active low reset ARESETn.

44

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

AXI Additional Features


Transaction burst type determines address bus behavior
Fixed, increment, or wrap

Unaligned Access
Master uses lower address bits; byte lane strobe must be consistent to lower
address bits information

Optional address Lock signals facilitates exclusive and atomic access


protection
System cache support
Protection unit support

45

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

Burst Length, Size and Type

46

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

Data Bus Usage: Narrow Transfers

In a Narrow Transfer, the address and control (WSTRB) determine which byte
lanes the transfer uses.
WSTRB[n:0] signals, when high specify which byte-lanes are used

Example 1: A narrow transfer with 8-bit transfers

Example 2: A narrow transfer with 32-bit transfers

47

Burst has 5 transfers


Starting address is 0
Transfer size is 8-bits
Data bus-width is 32-bit
Burst type is INCR

Burst has 3 transfers


Starting address is 4
Transfer size is 32-bit
Data bus-width is 64-bit
Burst type is INCR

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

Unaligned Transfer
AXI Support Unaligned Transfers
Unaligned Transfer is a transfer in
which the 1st byte accessed is
unaligned with the natural
address boundary
e.g. A 32-bit transfer that starts
at address 0x1002 is not aligned
to the natural 32-bit address
boundary

Master can:
Use low-order address lines to
signal an unaligned start address
OR
Provide an aligned address and
use byte-lane strobes to signal
the unaligned start address
48

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

Unaligned Transfer (Contd)


INCR burst case

49

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

Unaligned Transfer (Contd)


Wrapping burst case
The wrap boundary is aligned to the total size of the data to be transferred
That is, to ( (size of each transfer in burst) x (number of transfers in burst) )

After each transfer, the address increments same as for INCR Burst
If incremented address is ( (wrap boundary) + ( total size of data to be
transferred) ), then the address wraps around to wrap-boundary.

50

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

[AXI4]

Burst Length (AXI4)


AR(W)LEN[7:0] allows INCR burst of 256 beats
Burst in AXI3 protocol:

Early termination of bursts is not supported.


A burst must not cross a 4-kbyte boundary. This ensures that a burst is
only destined for a single slave.

AXI4 protocol longer burst support:


Bursts longer than 16 beats are only supported for the INCR burst
type. Both WRAP and FIXED burst types remain constrained to a
maximum burst length of 16 beats.
Exclusive accesses are not permitted to use a burst length greater than
16.

51

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

[AXI3]

System Cache Support


ARCACHE[3:0] / AWCACHE[3:0]
AxCACHE[3:0] signals define the
transaction attributes of a
transfer
Transaction attributes control:
How a transaction progresses
through the system
How any system-level cache handles
the transaction

CPU
RF

52

L1 Instruction
Cache

Memory
Unified L2
Cache

L1 Data Cache

Memory
Memory

Memory

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

System Cache Support


ARCACHE[3:0] / AWCACHE[3:0]

Bufferable bit (B): AxCACHE[0]

If 1, the transaction should be looked-up in the cache


In case of read miss, it is recommended to allocate an entry in the cache
If C=low, RA=low

Write Allocate bit (WA): AxCACHE[3]

53

The characteristics of the transaction at the final destination does not have to match the
characteristics of the original transaction.
For writes this means that a number of different writes can be merged together.
For reads this means that a location can be pre-fetched or can be fetched just once for
multiple read transactions.
To determine if a transaction should be cached this bit should be used in conjunction with the
Read Allocate (RA) and Write Allocate (WA) bits.

Read Allocate bit (RA): AxCACHE[2]

The interconnect or any component can delay the transaction for any number of cycles. This is
usually only relevant to writes.
Transaction response may not be from the final destination, but from the intermediate point,
like Cache. The cache is then responsible to update the memory.

Cacheable bit (C): AxCACHE[1]

[AXI3]

If 1, the transaction should be looked-up in the cache


In case of write miss, it is recommended to allocate an entry in the cache
If C=low,WA=low

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

Changes to Transaction Attribute Signaling

[AXI4]

The AxCACHE[1] bits are renamed:


From Cacheable to Modifiable to better describe the required functionality
Actual Functionality is unchanged

Ordering requirements are defined for Non-Modifiable transactions


Ordering between transactions should be maintained, if the transactions satisfy
all of the following conditions:
Transactions are Non-Modifiable
Transactions use the same ID
Transactions target the same slave device

Meanings of RA and WA bits are updated:


One bit indicates if an this transaction should be allocated in Cache
Other bit indicates if an allocation could have been made due to another
transaction
54

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

[AXI4]

RA and WA Bits
For Read Transactions:

RA bit means the same: The location could have been previously allocated in
the cache. It is recommended that this transaction is allocated in cache.
WA bit is redefined: The location could have been previously allocated in the
cache because of other transaction Either Write transaction or Transaction
by other master

For Write Transactions:


WA bit means the same: The location could have been previously allocated in
the cache. It is recommended that this transaction is allocated in cache.
RA bit is redefined: The location could have been previously allocated in the
cache because of other transaction Either a Read transaction or Transaction
by other master

This change means:


55

For a same location, a read and a write transfer may have different values for
AxCACHE

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

Protection Support
(AWPROT[2:0], ARPROT[2:0])
Normal or Privileged Mode: AxPROT[0]
Indicates whether an access was done by a Master in Privilege Mode or
in Unprivileged Mode
LOW indicates an access done by a Master in Unprivileged Mode
HIGH indicates an access done by a Master in Privileged Mode
A privileged processing mode typically has a greater level of access within a
system.

Secure or Non-secure: AxPROT[1]


LOW indicates an Secure access
HIGH indicates an Non-secure access
Used where a greater degree of differentiation between processing modes is
required.

Instruction or data, AxPROT[2]


LOW indicates a data access
HIGH indicates an instruction access.
56

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

Atomic Access
(ARLOCK[1:0], AWLOCK[1:0])
Normal access, AxLOCK[1:0]=b00
Exclusive access, b01
Exclusive read
Exclusive write
If no intervening write to the address
region, EXOKAY response. If not,
OKAY response.
Usually used for read-modify-write

Locked access, b10


Start with b10, and end with b00
During the period, only the lock
initiating master can access the
address region
57

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

Exclusive Access
Semaphore type operation without requiring bus to remain locked to a
particular master for the duration of the operation
Usually used for read-modify-write kind of operations
Slave must have additional logic to support exclusive access.
The basic process for an exclusive access is:
A master performs an exclusive read from an address location.
At some later time, the master attempts to complete the exclusive operation
by performing an exclusive write to the same address location.
The exclusive write access of the master is signaled as:
Successful (EXOKAY) if no other master has written to that location between the
read and write accesses.
Failed (OKAY) if another master has written to that location between the read and
write accesses. In this case the address location is not updated
Master 1

Master 2

Master 1

E.RD 0x100

WR 0x100

E.WR 0x100

Slave 1
OKAY
time

58

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

[AXI3]

Locked Access

Interconnect must ensure that only that master is allowed access to the
slave region until an unlocked transfer from the same master completes
Master should have no other outstanding transactions waiting to complete
before issuing locked sequence
Final transaction effectively removes the lock

Master 1
0x100

Master 2
0x100

Master 1
0x100
time

Lock

59

Unlock

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

Atomic Access in AXI4


No support of locked access

All locked accesses from AXI3 masters need to be converted to normal


accesses

60

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

[AXI4]

Additional Signaling (Optional)

Quality of Service Signaling (AxQOS[3:0])

[AXI4]

AxQOS 4-bit signals sent on address channel for each transaction


This protocol does not specify exact use of QoS identifiers
Recommendation: Can be used as a priority indicator for that transaction
Default value of b0000 indicates no participation in QoS scheme

Region Identifier Signals ( AxREGION[3:0] ):


4-bit signals can uniquely identify upto 16 different regions
The region identifier provides a decode of higher order address bits
Using regional identifiers, a single Physical Interface on a slave can mimic multiple (upto
16) logical interfaces, each with a different location in the system address map
Interconnect should produce AxREGION signals when performing the address decode
function for a signle slave that has multiple logical interfaces

User Signals on each AXI Channel for User Defined Signaling


( AWUSER, WUSER, BUSER, ARUSER, RUSER )
Specification recommends not to use them, to avoid interoperability issues

61

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

Low Power Interface (C channel)


Optional Extension to AXI protocol
Uses 3 level signals for handshake between the system-clock-controller and
the peripheral

Signals:
CACTIVE: (driven by peripheral)
High => Peripheral requires a clock signal. Clock-Controller must enable the clock
immediately.
Low => Peripheral does not require the clock

CSYSREQ: (driven by clock-controller)


Low => Request for the peripheral, to enter a low-power state
High => Request for the peripheral, to exit a low-power state

CSYSACK: (driven by peripheral)


Low => Low-power entry request acknowledged by peripheral
High => Low-power exit request acknowledged by peripheral
62

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

Low Power Interface (C channel)

The peripheral can accept or deny the request, from the system-clockcontroller, to enter low-power state.
The level of the CACTIVE signal when the peripheral acknowledges the
request by driving CSYSACK low indicates the acceptance or denial of the
request.

63

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

Low Power Interface (C channel)


Acceptance of low-power request

Denial of low-power request

64

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

AXI4 Updates over AXI3: Summary


Additional QoS Signaling: AxQOS[3:0]
Additional 4-bit interface signals AxREGION
allows 16 different regions to be uniquely identified
Region identifier should be constant in 4kB address space

Added USER signals


AxUSER, RUSER,WUSER,BUSER

Removes support for locked transfers so AxLOCK signal is single bit


(Normal/Exclusive)
Removal of Write Interleaving support
Removes WID signal

Write response requirements are updated:


AXI3: clock cycle after last data transfer
AXI4: clock cycle after address acceptance

65

Support of upto 256beats of burst lengths (for INCR bursts)


AWCACHE and ARCACHE signaling is updated

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

Part B
AMBA AXI4-Lite

66

2010 Wipro Ltd - Confidential

AXI4 Lite
AXI4-Lite is a subset of the AXI4 protocol intended for communication
with simpler, smaller control-register style interfaces in components.
AXI4-Lite is a simpler AXI4 for onchip devices requiring a more powerful
interface than APB.
Features:
All transactions with burst length of 1
all data accesses are the same size as the width of the data bus
support for data bus width of 32-bit or 64-bit
all accesses are equivalent to AWCACHE or ARCACHE equal to b0000
(i.e. non-modifiable and non-bufferable)
exclusive accesses are not supported.
AXI IDs not supported All transactions must be in order
So signal list reduced

67

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

AXI Lite Signal list


Subset of AXI signal set
Simple traditional signaling
Targeted applications: simple, low-performance peripherals
GPIO
Uart

Signals not-supported in AXI-Lite:

68

AWLEN, ARLEN
AWSIZE, ARSIZE
AWBURST, ARBURST
AWLOCK, ARLOCK
AWCACHE, ARCACHE
WLAST, RLAST

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

Part C
ACE Protocol Specification

69

2010 Wipro Ltd - Confidential

Coherency Problem
Two problems for systems that
contain caches:
1) Memory may be updated (by
another master) after a cached
master has taken a copy
The cache no-longer contains
up-to-date data

Master1

Master2

Master3

Cache

Cache

Cache

2) In systems that contain writeback caches, if the master writes


to local cached copy
The memory no-longer contains
up-to-date data.
A 2nd master reading from
memory will see stale data.

70

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

Interconnect
Main
Memory

Hardware based Coherency Approaches

Snooping Cache Coherency Protocols


Transactions to a shared-region are broadcast to all masters
All masters listen-in to all shared data-transactions originating from other
masters
When the master detects a read transaction for which it has the most up-todate data, it provides the data to the other master requesting it; or in the case
of a write, it invalidates its own copy.

Directory based Cache Coherency Protocols


A single directory is maintained, which contains a list of where every cached
line within the system is held.
A master initiating a transaction first consults the directory to find where the
data is cached and then directs cache coherency traffic to only those masters
containing cached copies.
71

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

Features
ACE is an extension to AXI

Aims at providing Hardware based cache coherency


Adds 3 new Snoop Channels:
Adds additional signal to existing AXI channels
Also adds barrier support to enforce ordering of multiple outstanding
transactions

72

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

Specification

ACE Protocol is realized using:


A 5-State Cache Model to define the state of a Cache Line in the coherent system
Additional Snoop Channels that enable communication with a cached master when
another master is accessing a shared address location
Read Channels: (AR, R)
Write Channels: (AW, W, B)
Snoop Channels: (AC, CR, CD)

Additional Signaling on the existing AXI4 channels that enables new transactions and
information to be conveyed

ACE Supported Policies


100% Snoop
Directory Based
Anything in-between (Snoop Filter)

ACE adds following to the AXI


Support for hardware coherency
Support for cache maintenance
Support for Barriers

73

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

ACE Cache Line States


Terms used to describe the state of a cache
line are:

Unique:

Shared

The cache controller is responsible to


update the main memory

Invalid

74

The cache controller does not have to


update the main memory

Dirty

The cache line MAY be in other cache

Clean

The cache line resides ONLY in this cache

Devices are not required to


support all 5 states internally

The cache line is not being used for


caching data

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

Shareability Domains defined in ACE

Non-Shareable

Inner Shareable

The domain contains at least all masters in the inner domain


Can include additional masters

System

75

The domain can include additional masters

Outer Shareable

The domain contains a single master

This domain includes all masters in the systems

Example of Shareability Domains

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

Use of Shareability Domains


For Coherency Transactions:
A master uses a shareability
domain to determine which
other masters might have a copy
of the addressed location in their
local cache
Interconnect uses this
information to determine, for any
given transaction, which other
masters must be snooped

76

For Barrier Transactions:


The domain of a barrier
transaction can be used to
determine how far a barrier
transaction must propagate

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

Additional Snoop Channels

Snoop Channels enable communication with a


cached master when another master is accessing a
shared address location

AC Channel (Coherent Address Channel): Input to


Master

CR Channel (Coherent Response Channel): Output


from Master

CRRESP is used by the master to signal the


responses to snoop to the interconnect
A narrow, 5-bit response indicating whether an
associated data transfer is expected on the CD
channel

CD Channel (Coherent Data Channel): Output


from Master

77

ACADDR used for sending the address of snoop


request to a cached master, accompanied with
control signals

CDDATA, used by the master to provide the data in


response to a snoop.
Optional for write-through caches

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

Additional Signals to Existing AXI Channels


ACE adds additional signals to existing AXI Channels:
Read Address Channel and Write Address Channel
ARSNOOP [3:0] / AWSNOOP[2:0]
- Indicate the type of snoop transactions for shareable transactions

ARBAR [1:0] / AWBAR [1:0]


- Are used for barrier signaling

ARDOMAIN [1:0] / AWDOMAIN [1:0]


- Indicates which masters should be snooped for snoop transactions
and which masters must be considered for ordering of barrier
transactions

Read Data Channel and Write Data Channel


RRESP [3:2]
- Additional response bits, for shared read transactions that are
indirectly driven by CRRESP outputs from a snooped master

RACK / WACK

78

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

ACE Protocol Design Principles


In ACE, snoop requests must be responded in-order (as it doesnt have ID
signals)
The system interconnect is responsible for coordinating the progress of all
shared (coherent) transactions:
e.g. The interconnect may present snoop addresses to all masters in parallel
simultaneously, OR it may present snoop addresses one at a time serially
Access to system memory can be issued upon snoop-miss, or speculatively
before all snoop responses have arrived
One example of such coherent interconnect is the CCI-400 Interconnect
developed by ARM

79

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

Different kinds of Components

Interconnect:
CCI (Cache Coherent
Interconnect)

ACE Masters
Masters with Caches

ACE-Lite Masters
Components without caches
snooping other caches

Slaves
Components not initiating snoop
transactions

80

Example Cortex-A15 Coherent System


with CCI-400 Interconnect

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

Transaction Groups

ACE introduces a large number of new transactions to AMBA4.

Non-Shared Transactions

Non-Cached Transactions

MakeUnique
ReadUnique
CleanUnique

Write-back Transactions

81

ReadShared
ReadNotSharedDirty

Shareable Write Transactions

CleanShared
CleanInvalid
MakeInvalid

Shareable Read Transactions

ReadOnce
WriteUnique
WriteLineUnique

Cache Maintenance Transactions

These are the existing AXI read and write transactions


Used for non-coherent, non-snooped transactions

WriteBack
WriteClean
Evict

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

Transaction Processing
Initiating Master component issues a transaction
Depending on whether coherency support is required, either:
The transaction is passed directly to a slave component
The transaction is passed to the coherency support logic within the
interconnect

Interconnect initiates the snoop transactions that are required


Each cached master that receives a snoop transaction provides a snoop
response.
The interconnect determines whether a main memory access is required
The interconnect collates snoop responses and any required data
The initiating master completes the transaction

82

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

Example: Load operation from Shareable Location


Master Component issues a read transaction on Read-Address channel
Interconnect determines whether any other cache holds a copy of the
location, by Snooping:
i.e. It passes the shareable address to other caching masters that can hold a
copy, on the Snoop Address Channel
If any snooped master holds the requested cache line, then it:
Responds on the snoop response channel
Provides the snoop-data to the interconnect on the snoop data channel

If no snooped master component holds the requested cache line:


The interconnect initiates a transaction to main memory,
The read data is supplied back to the initiating master on the AXI Read Data
channel, as for standard transactions

The master component indicates that the transaction has completed, using the
RACK signal

83

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

Example: Store operation to a Shareable Location


Initiating master component requests a unique copy of the cache-line by
issuing a MakeUnique transaction on the AXI Read Address Channel
Interconnect passes the transaction to other caches on the Snoop Address
Channel
Snooped masters respond on Snoop Response Channel to indicate that the
cache line has been removed from their local caches
A response is provided to the initiating master, using AXI Read Data channel
(no data transfer occurs)
MakeUnique removes copies of the cache-lines from other Caching Masters

Initiating master performs the store using standard AXI write channels
Initiating master issues and RACK signal, to indicate that the transaction
has completed

84

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

Example: Transaction Execution Scenario

85

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

Barrier Instructions

ARM Architecture supports 2 types of Barrier Instructions:

DMB (Data Memory Barrier):

The DMB transaction can flow on the pipelined interconnect but no re-ordering is allowed about the DMB.
Ensures that all memory transactions prior to the barrier are visible by other masters
This prevents re-ordering about the DMB
Everything before the DMB must be complete before anything after the DMB
This was ensured by the ARM MPCore processor cluster
In ACE, the DMB Barriers may define a subset of masters that must be able to observe the barrier:
-

DSB (Data Synchronization Barrier):

86

This is indicated on the AxDOMAIN signals. These can indicate: Inner, Outer, System or Non-Shareable.

DSB is used to stall the processor until previous transactions have completed
Can be used for example to ensure data written to DMA command buffer in memory has reached its
destination before kicking off the DMA via a peripheral register
Is the most time-consuming barrier since it stops the processor until transactions are complete

A master issues a Barrier on both: Read Address Channel and Write Address channel
simultaneously using ARBAR and AWBAR signaling.
A barrier transaction has an address phase and response phase, but no data transfer occurs.
Barriers enforce ordering because a master must not issue any read or write transaction
until the master has received a response for the barrier on both: read and write channels

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

Types of Master Interfaces


Full-ACE Master:
Contains all ACE Channels
Can issue snoop requests and can be snooped by interconnect
e.g. ARM Cortex A15 Processor cluster

ACE-Lite Master:

87

Does not include the AC, CR and CD channels


But has the additional coherency signals on existing AXI channels
Can issue Snoop requests but it itself cannot be snooped
E.g. a GPU or a Coherent I/O Device

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

ACE-Lite
ACE-Lite is a subset of ACE
Enables uncached masters to snoop ACE Coherent masters
e.g. An AXI Master interface like GigabitEthernet that shares data
with CPU can directly read/write cached data shared within the CPU.

ACE-Lite masters have additional signals on AXI Channels, but do


not have the additional three ACE Snoop channels.

88

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

ACE protocol does not guarantee Coherency!


ACE defines the hardware infrastructure required for Coherency

89

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

References
AMBA AXI and ACE Protocol Specification (Issue E, Date 22
February 2013)
http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ihi0022e/index.html

90

2010
2010 Wipro
Wipro Ltd
Ltd -- Confidential
Confidential

Thank You
Yashdeep Mahajani
yashdeep.mahajani@wipro.com

2010 Wipro Ltd - Confidential

You might also like