reserved TMS320C54X ARCHITECTURE UG Consultants Bangalore
UG Consultants Introduction to TMS320C54x Lowest DSP in power consumption: 0.54 mW/MIP Acceleration for FIR and LMS filtering, code book search, polynomial evaluation, Viterbi decoding Roadmap TMS320C54X ARCHITECTURE Advanced Multibus Architecture With Three Separate 16-Bit Data Memory Buses and One Program Memory Bus
40-Bit Arithmetic Logic Unit (ALU) Including a 40-Bit Barrel Shifter and Two Independent 40-Bit Accumulators
UG Consultants TMS320C54X ARCHITECTURE 17- 17-Bit Parallel Multiplier Coupled to a 40-Bit Dedicated Adder for Non- Pipelined Single-Cycle Multiply/Accumulate (MAC) Operation
Compare, Select, and Store Unit (CSSU) for the Add/Compare Selection of the Viterbi Operator UG Consultants TMS320C54X ARCHITECTURE Exponent Encoder to Compute an Exponent Value of a 40-Bit Accumulator Value in a Single Cycle
Two Address Generators With Eight Auxiliary Registers and Two Auxiliary Register Arithmetic Units (ARAUs) UG Consultants TMS320C54X ARCHITECTURE Extended Addressing Mode for 8M 16-Bit Maximum Addressable External Program Space 128K x 16-Bit On-Chip RAM Composed of:
Eight Blocks of 8K 16-Bit On-Chip Dual- Access Program/Data RAM Eight Blocks of 8K 16-Bit On-Chip Single- Access Program RAM UG Consultants TMS320C54X ARCHITECTURE 16K 16-Bit On-Chip ROM Configured for Program Memory
Single-Instruction-Repeat and Block- Repeat Operations for Program Code
Block-Memory-Move Instructions for Better Program and Data Management TMS320C54X ARCHITECTURE Instructions With a 32-Bit Long Word
Operand Instructions With Two- or Three-Operand Reads
Arithmetic Instructions With Parallel Store and Parallel Load
UG Consultants Overview of C54x Overview of the C54x device: Different memory and peripheral Options. Performance can go up to 500 MIPS (C5441) Highly specialized instructions set. Super Modified Harvard architecture. Low power consumption devices well suited for cellular application and battery operated devices. 16 Bit Fixed point DSP ( CISC processor ). UG Consultants TMS320C54x Key features Total memory Divided into Three spaces 64 K Words of Program memory (548, 549, 5402, 5410, and 5420 Devices Support Extended Program Memory Of Up To 8M Words) 64 K Words of data memory (Onchip & External) 64 K Words of IO memory ROM, DARAM and SARAM are the type of On-Chip Memories Supported DARAM ( Dual Access RAM ) can be accessed twice per machine cycle SARAM (Single Access RAM) can be accessed once per machine cycle ROM contains boot loader and data tables. ROM can be customized by submitting ROM Mask to TI. CPU and peripherals ex. BSP,HPI can write to and read from DARAM in the same cycle UG Consultants Architecture UG Consultants Architecture Internal Bus Structures:
There are total 8 internal buses. A separate output data bus, the E-Bus is used to write to the memory. A dual data-bus scheme, the C-Bus and D-Bus, permits fetching two operands in the same cycle or a dual data in one cycle. The P-Bus, carries instruction code and immediate data operands from program memory, is also connected to the multiplier input. Four-address buses (PAB, CAB, DAB, and EAB) carry the addresses needed for instruction execution UG Consultants Architecture UG Consultants Architecture CPU and status registers: 40-bit ALU Two 40-bit accumulators 40-bit Barrel shifter 17 X 17 multiplier and accumulate unit (MAC) 40-bit Adder Data address generation unit ( ARAU0 and ARAU1 ) Program address generation Unit (PAGEN) Compare,select and store unit (CSSU) Exponent encoder UG Consultants Architecture Status and control registers: Status registers indicates the condition of P. There are two status registers ST0 And ST1. Control register is used to configure processor is called PMST. Content of these register can changed using SSBX, RSBX and LD instruction. Status Register ST0 : UG Consultants Architecture Status register ST1: PMST Processor mode status register: Content PMST determines the configuration of the DSP mode and Memory UG Consultants TMS320C54x Pipelining The TMS320C54x has a six-stage pipeline. Each stage is independent and allows overlapped execution of instructions. One to six different instructions can be active simultaneously, each at a different stage The pipeline provides very fast throughput, but requires some attention to detail in programming. Pre-fetch: PAB is loaded with contents of PC. Fetch: Opcode is fetched from the program bus (PB) and loaded into the IR. Decode: The contents of the IR are decoded. Access: DAB is loaded with address if read access is require , If second operand is required CAB is loaded with address Or Auxiliary registers are update in indirect addressing mode. Read : The read data operand(s), if any, are read from the data buses, DB and CB. Or Same time Write data address is placed on the EAB. Execute/write: Execution of the instruction Or EB is loaded with the write data. P F D A R E / W Instruction fetch Operand Read Operand Write UG Consultants C54x Pipelining Bus/hardware Use P F D A R E/W P F D A R E/W P F D A R E/W P F D A R E/W P F D A R E/W P F D A R E/W Time P Generate Program address PAB PC F Get Opcode PB Program Mem D Decode instruction Decoder A Generate read address DAB/CAB ARs and ARAU R Read operands Generate Write address DB/CB EAB Data mem ARs, ARAU E/ W Execute instruction Write result EB MAC,ALU Data mem UG Consultants Architecture Arithmetic Logic Unit (ALU) UG Consultants Architecture ALU Inputs:
X input source 1. Shifter output a 32 bit or 16-bit data memory operand or shifted ACC value. 2. Data memory operand from D-bus. Y- input source: 1. Accumulator A or B 2. Data bus CB 3. T register Note: Neither Acc A or B is connected to the X-input of the ALU. Ex: ADD A,0,B ;But How this instruction executed ? Here A forms one of the input to y-input of ALU. Another input B comes though shifter output to X-input. UG Consultants Architecture Accumulators A and B : Destination registers for MAC or ALU operations. Accumulators are divided into three parts : Guard bits AG and BG High word AH and BH Low word AL and BL Guard bits prevents overflow in iterative computations. 32-16 bits of A can used as an input to multiplier in MAC. UG Consultants Architecture Barrel Shifter: shifts data (-16 to 31 ) times at once Pre-scaling before ALU operation Shift operations Normalizing Post scaling before storing Acc. UG Consultants Architecture Input sources : DB for a 16-bit data input operand DB and CB for a 32-bit data input operand Either one of the two 40-bit accumulators
Output sources: One of the ALU inputs E-bus through MSW/LSW write select unit Shift value : Ranges from 16 to 31 Immediate operand ASM field of ST1 T register UG Consultants Memory Organization Memory Map : This space is divided into three individually selectable space: Program memory of 64KW Data memory of 64KW I/O space of 64KW Some device have more than 64KW of program memory is referred as paged extended program memory. This PM is divided into 64KW block is called Page. UG Consultants Memory Organization Program memory: Contains instructions, immediate data operand and tables Data Memory: Stores data used by the instruction, Can be used to store Code also. I/O memory : Used for addressing memory mapped peripherals. Can also serve as extra memory storage. Memory Configuration There are three CPU status registers bits that affects the memory configuration. MP/MC, OVLY and DROM bits affects the memory configuration are located in the PMST register. MP/MC is external pin on the processor UG Consultants Memory Organization 0000 MP/MC = 1 Page 0 Program External Hex FF7F FF80 FFFF Interrupts Vector External (Microprocessor mode) MP/MC = 0 Page 0 Program 0000 FF7F FF80 FFFF Interrupts vector On-chip Reserved On chip ROM 4K X 16 (Microcomputer mode) EFFF F000 FEFF FF00 External Hex UG Consultants MP/MC = 0 Page 0 Program Hex 0000 FF7F FF80 FFFF Interrupts Vector On-chip Reserved On chip ROM 4K X 16 (Microcomputer mode) EFFF F000 FEFF FF00 Page 0 Program Hex 0000 FF7F FF80 FFFF Interrupts Vector External (Microprocessor mode) MP/MC = 1 Data Page External Hex 0000 007F 0080 3FFF 4000 FFFF On chip DARAM (16KW) MMR Scratch Pad registers Memory Organization EFFF F000 FEFF FF00 Reserved (DROM = 1) External (DROM = 0) ROM ( DROM = 1) External( DROM=0) External 007F 0080 3FFF 4000 External 007F 0080 3FFF 4000 Reserved OVLY = 1 DARAM OVLY = 1 Reserved OVLY = 1 DARAM OVLY = 1 External OVLY = 0 External OVLY = 0 External OVLY = 0 External OVLY = 0 External OVLY = 0 External OVLY = 0 UG Consultants Memory Organization On-chip ROM organization: Subdivided into blocks to enhance performance Allows one access per block On some devices, On-chip ROM contains code A boot loader that boot from Serial port, Extr Memory, HPI A 256-word -Law expansion table A 256-word A-law expansion table A 256-word sine table Interrupt vector Table F800 FC00 FE00 FF80 FD00 FE00 Boot loader code - Law A-Law Sine lookup reserved IVT 542/534/548/549/5402 UG Consultants Memory Organization Memory Mapped Registers: A portion of data memory is used as registers are called memory mapped registers (MMR). Peripherals register resides within addresses 0020h 005Fh. Scratch pad register are sued for temporary variables storage. They reside in range 0060h 007Fh) These MMR register reside in the data page 0 ( 0000h 007F ) CPU register( 26 Total) are requires no wait states. Each MMR is associated with memory address. Ex ST0 address is 6, AR0 is 10, etc UG Consultants Memory Mapped Registers Tables UG Consultants Memory Organization Temporary Register (T) Used to hold one of multiplicands A dynamic (execution-time programmable) shift count for instructions with shift operation such as the ADD, LD, and SUB instructions. A dynamic bit address for the BITT instruction. Used as one of the operand for instructions for CMPS, double precision operation instructions,EXP,NORM UG Consultants