Introduction to VLIW

VLIW processors issue a fixed number of instructions formatted either as one large instruction or as a fixed instruction packet with the parallelism among instructions explicitly indicated by the instruction. 1. Multple operations are packed into one instruction. 2. Each operation slot is for a fixed function. 3. Constant operation latencies are specified. 4. Architecture required guarantee of + Parallelsim within an instruction + No data use before data ready.

VLIW Equals(EQ) Scheduling Model

Each operation takes exactly specified latency.
Efficient register usage
No need for register renaming or buffering.
Compiler depends on not having registers visible early.

VLIW Compiler Optimizations

The responsibilities of VLIW compiler 1. Schedule operations to maximize parallel execution. 2. Guarantee intro-instruction parallelism. 3. Schedule to avoid data hazards.

Loop Unrolling

Give its assembly language:

What if There are no loops?

Firstly, define a concept named Basic block: Basic block: single entry and single exit.

Branch limit basic block size in control-flow intensive irregular code.
It's difficult to find ILP in individual basic blocks.

Classic VLIW Challenges

Object-code compatbility: have to recompile all code for every machine even for two machine in same generation.
Object code size: intruction padding wastes instruction memory and loop unrolling/software pipelining replicates code.
Scheduling variable latency memory operations: caches and memory bank conflicts impose statically unpredicatable variability.
Knowing branch probabilities.
Scheduling for statically unpredictable branches.
Precise interrupts can be challenging.

How to SOLVE these challenges? 1. VLIW instruction encoding: