Target PC :: Intel’s Pentium 2200A/2000A MHz / AMD’s Athlon XP2000+

	\| about us \| advertise \| careers \| links \|

[an error occurred while processing this directive]

Hyper Pipelined Technology

In order to deliver the highest clock rates, the Pentium 4 features a pipeline twice as big as the one on the Pentium III (10). The original Pentium processor which was based on the P5 architecture featured a total of 5 stages. Intel doubled that number on the P6 architecture featuring a total of 10 stages on the Pentium PRO and Pentium III. Intel doubled that number again with their latest architecture (NetBurst); the Pentium 4 features a total of 20 Pipelines. The 20-stage pipeline is what Intel calls their Hyper Pipelined Technology.

Advanced Dynamic Execution

Intel describes the Advanced Dynamic Execution being an out of order speculative execution engine. This engine keeps the execution units executing instructions. This is accomplished by providing a large window of instructions from which the execution units can choose. The large out of order instruction window allows the processor to avoid stalls that might occur while instructions are waiting for dependencies to resolve. Intel’s previous P6 architecture featured a small window of 42 instructions, the NetBurst architecture that can have up to 126 instructions in this window (in flight).

This technology at the same time features an improved branch prediction capability. The Pentium 4 is estimated to reduce branch miss-predictions by around 33% compared to the P6 architecture’s branch prediction capability. This is achieved by implanting a 4K branch target buffer that is used to store more detail on the history of past branches and as well as by implementing a more advanced branch prediction algorithm.

Rapid Execution Engine

The new architecture permitted the Pentium 4 to run the Arithmetic Logic Units (ALUs) two times the frequency of the Processor’s core it self. This means that the Arithmetic Logic Units on a Pentium 4 running at 2.2 are operating at 4.4GHz with a latency that is half the duration of the core clock. This can be directly translated in higher through and reduced latency of execution.

400MHz Front Side Bus

The most talked features about the Pentium 4 is its 400MHz BUS. The Pentium III Processor’s 133MHz bus, which is 64-bit Wide, is capable of delivering 1.06GB/S of data. The Architecture of the Pentium 4 is somewhat different. The Pentium 4’s bus is clocked at only 100MHz at also 64-bit wide, what differs here is that the 100MHz is quad pumped and is capable of achieving a whooping 3.2GB/s peak.

Advanced Transfer Cache

Intel’s Pentium 4 features 8KB of L1 data cache. This is half the size of what the Pentium III features. This may seem a bit confusing at first, but smaller caches have lower latencies. This was done in order to decrease the latency of the L1 memory, this should result in an improved transfer rate but at the same time, the little size (8K) might not be enough for some specific tasks.

This is where the L2 memory comes in mind. The Pentium 4 Willamette, like the Pentium III (Coppermine) features 256k of on-die-cache on a 256-bit bus and the Pentium 4 Northwood features a total of 512K of L2 memory.

Execution Trace Cache

This technology caches decoded x86 instructions (micro-ops), thus removing the latency associated with the instruction decoder from the execution loop. The Execution Trace Cache stores the micro-ops in the path of program execution flow, where the results of branches in the code are integrated into the same cache line.

Execution Trace Cache is another handy technique Intel implemented in its new Architecture to ease the penalty of miss-Predicted Branch instructions. On older Intel processors, based on previous architectures, if the branch instruction was miss-predicted, the processor needed to start the process from the beginning. The NetBurst architectures allows going directly through the Execution Trace Cache Technology to retrieve the micro-op and then send it through execution pipeline without having to restart the process from the first phase.

Streaming SIMD Extensions 2 (SSE2)

Intel’s Pentium 4 architecture features 144 new instructions capable of delivering 128-bit SIMD integer arithmetic operation and 128-Bit SIMD Double Precision Floating Point.

Web Target PC