0% found this document useful (0 votes)
14 views8 pages

Q10,12

The Pentium 4 processor utilizes the Intel NetBurst micro-architecture, designed for high performance with features like hyper-pipelined technology, rapid execution engine, and aggressive branch prediction. It includes a three-section pipeline and Hyper-Threading Technology, allowing two logical processors to operate simultaneously on a single core. Additionally, the processor employs a Translation Lookaside Buffer (TLB) to enhance memory access speed, although it can experience performance issues due to cache thrashing.

Uploaded by

rashi doiphode
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
14 views8 pages

Q10,12

The Pentium 4 processor utilizes the Intel NetBurst micro-architecture, designed for high performance with features like hyper-pipelined technology, rapid execution engine, and aggressive branch prediction. It includes a three-section pipeline and Hyper-Threading Technology, allowing two logical processors to operate simultaneously on a single core. Additionally, the processor employs a Translation Lookaside Buffer (TLB) to enhance memory access speed, although it can experience performance issues due to cache thrashing.

Uploaded by

rashi doiphode
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 8
@ The Pentium 4 processor is the first hardware implementation of new micro-architecture, the Intel NetBurst micro-architecture. @ The intel NetBurst micro-architecture is designed to achieve high performance for both integer and floating-point computations at very high clock rates. @ it has the following features: 1. Hyper pipelined technology to enable high clock rates and frequency headroom to well above 1GHz. 2. Rapid execution engine to reduce the latency of basic integer _ instructions. 3. High-performance, quad-pumped bus interface to the 400 NetBurst micro-architecture system bus. = Execution trace cache to shorten branch delays. Cache line sizes of 64 and 128 bytes. Hardware prefetch Aggressive branch prediction to minimize pipeline delays. Out-of-order speculative execution to enable parallelism. PP New» Superscalar issue to enable parallelism. 10. Hardware register renaming to avoid register name space limitations. Intel® NetBuurst™ Micro — architecture @ ‘Me pipeline of the Intel NetBurst micro-architecture contain three sections: 1. The in-order issue front end 2. The out-of-order superscalar execution core 3. The in-order retirement unit. = B the front end of the Intel NetBurst micro-architecture consists of 4 two parts: 1. Fetch/decode unit 2. Execution trace cache. (@ The front end performs several basic functions: BB Prefetches IA-32 instructions that are likely to be executed. B Fetches instructions that have not already been prefetched. a B® Decodes instructions into pops. B Generates microcode for complex instructions and special-purpose code. B® Delivers decoded instructions from the execution trace cache. B Predicts branches using highly advanced algorithm. in enabling parallelism. {J This feature enables the processor to reorder instructions # ‘one pop is delayed while waiting for data or a contended resource, other uops that appear later in the program order may proceed around it. 0 The processor employs several buffers to smooth the flow of pops. (B® the core is designed to facilitate parallel execution. pipeline. 4] A number of arithmetic logical unit (ALU) instructions can st per cycle, and many floating point instructions can start one every two cycles. data inputs are ready and resources are available. @ The retirement section receives the results of the executed pops: from the execution core and processes. @ For semantically-correct execution, the results of IA-32 must be committed in original program order before it is retired. @ Exceptions may be raised as instructions are retired. @ ms, exceptions cannot occur speculatively, they occur in the correct order, and the machine can be correctly restarted after an exception. : @ when a pop completes and writes its result to the d it is retired. _@ Up to three pops may be retired per cycle. $8] The Reorder Buffer (ROB) is the unit in the processor wh completed ops, updates the architectural state in order, (Giew Fireang Tecttony) @ xyper-Threading Technology is a form of simultaneous E multithreading technology introduced by Intel, { G A processor with Hyper-Threading Technology consists of two logical processors per core, each of which has its own processor architectural state. @® Each logical processor can be individually halted, interrupted or directed to execute a specified thread, independently from the other logical processor sharing the same physical core. @ Unlike a traditional dual-processor configuration that uses two separate physical processors, the logical processors in a hyper-threaded core share the execution resources. gu These resources include the execution engine, caches, and system bus interface; g@ The sharing of resources allows two logical processors to work with each other more efficiently, and (@ Hyper-threading works by duplicating certain sections of the Processor - those that store the architecture state - but not duplicating the main execution resources. B This technology is transparent to operating systems and pro B itis possible to optimize operating system behavior on multi-processor hyper-threading capable systems. {) Software applications that have been written to use mu of code called "threads" view the Pentium 4 processor at 3.06 with HT Technology as two processors. g HT Technology allows the processor to work on two separate thn at the same time rather than one at a time. GB A translation ookeside buffer (TLB) is a memory cache thats used to reduce the time taken to access a user memory location. @ A transtation lookaside busier (TLB) is a memory cache that stores. recent translations of virtual memory to physical addresses for faster retrieval. GH A TLB may reside between the CPU and the CPU cache, between CPU cache and the main memory or between the different levels of the multi-level cache. G The majority of desktop, laptop, and server processors include one or more TLBs in the memory-management hardware, and 2 @ Itis nearly always present in any processor that utilizes paged or segmented virtual memory. @ when a virtual memory address is referenced by a program, the search starts in the CPU. First, instruction caches are checked. i (Bf tthe required memory is not in these very fast caches, the system has to look up the memory’s physical address. GB At this point, TLB is checked for a quick reference to the location in physical memory. {3 When an address is searched in the TLB and not found, the physical memory must be searched with a memory page crawl operation. @ As virtual memory addresses are translated, values referenced are added to TLB. OB when a value can be retrieved from TLB, speed is enhanced because the memory address is stored in the TLB on processor. TBs can suffer performance issues from multitasking and code errors. @ This performance degradation is called a cache thrash. {Cache thrash is caused by an ongoing computer activity that fails to progress due to excessive use of resources or conflicts in the caching system.

You might also like