Archive for November 9, 2008
Intel’s 15 Most Unforgettable x86 CPUs-Part 2
Pentium M: Laptops Flex Their Muscles
In 2003 the market for portable PCs was booming and Intel had only two processors for them: the aging Pentium III Tualatin and the Pentium 4, whose high power consumption made it unsuitable. But a savior was to arrive from Israel: the Banias (alias Pentium M). This processor, based on the P6 architecture (the same as the Pentium Pro) had high performance and low power consumption. It even beat the Pentium 4, while consuming a lot less power. This was the processor used in the Centrino platform and it was quickly followed (in 2004) by the (faster) Dothan model. The Pentium M left its mark on the world of mobility, and the Stealey (A100) still uses the Dothan architecture (with lower frequencies and TDP).
| Code name | Banias | Dothan |
| Date released | 2003 | 2004 |
| Architecture | 32 bits | 32 bits |
| Data bus | 64 bits | 64 bits |
| Address bus | 32 bits | 32 bits |
| Maximum memory | 4 GB | 4 GB |
| L1 cache | 32 KB + 32 KB | 32 KB + 32 KB |
| L2 cache | 1,024 KB | 2,048 KB |
| Clock frequency | 0.9–1.7 GHz | 1–2.13 GHz |
| FSB | 400 MHz | 400, 533 MHz |
| SIMD | MMX, SSE, SSE2 | MMX, SSE, SSE2 |
| SMT/SMP | no | no |
| Fabrication process | 130 nm | 90 nm |
| Number of transistors | 77 million | 140 million |
| Power consumption | 9-30 W | 6-35 W |
| Voltage | 0.9–1.5 V | 0.9–1.4 V |
| Die surface area | 82 mm² | 87 mm² |
| Connector | Socket 479 | Socket 479 |
As with the Pentium 4, the FSB actually operated at a quarter of the nominal frequency (QDR). The connector used, the Socket 479, actually had 478 pins, but they were arranged differently from the Pentium 4 Socket 478 (though adapters were made).
Pentium 4 Gets 64-bit And Another Core
In 2005, Intel improved its Pentium 4 twice. First, with the Prescott-2M, and then with Smithfield. The former was a 64-bit processor, based on the Prescott design, and the latter was a dual-core processor. They are fairly similar and have the same problems as other Pentium 4s: low instructions per cycle (IPC) throughput and difficulty in increasing the clock frequency due to current losses. These two processors, intended to limit losses while awaiting the Core 2 Duo, are not among Intel’s most highly regarded. And while the Pentium D (the commercial name of the Smithfield) does have two cores, in reality it’s an assembly of two Prescott dies in the same package.
| Code name | Prescott-2M | Smithfield |
| Date released | 2005 | 2005 |
| Architecture | 64 bits | 64 bits |
| Data bus | 64 bits | 64 bits |
| Address bus | 64 (actual 36) bits | 64 (actual 36) bits |
| Maximum memory | 64 GB | 64 GB |
| L1 cache | 16 KB + 12 Kµops | 2 x 16 KB + 12 Kµops |
| L2 cache | 2,048 KB | 2 x 1,024 KB |
| Clock frequency | 3–3.6 GHz | 2.8–3.2 GHz |
| FSB | 800 MHz | 800 MHz |
| SIMD | MMX, SSE, SSE2, SSE3 | MMX, SSE, SSE2, SSE3 |
| SMT/SMP | Hyper-Threading | dual cores (Hyper-Threading on certain models) |
| Fabrication process | 90 nm | 90 nm |
| Number of transistors | 169 million | 230 million |
| TDP | 84-115 W | 95-130 W |
| Voltage | 1.2 V | 1.2 V |
| Die surface area | 135 mm² | 206 mm² |
| Connector | LGA775 | LGA775 |
An interesting point is that whereas the Pentium 4 processors intended for the consumer market did not use the PAE technology (which enables 36-bit, as opposed to 32-bit memory management) and were therefore limited to 4 GB of RAM, these models can go beyond that limit. In practice, the address bus is still limited to 36 bits (40 bits on the Xeon), but PAE (management in 4 GB pages) is now ancient history—a 64-bit program is capable of making full use of the available memory.
Hyper-Threading, an Intel SMT technology, was available on certain models (Xeon and Extreme Edition). Finally, a 65 nm version (the 9×0 series) of the Pentium 4 was released later, but made no major improvements.
The First Mobile Dual-Core
In 2006, Intel announced the Core Duo. The first dual-core processor for portable PCs boasted excellent performance—much better than the Pentium 4. It was also one of the first x86 processors to be truly dual-core. The cache, for example, is shared (whereas the Pentium D was more like an assembly of two processors in the same package). This processor was part of the Centrino Duo platform and was a huge success. The only drawback was that it was still a 32-bit processor, unlike the Pentium 4.
| Code name | Yonah |
| Date released | 2006 |
| Architecture | 32 bits |
| Data bus | 64 bits |
| Address bus | 32 bits |
| Maximum memory | 4 GB |
| L1 cache | 32 KB + 32 KB |
| L2 cache | 2,048 KB shared |
| Clock frequency | 1.06–2.33 GHz |
| FSB | 667 MHz |
| SIMD | MMX, SSE, SSE2, SSE3 |
| SMT/SMP | Dual core |
| Fabrication process | 65 nm |
| Number of transistors | 151 million |
| TDP | 9-31 W |
| Voltage | 0.9–1.3 V |
| Die surface area | 91 mm² |
| Connector | Socket 479 |
A Core Solo version with one core was also made available, and the low-power-consumption versions used a 533 MHz bus (133 MHz QDR) instead of 667 MHz. This processor was used in servers (code name Sossaman), which was a first for a processor originally intended for the mobile world. Note that this processor didn’t officially use the Core architecture of the Core 2 Duo, and it was quickly replaced by the Core 2 Duo (Merom) in portable PCs. Also, the Yonah’s Socket 479 is different from the Socket 479 of other Pentium M processors.
Today’s Hotness: The Core 2 Duo
In 2006, Intel released a processor that quickly became a best-seller: the Core 2 Duo. Derived from work done on the Pentium M, this processor uses a new Core architecture. Before, Intel had two lines of processors—the Pentium 4 for desktops, Pentium M for mobiles, and both lines for servers. In contrast, Intel now has a single micro-architecture on which all of its product lines draw. The 64-bit Core 2 Duo is represented from the low end to the high end, for desktop computers, portables and servers.
There are many versions of the architecture, resulting in configurations with a different number of cores (one to four, yielding everything from Solos to Quads), cache memory (512 KB to 12 MB), and the FSB (between 400 and 1600 MHz). The model shown here is the original Core 2 Duo, but faster versions (at 45 nm) exist.
| Code name | Conroe |
| Date released | 2006 |
| Architecture | 64 bits |
| Data bus | 64 bits |
| Address bus | 64 (actual 36) bits |
| Maximum memory | 64 GB |
| L1 cache | 32 KB + 32 KB |
| L2 cache | 2,048 KB shared |
| Clock frequency | 1.8-3 GHz |
| FSB | 800-1066-1333 MHz |
| SIMD | MMX, SSE, SSE2, SSE3, SSSE3 |
| SMT/SMP | Dual core |
| Fabrication process | 65 nm |
| Number of transistors | 291 million |
| TDP | 65 W |
| Voltage | 1.5 V |
| Die surface area | 143 mm² |
| Connector | LGA 775 |
The mobile versions (Merom) are basically identical (but not as fast, with a slower FSB) whereas the Extreme Edition versions are faster. The Core 2 Duo also exists in a four-core version, which was, in fact, two Conroes in the same package. The 45 nm version of the Core 2 Duo (Penryn) has a larger cache and generates less heat, but is still fundamentally similar to this model.
The Future: Nehalem, Atom, Etc.
Obviously, this is only the first part of a series of articles. The second part, on AMD processors, will follow (along with a piece on AMD’s ATI graphics cards). But the story of the Intel x86 processors doesn’t end with the Core 2 Duo, and obviously other models are planned for the future. Nehalem and Atom are also x86 processors. And a little bird tells us that Intel’s upcoming entry into the graphics market, Larrabee, is also based on a number of x86 cores.
Imitation To Innovation: AMD’s Best CPUs – Part 1
AMD Clones Intel
The year is 1981, and Intel (see history of Intel processors) has just been chosen by IBM to supply the processor for the first personal computer. IBM wanted at least two CPU suppliers for its PC, and forced Intel to license its technology. And so it was that AMD became one of the first companies to sell an 8086 clone. AMD’s first processor went on sale in 1982. Because it was a licensed processor, the AMD 8086 (and 8088) was identical to Intel’s model.
| Code name | ? |
| Date released | 1982 |
| Architecture | 16-bits |
| Data bus | 16-bits |
| Address bus | 20-bits |
| Maximum memory | 1 MB |
| L1 cache | no |
| L2 cache | no |
| Clock frequency | 5-10 MHz |
| FSB | same as clock frequency |
| FPU | 8087 |
| SIMD | no |
| Fabrication process | 3,000 nm |
| Number of transistors | 29,000 |
| Power consumption | ? |
| Voltage | 5 V |
| Die surface area | 16 mm² |
| Connector | 40 pins |
Note the “© Intel” on the processor, made by AMD.
Am286: Manufactured Under License, But Faster
AMD’s Am286, a clone of the Intel 80286 manufactured under license, was identical to the chip from Intel, but it had a big advantage: its higher clock speed. Whereas Intel’s 286s topped out at 12.5 MHz, AMD sold 20 MHz versions. Because the 286 was more economical than the 386, whose innovations weren’t fully exploited for several years, AMD was already the value choice more than 20 years ago.
| Code name | ? |
| Date released | 1983 |
| Architecture | 16-bits |
| Data bus | 16-bits |
| Address bus | 24-bits |
| Maximum memory | 16 MB |
| L1 cache | no |
| L2 cache | no |
| Clock frequency | 8-20 MHz |
| FSB | same as clock frequency |
| FPU | 80287 |
| SIMD | no |
| Fabrication process | 1,500 nm |
| Number of transistors | 134,000 |
| Power consumption | ? |
| Voltage | 5 V |
| Die surface area | 49 mm² |
| Connector | 68 pins |
Am386: A 40-MHz 386
In 1991, AMD released its 386 processor. Like its predecessors, this model was identical to the Intel versions. AMD was licensed to produce clones of Intel products, right down to the microcode (the CPU’s firmware). This processor had two notable features. First, it was faster than the Intel model—40 MHz compared to a top speed of 33 MHz at Intel—and it was the first to sport the Windows Compatible logo on the package.
| Code name | ? |
| Date released | 1991 |
| Architecture | 32-bits |
| Data bus | 32-bits |
| Address bus | 32-bits |
| Maximum memory | 4,096 MB |
| L1 cache | no |
| L2 cache | no |
| Clock frequency | 12-40 MHz |
| FSB | same as clock frequency |
| FPU | 80387 |
| SIMD | no |
| Fabrication process | 1,500 – 1,000 nm |
| Number of transistors | 275,000 |
| Power consumption | 2 W (33 MHz) |
| Voltage | 5 V |
| Die surface area | 42 mm² |
| Connector | 132 pins |
Am486: The Last Clone
The 486 was the last clone of an Intel processor. AMD produced 486s in two different versions—one with microcode by Intel and another with microcode by AMD, because the company was having legal hassles with Intel by that point. In addition to processors sold under the 486 designation, AMD also marketed an AMD 5×86, which was a 486 with a 4x clock multiplier. Running at 133 MHz, this model was compatible with 486 motherboards, but had the performance of a Pentium 75. It was with the 5×86 that AMD began using the famous “Pentium Rating” (5×86 PR 75), which it would stay with up to and including the Athlon 64 X2.
| Code name | ? | X5 |
| Date released | 1993 | 1995 |
| Architecture | 32-bits | 32-bits |
| Data bus | 32-bits | 32-bits |
| Address bus | 32-bits | 32-bits |
| Maximum memory | 4,096 MB | 4,096 MB |
| L1 cache | 8 KB | 16 KB |
| L2 cache | motherboard (FSB frequency) | motherboard (FSB frequency) |
| Clock frequency | 16-120 MHz | 133 MHz |
| FSB | 16-50 MHz | 33 MHz |
| FPU | built-in | built-in |
| SIMD | no | no |
| Fabrication process | 1,000 – 800 nm | 350 nm |
| Number of transistors | 1,185,000 | ? |
| Power consumption | ? | ? |
| Voltage | 5 V–3.3 V | 3.45 V |
| Die surface area | 81 – 67 mm² | ? |
| Connector | 168 pins | 168 pins |
The K5: AMD’s Very Own Processor
In 1996, AMD released its fifth-generation processor, the K5. Compared to Intel’s Pentium, the K5 was technically more advanced, though it did have some faults. It’s especially interesting because of its RISC-based internal architecture that decoded x86 instructions into micro-instructions before executing them. The K5 had difficulty reaching high clock speeds and its FPU was a little weak. Still, in normal use, the K5 was a better performer than the Pentium and its PR was not just hype—a K5 clocked at 100 MHz was sold as a PR133 chip, meaning that AMD considered it as being equivalent in performance to a 133 MHz Pentium.
| Code name | SSA/5, 5k86 |
| Date released | 1996 |
| Architecture | 32-bits |
| Data bus | 64-bits |
| Address bus | 32-bits |
| Maximum memory | 4,096 MB |
| L1 cache | 16 KB + 8 KB |
| L2 cache | motherboard (FSB frequency) |
| Clock frequency | 75-133 MHz (PR75 – PR200) |
| FSB | 50-66 MHz |
| FPU | built-in |
| SIMD | no |
| Fabrication process | 500 – 350 nm |
| Number of transistors | 4.3 million |
| Power consumption | 11-16 W |
| Voltage | 3.52 V |
| Die surface area | 251 – 181 mm² |
| Connector | Socket 5 or 7 |
The use of the PR resulted in such oddities as a K5 PR90 and PR120 running at the same frequency (90 MHz) and a PR100 and PR133 both clocked at 100 MHz. Notice also that the CPU package informed buyers that a heat sink and fan were required—at that time, the use of such cooling devices was not yet common practice.
The K6: AMD Extends Its Range
In 1997, AMD released a new processor: the K6. Unlike the K5, which was created by AMD, the K6 was the result of the work done by NexGen on the Nx686. This processor was compatible with Socket 7 (Pentium) motherboards and offered very good performance compared to Intel’s Pentium II processors, at a much lower price. The K6’s FPU was still a little weak compared to Intel’s. A 250 nm version of the K6, called Little Foot, came out in 1998.
Also in 1998, AMD announced the K6-2, a processor that used a faster bus (100 MHz) and had improved SIMD performance. It also had one more MMX unit than the K6 and a new instruction set, 3DNow!, for floating-point calculations (MMX handled only integers). The K6-2 (400 and up) was a big success because it was a good upgrade solution for owners of Pentium MMX platforms—by using the 2X multiplier on a motherboard with a 66 MHz bus, the processor was in fact operating at 6X (400 MHz), which permitted a significant gain in speed at a lower upgrade cost.
Finally, in 1999, AMD released the third version of the K6, the K6-III. The main difference from the K6-2 version was an on-chip 256 KB cache. The K6-III was very fast, but also very costly to produce, and was quickly replaced by the Athlon (K7).
| Code name | K6, Little Foot (250 nm) | K6-3D, Chomper | Sharptooth |
| Date released | 1997/1998 | 1998 | 1999 |
| Architecture | 32-bits | 32-bits | 32-bits |
| Data bus | 64-bits | 64-bits | 64-bits |
| Address bus | 32-bits | 32-bits | 32-bits |
| Maximum memory | 4,096 MB | 4,096 MB | 4,096 MB |
| L1 cache | 32 KB + 32 KB | 32 + 32 KB | 32 + 32 KB |
| L2 cache | motherboard (FSB frequency) | motherboard (FSB frequency) | 256 KB (CPU frequency) |
| L3 cache | no | no | motherboard (FSB frequency) |
| Clock frequency | 166-300 MHz | 300-550 MHz | 400-450 MHz |
| FSB | 50-66 MHz | 66-100 MHz | 100 MHz |
| FPU | built-in | built-in | built-in |
| SIMD | MMX | MMX, 3DNow! | MMX, 3DNow! |
| Fabrication process | 350 – 250 nm | 250 nm | 250 nm |
| Number of transistors | 8.8 million | 9.3 million | 21.3 million |
| Power consumption | 12-28 W | 13-25 W | 10-17 W |
| Voltage | 2.2–2.9 V–3.2 V | 2.2–2.4 V | 2.2–2.4 V |
| Die surface area | 157-68 mm² | 81 mm² | 118 mm² |
| Connector | Socket 7 | Socket 7 / Super Socket 7 | Super Socket 7 |
AMD also marketed K6-2+ and K6-3+ processors, mainly for portable PCs. These used a 180 nm fab process and had an on-chip 128 KB (K6-2+) or 256 KB (K6-3+) L2 cache.
K7/Athlon: A Killer
In 1999, AMD released its seventh-generation processor, the K7, later renamed Athlon. This chip did away with the drawbacks of earlier models and finally had an FPU worthy of the name—in fact, it was even better than Intel’s. The Athlon was the fastest x86 processor and had many strong points, including a fast FSB—the EV6, used in the first Alpha processors—and high performance numbers. The only real problem came not from the processor but from the chipsets: neither the AMD nor Via models could compete with Intel’s chipsets (like the famous 440BX). The K7 used Slot A (competing with Intel’s Slot 1) and had a Level 2 cache with a variable divider (1/2, 2/5 or 1/3).
| Code name | Argon (K7) | Pluto, Orion (K75) |
| Date released | 1999 | 1999 |
| Architecture | 32-bits | 32-bits |
| Data bus | 64-bits | 64-bits |
| Address bus | 32-bits | 32-bits |
| Maximum memory | 4,096 MB | 4,096 MB |
| L1 cache | 64 KB + 64 KB | 64 KB + 64 KB |
| L2 cache | Slot A (1/2 CPU) | Slot A (1/2, 2/5 or 1/3 CPU) |
| Clock frequency | 500-700 MHz | 550-1000 MHz |
| FSB | 100 MHz (DDR) | 100 MHz (DDR) |
| FPU | built-in | built-in |
| SIMD | MMX, Enhanced 3DNow! | MMX, Enhanced 3DNow! |
| Fabrication process | 250 nm | 180 nm |
| Number of transistors | 22 million | 22 million |
| Power consumption | 42-50 W | 31-65 W |
| Voltage | 1.6 V | 1.6–1.8 V |
| Die surface area | 184 mm² | 102 mm² |
| Connector | Slot A | Slot A |
Just as a side note, it was AMD who was the first to announce (and market) a 1 GHz processor with the Athlon (two days before Intel’s 1 GHz Pentium III).
AMD Improves the Athlon: Thunderbird, XP, and more.
AMD knew it had a winner with the K7 architecture and improved it little by little, increasing the frequency and using finer fab processes. The Thunderbird core employed a 180 nm process and had 256 KB of on-chip cache. The Palomino design introduced support for SSE. The Athlon XP changed the package and reinstated PR numbers. The Thoroughbred was an Athlon XP using a 130 nm fab process (with a 256 KB cache). Barton had a 512 KB cache and also used a 130 nm process. Athlon XP and subsequent models used the PR number instead of a clock frequency designation.
| Code name | Thunderbird | Palomino/XP | Thoroughbred | Barton |
| Date released | 2000 | 2001 | 2002 | 2003 |
| Architecture | 32-bits | 32-bits | 32-bits | 32-bits |
| Data bus | 64-bits | 64-bits | 64-bits | 64-bits |
| Address bus | 32-bits | 32-bits | 32-bits | 32-bits |
| Maximum memory | 4,096 MB | 4,096 MB | 4,096 MB | 4,096 MB |
| L1 cache | 64 KB + 64 KB | 64 KB + 64 KB | 64 KB + 64 KB | 64 KB + 64 KB |
| L2 cache | 256 KB (CPU frequency) | 256 KB (CPU frequency) | 256 KB (CPU frequency) | 512 KB (CPU frequency) |
| Clock frequency | 650-1,400 MHz | 1,000-1,733 MHz | 1,200-2,250 MHz | 1,400-2,200 MHz |
| FSB | 100/133 MHz (DDR) | 133 MHz (DDR) | 133/166 MHz (DDR) | 166/200 MHz (DDR) |
| FPU | built-in | built-in | built-in | built-in |
| SIMD | MMX, Enhanced 3DNow! | MMX, Enhanced 3DNow!, SSE | MMX, Enhanced 3DNow!, SSE | MMX, Enhanced 3DNow!, SSE |
| Fabrication process | 180 nm | 180 nm | 130 nm | 130 nm |
| Number of transistors | 37 million | 37.5 million | 37.2 million | 54.3 million |
| Power consumption | 38-72 W | 46-72 W | 49-68 W | 60-76 W |
| Voltage | 1.7-1.75 V | 1.75 V | 1.5-1.65 V | 1.65 V |
| Die surface area | 120 mm² | 129.26 mm² | 84.66 mm² | 100.99 mm² |
| Connector | Socket A | Socket A | Socket A | Socket A |
We should mention that AMD also produced versions for servers (Athlon MP) and for laptops (Athlon 4, Athlon XP Mobile), as well as the Geode NX (130 nm and a 256 KB cache). AMD marketed the Thorton (130 nm, 512 KB of cache, 256 KB of which was disabled) and planned on Trinidad, an Athlon using a 90 nm process. There were more PR oddities: the Athlon XP 2600+ was clocked at 1,900, 1,917, 2,000, 2,083, or 2,133 MHz depending on the version, for instance
Intel’s 15 Most Unforgettable x86 CPUs-Part 1
8086: The First PC processor
The 8086 was the first x86 processor—Intel had already released the 4004, the 8008, the 8080 and the 8085. This 16-bit processor could manage 1 MB of memory using an external 20-bit address bus. The clock frequency chosen by IBM (4.77 MHz) was fairly low, though the processor was running at 10 MHz by the end of its career.
The first PCs used a derivative of this processor, the 8088, which had only an 8-bit (external) data bus. An interesting aside is that the control systems in the US space shuttles use 8086 processors and NASA was forced to buy some from eBay in 2002 since Intel could no longer supply them.
| Code name | N/A |
| Date released | 1979 |
| Architecture | 16 bits |
| Data bus | 16 bits |
| Address bus | 20 bits |
| Maximum memory | 1 MB |
| L1 cache | no |
| L2 cache | no |
| Clock frequency | 4.77-10 MHz |
| FSB | same as clock frequency |
| FPU | 8087 |
| SIMD | no |
| Fabrication process | 3,000 nm |
| Number of transistors | 29,000 |
| Power consumption | N/A |
| Voltage | 5 V |
| Die surface area | 16 mm² |
| Connector | 40-pin |
80286: 16 MB Of Memory, But Still 16 Bits
Released in 1982, the 80286 was 3.6 times faster than the 8086 at the same frequency. It could manage up to 16 MB of memory, but the 286 was still a 16-bit processor. It was the first x86 equipped with a memory management unit (MMU), allowing it to manage virtual memory. Like the 8086, it did not have a floating-point unit (FPU), but could use a x87 co-processor chip (80287). Intel offered these processors at a maximum frequency of 12.5 MHz, whereas their competitors reached 25 MHz.
| Code name | N/A |
| Date released | 1982 |
| Architecture | 16 bits |
| Data bus | 16 bits |
| Address bus | 24 bits |
| Maximum memory | 16 MB |
| L1 cache | No |
| L2 cache | No |
| Clock frequency | 6–12 MHz |
| FSB | same as clock frequency |
| FPU | 80287 |
| SIMD | No |
| Fabrication process | 1,500 nm |
| Number of transistors | 134,000 |
| Power consumption | N/A |
| Voltage | 5 V |
| Die surface area | 49 mm² |
| Connector | 68-pin |
386: 32-Bit and Cache Memory
Intel’s 80836 was the first x86 with a 32-bit architecture. Several versions of this processor were offered. The two best known are the 386 SX (Single-word eXternal), which had a 16-bit data bus, and the 386 DX (Double-word eXternal) with a 32-bit data bus. Two other versions are worth noting, though: the SL, which was the first x86 to offer management of a cache (external) and the 386EX, used in the space program (the Hubble telescope uses this processor).
| Code name | P3 |
| Date released | 1985 |
| Architecture | 32 bits |
| Data bus | 32 bits |
| Address bus | 32 bits |
| Maximum memory | 4096 MB |
| L1 cache | 0 KB (controller sometimes present) |
| L2 cache | no |
| Clock frequency | 16-33 MHz |
| FSB | same as clock frequency |
| FPU | 80387 |
| SIMD | no |
| Fabrication process | 1,500-1,000 nm |
| Number of transistors | 275,000 |
| Power consumption | 2 W @ 33 MHz |
| Voltage | 5 V |
| Die surface area | 42 mm² @ 1µ |
| Connector | 132 pins |
The 486: An FPU And Multipliers Too
The 486 is emblematic of a certain generation who were first discovering computers. In fact, the very famous 486 DX2/66 was long considered the minimum configuration for gamers. This processor, released in 1989, ushered in several interesting new features, like an on-chip FPU, data cache, and the first clock multiplier. The former consisted of an x87 coprocessor built into the 486 DX (not SX) series. An 8 KB Level 1 cache was built into the processor (write-through type, then write-back with slightly better performance). There was also the possibility of a Level 2 cache on the motherboard (at the bus frequency).
The second generation of 486s had a CPU multiplier, since the processor operated faster than the FSB, with DX2 (2x multiplier) and DX4 (3x multiplier) versions. Another anecdote: the “487SX” sold as an FPU for the 486SX was actually a full 486DX that disabled and took the place of the first processor.
| Code name | P4, P24, P24C |
| Date released | 1989 |
| Architecture | 32 bits |
| Data bus | 32 bits |
| Address bus | 32 bits |
| Maximum memory | 4096 MB |
| L1 cache | 8 KB |
| L2 cache | Motherboard (FSB frequency) |
| Clock frequency | 16-100 MHz |
| FSB | 16-50 MHz |
| FPU | On chip |
| SIMD | No |
| Fabrication process | 1,000–800 nm |
| Number of transistors | 1,185,000 |
| Power consumption | N/A |
| Voltage | 5 V–3.3 V |
| Die surface area | 81 – 67 mm² |
| Connector | 168 pins |
The DX4 had a 16 KB cache and a few more transistors: 1.6 million. This processor, using a 600 nm process and measuring 76 mm², consumed less power than the original 486 (at a voltage of 3.3 V).
Intel Pentium: A Bothersome Bug
The Pentium, introduced in 1993, was interesting for more than one reason. It was the first x86 to drop the traditional model number for a more attractive name, since Intel wasn’t allowed to trademark a name made up of numbers only. It’s also famous because of a bug it contained. On the first generations of Pentiums, certain division operations produced an incorrect result. Intel replaced the processors, but the damage was done. A very rare error gave rise to the first big IT media buzz.
The Pentium was sold in three different versions, the first without a CPU multiplier, the second with a multiplier (including the very familiar Pentium 166), and the last with the SIMD instruction set for x86s, MMX. The Pentium MMX also increased the size of the Level 1 cache and brought in a few minor improvements. This was the first Intel x86 capable of executing two instructions in parallel. The L2 cache was on the motherboard with these processors (running at the frequency of the FSB).
| Code name | P5, P54 | P55 (Pentium MMX) |
| Date released | 1993 | 1997 |
| Architecture | 32 bits | 32 bits |
| Data bus | 64 bits | 64 bits |
| Address bus | 32 bits | 32 bits |
| Maximum memory | 4096 MB | 4096 MB |
| L1 cache | 8 KB + 8 KB | 16 KB + 16 KB |
| L2 cache | Motherboard (FSB frequency) | Motherboard (FSB frequency) |
| Clock frequency | 60-200 MHz | 133-300 MHz |
| FSB | 50-66 MHz | 60-66 MHz |
| FPU | on chip | on chip |
| SIMD | no | MMX |
| Fabrication process | 800-600-350 nm | 350 nm |
| Number of transistors | 3.1-3.3 million | 4.5 million |
| Power consumption | 8-16 W | 4-17 W |
| Voltage | 5 V-3.3 V | 2.8 V |
| Die surface area | 294-163-90 mm² | 141 mm² |
| Connector | Socket 4, 5 or 7 | Socket 7 |
Here’s a little explanation of the Pentium bug: certain calculations using the FPU resulted in erroneous results. This was fairly rare—though sources disagree about exactly how rare—and Intel replaced the defective processors free of charge. Here’s an example of a Pentium error:
4195835.0/3145727.0 = 1.333 820 449 136 241 002 (correct result) 4195835.0/3145727.0 = 1.333 739 068 902 037 589 (incorrect result on a defective Pentium)
Pentium Pro: The First To Handle Over 4 GB Of Memory
The Pentium Pro, released in 1995, was the first x86 CPU able to manage more than 4 GB of RAM using Physical Address Extension (PAE), 36-bit address size, and thus 64 GB. An interesting point is that this processor was also the first P6 (the architecture the Core 2 processors are loosely derived from) and also the first x86 to include a Level 2 cache on the processor instead of on the motherboard. In fact, between 256 KB and 1 MB of cache were placed next to the CPU, on the same socket, making the L2 cache on-package as opposed to on-chip, clocked at the same frequency as the CPU.
This processor also had a bit of a performance issue. It ran great in 32-bit applications, but was much slower with software still written in 16 bits (like Windows 95). The cause was simple: access to 16-bit registers caused problems with management of the (32-bit) registers, which canceled out the advantages of the Pentium Pro’s out-of-order architecture.
| Code name | P6 |
| Date released | 1995 |
| Architecture | 32 bits |
| Data bus | 64 bits |
| Address bus | 36 bits |
| Maximum memory | 64 GB |
| L1 cache | 8 KB + 8 KB |
| L2 cache | external, 256-1024 KB (CPU frequency) |
| Clock frequency | 150-200 MHz |
| FSB | 60-66 MHz |
| FPU | built-in |
| SIMD | N/A |
| Fabrication process | 600-350 nm |
| Number of transistors | 5,500,000 + cache |
| Power consumption | 29-47 W |
| Voltage | 3.3 V |
| Die surface area | 306-196 mm² + cache |
| Connector | Socket 8 |
The cache measured 202 mm² (256 KB at 500 nm), 242 mm² (512 KB at 350 nm), or 484 mm² (1 MB at 350 nm). The number of transistors in the cache was 15.5 million (256 KB), 31 million (512 KB), or 62 million (1 MB).
Pentium II and III: Brothers
Released in 1997, the Pentium II was an adaptation of the Pentium Pro aimed at the general public. It was quite similar to the Pentium Pro, but the cache memory was different. Instead of using a cache at the same frequency as the processor (which is expensive), the 512 KB Level 2 cache operated at half-frequency. In addition, the Pentium II abandoned the classic socket for a cartridge containing the processor and the Level 2 cache, which was in the cartridge and not on the motherboard or in the processor itself.
New features compared to the Pentium Pro were essentially MMX (SIMD) support and a doubling of the Level 1 cache. The first Pentium III (Katmai) was very similar to the Pentium II. Released in 1999, its new feature was essentially support for SSE (SIMD instructions), but the rest was identical.
| Code name | Klamath (Pentium II 0.35µ), Deschutes (Pentium II 0.25µ), Katmai (Pentium III) |
| Date released | 1997, 1998, 1999 |
| Architecture | 32 bits |
| Data bus | 64 bits |
| Address bus | 36 bits (32 bits on the P III) |
| Maximum memory | 64 GB (4 GB on the P III) |
| L1 cache | 16 KB + 16 KB |
| L2 cache | external, 512 KB (1/2 CPU frequency) |
| Clock frequency | 233-300 MHz (Klamath), 300-450 MHz (Deschutes), 450-600 MHz (Klamath) |
| FSB | 66-100-133 MHz |
| FPU | built-in |
| SIMD | MMX (SSE) |
| Fabrication process | 350 nm (Klamath), 250 nm (Deschutes, Katmai) |
| Number of transistors | 7,500,000 + cache (Pentium II), 9,500,000 + cache (Pentium III) |
| Power consumption | 25-35 W |
| Voltage | 2.8 V (0.35µ), 2 V (0.25µ) |
| Die surface area | 204 mm² (0.35µ), 131 mm² (0.25µ), 128 mm² (PIII) + cache |
| Connector | Slot 1 |
The Pentium II and III had 512 KB of Level 2 cache (31 million transistors). One Pentium II actually had an on-chip 256 KB Level 2 cache—the Pentium II Mobile Dixon. Using a 180 nm fabrication process, this processor was significantly faster than the desktop versions
Celeron and Xeon: Intel Aims At The High/Low End
At the end of the 1990s, Intel launched two of its best-known processor brands: Celeron and Xeon. The former was aimed at the budget market and the latter at servers, and sometimes workstations. The first Celeron (Covington) was a Pentium II without a Level 2 cache, and suffered extremely poor performance, whereas the Pentium II Xeon had a large cache. Even now, both brands still exist—Celeron for the entry-level market (generally with a reduced cache and a slower FSB) and Xeon for servers (with a fast FSB, sometimes more cache, and high clock speeds).
Intel quickly added a cache to the Celeron with the Mendocino model (128 KB). The Celeron 300A is famous for its overclocking capacities, able to go 50% or more above its rated clock speed much of the time.
| Code name | Covington, Mendocino | Drake |
| Date released | 1998 | 1998 |
| Architecture | 32 bits | 32 bits |
| Data bus | 64 bits | 64 bits |
| Address bus | 32 bits | 36 bits |
| Maximum memory | 4 GB | 64 GB |
| L1 cache | 16 KB + 16 KB | 16 KB + 16 KB |
| L2 cache | 0 KB/128 KB (internal, CPU frequency) | external, 512 KB-2,408 KB (CPU frequency) |
| Clock frequency | 266-300 MHz/300-533 MHz | 400-450 MHz |
| FSB | 66 MHz | 100 MHz |
| FPU | built in | built in |
| SIMD | MMX | MMX |
| Fabrication process | 250 nm | 250 nm |
| Number of transistors | 7,500,000/19,000,000 | 7,500,000 + cache |
| Power consumption | 16–28 W | 30-46 W |
| Voltage | 2 V | 2 V |
| Die surface area | 131 mm²/154 mm² | 131 mm² + cache |
| Connector | Slot1/Socket 370 PPGA | Slot 2 |
Like the Pentium II, Xeon had an external L2 cache inside the processor cartridge. Its capacity was between 512 KB and 2 MB, and the number of transistors between 31 million and 124 million.
The Pentium III Hits 1 GHz
The Pentium III Coppermine was the first commercial x86 processor from Intel to attain a clock speed of 1 GHz; a 1.13 GHz version was even released, but was quickly taken off the market because it was unstable. This new version of the Pentium III improved the Level 2 cache—now on-die. It was faster than the 512 KB external cache on the first model and was touted as a feature able to speed up the Internet experience. It was released in three versions: server (Xeon), entry-level (Celeron), and mobile (with the first version of SpeedStep).
| Code name | Coppermine |
| Date released | 1999 |
| Architecture | 32 bits |
| Data bus | 64 bits |
| Address bus | 32 bits |
| Maximum memory | 4 GB |
| L1 cache | 16 KB + 16 KB |
| L2 cache | internal, 256 KB (CPU frequency) |
| Clock frequency | 500–1,133 MHz |
| FSB | 100-133 MHz |
| FPU | built in |
| SIMD | MMX (SSE) |
| Fabrication process | 180 nm |
| Number of transistors | 28.1 million |
| Power consumption | 25-35 W |
| Voltage | 1.6 V, 1.8 V |
| Die surface area | 106 mm² |
| Connector | Slot 1-Socket 370 FCPGA |
A slightly improved version (Tualatin), with more L2 cache (512 KB) and centering on a 130 nm process, was released in 2002. Essentially intended for servers (PIII-S) and mobile devices, it was less common in consumer-level machines.
The Pentium 4: A Lot Of Noise Over Very Little
In November 2000, Intel announced its new processor, the Pentium 4. With a higher clock speed (at least 1,400 MHz), this processor had a major drawback in that its performance wasn’t as good as competing models on a per-clock basis. AMD’s Athlon (and even the Pentium III) performed better at the same frequency. Complicating matters, Intel tried to shift to Rambus’ RDRAM memory (the only memory at the time capable of meeting the requirements of the CPU’s FSB), but failed. Expensive and hot, the Pentium 4 nonetheless managed, with many modifications, to more or less stay in the competition for a few years (by adding L3 cache and technologies like Hyper-Threading).
| Code name | Willamette | Northwood | Prescott |
| Date released | 2000 | 2001 | 2004 |
| Architecture | 32 bits | 32 bits | 32 bits |
| Data bus | 64 bits | 64 bits | 64 bits |
| Address bus | 32 bits | 32 bits | 32 bits |
| Maximum memory | 4 GB | 4 GB | 4 GB |
| L1 cache | 8 KB + 12 Kµops | 8 KB + 12 Kµops | 16 KB + 12 Kµops |
| L2 cache | 256 KB | 512 KB | 1,024 KB |
| Clock frequency | 1.3-2 GHz | 1.8–3.4 GHz | 2.4–3.8 GHz |
| FSB | 400 MHz | 400, 533, 800 MHz | 533, 800 MHz |
| SIMD | MMX, SSE, SSE2 | MMX, SSE, SSE2 | MMX, SSE, SSE2, SSE3 |
| SMT/SMP | no | Hyper-Threading (certain versions) | Hyper-Threading |
| Fabrication process | 180 nm | 130 nm | 90 nm |
| Number of transistors | 42 million | 55 million | 125 million |
| Power consumption | 66-100 W | 54-137 W | 94-151 W |
| Voltage | 1.7 V | 1.55 V | 1.25–1.5 V |
| Die surface area | 217 mm² | 146 mm² | 112 mm² |
| Connector | Socket 423/Socket 478 | Socket 478 | Socket 478/LGA775 |
Mobile versions (with a variable multiplier), Celeron versions (with a smaller L2 cache), and Xeon versions (with an L3 cache) of the Pentium 4 were sold. Hyper-Threading and the L3 cache are two technologies that first appeared on servers and were then adapted to standard processors (though L3 cache was available only on the expensive EE models).
We should also mention the FSB, which was clocked at a fourth of the nominal clock frequency, using what is called Quad Data Rate (QDR) technology—a 400 MHz bus is actually 100 MHz QDR, 533 MHz is 133 MHz QDR, etc. Finally, 64-bit versions of the Pentium 4 appeared in 2005, which we’ll talk about later on.
AMD ATI Comparison Table
With more and more graphics chips being released every day it became very complicated for the user who does not follow the video card market to know the differences among all ATI graphic chips in the market today. To facilitate knowing and understanding the difference among major ATI chips, we have compiled the following table.
It is important to notice that starting 2007 both ATI and nVidia started referring to the memory clock of their video cards with the real clock rate used. In the past manufacturers referred the memory clocks with double their real clock rate, because DDR and subsequent technologies (DDR2, GDDR3, etc) allow the memory chip to transfer two data per clock cycle. So a video card with a memory chip running at 500 MHz would be referred as having a 1 GHz memory. In order to keep the compatibility of our table, we are still referring the memory clocks with the DDR naming convention – i.e. double the real clock rate – on cards with memories based on DDR or subsequent technologies.
|
Chip |
Core Clock |
Memory Clock |
Memory Interface |
Memory Transfer Rate |
Pixels per clock |
DirectX |
|
Radeon 9200 |
250 MHz |
400 MHz |
128-bit |
6.4 GB/s |
4 |
8.1 |
|
Radeon 9200 Pro |
275 MHz |
550 MHz |
128-bit |
8.8 GB/s |
4 |
8.1 |
|
Radeon 9200 SE |
200 MHz |
333 MHz |
64-bit |
2.6 GB/s |
4 |
8.1 |
|
Radeon 9250 |
240 MHz |
400 MHz |
128-bit |
6.4 GB/s |
4 |
8.1 |
|
Radeon 9250 SE |
240 MHz |
400 MHz |
64-bit |
3.2 GB/s |
4 |
8.1 |
|
Radeon 9500 |
275 MHz |
540 MHz |
128-bit |
8.6 GB/s |
4 |
9.0 |
|
Radeon 9550 |
250 MHz |
400 MHz |
128-bit |
6.4 GB/s |
4 |
9.0 |
|
Radeon 9550 SE |
250 MHz |
400 MHz |
64-bit |
3.2 GB/s |
4 |
9.0 |
|
Radeon 9500 Pro |
275 MHz |
540 MHz |
128-bit |
8.6 GB/s |
8 |
9.0 |
|
Radeon 9600 |
325 MHz |
400 MHz |
128-bit |
6.4 GB/s |
4 |
9.0 |
|
Radeon 9600 Pro |
400 MHz |
600 MHz |
128-bit |
9.6 GB/s |
4 |
9.0 |
|
Radeon 9600 SE |
325 MHz |
400 MHz |
64-bit |
3.2 GB/s |
4 |
9.0 |
|
Radeon 9600 XT |
500 MHz |
600 MHz |
128-bit |
9.6 GB/s |
4 |
9.0 |
|
Radeon 9700 |
275 MHz |
540 MHz |
256-bit |
17.2 GB/s |
8 |
9.0 |
|
Radeon 9700 Pro |
325 MHz |
620 MHz |
256-bit |
19.8 GB/s |
8 |
9.0 |
|
Radeon 9800 |
325 MHz |
580 MHz |
256-bit |
18.56 GB/s |
8 |
9.0 |
|
Radeon 9800 Pro |
380 MHz |
680 MHz |
256-bit |
21.7 GB/s |
8 |
9.0 |
|
Radeon 9800 SE |
325 MHz |
500 MHz |
128-bit or 256-bit |
8 GB/s or 16 GB/s |
4 |
9.0 |
|
Radeon 9800 XT |
412 MHz |
730 MHz |
256-bit |
23.3 GB/s |
8 |
9.0 |
|
Radeon X300 SE |
325 MHz |
400 MHz |
64-bit |
3.2 GB/s |
4 |
9.0 |
|
Radeon X300 |
325 MHz |
400 MHz |
128-bit |
6.4 GB/s |
4 |
9.0 |
|
Radeon X550 |
400 MHz |
500 MHz |
128-bit or 64-bit |
8 GB/s or 4 GB/s |
4 |
9.0 |
|
Radeon X600 Pro |
400 MHz |
600 MHz |
128-bit |
9.6 GB/s |
4 |
9.0 |
|
Radeon X600 XT |
500 MHz |
730 MHz |
128-bit |
11.68 GB/s |
4 |
9.0 |
|
Radeon X700 |
400 MHz |
600 MHz |
128-bit |
9.6 GB/s |
8 |
9.0 |
|
Radeon X700 Pro |
420 MHz |
864 MHz |
128-bit |
13.8 GB/s |
8 |
9.0 |
|
Radeon X700 XT |
475 MHz |
1.05 GHz |
128-bit |
16.8 GB/s |
8 |
9.0 |
|
Radeon X800 SE |
* |
* |
* |
* |
8 |
9.0 |
|
Radeon X800 |
400 MHz |
700 MHz |
256-bit |
22.4 GB/s |
12 |
9.0 |
|
Radeon X800 XL |
400 MHz |
1 GHz |
256-bit |
32 GB/s |
16 |
9.0 |
|
Radeon X800 GT |
475 MHz |
** |
128-bit or 256-bit |
** |
8 |
9.0 |
|
Radeon X800 GTO |
400 MHz |
1 GHz *** |
256-bit |
32 GB/s |
12 |
9.0 |
|
Radeon X800 Pro |
475 MHz |
950 MHz |
256-bit |
30.4 GB/s |
12 |
9.0 |
|
Radeon X800 XT |
500 MHz |
1 GHz |
256-bit |
32 GB/s |
16 |
9.0 |
|
Radeon X800 XT PE |
520 MHz |
1.12 GHz |
256-bit |
35.8 GB/s |
16 |
9.0 |
|
Radeon X850 Pro |
520 MHz |
1.08 GHz |
256-bit |
34.56 GB/s |
12 |
9.0 |
|
Radeon X850 XT |
520 MHz |
1.08 GHz |
256-bit |
34.56 GB/s |
16 |
9.0 |
|
Radeon X850 PE |
540 MHz |
1.18 GHz |
256-bit |
37.76 GB/s |
16 |
9.0 |
|
Radeon X1050 |
**** |
**** |
**** |
**** |
4 |
9.0c |
|
Radeon X1300 HM |
450 MHz |
1 GHz |
128-bit or 64-bit or 32-bit |
16 GB/s or 8 GB/s or 4 GB/s |
4 |
9.0c |
|
Radeon X1300 |
450 MHz |
500 MHz |
128-bit or 64-bit or 32-bit |
8 GB/s or 4 GB/s or 2 GB/s |
4 |
9.0c |
|
Radeon X1300 Pro |
600 MHz |
800 MHz |
128-bit or 64-bit or 32-bit |
12.8 GB/s or 6.4 GB/s or 3.2 GB/s |
4 |
9.0c |
|
Radeon X1300 XT |
500 MHz |
800 MHz (DDR2) or 1 GHz (GDDR3) |
128-bit |
12.8 GB/s or 16 GB/s |
12 |
9.0c |
|
Radeon X1550 |
450 MHz or 550 MHz or 600 MHz |
800 MHz |
64-bit or 128-bit |
6.4 GB/s or 12.8 GB/s |
4 |
9.0c |
|
Radeon X1600 Pro |
500 MHz or 575 MHz |
780 MHz |
128-bit |
12.48 GB/s |
12 |
9.0c |
|
Radeon X1600 XT |
590 MHz |
1.38 GHz |
128-bit |
22.08 GB/s |
12 |
9.0c |
|
Radeon X1650 Pro |
600 MHz |
1.40 GHz |
128-bit |
22.40 GB/s |
12 |
9.0c |
|
Radeon X1650 XT |
575 MHz |
1.35 GHz |
128-bit |
21.60 GB/s |
24 |
9.0c |
|
Radeon X1800 GTO |
500 MHz |
1 GHz |
256-bit |
32 GB/s |
12 |
9.0c |
|
Radeon X1800 XL |
500 MHz |
1 GHz |
256-bit |
32 GB/s |
16 |
9.0c |
|
Radeon X1800 XT |
625 MHz |
1.5 GHz |
256-bit |
48 GB/s |
16 |
9.0c |
|
Radeon X1900 GT |
575 MHz |
1.2 GHz |
256-bit |
38.4 GB/s |
36 |
9.0c |
|
Radeon X1900 XT |
625 MHz |
1.45 GHz |
256-bit |
46.4 GB/s |
48 |
9.0c |
|
Radeon X1900 XTX |
650 MHz |
1.55 GHz |
256-bit |
49.6 GB/s |
48 |
9.0c |
|
Radeon X1950 GT |
500 MHz |
1.2 GHz |
256-bit |
38.4 GB/s |
36 |
9.0c |
|
Radeon X1950 Pro |
575 MHz |
1.38 GHz |
256-bit |
44.16 GB/s |
36 |
9.0c |
|
Radeon X1950 XT |
625 MHz |
1.8 GHz |
256-bit |
57.6 GB/s |
48 |
9.0c |
|
Radeon X1950 XTX |
650 MHz |
2 GHz |
256-bit |
64 GB/s |
48 |
9.0c |
|
Radeon HD 2400 Pro |
525 MHz |
800 MHz |
64-bit |
6.4 GB/s |
40 ***** |
10 |
|
Radeon HD 2400 XT |
700 MHz |
1.6 GHz |
64-bit |
12.8 GB/s |
40 ***** |
10 |
|
Radeon HD 2600 Pro |
600 MHz |
800 MHz |
128-bit |
12.8 GB/s |
120 ***** |
10 |
|
Radeon HD 2600 XT |
800 MHz |
1.6 GHz (GDDR3) or 2.2 GHz (GDDR4) |
128-bit |
25.6 GB/s (GDDR3) or 35.2 GB/s (GDDR4) |
120 ***** |
10 |
|
Radeon HD 2900 GT |
600 MHz |
1.6 GHz |
256-bit |
51.2 GB/s |
240 ***** |
10 |
|
Radeon HD 2900 Pro |
600 MHz |
1.85 GHz |
512-bit |
118.4 GB/s |
320 ***** |
10 |
|
Radeon HD 2900 XT |
740 MHz |
1.65 GHz (GDDR3) or 2 GHz (GDDR4) |
512-bit |
105.6 GB/s (GDDR3) or 128 GB/s (GDDR4) |
320 ***** |
10 |
|
Radeon HD 3450 ^ |
600 MHz |
1 GHz |
64-bit |
8 GB/s |
40 ***** |
10.1 |
|
Radeon HD 3470 ^ |
800 MHz |
1.90 GHz |
64-bit |
15.2 GB/s |
40 ***** |
10.1 |
|
Radeon HD 3650 ^ |
725 MHz |
1 GHZ (DDR2) or 1.6 GHz (GDDR3) |
128-bit |
16 GB/s (DDR2) or 25.6 GB/s (GDDR3) |
120 ***** |
10.1 |
|
Radeon HD 3690 ^ |
668 MHz |
1,656 MHz |
128-bit |
26.5 GB/s |
120 ***** |
10.1 |
|
Radeon HD 3850 ^ |
670 MHz |
1.66 GHz |
256-bit |
53.12 GB/s |
320 ***** |
10.1 |
|
Radeon HD 3870 ^ |
775 MHz |
2.25 GHz |
256-bit |
72 GB/s |
320 ***** |
10.1 |
|
Radeon HD 3870 X2 ^ + |
825 MHz |
1.8 GHz |
256-bit |
57.6 GB/s |
320 ***** |
10.1 |
|
Radeon HD 4350 ^ |
600 MHz |
1 GHz |
64-bit |
8 GB/s |
80 ***** |
10.1 |
|
Radeon HD 4550 ^ |
800 MHz |
1.6 GHz |
64-bit |
12.8 GB/s |
80 ***** |
10.1 |
|
Radeon HD 4650 ^ |
600 MHz |
1 GHz or 1.4 GHz |
128-bit |
16 GB/s or 22.4 GB/s |
320 ***** |
10.1 |
|
Radeon HD 4670 ^ |
750 MHz |
2 GHz (512 MB) or 1,746 MHz (1 GB) |
128-bit |
32 GB/s or 27.94 GB/s |
320 ***** |
10.1 |
|
Radeon HD 4830 ^ |
575 MHz |
1.8 GHz |
256-bit |
57.6 GB/s |
640 ***** |
10.1 |
|
Radeon HD 4850 ^ |
625 MHz |
2 GHz |
256-bit |
64 GB/s |
800 ***** |
10.1 |
|
Radeon HD 4850 X2 ^ + |
625 MHz |
2 GHz |
256-bit |
64 GB/s |
800 ***** |
10.1 |
|
Radeon HD 4870 ^ |
750 MHz |
3.6 GHz |
256-bit |
115.2 GB/s |
800 ***** |
10.1 |
|
Radeon HD 4870 X2 ^ + |
750 MHz |
2.6 GHz |
256-bit |
115.2 GB/s |
800 ***** |
10.1 |
* ATI doesn’t set a default clock for Radeon X800 SE chip. The specs depend on the video card manufacturer. So you have to take care when comparing video cards using this chip.
** Depends on the model. There are boards based on Radeon X800 GT using DDR, DDR2 and GDDR3 memories running at different speeds. We’ve seen GDDR3 models running at 980 MHz and DDR models running at 700 MHz. You can calculate the memory transfer rate using the formula memory clock x number of bits / 8. A model with GDDR3 memory running at 980 MHz and 256-bit interface has a transfer rate of 31.36 GB/s.
*** There are models using DDR memories and running at lower clock rates.
**** There are three video card versions using this chip with very different specs, depending on the memory chips used. If they are 128 MB DDR, then the graphics chip runs at 400 MHz, the memory runs at 500 MHz, a 128-bit memory interface is used and the memory has a maximum theoretical transfer rate of 8 GB/s. If the card has 128 MB DDR2, then the graphics chip runs at 325 MHz, the memory runs at 666 MHz, a 64-bit memory interface is used and the memory has a maximum theoretical transfer rate of 5.3 GB/s. And finally if the card has 256 MB DDR2 then the graphics chip runs at 400 MHz, the memory runs at 666 MHz, a 128-bit memory interface us used and the memory has a maximum theoretical transfer rate of 10.6 GB/s.
***** The shader unit is unified, meaning that this chip doesn’t have separated pixel shader and vertex shader units. On video cards from Radeon HD 2400 and HD 2600 series the video card manufacturer can use a different clock for the memory (usually lower, thus achieveing a lower performance compared to the reference model); the clock rates published here are the official one.
^ Based on PCI Express 2.0, which doubles the available I/O bandwidth from 2.5 GB/s to 5 GB/s if a PCI Express 2.0 motherboard is used.
+ Radeon HD 3870 X2, Radeon HD 4850 X2 and HD 4870 X2 use two Radeon chips working in parallel (CrossFire). The specs published are for only one of the chips.
When you compare chips, you have to be very careful. Judging from the table, a Radeon 9800 may seem slower than a Radeon 9600 Pro, since its clock is inferior, and a Radeon X700 Pro seems faster than a Radeon X800 since it uses a higher clock rate.
However, Radeon 9800 accesses its memory using a 256-bit interface and processes eight pixels per clock pulse, while the Radeon 9600 Pro accesses its memory using a 128-bit interface and processes four pixels per clock pulse. This means that memory access and processing performance of the Radeon 9800 would the double of that of the Radeon 9600 Pro if they were working at the same clock. In other words, a Radeon 9600 Pro would have work at 650 MHz and access the memory at 1.360 MHz to have the same performance of the Radeon 9800.
The same idea goes for the Radeon X700 Pro example, it accesses memory using a 128-bit interface and processes data at 8 pixels per clock tick, while Radeon X800 accesses memory using a 256-bit interface and processes data at 12 pixels per clock tick.
Therefore, it is not correct to compare graphic chips only through their clocks. For the processing performance you will have to compare the clocks and the number of pixels per clock. As of the memory, the right way to compare its performance among different chips is through their memory transfer rate, which is calculated using the formula (clock x bits per clock)/ 8.
As you can see in the table, “SE” chips are the simplest and access the memory at only 64 bits per time. Another detail is that ATI uses the letters “XT” to indicate the fastest chip in a series, while its competitor, nVidia, uses the same letters to indicate the simplest chip in a series.
“PE” stands for “Platinum Edition” and are models even faster than the “XT” models, aimed to gamers with money.
As for the DirectX version, check the table below:
|
DirectX |
Shader Model |
|
7.0 |
No |
|
8.1 |
1.4 |
|
9.0 |
2.0 |
|
9.0c |
3.0 |
|
10 |
4.0 |
|
10.1 |
4.1 |
Nvidia Chips Comparison
|
Chip |
Core Clock |
Memory Clock |
Memory Interface |
Memory Transfer Rate |
Pixels per clock |
DirectX |
|
GeForce 4 MX 440 AGP 8x |
275 MHz |
512 MHz |
128-bit |
8.1 GB/s |
2 |
7 |
|
GeForce MX 4000 |
250 MHz |
* |
32-bit or 64-bit or 128-bit |
* |
2 |
7 |
|
GeForce FX 5200 |
250 MHz |
400 MHz |
64-bit or 128-bit |
3.2 GB/s or 6.4 GB/s |
4 |
9.0 |
|
GeForce FX 5200 Ultra |
350 MHz |
650 MHz |
128-bit |
10.4 GB/s |
4 |
9.0 |
|
GeForce FX 5600 |
325 MHz |
550 MHz |
128-bit |
8.8 GB/s |
4 |
9.0 |
|
GeForce FX 5500 |
270 MHz |
400 MHz |
64-bit or 128-bit |
3.2 GB/s or 6.4 GB/s |
4 |
9.0 |
|
GeForce FX 5600 Ultra |
500 MHz |
800 MHz |
128-bit |
12.8 GB/s |
4 |
9.0 |
|
GeForce FX 5700 LE |
250 MHz |
400 MHz |
128-bit |
6.4 GB/s |
4 |
9.0 |
|
GeForce FX 5700 |
425 MHz |
600 MHz |
128-bit |
9,6 GB/s |
4 |
9.0 |
|
GeForce FX 5700 Ultra |
475 MHz |
900 MHz |
128-bit |
14.4 GB/s |
4 |
9.0 |
|
GeForce FX 5800 |
400 MHz |
900 MHz |
128-bit |
14.4 GB/s |
8 |
9.0 |
|
GeForce FX 5800 Ultra |
500 MHz |
1 GHz |
128-bit |
16 GB/s |
8 |
9.0 |
|
GeForce FX 5900 XT |
390 MHz |
680 MHz |
256-bit |
21.7 GB/s |
8 |
9.0 |
|
GeForce FX 5900 |
400 MHz |
850 MHz |
256-bit |
27.2 GB/s |
8 |
9.0 |
|
GeForce FX 5900 Ultra |
450 MHz |
850 MHz |
256-bit |
27.2 GB/s |
8 |
9.0 |
|
GeForce FX 5950 Ultra |
475 MHz |
950 MHz |
256-bit |
30.4 GB/s |
8 |
9.0 |
|
GeForce PCX 5300 |
325 MHz |
650 MHz |
128-bit |
10.4 GB/s |
4 |
9.0 |
|
GeForce PCX 5750 |
475 MHz |
900 MHz |
128-bit |
14.4 GB/s |
4 |
9.0 |
|
GeForce PCX 5900 |
350 MHz |
500 MHz |
256-bit |
17.6 GB/s |
8 |
9.0 |
|
GeForce PCX 5950 |
475 MHz |
900 MHz |
256-bit |
30.4 GB/s |
8 |
9.0 |
|
GeForce 6200 |
300 MHz |
550 MHz |
128-bit |
8.8 GB/s |
4 |
9.0c |
|
GeForce 6200 LE |
350 MHz |
550 MHz |
64-bit |
4.4 GB/s |
2 |
9.0c |
|
GeForce 6200 (TC) |
350 MHz |
666 MHz * |
32-bit or 64-bit |
2.66 GB/s or 5.32 GB/s * |
4 |
9.0c |
|
GeForce 6500 (TC) |
400 MHz |
666 MHz * |
32-bit or 64-bit |
2.66 GB/s or 5.32 GB/s * |
4 |
9.0c |
|
GeForce 6600 |
300 MHz |
550 MHz * |
64-bit or 128-bit |
4.4 GB/s or 8.8 GB/s * |
8 |
9.0c |
|
GeForce 6600 DDR2 |
350 MHz |
800 MHz * |
128-bit |
12.8 GB/s * |
8 |
9.0c |
|
GeForce 6600 LE |
300 MHz |
* |
64-bit or 128-bit |
* |
4 |
9.0c |
|
GeForce 6600 GT |
500 MHz |
1 GHz |
128-bit |
16 GB/s |
8 |
9.0c |
|
GeForce 6600 GT AGP |
500 MHz |
900 MHz |
128-bit |
14.4 GB/s |
8 |
9.0c |
|
GeForce 6800 LE |
300 MHz |
700 MHz |
256-bit |
22.4 GB/s |
8 |
9.0c |
|
GeForce 6800 XT |
325 MHz |
600 MHz |
256 bits |
19.2 GB/s |
8 |
9.0c |
|
GeForce 6800 XT AGP |
325 MHz |
700 MHz |
256 bits |
22.4 GB/s |
8 |
9.0c |
|
GeForce 6800 |
325 MHz |
600 MHz |
256-bit |
19.2 GB/s |
12 |
9.0c |
|
GeForce 6800 AGP |
325 MHz |
700 MHz |
256-bit |
22.4 GB/s |
12 |
9.0c |
|
GeForce 6800 GS |
425 MHz |
1 GHz |
256-bit |
32 GB/s |
12 |
9.0c |
|
GeForce 6800 GS AGP |
350 MHz |
1 GHz |
256-bit |
32 GB/s |
12 |
9.0c |
|
GeForce 6800 GT |
350 MHz |
1 GHz |
256-bit |
32 GB/s |
16 |
9.0c |
|
GeForce 6800 Ultra |
400 MHz |
1.1 GHz |
256-bit |
35.2 GB/s |
16 |
9.0c |
|
GeForce 6800 Ultra Extreme |
450 MHz |
1.1 GHz |
256-bit |
35.2 GB/s |
16 |
9.0c |
|
GeForce 7100 GS (TC) |
350 MHz |
666 MHz * |
64-bit |
5.3 GB/s * |
4 |
9.0c |
|
GeForce 7200 GS (TC) |
450 MHz |
800 MHz * |
64-bit |
6.4 GB/s * |
4 |
9.0c |
|
GeForce 7300 SE (TC) |
225 MHz |
* |
64-bit |
* |
4 |
9.0c |
|
GeForce 7300 LE (TC) |
450 MHz |
648 MHz * |
64-bit |
5.2 GB/s * |
4 |
9.0c |
|
GeForce 7300 GS (TC) |
550 MHz |
810 MHz * |
64-bit |
6.5 GB/s * |
4 |
9.0c |
|
GeForce 7300 GT (TC) |
350 MHz |
667 MHz |
128-bit |
10.6 GB/s |
8 |
9.0c |
|
GeForce 7600 GS |
400 MHz |
800 MHz |
128-bit |
12.8 GB/s |
12 |
9.0c |
|
GeForce 7600 GT |
560 MHz |
1.4 GHz |
128-bit |
22.4 GB/s |
12 |
9.0c |
|
GeForce 7800 GS |
375 MHz |
1.2 GHz |
256-bit |
38.4 GB/s |
16 |
9.0c |
|
GeForce 7800 GT |
400 MHz |
1 GHz |
256-bit |
32 GB/s |
20 |
9.0c |
|
GeForce 7800 GTX |
430 MHz |
1.2 GHz |
256-bit |
38.4 GB/s |
24 |
9.0c |
|
GeForce 7800 GTX 512 |
550 MHz |
1.7 GHz |
256-bit |
54.4 GB/s |
24 |
9.0c |
|
GeForce 7900 GS |
450 MHz |
1.32 GHz |
256-bit |
42.2 GB/s |
20 |
9.0c |
|
GeForce 7900 GT |
450 MHz |
1.32 GHz |
256-bit |
42.2 GB/s |
24 |
9.0c |
|
GeForce 7900 GTX |
650 MHz |
1.6 GHz |
256-bit |
51.2 GB/s |
24 |
9.0c |
|
GeForce 7950 GT |
550 MHz |
1.4 GHz |
256-bit |
44.8 GB/s |
24 |
9.0c |
|
GeForce 7950 GX2 ** |
500 MHz |
1.2 GHz |
256-bit |
38.4 GB/s |
24 |
9.0c |
|
GeForce 8400 GS *** |
450 MHz / 900 MHz |
800 MHz |
64-bit |
6.4 GB/s |
16 |
10 |
|
GeForce 8500 GT *** |
450 MHz / 900 MHz |
666 MHz or 800 MHz |
128-bit |
10.6 GB/s or 12.8 GB/s |
16 |
10 |
|
GeForce 8600 GT DDR2 *** |
540 MHz / 1.18 GHz |
666 MHz or 800 MHz |
128-bit |
10.6 GB/s or 12.8 GB/s |
32 |
10 |
|
GeForce 8600 GT GDDR3 *** |
540 MHz / 1.18 GHz |
1.4 GHz |
128-bit |
22.4 GB/s |
32 |
10 |
|
GeForce 8600 GTS *** |
675 MHz / 1.45 GHz |
2 GHz |
128-bit |
32 GB/s |
32 |
10 |
|
GeForce 8800 GS *** ^ |
550 MHz / 1,375 MHz |
1.6 GHz |
192-bit |
38.4 GB/s |
96 |
10 |
|
GeForce 8800 GT *** ^ |
600 MHz / 1.5 GHz |
1.8 GHz |
256-bit |
57.6 GB/s |
112 |
10 |
|
GeForce 8800 GTS *** |
500 MHz / 1.2 GHz |
1.6 GHz |
320-bit |
64 GB/s |
96 |
10 |
|
GeForce 8800 GTS 512 *** ^ |
650 MHz / 1,625 MHz |
1.94 GHz |
256-bit |
62.08 GB/s |
128 |
10 |
|
GeForce 8800 GTX *** |
575 MHz / 1.35 GHz |
1.8 GHz |
384-bit |
86.4 GB/s |
128 |
10 |
|
GeForce 8800 Ultra *** |
612 MHz / 1.5 GHz |
2.16 GHz |
384-bit |
103.6 GB/s |
128 |
10 |
|
GeForce 9400 GT *** ^ |
550 MHz / 1.4 GHz |
800 MHz |
128-bit |
12.8 GB/s |
16 |
10 |
|
GeForce 9500 GT *** ^ |
550 MHz / 1.4 GHz |
1 GHz (DDR2) or 1.6 GHz (GDDR3) |
128-bit |
16 GB/s (DDR2) or 25.6 GB/s (GDDR3) |
32 |
10 |
|
GeForce 9600 GSO *** ^ |
550 MHz / 1.35 GHz |
1.6 GHz |
192-bit |
38.4 GB/s |
96 |
10 |
|
GeForce 9600 GT *** ^ |
650 MHz / 1,625 MHz |
1.8 GHz |
256-bit |
57.6 GB/s |
64 |
10 |
|
GeForce 9800 GT *** ^ |
600 MHz / 1.5 GHz |
1.8 GHz |
256-bit |
57.6 GB/s |
112 |
10 |
|
GeForce 9800 GTX *** ^ |
675 MHz / 1,688 MHz |
2.2 GHz |
256-bit |
70.4 GB/s |
128 |
10 |
|
GeForce 9800 GTX+ *** ^ |
738 MHz / 1,836 MHz |
2.2 GHz |
256-bit |
70.4 GB/s |
128 |
10 |
|
GeForce 9800 GX2 ** *** ^ |
600 MHz / 1.5 GHz |
2 GHz |
256-bit |
64 GB/s |
128 |
10 |
|
GeForce GTX 260 *** ^ |
576 MHz / 1,242 MHz |
2 GHz |
448-bit |
112 GB/s |
192 |
10 |
|
GeForce GTX 280 *** ^ |
602 MHz / 1,296 MHz |
2.21 GHz |
512-bit |
141.7 GB/s |
240 |
10 |
* The manufacturer can setup a different memory clock rate or interface, so pay attention because not all video cards based on this chip have this spec. The memory transfer rate will depend on the interface and clock rate used. See how to calculate below.
** GeForce 7950 GX2 and GeForce 9800 GX2 use two graphics processors in parallel (SLI mode). The specs published are for just one of the chips.
*** GeForce 8, 9 and 200 series use two clocks, the higher one is used by the shader unit and the lower one by the rest of the chip. The shader unit is unified, meaning that these chips don’t have separated pixel shader and vertex shader units.
^ Based on PCI Express 2.0, which doubles the available I/O bandwidth from 2.5 GB/s to 5 GB/s if a PCI Express 2.0 motherboard is used.
(TC) means TurboCache. TurboCache is a technology that allows the video card to simulate more video memory by using part of the main system RAM as video memory.
At first nVidia’s profusion of letters may seem confusing. The GeForce FX 5700 Ultra chip works at a higher clock than the GeForce FX 5900, GeForce FX 5900 Ultra and GeForce FX 5900 XT chips, and this may make you think that a GeForce FX 5700 Ultra is the faster than chips in the 5900 series.
But that is not really so. Chips from the GeForce FX 5900 series access the memory at 256 bits per time, while the memory is accessed at 128 bits in the FX 5700 series. That makes the 5900 series memory access performance twice as fast as those of the previous series. For instance, the GeForce FX 5700 Ultra would have to access its memory at 1,700 MHz – the double of the memory clock used – to reach the memory performance of the GeForce FX 5900 Ultra.
Another example. From the table you may think GeForce 6600 GT is faster than a GeForce 6800 because it has a higher clock rate (500 MHz against 325 MHz). But GeForce 6800 accesses memory 256 bits at a time while GeForce 6600 GT accesses memory 128 bits at a time, and also GeForce 6800 processes 12 pixels per clock tick, while GeForce 6600 GT processes eight pixels per clock.
The right way to compare the memory performance of different chips is through their memory transfer rate, which is calculated using the formula (clock x bits per clock ) / 8.
Another difference is the graphic processor of the FX 5900 series, which processes eight pixels per clock pulse, while the graphic chip only processes four pixels per clock in the other series. In other words, despite having a higher clock, the graphic processing performance of the GeForce FX 5700 Ultra is inferior than those of chips from the FX 5900 series, as they process the double of pixels when work at the same clock (simply put, the GeForce FX 5700 Ultra would have to work at twice its clock to have the same performance of the GeForce FX 5900 Ultra).
Therefore, it is not correct to compare graphic chips only through their clocks.
We must be careful with the GeForce FX 5900 XT, too. While ATI uses the letters “XT” to indicate high-end chips (for ex.: Radeon 9800 XT), nVidia uses the same letters to indicate the low-end chips of the series (see table).
You have to be very carefull with low-end video cards using nVidia chips, because they can use different clock rates and different memory interface from the table. For example, you can find GeForce FX 5200, GeForce FX 5500 and GeForce 6600 with 64-bit or 128-bit interface. We’ve seen GeForce FX 5200, GeForce FX 5500 and GeForce 6200 with 32-bit interface on the market!



























Recent Comments