Archive for November 9, 2008

Imitation To Innovation: AMD’s Best CPUs – Part 2

Duron and Sempron: AMD’s Celerons

CPU makers seem to like names that end in “on.” To compete with the Celeron and back up its Athlon, AMD released the Duron, later replaced by the Sempron. These two budget processors were generally slower than the Athlon and had less cache memory. AMD’s exclusive cache design enabled CPUs with an L2 cache that was smaller than the L1, since the latter was not mirrored in the L2 (unlike the inclusive architecture used by Intel). The Sempron is simply a re-named Athlon XP, with certain versions equipped with less cache memory (256 of 512 KB are disabled in the Thorton).

AMD Duron and Sempron
Code name Spitfire Thorton
Date released 2000 2004
Architecture 32-bits 32-bits
Data bus 32-bits 32-bits
Address bus 32-bits 32-bits
Maximum memory 4,096 MB 4,096 MB
L1 cache 64 KB + 64 KB 64 KB + 64 KB
L2 cache 64 KB (CPU frequency) 256 KB (CPU frequency)
Clock frequency 600-950 MHz 1,500-2,000 MHz
FSB 100 MHz (DDR) 166 MHz (DDR)
FPU built-in built-in
SIMD MMX, Enhanced 3DNow! MMX, Enhanced 3DNow!, SSE
Fabrication process 180 nm 130 nm
Number of transistors 25 million 54.3 million
Power consumption 27-41 W 62 W
Voltage 1.5–1.6 V 1.6 V
Die surface area 100 mm² 100.99 mm²
Connector Socket A Socket A

In addition to the Spitfire, AMD also released the Duron Morgan (based on the Athlon XP, with SSE support) and the Applebred (130 nm). The Sempron continued its career with the K8 Sempron 3400+, which is a 64-bit Sempron.

The K8: AMD Moves To 64 Bits

K8 was the first x86 processor compatible with 64-bit addressing. The architecture had other advantages such as an integrated memory controller. AMD has released a veritable army of K8-based processors since then, but we’ll concentrate on the models intended for the mainstream: the Athlon 64s. In practice, the Opteron (the server version), Athlon 64 FX (high-end) and Turion 64 (for mobile PCs) are very closely related. In general, they differ only in the management of the memory controller and cache memory, plus the type of memory used.

AMD Athlon 64
Code name ClawHammer Orleans
Date released 2003 2006
Architecture 64-bits 64-bits
Data bus 64-bits 64-bits
Address bus 64-bits 64-bits
Maximum memory 1 TB 1 TB
L1 cache 64 KB + 64 KB 64 KB + 64 KB
L2 cache 1,024 KB (CPU frequency) 512 KB (CPU frequency)
Clock frequency 1,800-2,400 MHz 1,800-2,600 MHz
memory controller DDR-400, 1 channel DDR2-667, 2 channel
FSB 800 MHz (HTT) 1,000 MHz (HTT)
FPU built-in built-in
SIMD MMX, Enhanced 3DNow!, SSE, SSE2 MMX, Enhanced 3DNow!, SSE, SSE2, SSE3
Fabrication process 130 nm 90 nm
Number of transistors 105.9 million 81.1 million
Power consumption 89 W (TDP) 62 W (TDP)
Voltage 1.5 V 1.25-1.4 V
Die surface area 193 mm² 103 mm²
Connector Socket 754 Socket AM2

Athlon 64 processors still use a PR number to indicate their ranking in the product range and there are many different versions, which generally differ in terms of cache memory and/or fabrication process. We highlighted only two models, but there are a dozen or so different K8 versions for the standard Athlon 64 alone.

Athlon 64 X2: AMD’s Dual-Core

In 2005, AMD changed its architecture to offer a dual-core version of the K8, and the Athlon 64 X2 was born. Though made up of two K8 cores, the architecture—using a HyperTransport interface—enabled good performance, unlike the solution used by Intel, with the FSB handling communication between the CPUs in its first dual-core processors. The Athlon 64 X2 exists in different sockets and is still on the market (as of August 2008) as an entry-level solution.

AMD Athlon 64 X2
Code name Toledo Brisbane
Date released 2005 2006
Architecture 64-bits 64-bits
Data bus 64-bits 64-bits
Address bus 64-bits 64-bits
Maximum memory 1 TB 1 TB
L1 cache 64 KB + 64 KB x 2 64 KB + 64 KB x 2
L2 cache 1,024 KB x 2 (CPU frequency) 512 KB x 2 (CPU frequency)
Clock frequency 2,200-2,400 MHz 1,900-3,100 MHz
Memory controller DDR-400, 2 channels DDR2-800, 2 channels
FSB 1,000 MHz (HTT) 1,000 MHz (HTT)
FPU built-in built-in
SIMD MMX, Enhanced 3DNow!, SSE, SSE2, SSE3 MMX, Enhanced 3DNow!, SSE, SSE2, SSE3
Fabrication process 90 nm 65 nm
Number of transistors 233.2 million 153 million
Power consumption 89/110 W (TDP) 65/89 W (TDP)
Voltage 1.35–1.4 V 1.25–1.35 V
Die surface area 199 mm² 126 mm²
Connector Socket 939 Socket AM2

As for the Athlon 64, we’re only showing two versions of the K8, though other versions exist. Obviously there are server versions (Opteron), high-end versions (Athlon 64 FX) and mobile versions (Turion 64 X2), and also entry-level versions in the form of the Sempron X2. One final anecdote: AMD got away with using the same code name for a processor as Intel had used: the Santa Rosa (a dual-core Opteron manufactured on a 90 nm process).

The Future Lies With Phenom

Now, in mid-2008, it’s no secret that AMD’s CPUs are struggling to keep up on the performance front. But there are a few promising prospects. Early tests of the 45 nm Phenom show interesting results and the Fusion, a cross between a GPU and a CPU, seems to be making progress as well.

Let’s hope that AMD’s financial problems are only temporary, and that they’ll be around for many more years to compete with Intel in the x86 processor arena.

November 9, 2008 at 12:25 pm Leave a comment

Intel’s 15 Most Unforgettable x86 CPUs-Part 2

Pentium M: Laptops Flex Their Muscles

In 2003 the market for portable PCs was booming and Intel had only two processors for them: the aging Pentium III Tualatin and the Pentium 4, whose high power consumption made it unsuitable. But a savior was to arrive from Israel: the Banias (alias Pentium M). This processor, based on the P6 architecture (the same as the Pentium Pro) had high performance and low power consumption. It even beat the Pentium 4, while consuming a lot less power. This was the processor used in the Centrino platform and it was quickly followed (in 2004) by the (faster) Dothan model. The Pentium M left its mark on the world of mobility, and the Stealey (A100) still uses the Dothan architecture (with lower frequencies and TDP).

Intel Pentium M
Code name Banias Dothan
Date released 2003 2004
Architecture 32 bits 32 bits
Data bus 64 bits 64 bits
Address bus 32 bits 32 bits
Maximum memory 4 GB 4 GB
L1 cache 32 KB + 32 KB 32 KB + 32 KB
L2 cache 1,024 KB 2,048 KB
Clock frequency 0.9–1.7 GHz 1–2.13 GHz
FSB 400 MHz 400, 533 MHz
SIMD MMX, SSE, SSE2 MMX, SSE, SSE2
SMT/SMP no no
Fabrication process 130 nm 90 nm
Number of transistors 77 million 140 million
Power consumption 9-30 W 6-35 W
Voltage 0.9–1.5 V 0.9–1.4 V
Die surface area 82 mm² 87 mm²
Connector Socket 479 Socket 479

As with the Pentium 4, the FSB actually operated at a quarter of the nominal frequency (QDR). The connector used, the Socket 479, actually had 478 pins, but they were arranged differently from the Pentium 4 Socket 478 (though adapters were made).

Pentium 4 Gets 64-bit And Another Core

In 2005, Intel improved its Pentium 4 twice. First, with the Prescott-2M, and then with Smithfield. The former was a 64-bit processor, based on the Prescott design, and the latter was a dual-core processor. They are fairly similar and have the same problems as other Pentium 4s: low instructions per cycle (IPC) throughput and difficulty in increasing the clock frequency due to current losses. These two processors, intended to limit losses while awaiting the Core 2 Duo, are not among Intel’s most highly regarded. And while the Pentium D (the commercial name of the Smithfield) does have two cores, in reality it’s an assembly of two Prescott dies in the same package.

Intel Pentium 4
Code name Prescott-2M Smithfield
Date released 2005 2005
Architecture 64 bits 64 bits
Data bus 64 bits 64 bits
Address bus 64 (actual 36) bits 64 (actual 36) bits
Maximum memory 64 GB 64 GB
L1 cache 16 KB + 12 Kµops 2 x 16 KB + 12 Kµops
L2 cache 2,048 KB 2 x 1,024 KB
Clock frequency 3–3.6 GHz 2.8–3.2 GHz
FSB 800 MHz 800 MHz
SIMD MMX, SSE, SSE2, SSE3 MMX, SSE, SSE2, SSE3
SMT/SMP Hyper-Threading dual cores (Hyper-Threading on certain models)
Fabrication process 90 nm 90 nm
Number of transistors 169 million 230 million
TDP 84-115 W 95-130 W
Voltage 1.2 V 1.2 V
Die surface area 135 mm² 206 mm²
Connector LGA775 LGA775

An interesting point is that whereas the Pentium 4 processors intended for the consumer market did not use the PAE technology (which enables 36-bit, as opposed to 32-bit memory management) and were therefore limited to 4 GB of RAM, these models can go beyond that limit. In practice, the address bus is still limited to 36 bits (40 bits on the Xeon), but PAE (management in 4 GB pages) is now ancient history—a 64-bit program is capable of making full use of the available memory.

Hyper-Threading, an Intel SMT technology, was available on certain models (Xeon and Extreme Edition). Finally, a 65 nm version (the 9×0 series) of the Pentium 4 was released later, but made no major improvements.

The First Mobile Dual-Core

In 2006, Intel announced the Core Duo. The first dual-core processor for portable PCs boasted excellent performance—much better than the Pentium 4. It was also one of the first x86 processors to be truly dual-core. The cache, for example, is shared (whereas the Pentium D was more like an assembly of two processors in the same package). This processor was part of the Centrino Duo platform and was a huge success. The only drawback was that it was still a 32-bit processor, unlike the Pentium 4.

Intel Core Duo
Code name Yonah
Date released 2006
Architecture 32 bits
Data bus 64 bits
Address bus 32 bits
Maximum memory 4 GB
L1 cache 32 KB + 32 KB
L2 cache 2,048 KB shared
Clock frequency 1.06–2.33 GHz
FSB 667 MHz
SIMD MMX, SSE, SSE2, SSE3
SMT/SMP Dual core
Fabrication process 65 nm
Number of transistors 151 million
TDP 9-31 W
Voltage 0.9–1.3 V
Die surface area 91 mm²
Connector Socket 479

A Core Solo version with one core was also made available, and the low-power-consumption versions used a 533 MHz bus (133 MHz QDR) instead of 667 MHz. This processor was used in servers (code name Sossaman), which was a first for a processor originally intended for the mobile world. Note that this processor didn’t officially use the Core architecture of the Core 2 Duo, and it was quickly replaced by the Core 2 Duo (Merom) in portable PCs. Also, the Yonah’s Socket 479 is different from the Socket 479 of other Pentium M processors.

Today’s Hotness: The Core 2 Duo

In 2006, Intel released a processor that quickly became a best-seller: the Core 2 Duo. Derived from work done on the Pentium M, this processor uses a new Core architecture. Before, Intel had two lines of processors—the Pentium 4 for desktops, Pentium M for mobiles, and both lines for servers. In contrast, Intel now has a single micro-architecture on which all of its product lines draw. The 64-bit Core 2 Duo is represented from the low end to the high end, for desktop computers, portables and servers.

There are many versions of the architecture, resulting in configurations with a different number of cores (one to four, yielding everything from Solos to Quads), cache memory (512 KB to 12 MB), and the FSB (between 400 and 1600 MHz). The model shown here is the original Core 2 Duo, but faster versions (at 45 nm) exist.

Intel Core 2 Duo
Code name Conroe
Date released 2006
Architecture 64 bits
Data bus 64 bits
Address bus 64 (actual 36) bits
Maximum memory 64 GB
L1 cache 32 KB + 32 KB
L2 cache 2,048 KB shared
Clock frequency 1.8-3 GHz
FSB 800-1066-1333 MHz
SIMD MMX, SSE, SSE2, SSE3, SSSE3
SMT/SMP Dual core
Fabrication process 65 nm
Number of transistors 291 million
TDP 65 W
Voltage 1.5 V
Die surface area 143 mm²
Connector LGA 775

The mobile versions (Merom) are basically identical (but not as fast, with a slower FSB) whereas the Extreme Edition versions are faster. The Core 2 Duo also exists in a four-core version, which was, in fact, two Conroes in the same package. The 45 nm version of the Core 2 Duo (Penryn) has a larger cache and generates less heat, but is still fundamentally similar to this model.

The Future: Nehalem, Atom, Etc.

Obviously, this is only the first part of a series of articles. The second part, on AMD processors, will follow (along with a piece on AMD’s ATI graphics cards). But the story of the Intel x86 processors doesn’t end with the Core 2 Duo, and obviously other models are planned for the future. Nehalem and Atom are also x86 processors. And a little bird tells us that Intel’s upcoming entry into the graphics market, Larrabee, is also based on a number of x86 cores.

November 9, 2008 at 11:38 am Leave a comment

Imitation To Innovation: AMD’s Best CPUs – Part 1

AMD Clones Intel

The year is 1981, and Intel (see history of Intel processors) has just been chosen by IBM to supply the processor for the first personal computer. IBM wanted at least two CPU suppliers for its PC, and forced Intel to license its technology. And so it was that AMD became one of the first companies to sell an 8086 clone. AMD’s first processor went on sale in 1982. Because it was a licensed processor, the AMD 8086 (and 8088) was identical to Intel’s model.

AMD 8086
Code name ?
Date released 1982
Architecture 16-bits
Data bus 16-bits
Address bus 20-bits
Maximum memory 1 MB
L1 cache no
L2 cache no
Clock frequency 5-10 MHz
FSB same as clock frequency
FPU 8087
SIMD no
Fabrication process 3,000 nm
Number of transistors 29,000
Power consumption ?
Voltage 5 V
Die surface area 16 mm²
Connector 40 pins

Note the “© Intel” on the processor, made by AMD.

Am286: Manufactured Under License, But Faster

AMD’s Am286, a clone of the Intel 80286 manufactured under license, was identical to the chip from Intel, but it had a big advantage: its higher clock speed. Whereas Intel’s 286s topped out at 12.5 MHz, AMD sold 20 MHz versions. Because the 286 was more economical than the 386, whose innovations weren’t fully exploited for several years, AMD was already the value choice more than 20 years ago.

Am286
Code name ?
Date released 1983
Architecture 16-bits
Data bus 16-bits
Address bus 24-bits
Maximum memory 16 MB
L1 cache no
L2 cache no
Clock frequency 8-20 MHz
FSB same as clock frequency
FPU 80287
SIMD no
Fabrication process 1,500 nm
Number of transistors 134,000
Power consumption ?
Voltage 5 V
Die surface area 49 mm²
Connector 68 pins

Am386: A 40-MHz 386

In 1991, AMD released its 386 processor. Like its predecessors, this model was identical to the Intel versions. AMD was licensed to produce clones of Intel products, right down to the microcode (the CPU’s firmware). This processor had two notable features. First, it was faster than the Intel model—40 MHz compared to a top speed of 33 MHz at Intel—and it was the first to sport the Windows Compatible logo on the package.

Am386
Code name ?
Date released 1991
Architecture 32-bits
Data bus 32-bits
Address bus 32-bits
Maximum memory 4,096 MB
L1 cache no
L2 cache no
Clock frequency 12-40 MHz
FSB same as clock frequency
FPU 80387
SIMD no
Fabrication process 1,500 – 1,000 nm
Number of transistors 275,000
Power consumption 2 W (33 MHz)
Voltage 5 V
Die surface area 42 mm²
Connector 132 pins

Am486: The Last Clone

The 486 was the last clone of an Intel processor. AMD produced 486s in two different versions—one with microcode by Intel and another with microcode by AMD, because the company was having legal hassles with Intel by that point. In addition to processors sold under the 486 designation, AMD also marketed an AMD 5×86, which was a 486 with a 4x clock multiplier. Running at 133 MHz, this model was compatible with 486 motherboards, but had the performance of a Pentium 75. It was with the 5×86 that AMD began using the famous “Pentium Rating” (5×86 PR 75), which it would stay with up to and including the Athlon 64 X2.

Am486 / 5×86
Code name ? X5
Date released 1993 1995
Architecture 32-bits 32-bits
Data bus 32-bits 32-bits
Address bus 32-bits 32-bits
Maximum memory 4,096 MB 4,096 MB
L1 cache 8 KB 16 KB
L2 cache motherboard (FSB frequency) motherboard (FSB frequency)
Clock frequency 16-120 MHz 133 MHz
FSB 16-50 MHz 33 MHz
FPU built-in built-in
SIMD no no
Fabrication process 1,000 – 800 nm 350 nm
Number of transistors 1,185,000 ?
Power consumption ? ?
Voltage 5 V–3.3 V 3.45 V
Die surface area 81 – 67 mm² ?
Connector 168 pins 168 pins

The K5: AMD’s Very Own Processor

In 1996, AMD released its fifth-generation processor, the K5. Compared to Intel’s Pentium, the K5 was technically more advanced, though it did have some faults. It’s especially interesting because of its RISC-based internal architecture that decoded x86 instructions into micro-instructions before executing them. The K5 had difficulty reaching high clock speeds and its FPU was a little weak. Still, in normal use, the K5 was a better performer than the Pentium and its PR was not just hype—a K5 clocked at 100 MHz was sold as a PR133 chip, meaning that AMD considered it as being equivalent in performance to a 133 MHz Pentium.

AMD K5
Code name SSA/5, 5k86
Date released 1996
Architecture 32-bits
Data bus 64-bits
Address bus 32-bits
Maximum memory 4,096 MB
L1 cache 16 KB + 8 KB
L2 cache motherboard (FSB frequency)
Clock frequency 75-133 MHz (PR75 – PR200)
FSB 50-66 MHz
FPU built-in
SIMD no
Fabrication process 500 – 350 nm
Number of transistors 4.3 million
Power consumption 11-16 W
Voltage 3.52 V
Die surface area 251 – 181 mm²
Connector Socket 5 or 7

The use of the PR resulted in such oddities as a K5 PR90 and PR120 running at the same frequency (90 MHz) and a PR100 and PR133 both clocked at 100 MHz. Notice also that the CPU package informed buyers that a heat sink and fan were required—at that time, the use of such cooling devices was not yet common practice.

The K6: AMD Extends Its Range

In 1997, AMD released a new processor: the K6. Unlike the K5, which was created by AMD, the K6 was the result of the work done by NexGen on the Nx686. This processor was compatible with Socket 7 (Pentium) motherboards and offered very good performance compared to Intel’s Pentium II processors, at a much lower price. The K6’s FPU was still a little weak compared to Intel’s. A 250 nm version of the K6, called Little Foot, came out in 1998.

Also in 1998, AMD announced the K6-2, a processor that used a faster bus (100 MHz) and had improved SIMD performance. It also had one more MMX unit than the K6 and a new instruction set, 3DNow!, for floating-point calculations (MMX handled only integers). The K6-2 (400 and up) was a big success because it was a good upgrade solution for owners of Pentium MMX platforms—by using the 2X multiplier on a motherboard with a 66 MHz bus, the processor was in fact operating at 6X (400 MHz), which permitted a significant gain in speed at a lower upgrade cost.

Finally, in 1999, AMD released the third version of the K6, the K6-III. The main difference from the K6-2 version was an on-chip 256 KB cache. The K6-III was very fast, but also very costly to produce, and was quickly replaced by the Athlon (K7).

AMD K6, K6-2, K6-III
Code name K6, Little Foot (250 nm) K6-3D, Chomper Sharptooth
Date released 1997/1998 1998 1999
Architecture 32-bits 32-bits 32-bits
Data bus 64-bits 64-bits 64-bits
Address bus 32-bits 32-bits 32-bits
Maximum memory 4,096 MB 4,096 MB 4,096 MB
L1 cache 32 KB + 32 KB 32 + 32 KB 32 + 32 KB
L2 cache motherboard (FSB frequency) motherboard (FSB frequency) 256 KB (CPU frequency)
L3 cache no no motherboard (FSB frequency)
Clock frequency 166-300 MHz 300-550 MHz 400-450 MHz
FSB 50-66 MHz 66-100 MHz 100 MHz
FPU built-in built-in built-in
SIMD MMX MMX, 3DNow! MMX, 3DNow!
Fabrication process 350 – 250 nm 250 nm 250 nm
Number of transistors 8.8 million 9.3 million 21.3 million
Power consumption 12-28 W 13-25 W 10-17 W
Voltage 2.2–2.9 V–3.2 V 2.2–2.4 V 2.2–2.4 V
Die surface area 157-68 mm² 81 mm² 118 mm²
Connector Socket 7 Socket 7 / Super Socket 7 Super Socket 7

AMD also marketed K6-2+ and K6-3+ processors, mainly for portable PCs. These used a 180 nm fab process and had an on-chip 128 KB (K6-2+) or 256 KB (K6-3+) L2 cache.

K7/Athlon: A Killer

In 1999, AMD released its seventh-generation processor, the K7, later renamed Athlon. This chip did away with the drawbacks of earlier models and finally had an FPU worthy of the name—in fact, it was even better than Intel’s. The Athlon was the fastest x86 processor and had many strong points, including a fast FSB—the EV6, used in the first Alpha processors—and high performance numbers. The only real problem came not from the processor but from the chipsets: neither the AMD nor Via models could compete with Intel’s chipsets (like the famous 440BX). The K7 used Slot A (competing with Intel’s Slot 1) and had a Level 2 cache with a variable divider (1/2, 2/5 or 1/3).

AMD Athlon (K7, K75)
Code name Argon (K7) Pluto, Orion (K75)
Date released 1999 1999
Architecture 32-bits 32-bits
Data bus 64-bits 64-bits
Address bus 32-bits 32-bits
Maximum memory 4,096 MB 4,096 MB
L1 cache 64 KB + 64 KB 64 KB + 64 KB
L2 cache Slot A (1/2 CPU) Slot A (1/2, 2/5 or 1/3 CPU)
Clock frequency 500-700 MHz 550-1000 MHz
FSB 100 MHz (DDR) 100 MHz (DDR)
FPU built-in built-in
SIMD MMX, Enhanced 3DNow! MMX, Enhanced 3DNow!
Fabrication process 250 nm 180 nm
Number of transistors 22 million 22 million
Power consumption 42-50 W 31-65 W
Voltage 1.6 V 1.6–1.8 V
Die surface area 184 mm² 102 mm²
Connector Slot A Slot A

Just as a side note, it was AMD who was the first to announce (and market) a 1 GHz processor with the Athlon (two days before Intel’s 1 GHz Pentium III).

AMD Improves the Athlon: Thunderbird, XP, and more.

AMD knew it had a winner with the K7 architecture and improved it little by little, increasing the frequency and using finer fab processes. The Thunderbird core employed a 180 nm process and had 256 KB of on-chip cache. The Palomino design introduced support for SSE. The Athlon XP changed the package and reinstated PR numbers. The Thoroughbred was an Athlon XP using a 130 nm fab process (with a 256 KB cache). Barton had a 512 KB cache and also used a 130 nm process. Athlon XP and subsequent models used the PR number instead of a clock frequency designation.

AMD Athlon
Code name Thunderbird Palomino/XP Thoroughbred Barton
Date released 2000 2001 2002 2003
Architecture 32-bits 32-bits 32-bits 32-bits
Data bus 64-bits 64-bits 64-bits 64-bits
Address bus 32-bits 32-bits 32-bits 32-bits
Maximum memory 4,096 MB 4,096 MB 4,096 MB 4,096 MB
L1 cache 64 KB + 64 KB 64 KB + 64 KB 64 KB + 64 KB 64 KB + 64 KB
L2 cache 256 KB (CPU frequency) 256 KB (CPU frequency) 256 KB (CPU frequency) 512 KB (CPU frequency)
Clock frequency 650-1,400 MHz 1,000-1,733 MHz 1,200-2,250 MHz 1,400-2,200 MHz
FSB 100/133 MHz (DDR) 133 MHz (DDR) 133/166 MHz (DDR) 166/200 MHz (DDR)
FPU built-in built-in built-in built-in
SIMD MMX, Enhanced 3DNow! MMX, Enhanced 3DNow!, SSE MMX, Enhanced 3DNow!, SSE MMX, Enhanced 3DNow!, SSE
Fabrication process 180 nm 180 nm 130 nm 130 nm
Number of transistors 37 million 37.5 million 37.2 million 54.3 million
Power consumption 38-72 W 46-72 W 49-68 W 60-76 W
Voltage 1.7-1.75 V 1.75 V 1.5-1.65 V 1.65 V
Die surface area 120 mm² 129.26 mm² 84.66 mm² 100.99 mm²
Connector Socket A Socket A Socket A Socket A

We should mention that AMD also produced versions for servers (Athlon MP) and for laptops (Athlon 4, Athlon XP Mobile), as well as the Geode NX (130 nm and a 256 KB cache). AMD marketed the Thorton (130 nm, 512 KB of cache, 256 KB of which was disabled) and planned on Trinidad, an Athlon using a 90 nm process. There were more PR oddities: the Athlon XP 2600+ was clocked at 1,900, 1,917, 2,000, 2,083, or 2,133 MHz depending on the version, for instance

November 9, 2008 at 11:07 am Leave a comment

Intel’s 15 Most Unforgettable x86 CPUs-Part 1

8086: The First PC processor

The 8086 was the first x86 processor—Intel had already released the 4004, the 8008, the 8080 and the 8085. This 16-bit processor could manage 1 MB of memory using an external 20-bit address bus. The clock frequency chosen by IBM (4.77 MHz) was fairly low, though the processor was running at 10 MHz by the end of its career.

The first PCs used a derivative of this processor, the 8088, which had only an 8-bit (external) data bus. An interesting aside is that the control systems in the US space shuttles use 8086 processors and NASA was forced to buy some from eBay in 2002 since Intel could no longer supply them.

Intel 8086
Code name N/A
Date released 1979
Architecture 16 bits
Data bus 16 bits
Address bus 20 bits
Maximum memory 1 MB
L1 cache no
L2 cache no
Clock frequency 4.77-10 MHz
FSB same as clock frequency
FPU 8087
SIMD no
Fabrication process 3,000 nm
Number of transistors 29,000
Power consumption N/A
Voltage 5 V
Die surface area 16 mm²
Connector 40-pin

80286: 16 MB Of Memory, But Still 16 Bits

Released in 1982, the 80286 was 3.6 times faster than the 8086 at the same frequency. It could manage up to 16 MB of memory, but the 286 was still a 16-bit processor. It was the first x86 equipped with a memory management unit (MMU), allowing it to manage virtual memory. Like the 8086, it did not have a floating-point unit (FPU), but could use a x87 co-processor chip (80287). Intel offered these processors at a maximum frequency of 12.5 MHz, whereas their competitors reached 25 MHz.

Intel 80286
Code name N/A
Date released 1982
Architecture 16 bits
Data bus 16 bits
Address bus 24 bits
Maximum memory 16 MB
L1 cache No
L2 cache No
Clock frequency 6–12 MHz
FSB same as clock frequency
FPU 80287
SIMD No
Fabrication process 1,500 nm
Number of transistors 134,000
Power consumption N/A
Voltage 5 V
Die surface area 49 mm²
Connector 68-pin

386: 32-Bit and Cache Memory

Intel’s 80836 was the first x86 with a 32-bit architecture. Several versions of this processor were offered. The two best known are the 386 SX (Single-word eXternal), which had a 16-bit data bus, and the 386 DX (Double-word eXternal) with a 32-bit data bus. Two other versions are worth noting, though: the SL, which was the first x86 to offer management of a cache (external) and the 386EX, used in the space program (the Hubble telescope uses this processor).

Intel 80386 DX
Code name P3
Date released 1985
Architecture 32 bits
Data bus 32 bits
Address bus 32 bits
Maximum memory 4096 MB
L1 cache 0 KB (controller sometimes present)
L2 cache no
Clock frequency 16-33 MHz
FSB same as clock frequency
FPU 80387
SIMD no
Fabrication process 1,500-1,000 nm
Number of transistors 275,000
Power consumption 2 W @ 33 MHz
Voltage 5 V
Die surface area 42 mm² @ 1µ
Connector 132 pins

The 486: An FPU And Multipliers Too

The 486 is emblematic of a certain generation who were first discovering computers. In fact, the very famous 486 DX2/66 was long considered the minimum configuration for gamers. This processor, released in 1989, ushered in several interesting new features, like an on-chip FPU, data cache, and the first clock multiplier. The former consisted of an x87 coprocessor built into the 486 DX (not SX) series. An 8 KB Level 1 cache was built into the processor (write-through type, then write-back with slightly better performance). There was also the possibility of a Level 2 cache on the motherboard (at the bus frequency).

The second generation of 486s had a CPU multiplier, since the processor operated faster than the FSB, with DX2 (2x multiplier) and DX4 (3x multiplier) versions. Another anecdote: the “487SX” sold as an FPU for the 486SX was actually a full 486DX that disabled and took the place of the first processor.

Intel 80486 DX
Code name P4, P24, P24C
Date released 1989
Architecture 32 bits
Data bus 32 bits
Address bus 32 bits
Maximum memory 4096 MB
L1 cache 8 KB
L2 cache Motherboard (FSB frequency)
Clock frequency 16-100 MHz
FSB 16-50 MHz
FPU On chip
SIMD No
Fabrication process 1,000–800 nm
Number of transistors 1,185,000
Power consumption N/A
Voltage 5 V–3.3 V
Die surface area 81 – 67 mm²
Connector 168 pins

The DX4 had a 16 KB cache and a few more transistors: 1.6 million. This processor, using a 600 nm process and measuring 76 mm², consumed less power than the original 486 (at a voltage of 3.3 V).

Intel Pentium: A Bothersome Bug

The Pentium, introduced in 1993, was interesting for more than one reason. It was the first x86 to drop the traditional model number for a more attractive name, since Intel wasn’t allowed to trademark a name made up of numbers only. It’s also famous because of a bug it contained. On the first generations of Pentiums, certain division operations produced an incorrect result. Intel replaced the processors, but the damage was done. A very rare error gave rise to the first big IT media buzz.

The Pentium was sold in three different versions, the first without a CPU multiplier, the second with a multiplier (including the very familiar Pentium 166), and the last with the SIMD instruction set for x86s, MMX. The Pentium MMX also increased the size of the Level 1 cache and brought in a few minor improvements. This was the first Intel x86 capable of executing two instructions in parallel. The L2 cache was on the motherboard with these processors (running at the frequency of the FSB).

Intel Pentium (MMX)
Code name P5, P54 P55 (Pentium MMX)
Date released 1993 1997
Architecture 32 bits 32 bits
Data bus 64 bits 64 bits
Address bus 32 bits 32 bits
Maximum memory 4096 MB 4096 MB
L1 cache 8 KB + 8 KB 16 KB + 16 KB
L2 cache Motherboard (FSB frequency) Motherboard (FSB frequency)
Clock frequency 60-200 MHz 133-300 MHz
FSB 50-66 MHz 60-66 MHz
FPU on chip on chip
SIMD no MMX
Fabrication process 800-600-350 nm 350 nm
Number of transistors 3.1-3.3 million 4.5 million
Power consumption 8-16 W 4-17 W
Voltage 5 V-3.3 V 2.8 V
Die surface area 294-163-90 mm² 141 mm²
Connector Socket 4, 5 or 7 Socket 7

Here’s a little explanation of the Pentium bug: certain calculations using the FPU resulted in erroneous results. This was fairly rare—though sources disagree about exactly how rare—and Intel replaced the defective processors free of charge. Here’s an example of a Pentium error:

4195835.0/3145727.0 = 1.333 820 449 136 241 002 (correct result) 4195835.0/3145727.0 = 1.333 739 068 902 037 589 (incorrect result on a defective Pentium)

Pentium Pro: The First To Handle Over 4 GB Of Memory

The Pentium Pro, released in 1995, was the first x86 CPU able to manage more than 4 GB of RAM using Physical Address Extension (PAE), 36-bit address size, and thus 64 GB. An interesting point is that this processor was also the first P6 (the architecture the Core 2 processors are loosely derived from) and also the first x86 to include a Level 2 cache on the processor instead of on the motherboard. In fact, between 256 KB and 1 MB of cache were placed next to the CPU, on the same socket, making the L2 cache on-package as opposed to on-chip, clocked at the same frequency as the CPU.

This processor also had a bit of a performance issue. It ran great in 32-bit applications, but was much slower with software still written in 16 bits (like Windows 95). The cause was simple: access to 16-bit registers caused problems with management of the (32-bit) registers, which canceled out the advantages of the Pentium Pro’s out-of-order architecture.

Intel Pentium Pro
Code name P6
Date released 1995
Architecture 32 bits
Data bus 64 bits
Address bus 36 bits
Maximum memory 64 GB
L1 cache 8 KB + 8 KB
L2 cache external, 256-1024 KB (CPU frequency)
Clock frequency 150-200 MHz
FSB 60-66 MHz
FPU built-in
SIMD N/A
Fabrication process 600-350 nm
Number of transistors 5,500,000 + cache
Power consumption 29-47 W
Voltage 3.3 V
Die surface area 306-196 mm² + cache
Connector Socket 8

The cache measured 202 mm² (256 KB at 500 nm), 242 mm² (512 KB at 350 nm), or 484 mm² (1 MB at 350 nm). The number of transistors in the cache was 15.5 million (256 KB), 31 million (512 KB), or 62 million (1 MB).

Pentium II and III: Brothers

Released in 1997, the Pentium II was an adaptation of the Pentium Pro aimed at the general public. It was quite similar to the Pentium Pro, but the cache memory was different. Instead of using a cache at the same frequency as the processor (which is expensive), the 512 KB Level 2 cache operated at half-frequency. In addition, the Pentium II abandoned the classic socket for a cartridge containing the processor and the Level 2 cache, which was in the cartridge and not on the motherboard or in the processor itself.

New features compared to the Pentium Pro were essentially MMX (SIMD) support and a doubling of the Level 1 cache. The first Pentium III (Katmai) was very similar to the Pentium II. Released in 1999, its new feature was essentially support for SSE (SIMD instructions), but the rest was identical.

Intel Pentium II and III
Code name Klamath (Pentium II 0.35µ), Deschutes (Pentium II 0.25µ), Katmai (Pentium III)
Date released 1997, 1998, 1999
Architecture 32 bits
Data bus 64 bits
Address bus 36 bits (32 bits on the P III)
Maximum memory 64 GB (4 GB on the P III)
L1 cache 16 KB + 16 KB
L2 cache external, 512 KB (1/2 CPU frequency)
Clock frequency 233-300 MHz (Klamath), 300-450 MHz (Deschutes), 450-600 MHz (Klamath)
FSB 66-100-133 MHz
FPU built-in
SIMD MMX (SSE)
Fabrication process 350 nm (Klamath), 250 nm (Deschutes, Katmai)
Number of transistors 7,500,000 + cache (Pentium II), 9,500,000 + cache (Pentium III)
Power consumption 25-35 W
Voltage 2.8 V (0.35µ), 2 V (0.25µ)
Die surface area 204 mm² (0.35µ), 131 mm² (0.25µ), 128 mm² (PIII) + cache
Connector Slot 1

The Pentium II and III had 512 KB of Level 2 cache (31 million transistors). One Pentium II actually had an on-chip 256 KB Level 2 cache—the Pentium II Mobile Dixon. Using a 180 nm fabrication process, this processor was significantly faster than the desktop versions

Celeron and Xeon: Intel Aims At The High/Low End

At the end of the 1990s, Intel launched two of its best-known processor brands: Celeron and Xeon. The former was aimed at the budget market and the latter at servers, and sometimes workstations. The first Celeron (Covington) was a Pentium II without a Level 2 cache, and suffered extremely poor performance, whereas the Pentium II Xeon had a large cache. Even now, both brands still exist—Celeron for the entry-level market (generally with a reduced cache and a slower FSB) and Xeon for servers (with a fast FSB, sometimes more cache, and high clock speeds).

Intel quickly added a cache to the Celeron with the Mendocino model (128 KB). The Celeron 300A is famous for its overclocking capacities, able to go 50% or more above its rated clock speed much of the time.

Intel Celeron and Intel Xeon
Code name Covington, Mendocino Drake
Date released 1998 1998
Architecture 32 bits 32 bits
Data bus 64 bits 64 bits
Address bus 32 bits 36 bits
Maximum memory 4 GB 64 GB
L1 cache 16 KB + 16 KB 16 KB + 16 KB
L2 cache 0 KB/128 KB (internal, CPU frequency) external, 512 KB-2,408 KB (CPU frequency)
Clock frequency 266-300 MHz/300-533 MHz 400-450 MHz
FSB 66 MHz 100 MHz
FPU built in built in
SIMD MMX MMX
Fabrication process 250 nm 250 nm
Number of transistors 7,500,000/19,000,000 7,500,000 + cache
Power consumption 16–28 W 30-46 W
Voltage 2 V 2 V
Die surface area 131 mm²/154 mm² 131 mm² + cache
Connector Slot1/Socket 370 PPGA Slot 2

Like the Pentium II, Xeon had an external L2 cache inside the processor cartridge. Its capacity was between 512 KB and 2 MB, and the number of transistors between 31 million and 124 million.

The Pentium III Hits 1 GHz

The Pentium III Coppermine was the first commercial x86 processor from Intel to attain a clock speed of 1 GHz; a 1.13 GHz version was even released, but was quickly taken off the market because it was unstable. This new version of the Pentium III improved the Level 2 cache—now on-die. It was faster than the 512 KB external cache on the first model and was touted as a feature able to speed up the Internet experience. It was released in three versions: server (Xeon), entry-level (Celeron), and mobile (with the first version of SpeedStep).

Intel Pentium III
Code name Coppermine
Date released 1999
Architecture 32 bits
Data bus 64 bits
Address bus 32 bits
Maximum memory 4 GB
L1 cache 16 KB + 16 KB
L2 cache internal, 256 KB (CPU frequency)
Clock frequency 500–1,133 MHz
FSB 100-133 MHz
FPU built in
SIMD MMX (SSE)
Fabrication process 180 nm
Number of transistors 28.1 million
Power consumption 25-35 W
Voltage 1.6 V, 1.8 V
Die surface area 106 mm²
Connector Slot 1-Socket 370 FCPGA

A slightly improved version (Tualatin), with more L2 cache (512 KB) and centering on a 130 nm process, was released in 2002. Essentially intended for servers (PIII-S) and mobile devices, it was less common in consumer-level machines.

The Pentium 4: A Lot Of Noise Over Very Little

In November 2000, Intel announced its new processor, the Pentium 4. With a higher clock speed (at least 1,400 MHz), this processor had a major drawback in that its performance wasn’t as good as competing models on a per-clock basis. AMD’s Athlon (and even the Pentium III) performed better at the same frequency. Complicating matters, Intel tried to shift to Rambus’ RDRAM memory (the only memory at the time capable of meeting the requirements of the CPU’s FSB), but failed. Expensive and hot, the Pentium 4 nonetheless managed, with many modifications, to more or less stay in the competition for a few years (by adding L3 cache and technologies like Hyper-Threading).

Intel Pentium 4 32-bit
Code name Willamette Northwood Prescott
Date released 2000 2001 2004
Architecture 32 bits 32 bits 32 bits
Data bus 64 bits 64 bits 64 bits
Address bus 32 bits 32 bits 32 bits
Maximum memory 4 GB 4 GB 4 GB
L1 cache 8 KB + 12 Kµops 8 KB + 12 Kµops 16 KB + 12 Kµops
L2 cache 256 KB 512 KB 1,024 KB
Clock frequency 1.3-2 GHz 1.8–3.4 GHz 2.4–3.8 GHz
FSB 400 MHz 400, 533, 800 MHz 533, 800 MHz
SIMD MMX, SSE, SSE2 MMX, SSE, SSE2 MMX, SSE, SSE2, SSE3
SMT/SMP no Hyper-Threading (certain versions) Hyper-Threading
Fabrication process 180 nm 130 nm 90 nm
Number of transistors 42 million 55 million 125 million
Power consumption 66-100 W 54-137 W 94-151 W
Voltage 1.7 V 1.55 V 1.25–1.5 V
Die surface area 217 mm² 146 mm² 112 mm²
Connector Socket 423/Socket 478 Socket 478 Socket 478/LGA775

Mobile versions (with a variable multiplier), Celeron versions (with a smaller L2 cache), and Xeon versions (with an L3 cache) of the Pentium 4 were sold. Hyper-Threading and the L3 cache are two technologies that first appeared on servers and were then adapted to standard processors (though L3 cache was available only on the expensive EE models).

We should also mention the FSB, which was clocked at a fourth of the nominal clock frequency, using what is called Quad Data Rate (QDR) technology—a 400 MHz bus is actually 100 MHz QDR, 533 MHz is 133 MHz QDR, etc. Finally, 64-bit versions of the Pentium 4 appeared in 2005, which we’ll talk about later on.

November 9, 2008 at 10:44 am Leave a comment

AMD ATI Comparison Table

With more and more graphics chips being released every day it became very complicated for the user who does not follow the video card market to know the differences among all ATI graphic chips in the market today. To facilitate knowing and understanding the difference among major ATI chips, we have compiled the following table.

It is important to notice that starting 2007 both ATI and nVidia started referring to the memory clock of their video cards with the real clock rate used. In the past manufacturers referred the memory clocks with double their real clock rate, because DDR and subsequent technologies (DDR2, GDDR3, etc) allow the memory chip to transfer two data per clock cycle. So a video card with a memory chip running at 500 MHz would be referred as having a 1 GHz memory. In order to keep the compatibility of our table, we are still referring the memory clocks with the DDR naming convention – i.e. double the real clock rate – on cards with memories based on DDR or subsequent technologies.

Chip

Core Clock

Memory Clock

Memory Interface

Memory Transfer Rate

Pixels per clock

DirectX

Radeon 9200

250 MHz

400 MHz

128-bit

6.4 GB/s

4

8.1

Radeon 9200 Pro

275 MHz

550 MHz

128-bit

8.8 GB/s

4

8.1

Radeon 9200 SE

200 MHz

333 MHz

64-bit

2.6 GB/s

4

8.1

Radeon 9250

240 MHz

400 MHz

128-bit

6.4 GB/s

4

8.1

Radeon 9250 SE

240 MHz

400 MHz

64-bit

3.2 GB/s

4

8.1

Radeon 9500

275 MHz

540 MHz

128-bit

8.6 GB/s

4

9.0

Radeon 9550

250 MHz

400 MHz

128-bit

6.4 GB/s

4

9.0

Radeon 9550 SE

250 MHz

400 MHz

64-bit

3.2 GB/s

4

9.0

Radeon 9500 Pro

275 MHz

540 MHz

128-bit

8.6 GB/s

8

9.0

Radeon 9600

325 MHz

400 MHz

128-bit

6.4 GB/s

4

9.0

Radeon 9600 Pro

400 MHz

600 MHz

128-bit

9.6 GB/s

4

9.0

Radeon 9600 SE

325 MHz

400 MHz

64-bit

3.2 GB/s

4

9.0

Radeon 9600 XT

500 MHz

600 MHz

128-bit

9.6 GB/s

4

9.0

Radeon 9700

275 MHz

540 MHz

256-bit

17.2 GB/s

8

9.0

Radeon 9700 Pro

325 MHz

620 MHz

256-bit

19.8 GB/s

8

9.0

Radeon 9800

325 MHz

580 MHz

256-bit

18.56 GB/s

8

9.0

Radeon 9800 Pro

380 MHz

680 MHz

256-bit

21.7 GB/s

8

9.0

Radeon 9800 SE

325 MHz

500 MHz

128-bit or 256-bit

8 GB/s or 16 GB/s

4

9.0

Radeon 9800 XT

412 MHz

730 MHz

256-bit

23.3 GB/s

8

9.0

Radeon X300 SE

325 MHz

400 MHz

64-bit

3.2 GB/s

4

9.0

Radeon X300

325 MHz

400 MHz

128-bit

6.4 GB/s

4

9.0

Radeon X550

400 MHz

500 MHz

128-bit or 64-bit

8 GB/s or 4 GB/s

4

9.0

Radeon X600 Pro

400 MHz

600 MHz

128-bit

9.6 GB/s

4

9.0

Radeon X600 XT

500 MHz

730 MHz

128-bit

11.68 GB/s

4

9.0

Radeon X700

400 MHz

600 MHz

128-bit

9.6 GB/s

8

9.0

Radeon X700 Pro

420 MHz

864 MHz

128-bit

13.8 GB/s

8

9.0

Radeon X700 XT

475 MHz

1.05 GHz

128-bit

16.8 GB/s

8

9.0

Radeon X800 SE

*

*

*

*

8

9.0

Radeon X800

400 MHz

700 MHz

256-bit

22.4 GB/s

12

9.0

Radeon X800 XL

400 MHz

1 GHz

256-bit

32 GB/s

16

9.0

Radeon X800 GT

475 MHz

**

128-bit or 256-bit

**

8

9.0

Radeon X800 GTO

400 MHz

1 GHz ***

256-bit

32 GB/s

12

9.0

Radeon X800 Pro

475 MHz

950 MHz

256-bit

30.4 GB/s

12

9.0

Radeon X800 XT

500 MHz

1 GHz

256-bit

32 GB/s

16

9.0

Radeon X800 XT PE

520 MHz

1.12 GHz

256-bit

35.8 GB/s

16

9.0

Radeon X850 Pro

520 MHz

1.08 GHz

256-bit

34.56 GB/s

12

9.0

Radeon X850 XT

520 MHz

1.08 GHz

256-bit

34.56 GB/s

16

9.0

Radeon X850 PE

540 MHz

1.18 GHz

256-bit

37.76 GB/s

16

9.0

Radeon X1050

****

****

****

****

4

9.0c

Radeon X1300 HM

450 MHz

1 GHz

128-bit or 64-bit or 32-bit

16 GB/s or 8 GB/s or 4 GB/s

4

9.0c

Radeon X1300

450 MHz

500 MHz

128-bit or 64-bit or 32-bit

8 GB/s or 4 GB/s or 2 GB/s

4

9.0c

Radeon X1300 Pro

600 MHz

800 MHz

128-bit or 64-bit or 32-bit

12.8 GB/s or 6.4 GB/s or 3.2 GB/s

4

9.0c

Radeon X1300 XT

500 MHz

800 MHz (DDR2) or 1 GHz (GDDR3)

128-bit

12.8 GB/s or 16 GB/s

12

9.0c

Radeon X1550

450 MHz or 550 MHz or 600 MHz

800 MHz

64-bit or 128-bit

6.4 GB/s or 12.8 GB/s

4

9.0c

Radeon X1600 Pro

500 MHz or 575 MHz

780 MHz

128-bit

12.48 GB/s

12

9.0c

Radeon X1600 XT

590 MHz

1.38 GHz

128-bit

22.08 GB/s

12

9.0c

Radeon X1650 Pro

600 MHz

1.40 GHz

128-bit

22.40 GB/s

12

9.0c

Radeon X1650 XT

575 MHz

1.35 GHz

128-bit

21.60 GB/s

24

9.0c

Radeon X1800 GTO

500 MHz

1 GHz

256-bit

32 GB/s

12

9.0c

Radeon X1800 XL

500 MHz

1 GHz

256-bit

32 GB/s

16

9.0c

Radeon X1800 XT

625 MHz

1.5 GHz

256-bit

48 GB/s

16

9.0c

Radeon X1900 GT

575 MHz

1.2 GHz

256-bit

38.4 GB/s

36

9.0c

Radeon X1900 XT

625 MHz

1.45 GHz

256-bit

46.4 GB/s

48

9.0c

Radeon X1900 XTX

650 MHz

1.55 GHz

256-bit

49.6 GB/s

48

9.0c

Radeon X1950 GT

500 MHz

1.2 GHz

256-bit

38.4 GB/s

36

9.0c

Radeon X1950 Pro

575 MHz

1.38 GHz

256-bit

44.16 GB/s

36

9.0c

Radeon X1950 XT

625 MHz

1.8 GHz

256-bit

57.6 GB/s

48

9.0c

Radeon X1950 XTX

650 MHz

2 GHz

256-bit

64 GB/s

48

9.0c

Radeon HD 2400 Pro

525 MHz

800 MHz

64-bit

6.4 GB/s

40 *****

10

Radeon HD 2400 XT

700 MHz

1.6 GHz

64-bit

12.8 GB/s

40 *****

10

Radeon HD 2600 Pro

600 MHz

800 MHz

128-bit

12.8 GB/s

120 *****

10

Radeon HD 2600 XT

800 MHz

1.6 GHz (GDDR3) or 2.2 GHz (GDDR4)

128-bit

25.6 GB/s (GDDR3) or 35.2 GB/s (GDDR4)

120 *****

10

Radeon HD 2900 GT

600 MHz

1.6 GHz

256-bit

51.2 GB/s

240 *****

10

Radeon HD 2900 Pro

600 MHz

1.85 GHz

512-bit

118.4 GB/s

320 *****

10

Radeon HD 2900 XT

740 MHz

1.65 GHz (GDDR3) or 2 GHz (GDDR4)

512-bit

105.6 GB/s (GDDR3) or 128 GB/s (GDDR4)

320 *****

10

Radeon HD 3450 ^

600 MHz

1 GHz

64-bit

8 GB/s

40 *****

10.1

Radeon HD 3470 ^

800 MHz

1.90 GHz

64-bit

15.2 GB/s

40 *****

10.1

Radeon HD 3650 ^

725 MHz

1 GHZ (DDR2) or 1.6 GHz (GDDR3)

128-bit

16 GB/s (DDR2) or 25.6 GB/s (GDDR3)

120 *****

10.1

Radeon HD 3690 ^

668 MHz

1,656 MHz

128-bit

26.5 GB/s

120 *****

10.1

Radeon HD 3850 ^

670 MHz

1.66 GHz

256-bit

53.12 GB/s

320 *****

10.1

Radeon HD 3870 ^

775 MHz

2.25 GHz

256-bit

72 GB/s

320 *****

10.1

Radeon HD 3870 X2 ^  +

825 MHz

1.8 GHz

256-bit

57.6 GB/s

320 *****

10.1

Radeon HD 4350 ^

600 MHz

1 GHz

64-bit

8 GB/s

80 *****

10.1

Radeon HD 4550 ^

800 MHz

1.6 GHz

64-bit

12.8 GB/s

80 *****

10.1

Radeon HD 4650 ^

600 MHz

1 GHz or 1.4 GHz

128-bit

16 GB/s or 22.4 GB/s

320 *****

10.1

Radeon HD 4670 ^

750 MHz

2 GHz (512 MB) or 1,746 MHz (1 GB)

128-bit

32 GB/s or 27.94 GB/s

320 *****

10.1

Radeon HD 4830 ^

575 MHz

1.8 GHz

256-bit

57.6 GB/s

640 *****

10.1

Radeon HD 4850 ^

625 MHz

2 GHz

256-bit

64 GB/s

800 *****

10.1

Radeon HD 4850 X2 ^ +

625 MHz

2 GHz

256-bit

64 GB/s

800 *****

10.1

Radeon HD 4870 ^

750 MHz

3.6 GHz

256-bit

115.2 GB/s

800 *****

10.1

Radeon HD 4870 X2 ^ +

750 MHz

2.6 GHz

256-bit

115.2 GB/s

800 *****

10.1

* ATI doesn’t set a default clock for Radeon X800 SE chip. The specs depend on the video card manufacturer. So you have to take care when comparing video cards using this chip.

** Depends on the model. There are boards based on Radeon X800 GT using DDR, DDR2 and GDDR3 memories running at different speeds. We’ve seen GDDR3 models running at 980 MHz and DDR models running at 700 MHz. You can calculate the memory transfer rate using the formula memory clock x number of bits / 8. A model with GDDR3 memory running at 980 MHz and 256-bit interface has a transfer rate of 31.36 GB/s.

*** There are models using DDR memories and running at lower clock rates.

**** There are three video card versions using this chip with very different specs, depending on the memory chips used. If they are 128 MB DDR, then the graphics chip runs at 400 MHz, the memory runs at 500 MHz, a 128-bit memory interface is used and the memory has a maximum theoretical transfer rate of 8 GB/s. If the card has 128 MB DDR2, then the graphics chip runs at 325 MHz, the memory runs at 666 MHz, a 64-bit memory interface is used and the memory has a maximum theoretical transfer rate of 5.3 GB/s. And finally if the card has 256 MB DDR2 then the graphics chip runs at 400 MHz, the memory runs at 666 MHz, a 128-bit memory interface us used and the memory has a maximum theoretical transfer rate of 10.6 GB/s.

***** The shader unit is unified, meaning that this chip doesn’t have separated pixel shader and vertex shader units. On video cards from Radeon HD 2400 and HD 2600 series the video card manufacturer can use a different clock for the memory (usually lower, thus achieveing a lower performance compared to the reference model); the clock rates published here are the official one.

^ Based on PCI Express 2.0, which doubles the available I/O bandwidth from 2.5 GB/s to 5 GB/s if a PCI Express 2.0 motherboard is used.

+ Radeon HD 3870 X2, Radeon HD 4850 X2 and HD 4870 X2 use two Radeon chips working in parallel (CrossFire). The specs published are for only one of the chips.

When you compare chips, you have to be very careful. Judging from the table, a Radeon 9800 may seem slower than a Radeon 9600 Pro, since its clock is inferior, and a Radeon X700 Pro seems faster than a Radeon X800 since it uses a higher clock rate.

However, Radeon 9800 accesses its memory using a 256-bit interface and processes eight pixels per clock pulse, while the Radeon 9600 Pro accesses its memory using a 128-bit interface and processes four pixels per clock pulse. This means that memory access and processing performance of the Radeon 9800 would the double of that of the Radeon 9600 Pro if they were working at the same clock. In other words, a Radeon 9600 Pro would have work at 650 MHz and access the memory at 1.360 MHz to have the same performance of the Radeon 9800.

The same idea goes for the Radeon X700 Pro example, it accesses memory using a 128-bit interface and processes data at 8 pixels per clock tick, while Radeon X800 accesses memory using a 256-bit interface and processes data at 12 pixels per clock tick.

Therefore, it is not correct to compare graphic chips only through their clocks. For the processing performance you will have to compare the clocks and the number of pixels per clock. As of the memory, the right way to compare its performance among different chips is through their memory transfer rate, which is calculated using the formula (clock x bits per clock)/ 8.

As you can see in the table, “SE” chips are the simplest and access the memory at only 64 bits per time. Another detail is that ATI uses the letters “XT” to indicate the fastest chip in a series, while its competitor, nVidia, uses the same letters to indicate the simplest chip in a series.

“PE” stands for “Platinum Edition” and are models even faster than the “XT” models, aimed to gamers with money.

As for the DirectX version, check the table below:

DirectX

Shader Model

7.0

No

8.1

1.4

9.0

2.0

9.0c

3.0

10

4.0

10.1

4.1

November 9, 2008 at 9:24 am Leave a comment

Nvidia Chips Comparison


Chip

Core Clock

Memory Clock

Memory Interface

Memory Transfer Rate

Pixels per clock

DirectX

GeForce 4 MX 440 AGP 8x

275 MHz

512 MHz

128-bit

8.1 GB/s

2

7

GeForce MX 4000

250 MHz

*

32-bit or 64-bit or 128-bit

*

2

7

GeForce FX 5200

250 MHz

400 MHz

64-bit or 128-bit

3.2 GB/s or 6.4 GB/s

4

9.0

GeForce FX 5200 Ultra

350 MHz

650 MHz

128-bit

10.4 GB/s

4

9.0

GeForce FX 5600

325 MHz

550 MHz

128-bit

8.8 GB/s

4

9.0

GeForce FX 5500

270 MHz

400 MHz

64-bit or 128-bit

3.2 GB/s or 6.4 GB/s

4

9.0

GeForce FX 5600 Ultra

500 MHz

800 MHz

128-bit

12.8 GB/s

4

9.0

GeForce FX 5700 LE

250 MHz

400 MHz

128-bit

6.4 GB/s

4

9.0

GeForce FX 5700

425 MHz

600 MHz

128-bit

9,6 GB/s

4

9.0

GeForce FX 5700 Ultra

475 MHz

900 MHz

128-bit

14.4 GB/s

4

9.0

GeForce FX 5800

400 MHz

900 MHz

128-bit

14.4 GB/s

8

9.0

GeForce FX 5800 Ultra

500 MHz

1 GHz

128-bit

16 GB/s

8

9.0

GeForce FX 5900 XT

390 MHz

680 MHz

256-bit

21.7 GB/s

8

9.0

GeForce FX 5900

400 MHz

850 MHz

256-bit

27.2 GB/s

8

9.0

GeForce FX 5900 Ultra

450 MHz

850 MHz

256-bit

27.2 GB/s

8

9.0

GeForce FX 5950 Ultra

475 MHz

950 MHz

256-bit

30.4 GB/s

8

9.0

GeForce PCX 5300

325 MHz

650 MHz

128-bit

10.4 GB/s

4

9.0

GeForce PCX 5750

475 MHz

900 MHz

128-bit

14.4 GB/s

4

9.0

GeForce PCX 5900

350 MHz

500 MHz

256-bit

17.6 GB/s

8

9.0

GeForce PCX 5950

475 MHz

900 MHz

256-bit

30.4 GB/s

8

9.0

GeForce 6200

300 MHz

550 MHz

128-bit

8.8 GB/s

4

9.0c

GeForce 6200 LE

350 MHz

550 MHz

64-bit

4.4 GB/s

2

9.0c

GeForce 6200 (TC)

350 MHz

666 MHz *

32-bit or 64-bit

2.66 GB/s or 5.32 GB/s *

4

9.0c

GeForce 6500 (TC)

400 MHz

666 MHz *

32-bit or 64-bit

2.66 GB/s or 5.32 GB/s *

4

9.0c

GeForce 6600

300 MHz

550 MHz *

64-bit or 128-bit

4.4 GB/s or 8.8 GB/s *

8

9.0c

GeForce 6600 DDR2

350 MHz

800 MHz *

128-bit

12.8 GB/s *

8

9.0c

GeForce 6600 LE

300 MHz

*

64-bit or 128-bit

*

4

9.0c

GeForce 6600 GT

500 MHz

1 GHz

128-bit

16 GB/s

8

9.0c

GeForce 6600 GT AGP

500 MHz

900 MHz

128-bit

14.4 GB/s

8

9.0c

GeForce 6800 LE

300 MHz

700 MHz

256-bit

22.4 GB/s

8

9.0c

GeForce 6800 XT

325 MHz

600 MHz

256 bits

19.2 GB/s

8

9.0c

GeForce 6800 XT AGP

325 MHz

700 MHz

256 bits

22.4 GB/s

8

9.0c

GeForce 6800

325 MHz

600 MHz

256-bit

19.2 GB/s

12

9.0c

GeForce 6800 AGP

325 MHz

700 MHz

256-bit

22.4 GB/s

12

9.0c

GeForce 6800 GS

425 MHz

1 GHz

256-bit

32 GB/s

12

9.0c

GeForce 6800 GS AGP

350 MHz

1 GHz

256-bit

32 GB/s

12

9.0c

GeForce 6800 GT

350 MHz

1 GHz

256-bit

32 GB/s

16

9.0c

GeForce 6800 Ultra

400 MHz

1.1 GHz

256-bit

35.2 GB/s

16

9.0c

GeForce 6800 Ultra Extreme

450 MHz

1.1 GHz

256-bit

35.2 GB/s

16

9.0c

GeForce 7100 GS (TC)

350 MHz

666 MHz *

64-bit

5.3 GB/s *

4

9.0c

GeForce 7200 GS (TC)

450 MHz

800 MHz *

64-bit

6.4 GB/s *

4

9.0c

GeForce 7300 SE (TC)

225 MHz

*

64-bit

*

4

9.0c

GeForce 7300 LE (TC)

450 MHz

648 MHz *

64-bit

5.2 GB/s *

4

9.0c

GeForce 7300 GS (TC)

550 MHz

810 MHz *

64-bit

6.5 GB/s *

4

9.0c

GeForce 7300 GT (TC)

350 MHz

667 MHz

128-bit

10.6 GB/s

8

9.0c

GeForce 7600 GS

400 MHz

800 MHz

128-bit

12.8 GB/s

12

9.0c

GeForce 7600 GT

560 MHz

1.4 GHz

128-bit

22.4 GB/s

12

9.0c

GeForce 7800 GS

375 MHz

1.2 GHz

256-bit

38.4 GB/s

16

9.0c

GeForce 7800 GT

400 MHz

1 GHz

256-bit

32 GB/s

20

9.0c

GeForce 7800 GTX

430 MHz

1.2 GHz

256-bit

38.4 GB/s

24

9.0c

GeForce 7800 GTX 512

550 MHz

1.7 GHz

256-bit

54.4 GB/s

24

9.0c

GeForce 7900 GS

450 MHz

1.32 GHz

256-bit

42.2 GB/s

20

9.0c

GeForce 7900 GT

450 MHz

1.32 GHz

256-bit

42.2 GB/s

24

9.0c

GeForce 7900 GTX

650 MHz

1.6 GHz

256-bit

51.2 GB/s

24

9.0c

GeForce 7950 GT

550 MHz

1.4 GHz

256-bit

44.8 GB/s

24

9.0c

GeForce 7950 GX2 **

500 MHz

1.2 GHz

256-bit

38.4 GB/s

24

9.0c

GeForce 8400 GS ***

450 MHz / 900 MHz

800 MHz

64-bit

6.4 GB/s

16

10

GeForce 8500 GT ***

450 MHz / 900 MHz

666 MHz or 800 MHz

128-bit

10.6 GB/s or 12.8 GB/s

16

10

GeForce 8600 GT DDR2 ***

540 MHz / 1.18 GHz

666 MHz or 800 MHz

128-bit

10.6 GB/s or 12.8 GB/s

32

10

GeForce 8600 GT GDDR3 ***

540 MHz / 1.18 GHz

1.4 GHz

128-bit

22.4 GB/s

32

10

GeForce 8600 GTS ***

675 MHz / 1.45 GHz

2 GHz

128-bit

32 GB/s

32

10

GeForce 8800 GS *** ^

550 MHz / 1,375 MHz

1.6 GHz

192-bit

38.4 GB/s

96

10

GeForce 8800 GT *** ^

600 MHz / 1.5 GHz

1.8 GHz

256-bit

57.6 GB/s

112

10

GeForce 8800 GTS ***

500 MHz / 1.2 GHz

1.6 GHz

320-bit

64 GB/s

96

10

GeForce 8800 GTS 512 *** ^

650 MHz / 1,625 MHz

1.94 GHz

256-bit

62.08 GB/s

128

10

GeForce 8800 GTX ***

575 MHz / 1.35 GHz

1.8 GHz

384-bit

86.4 GB/s

128

10

GeForce 8800 Ultra ***

612 MHz / 1.5 GHz

2.16 GHz

384-bit

103.6 GB/s

128

10

GeForce 9400 GT *** ^

550 MHz / 1.4 GHz

800 MHz

128-bit

12.8 GB/s

16

10

GeForce 9500 GT *** ^

550 MHz / 1.4 GHz

1 GHz (DDR2) or 1.6 GHz (GDDR3)

128-bit

16 GB/s (DDR2) or 25.6 GB/s (GDDR3)

32

10

GeForce 9600 GSO *** ^

550 MHz / 1.35 GHz

1.6 GHz

192-bit

38.4 GB/s

96

10

GeForce 9600 GT *** ^

650 MHz / 1,625 MHz

1.8 GHz

256-bit

57.6 GB/s

64

10

GeForce 9800 GT *** ^

600 MHz / 1.5 GHz

1.8 GHz

256-bit

57.6 GB/s

112

10

GeForce 9800 GTX *** ^

675 MHz / 1,688 MHz

2.2 GHz

256-bit

70.4 GB/s

128

10

GeForce 9800 GTX+ *** ^

738 MHz / 1,836 MHz

2.2 GHz

256-bit

70.4 GB/s

128

10

GeForce 9800 GX2 ** *** ^

600 MHz / 1.5 GHz

2 GHz

256-bit

64 GB/s

128

10

GeForce GTX 260 *** ^

576 MHz / 1,242 MHz

2 GHz

448-bit

112 GB/s

192

10

GeForce GTX 280 *** ^

602 MHz / 1,296 MHz

2.21 GHz

512-bit

141.7 GB/s

240

10

* The manufacturer can setup a different memory clock rate or interface, so pay attention because not all video cards based on this chip have this spec. The memory transfer rate will depend on the interface and clock rate used. See how to calculate below.

** GeForce 7950 GX2 and GeForce 9800 GX2 use two graphics processors in parallel (SLI mode). The specs published are for just one of the chips.

*** GeForce 8, 9 and 200 series use two clocks, the higher one is used by the shader unit and the lower one by the rest of the chip. The shader unit is unified, meaning that these chips don’t have separated pixel shader and vertex shader units.

^ Based on PCI Express 2.0, which doubles the available I/O bandwidth from 2.5 GB/s to 5 GB/s if a PCI Express 2.0 motherboard is used.

(TC) means TurboCache. TurboCache is a technology that allows the video card to simulate more video memory by using part of the main system RAM as video memory.

At first nVidia’s profusion of letters may seem confusing. The GeForce FX 5700 Ultra chip works at a higher clock than the GeForce FX 5900, GeForce FX 5900 Ultra and GeForce FX 5900 XT chips, and this may make you think that a GeForce FX 5700 Ultra is the faster than chips in the 5900 series.

But that is not really so. Chips from the GeForce FX 5900 series access the memory at 256 bits per time, while the memory is accessed at 128 bits in the FX 5700 series. That makes the 5900 series memory access performance twice as fast as those of the previous series. For instance, the GeForce FX 5700 Ultra would have to access its memory at 1,700 MHz – the double of the memory clock used – to reach the memory performance of the GeForce FX 5900 Ultra.

Another example. From the table you may think GeForce 6600 GT is faster than a GeForce 6800 because it has a higher clock rate (500 MHz against 325 MHz). But GeForce 6800 accesses memory 256 bits at a time while GeForce 6600 GT accesses memory 128 bits at a time, and also GeForce 6800 processes 12 pixels per clock tick, while GeForce 6600 GT processes eight pixels per clock.

The right way to compare the memory performance of different chips is through their memory transfer rate, which is calculated using the formula (clock x bits per clock ) / 8.

Another difference is the graphic processor of the FX 5900 series, which processes eight pixels per clock pulse, while the graphic chip only processes four pixels per clock in the other series. In other words, despite having a higher clock, the graphic processing performance of the GeForce FX 5700 Ultra is inferior than those of chips from the FX 5900 series, as they process the double of pixels when work at the same clock (simply put, the GeForce FX 5700 Ultra would have to work at twice its clock to have the same performance of the GeForce FX 5900 Ultra).

Therefore, it is not correct to compare graphic chips only through their clocks.

We must be careful with the GeForce FX 5900 XT, too. While ATI uses the letters “XT” to indicate high-end chips (for ex.: Radeon 9800 XT), nVidia uses the same letters to indicate the low-end chips of the series (see table).

You have to be very carefull with low-end video cards using nVidia chips, because they can use different clock rates and different memory interface from the table. For example, you can find GeForce FX 5200, GeForce FX 5500 and GeForce 6600 with 64-bit or 128-bit interface. We’ve seen GeForce FX 5200, GeForce FX 5500 and GeForce 6200 with 32-bit interface on the market!

November 9, 2008 at 8:25 am Leave a comment


Archives

 

November 2008
M T W T F S S
     
 12
3456789
10111213141516
17181920212223
24252627282930

Pages

Recent Comments

Top Clicks

  • None

Top Posts

  • None

Blog Stats

  • 2,414 hits

Follow

Get every new post delivered to your Inbox.