
TODAY Intel is celebrating the 40th birthday of the world’s first microprocessor, the Intel 4004.
The proliferation of microprocessors is due in large part to Intel’s relentless pursuit of Moore’s Law, a forecast for the pace of silicon technology development that states that roughly every two years semiconductor transistor density doubles, while increasing functionality and performance and decreasing costs.
Compared to the Intel 4004, today’s second-generation Intel Core processors have more than 350,000 times the performance and each transistor uses about 5,000 times less energy. In this same time period, the price of a transistor has dropped by a factor of about 50,000.

Intel 8008 and 8080
Introduced in 1972 the 8008 was basically just an 8-bit version of the 4004, and it was actually slower — 500KHz, as opposed to 740. It was built using the same PMOS process and used in much the same applications — calculators, early microcomputers, that kind of thing. The 8008 cost $120 in 1972, some $500 in today’s money.
Far more interesting than the 8008, however, was the 8080. Thanks to faster NMOS logic, the 8080 was clocked at 2MHz and thus capable of performing hundreds of thousands of calculations per second. The 8080 formed the basis of the Altair 8800, one of the first build-it-yourself PCs, and early cruise missiles.

Intel 8086 and 8088
The 8086 was Intel’s first 16-bit processor, and from its name we get the x86 instruction set — an instruction set that, more than 30 years after the fact, modern CPUs are still compatible with. Released in 1978, Intel hoped that the 8086 would draw attention away from the Zilog Z80, which had been designed by a breakaway group of ex-Intel radicals and was commercially very successful.
The 8086 itself was never a huge success, but its cheaper successor — the 8088, which had an 8-bit data bus instead of 16 — would find a home in the IBM PC, and then hundreds of clones and build-it-yourself kits.
80286
You may have noticed that the history of computer processors generally jumps from 8086 to 80286 — but in actual fact, there was an 80186. Sadly, though, thanks to the addition of DMA and interrupt controllers, it just wasn’t compatible with the original IBM PCs and their clones. By the time the PC AT rolled around, however, the 80286 was available — and boy was the 286 fast. The 8086, at 10MHz, was capable of 0.75 MIPS — millions of instructions per second; the 80286 at 12.5MHz could churn through no less than 2.66 MIPS.

80386 and 486, DX and SX
Originally launched in 1985, three years after the 286, the 386DX doubled transistor count, quadrupled MIPS performance — but most importantly, it was Intel’s first commercially-successful 32-bit processor. It was expensive, though, which resulted in the half-width 386SX being released in 1988 — and this is the chip that ended up being very popular in home and office computers.
Linus Torvalds famously created Linux on an 80386 computer. He had ordered a copy of the 16-bit Minix operating system, but it didn’t play well with the 386′s 32-bit internals — and so, in true hacker fashion, Torvalds wrote his own, Unix-like OS.
The 486DX, which debuted in 1989, was basically a supercharged 386 — a 25MHz 486 was around the same speed as a 50MHz 386 — but it was also the first Intel chip to feature an on-die (8KB!) cache. The first 486 was capable of 20 MIPS, but by the time the 100MHz 486-DX4 arrived in 1994, 70 MIPS had been broken. The 486DX, incidentally, was the first Intel chip to cross the 1,000,000 transistor threshold — it also contained a maths co-processor, or FPU as it’s now known; the SX, to reduce its price, lacked this feature.

Pentium, MMX, and the P5 microarchitecture
Building off the 80486, Intel produced its first ever superscalar (instruction-level parallelism) chip in 1993: the Pentium 60 and 66, based on the P5 microarchitecture. The first Pentiums were not significantly faster than the 486-DX4s that they replaced (100 MIPS vs. 70), but by 1994, with 75 and 100MHz clockspeeds, the 486 begun its slow march into basements everywhere. The original Pentium eventually topped out at 200MHz in 1996.
With a bump up to 32KB of L1 caches, the addition of 57 “multimedia instructions,” and another million transistors, Intel released Pentium MMX in 1997. It initially entered at the same speed as the last of the original Pentiums — 200MHz — but quickly scaled up to 300MHz. AMD copied and improved upon the MMX concept a couple of years later with 3DNow! on the wunderkind K6-2 processor, and then Intel one-upped that with SSE in the Pentium III. We’re now up to SSE4.2 in Core microarchitecture processors.

Pentium Pro, II, Celeron, and other P6 microarch chips
Intel followed up on P5 with P6, also known as i686, and created the Pentium Pro in 1995. The Pro was an odd beast that simply missed the mark — it was awesome at 32-bit computation, but was completely unchanged from the P5 when it came to 16-bit operations. At the time, almost every PC was running 16-bit MS-DOS and Windows, and so the Pro — which was very expensive — simply didn’t fly.
The Pro did introduce a (large, 1MB!) on-package L2 cache, however — not on-die, but right next door (pictured above, on the right) — which, amongst other features, made it a very popular chip for server workloads. Windows NT was already 32-bit by this point and so could take full advantage of the P6 architecture.

Pentium II
While the Pro went mostly unnoticed by consumers, then, the Pentium II — released in 1997, after the last Pentium MMX chip — was sensational. With 32KB of L1 cache, 512KB of L2, and a funky slot (as opposed to socket) package, speeds of up to 450MHz, and the first chip to use a 100MHz front-side bus (FSB, it really did scream. It packed a total of 7.5 million transistors (0.25 micron, 250nm), and when combined with the 440LX Balboa chipset, the P2 was the first chip to support the new SDRAM standard and AGP.
Finally, in response to the growing pressure from AMD K6 and Cyrix 6×86 on the low-end of the market, Intel stripped all of the L2 cache from the P2 and released the Celeron. Celeron — or Celery as it was affectionately known, as it had all of the calories but none of the cache, or perhaps because it was sometimes so horrendously slow that users felt it must be made from watery vegetables — was a very fun, very cheap CPU.

Pentium III
Introduced in 1999, the Pentium 3 was effectively a faster P2, but with the addition of SSE instructions and some tweaks that allowed for higher clock speeds. In just two years Intel managed to shrink its process twice, from 250nm (Katmai) to 180nm (Coppermine) and finally to 130nm (Tualatin), with the latter being by far the fastest and most overclockable chip of its time. Coppermine and Tualatin finally put L2 cache back on the die, too — a feature that had been missing since the P2 — furthermore, Coppermine (pictured above), was the last Intel chip that was distributed without an integrated heat-spreader. How many CPUs did you crack while installing a heatsink?
The Tualatin core turned out to be so good that it would form the basis of the Pentium M (Banias) chip in 2003, despite the emergence of Pentium 4 in 2000 — but more on that in a moment.

Pentium 4
Released in 2000, the Pentium 4 NetBurst architecture sounded great on paper — a front-side bus of 400MHz, 42 million transistors, and speeds of up to 1.5GHz — but in practice Intel’s first new microarch since the P6′s release in 1996 stank. NetBurst was designed first and foremost around attaining high clock speeds, but to do this it required some sacrifices, including a very long pipeline. A few revisions were meant to help matters, including the infamous 90nm Prescott core, but the chip simply got hotter. Ultimately, NetBurst just didn’t provide enough processing power for the power it consumed and the heat it produced.
NetBurst did provide one important feature that was kept by the Core microarchitecture, though: Hyper-Threading. HT first emerged in a 2002 Pentium 4 Xeon, and then the consumer-targeted 130nm Northwood P4 later that year, and now it can be found in all of Intel’s Core i3/5/7 processors. Prescott also ushered in the land grid array, and a bent pin hasn’t been seen since.

Core 2
After more than 11 years of dominance, the aging P6 finally made way for the Core architecture in June 2006 with the emergence of the 65nm Woodcrest Xeon and Conroe Core 2 Duo. The Core architecture is mainly significant for putting the ball back in Intel’s court after a few years of AMD Athlon 64 supremacy, and it did this through low power consumption and multiple cores on a single die. The Core architecture, in effect, reunified the mobile line of processors (stemming from P3) and the desktop and server parts (P4) — and yes, Core finally brought 64-bit computing to Intel’s consumer lineup.
Following its now-regular tick-tock process, Conroe was followed by a die shrink to 45nm (Penryn), and then a shift to the Nehalem microarch. Nehalem was die shrunk to 32nm with Westmere (Gulftown), and now we have Sandy Bridge (and Sandy Bridge-E!) — the architecture behind the latest Core i3, 5, and 7 processors. Depending on the amount of cache, we are now looking at CPUs with between 200 and 800 million transistors — a far cry from the 2,300 found in the 4004 CPU some 40 years ago.
Source : http://www.extremetech.com