Harishankar Rai

Core i 7

Nehalem: An Overview
It’s hard to talk about an overview of an architecture like Nehalem, which is fundamentally designed to be modular. The Intel engineers wanted to design a set of building blocks that can be assembled like Legos to create the various versions of the architecture.

It is possible, though, to take a look at the flagship of the new architecture—the very high-end version that will be used in servers and high-performance workstations. At first glance, the specs will likely remind you of the Barcelona (K10) architecture from AMD. It is natively quad-core and has three levels of cache, a built-in memory controller, and a high-performance system of point-to-point interconnections for communicating with peripherals and other CPUs in multiprocessor configurations. This proves that it wasn’t AMD’s technological choices that were bad, but simply its implementation, which hasn’t scaled well enough through its current design.

But Intel has done more than just revise its architecture by taking inspiration from their competitor’s innovations. With a budget of more than 700 million transistors (731 million, to be exact), the engineers were able to greatly improve certain characteristics of the execution core while adding new functionality. For example, simultaneous multi-threading (SMT), which had already appeared with the Pentium 4 "Northwood" under the name Hyper-Threading has made its comeback. Associated with four physical cores, certain versions of Nehalem that incorporate two dies in a single package will be capable of executing up to 16 threads simultaneously. While this change appears simple at first glance, as we’ll see later on, it has a wide impact at several levels of the pipeline; many buffers need to be re-dimensioned so that this mode of operation doesn’t impact performance. As has been the case with each new architecture for several years now, Intel has also added new SSE instructions to Nehalem. The architecture supports SSE 4.2, components of which appear to be borrowed from AMD’s K10 micro-architecture. .
Now that you know the broad outlines of the new architecture, it’s time to take a more detailed look, starting with the front end of the pipeline—the part that’s in charge of reading instructions in memory and preparing them for execution.

Reading And Decoding Instructions
Unlike the changes made in moving from Core to Core 2, Intel hasn’t done much to Nehalem’s front end. It has the same four decoders that made their appearance with the Conroe—three simple and one complex. It is still capable of macro-ops fusion and so offers a theoretical maximum throughput of 4+1 x86 instructions per cycle.

Though there are no revolutionary changes at first glance, the devil is in the details. As we noted in our article on the Barcelona architecture, increasing the number of processing units is an extremely inefficient way to boost performance. The cost is high and the gains shrink more and more with each addition, following the law of diminishing returns. So instead of adding a new decoder, the engineers concentrated on making the existing ones more efficient.
First, they added support for macro-ops fusion in 64-bit mode, which is justified for an architecture like Nehalem that makes no attempt to hide its ambitions in the server market segment. But the engineers didn’t stop there. Where the Conroe architecture could fuse only a limited number of instructions, the Nehalem architecture supports a greater number of variations, making it possible to use macro-ops fusion more frequently.
Another new feature introduced by the Conroe has also been improved: the Loop Stream Detector. Behind this name lies what is in fact a data buffer that holds a few instructions (18 x86 instructions on Core 2s). When the processor detects a loop, it disables certain parts of the pipeline. Since a loop consists of executing the same instructions a given number of times, it’s not necessary to perform branch prediction or to recover the instruction from the L1 cache at each iteration of the loop. So the Loop Stream Detector acts as a small cache memory that short-circuits the first stages of the pipeline in such situations. The gains made via this technique are twofold: it decreases power consumption by avoiding useless tasks and it improves performance by reducing the pressure on the L1 instruction cache.

With the Nehalem architecture, Intel has improved the functionality of the Loop Stream Detector. First of all the buffer is larger—it can now store 28 instructions. But what’s more, its position in the pipeline has changed. In Conroe, it was located just after the instruction fetch phase. It’s now located after the decoders; this new position allows a larger part of the pipeline to be disabled. The Nehalem’s Loop Stream Detector no longer stores x86 instructions, but rather µops. In this sense, it’s similar to the Pentium 4’s trace cache concept. It’s no surprise to find certain innovations ushered in by that architecture in the Nehalem, given that the Hillsboro team now in charge of the Nehalem was responsible for the Pentium 4 project. However, where the Pentium 4 used the trace cache exclusively, since it could only count on one decoder in case of a data cache miss, Nehalem has the benefit of the power of its four decoders, while the Loop Stream Detector is only an additional optimization for certain situations. In a way, it’s the best of both worlds.
Branch Predictors
The last improvement to the front end has to do with the branch predictors. The efficiency of branch prediction algorithms becomes crucial in architectures that need high levels of instruction parallelism. A branch breaks the parallelism because it necessitates waiting for the result of a preceding instruction before execution of the flow of instructions can be continued. Branch prediction determines whether or not a branch will be taken, and if it is, quickly determines the target address for continuing execution. No complicated techniques are needed to do this; all that’s needed is an array of branches—the Branch Target Buffer (BTB)—that stores the results of the branches as execution progresses (Taken or Not Taken and target address) and an algorithm for determining the result of the next branch.
Intel hasn’t provided details on the algorithm used for their new predictors, but it is known that they are now two-level predictors. the first level is unchanged from the Conroe architecture, but a new level with slower access that can store more branch history has been added. According to Intel, this configuration improves branch prediction for certain applications that use large volumes of code, such as databases—more evidence of Nehalem’s server orientation. Another improvement is to the Return Stack Buffer, which stores the return addresses of functions when they’re called. In certain cases this buffer can overflow, which could lead to faulty predictions. To limit that possibility, AMD increased its size to 24 entries, whereas with the Nehalem Intel has introduced a renaming system for this buffer.

The Return Of Hyper-Threading
So, the front end hasn’t been profoundly overhauled; neither has the back end. It has exactly the same execution units as the most recent Core processors, but here again the engineers have worked on using them more efficiently.

With Nehalem, Hyper-Threading makes its great comeback. Introduced with the Northwood version of Intel’s NetBurst architecture, Hyper-Threading—also known outside the world of Intel as Simultaneous Multi-Threading (SMT)—is a means of exploiting thread parallelism to improve the use of a core’s execution units, making the core appear to be two cores at the application level.
In order to use parallel threads, certain resources—such as registers—must be duplicated. Other resources are shared by the two threads, and that includes all the out-of-order execution logic (the instruction reorder buffer, the execution units, and cache memory). A simple observation led to the introduction of SMT: the “wider” (meaning more execution units) and “deeper” (meaning more pipeline stages) processors become, the harder it is to extract enough parallelism to use all the execution units at each cycle. Where the Pentium 4 was very deep, with a pipeline having more than 20 stages, Nehalem is very wide. It has six execution units capable of executing three memory operations and three calculation operations. If the execution engine can’t find sufficient parallelism of instructions to take advantage of them all, “bubbles”—lost cycles—occur in the pipeline.

To remedy that situation, SMT looks for instruction parallelism in two threads instead of just one, with the goal of leaving as few units unused as possible. This approach can be extremely effective when the two threads are executing tasks that are highly separate. On the other hand, two threads involving intensive calculation, for example, will only increase the pressure on the same calculating units, putting them in competition with each other for access to the cache. It goes without saying that SMT is of no interest in this type of situation, and can even negatively impact performance.
SMT Implementation

Still, the impact of SMT on performance is positive most of the time and the cost in terms of resources is still very limited, which explains why the technology is making a comeback. But programmers will have to pay attention because with Nehalem, all threads are not created equal. To help solve this puzzle, Intel provides a way of precisely determining the exact topology of the processor (the number of physical and logical processors), and programmers can then use the operating system affinity mechanism to assign each thread to a processor. This kind of thing shouldn’t be a problem for game programmers, who are already in the habit of working that way because of the way the Xenon processor (the one used in the Xbox 360) works. But unlike consoles, where programmers have very low-level access, on a PC the operating system’s thread scheduler will always have the last word.
Since SMT puts a heavier load on the out-of-order execution engine, Intel has increased the size of certain internal buffers to avoid turning them into bottlenecks. So the reorder buffer, which keeps track of all the instructions being executed in order to reorder them, has increased from 96 entries on the Core 2 to 128 entries on Nehalem. In practice, since this buffer is partitioned statically to keep any one thread from monopolizing all the resources, its size is reduced to 64 entries for each thread with SMT. Obviously, in cases where a single thread is executed, it has access to all the entries, which should mean that there won’t be any specific situations where Nehalem turns out to have worse performance than its predecessor.
The reservation station, which is the unit in charge of assigning instructions to the different execution units, has also increased in size: from 32 to 36 entries. But unlike the reorder buffer, here partitioning is dynamic, so that a thread can take up more or fewer entries as a function of its needs.
Two other buffers have also been resized: the load buffer and the store buffer. The former has 48 entries as opposed to 32 with Conroe, and the latter 32 instead of 20. Here too, partitioning between threads is static.
Another consequence of the return of SMT is that the performance of thread synchronization instructions has improved, according to Intel.
SSE 4.2 And Power Consumption
With the Nehalem architecture, Intel couldn’t resist adding some new items to the already long list of SSE instructions. Nehalem supports SSE 4.2, which has all the instructions supported by Penryn (SSE4.1) and adds seven more. Most of these new instructions are for manipulating character strings, one purpose of which, Intel says, is to speed up the processing of XML files.

The two other instructions are aimed at specific applications; one is POPCNT, which appeared with Barcelona, and is used to count the number of non-zero bits in a register. According to Intel, this instruction is especially useful in voice recognition and DNA sequencing. The last instruction, CRC32, is used for accelerating the calculation of error detection codes.
Power Consumption Under Control
Over and over Intel says that for any potential innovation in one of its new architectures, the engineers weigh performance gains against the impact on power consumption—it’s proof that they’ve learned well from Pentium 4. With the Nehalem architecture, the engineers have gone even farther with techniques for limiting consumption. There’s now a built-in microcontroller—the Power Control Unit—that constantly watches the temperature and power use of the cores and can disable them completely when they’re not being used. Thanks to this technology, the energy consumption of an unused core is next to zero, whereas before Nehalem there were still losses due to leakage currents.

Intel has implemented this in an original way with what it calls a Turbo mode. When the processor is operating below its standard TDP, for example, the Turbo mode increases the frequency of the cores being used, while still keeping within the TDP limit.

Note also that like the Atom processor, Nehalem’s L1 and L2 caches use eight transistors instead of the usual six, which reduces consumption at the cost of a slightly larger die surface.

QuickPath Interconnect
While the Core architecture was remarkably efficient, certain design details had begun to show their age, first among them the Front Side Bus (FSB) system. This bus connecting the processors to the northbridge was a real anachronism in an otherwise highly modern architecture. The major fault was most noticeable in multiprocessor configurations, where the architecture had a hard time keeping up with increasing loads. The processors had to share this bus not only for access to memory, but also for ensuring the coherence of the data contained in their respective cache memories.
In this kind of situation, the influx of transactions to the bus can lead to its saturation. For a long time, Intel simply worked around the problem by using a faster bus or larger cache memories, but it was high time for them to address the underlying cause by completely overhauling the way its processors communicate with memory and outside components.

The solution Intel chose—called QuickPath Interconnect (QPI)—was nothing new; an integrated memory controller is an extremely fast point-to-point serial bus. The technology was introduced five years ago in AMD processors, but in reality it’s even older than that. These concepts, showing up in AMD and now Intel products, are in fact the result of work done ten years ago by the engineers at DEC during the design of the Alpha 21364 (EV7). Since many former DEC engineers ended up in Santa Clara, it’s not surprising to see these concepts surfacing in the latest Intel architecture.
From a technical point of view, a QPI link is bidirectional and has two 20-bit links—one in each direction—of which 16 are reserved for data; the four others are used for error detection codes or protocol functions. This works out to a maximum of 6.4 GT/s (billion transfers per second), or a usable bandwidth of 12.8 GB/s, both read and write. Just for comparison, the FSB on the most recent Intel processors operates at a maximum clock frequency of 400 MHz, and address transfers need two clock cycles (200 MT/s) whereas data transfers operate in QDR mode, with a bandwidth of 1.6 GT/s. With its 64-bit width, the FSB also has a total bandwidth of 12.8 GB/s, but it’s usable for writing or reading.
So a QPI link has a theoretical bandwidth that’s up to twice as high, provided reads and writes are well balanced. In a theoretical case consisting of reads only or writes only, the bandwidth would be identical to that of the FSB. However, you have to keep in mind that the FSB was used both for memory access and for all transfers of data to peripherals or between processors. With Nehalem, a QPI link will be exclusively dedicated to transfers of data to peripherals, with memory transfers handled by the integrated controller and inter-CPU communications in multi-socket configurations by another QPI link. Even in the worst cases, a QPI link should show significantly better performance than the FSB.

As we’ve seen, Nehalem was designed to be a flexible, scalable architecture, and so the number of QPI links available will vary with the market segment being aimed at—from one link to the chipset for single-socket configurations to as many as four in four-socket server configurations. This enables true fully-connected four-processor systems, meaning that each processor can access any position in memory with a maximum of a single QPI link hop since each CPU is connected directly to the three others.
Memory Subsystem
An Integrated Memory Controller
Intel has taken its time catching up to AMD on this point. But as is often the case, when the giant does something, he takes a giant step. Where Barcelona had two 64-bit memory controllers supporting DDR2, Intel’s top-of-the-line configuration will include three DDR3 memory controllers. Hooked up to DDR3-1333, which Nehalem will also support, that adds up to a bandwidth of 32 GB/s in certain configurations. But the advantage of an integrated memory controller isn’t just a matter of bandwidth. It also substantially lowers memory access latency, which is just as important, considering that each access costs several hundred cycles. Though the latecy reduction achieved by an integrated memory controller will be appreciable in the context of desktop use, it is multi-socket server configurations that will get the full benefit of the more scalable architecture. Before, while bandwidth remained constant when CPUs were added, now each new CPU added will increase bandwidth, since each processor has its own local memory space.

Obviously this is not a miracle solution. This is a Non Uniform Memory Access (NUMA) configuration, which means that memory accesses can be more or less costly, depending on where the data is in memory. An access to local memory obviously has the lowest latency and the highest bandwidth; conversely, an access to remote memory requires a transit via the QPI link, which reduces performance.

The impact on performance is difficult to predict, since it’ll be dependent on the application and the operating system. Intel says the performance hit for remote access is around 70% in terms of latency, and that bandwidth can be reduced by half compared to local access. According to Intel, even with remote access via the QPI link, latency will still be lower than on earlier processors where the memory controller was on the northbridge. However, those considerations only apply to server applications, and for a long time now they’ve already been designed with the specifics of NUMA configurations in mind.

A Three-Level Cache Hierarchy
The memory hierarchy of Conroe was extremely simple and Intel was able to concentrate on the performance of the shared L2 cache, which was the best solution for an architecture that was aimed mostly at dual-core implementations. But with Nehalem, the engineers started from scratch and came to the same conclusions as their competitors: a shared L2 cache was not suited to a native quad-core architecture. The different cores can too frequently flush data needed by another core and that surely would have involved too many problems in terms of internal buses and arbitration to provide all four cores with sufficient bandwidth while keeping latency sufficiently low. To solve the problem, the engineers provided each core with a Level 2 cache of its own. Since it’s dedicated to a single core and relatively small (256 KB), the engineers were able to endow it with very high performance; latency, in particular, has reportedly improved significantly over Penryn—from 15 cycles to approximately 10 cycles.

Then comes an enormous Level 3 cache memory (8 MB) for managing communications between cores. While at first glance Nehalem’s cache hierarchy reminds one of Barcelona, the operation of the Level 3 cache is very different from AMD’s—it’s inclusive of all lower levels of the cache hierarchy. That means that if a core tries to access a data item and it’s not present in the Level 3 cache, there’s no need to look in the other cores’ private caches—the data item won’t be there either. Conversely, if the data are present, four bits associated with each line of the cache memory (one bit per core) show whether or not the data are potentially present (potentially, but not with certainty) in the lower-level cache of another core, and which one.

This technique is effective for ensuring the coherency of the private caches because it limits the need for exchanges between cores. It has the disadvantage of wasting part of the cache memory with data that is already in other cache levels. That’s somewhat mitigated, however, by the fact that the L1 and L2 caches are relatively small compared to the L3 cache—all the data in the L1 and L2 caches takes up a maximum of 1.25 MB out of the 8 MB available. As on Barcelona, the Level 3 cache doesn’t operate at the same frequency as the rest of the chip. Consequently, latency of access to this level is variable, but it should be in the neighborhood of 40 cycles.
The only real disappointment with Nehalem’s new cache hierarchy is its L1 cache. The bandwidth of the instruction cache hasn’t been increased—it’s still 16 bytes per cycle compared to 32 on Barcelona. This could create a bottleneck in a server-oriented architecture since 64-bit instructions are larger than 32-bit ones, especially since Nehalem has one more decoder than Barcelona, which puts that much more pressure on the cache. As for the data cache, its latency has increased to four cycles compared to three on the Conroe, facilitating higher clock frequencies. To end on a positive note, though, the engineers at Intel have increased the number of Level 1 data cache misses that the architecture can process in parallel.
TLB
For many years now, processors have been working not with physical memory addresses, but with virtual addresses. Among other advantages, this approach lets more memory be allocated to a program than the computer actually has, keeping only the data necessary at a given moment in actual physical memory with the rest remaining on the hard disk. This means that for each memory access a virtual address has to be translated into a physical address, and to do that an enormous table is put in charge of keeping track of the correspondences. The problem is that this table gets so large that it can’t be stored on-chip—it’s placed in main memory, and can even be paged (part of the table can be absent from memory and itself kept on the hard disk).
If this translation stage were necessary at each memory access, it would make access much too slow. As a result, engineers returned to the principle of physical addressing by adding a small cache memory directly on the processor that stored the correspondences for a few recently accessed addresses. This cache memory is called a Translation Lookaside Buffer (TLB). Intel has completely revamped the operation of the TLB in their new architecture. Up until now, the Core 2 has used a level 1 TLB that is extremely small (16 entries) but also very fast for loads only, and a larger level 2 TLB (256 entries) that handled loads missed in the level 1 TLB, as well as stores.
Nehalem now has a true two-level TLB: the first level of TLB is shared between data and instructions. The level 1 data TLB now stores 64 entries for small pages (4K) or 32 for large pages (2M/4M), while the level 1 instruction TLB stores 128 entries for small pages (the same as with Core 2) and seven for large pages. The second level is a unified cache that can store up to 512 entries and operates only with small pages. The purpose of this improvement is to increase the performance of applications that use large sets of data. As with the introduction of two-level branch predictors, this is further evidence of the architecture’s server orientation.
Let’s go back to SMT for a moment, since it also has an impact on the TLBs. The level 1 data TLB and the level 2 TLB are shared dynamically between the two threads. Conversely, the level 1 instruction TLB is statically shared for small pages, whereas the one dedicated to large pages is entirely replicated—this is understandable given its small size (seven entries per thread).
Memory Access And Prefetcher
Optimized Unaligned Memory Access
With the Core architecture, memory access was subject to several restrictions in terms of performance. The processor was optimized for access to memory addresses that were aligned on 64-byte boundaries—the size of one cache line. Not only was access slow for unaligned data, but execution of an unaligned load or store instruction was more costly than for aligned instructions, regardless of actual alignment of the data in memory. That’s because these instructions generated several µops for the decoders to handle, which reduced the throughput of this type of instruction. As a result, compilers avoided generating this type of instruction, by substituting sequences of instructions that were less costly.
Thus, memory reads that overlapped two cache lines took a performance hit of approximately 12 cycles, compared to 10 for writes. The Intel engineers have optimized these accesses to make them faster. First of all, there’s no performance penalty for using the unaligned versions of load/store instructions in cases where the data are aligned in memory. In other cases, Intel has optimized these accesses to reduce the performance hit compared to that of the Core architecture.
More Prefetchers Running More Efficiently
With the Conroe architecture, Intel was especially proud of its hardware prefetchers. As you know, a prefetch is a mechanism that observes memory access patterns and tries to anticipate which data will be needed several cycles in advance. The point is to return the data to the cache, where it will be more readily accessible to the processor while trying to maximize bandwidth by using it when the processor doesn’t need it.
This technique produced remarkable results with most desktop applications, but in the server world the result was often a loss of performance. There are many reasons for that inefficiency. First of all, memory accesses are often much less easy to predict with server applications. Database accesses, for example, aren’t linear—when an item of data is accessed in memory, the adjacent data won’t necessarily be called on next. That limits the prefetcher’s effectiveness. But the main problem was with memory bandwidth in multi-socket configurations. As we said earlier, there was already a bottleneck between processors, but in addition, the prefetchers added additional pressure at this level. When a microprocessor wasn’t accessing memory, the prefetchers kicked in to use bandwidth they assumed was available. They had no way of knowing at that precise point that the other processor might need the bandwidth. That meant the prefetchers could deprive a processor of bandwidth that was already at a premium in this kind of configuration. To solve the problem, Intel had no better solution to offer than to disable the prefetchers in these situations—hardly a satisfactory answer.
Intel says the problem is solved now, but provides no details on the operation of the new prefetch algorithms; all its says is that it won’t be necessary to disable them for server configurations. But even if Intel hasn’t changed anything, the gains stemming from the new memory organization and the resulting wider bandwidth should limit any negative impact of the prefetchers
Conclusion
Conroe laid strong foundations and Nehalem builds on them. It has the same efficient architecture, but it’s now much more modular and scalable, which should guarantee success in the different market segments. We’re not saying that Nehalem revolutionizes the Core architecture, but it does revolutionize the Intel platform, which has now once again become a formidable match for AMD in terms of design and surpassed its competitor in terms of implementation.

With all the improvements made at this level (integrated memory controller, QPI), it’s not really surprising that the changes to the execution core are purely incremental. But the return of Hyper-Threading is significant and there are several little optimizations that should guarantee a notable gain in performance compared to Penryn at equal frequencies.
Clearly, the greatest gains will be in situations where memory was the main bottleneck. In reading this article, you’ve probably noticed that this is the area the engineers have focused their attentions on. In addition to the integrated memory controller, which will undoubtedly produce the biggest gains where memory access is concerned, there’s a raft of other improvements, both big and small—the new cache and TLB hierarchies, unaligned memory access and prefetchers.
After all these theoretical considerations, the next step will be to see if the improvements in actual applications will live up to the expectations that the new architecture has aroused. We’ll look into that in a series of upcoming articles, so stay tuned!

Mobile Phone Secrets & Tricks

Mobile Phone Secrets & Tricks

.:: NOKIA ::.

Nokia Universal Codes

Code Description :
These Nokia codes will work on most Nokia Mobile Phones

(1) *3370# Activate Enhanced Full Rate Codec (EFR) - Your phone uses the best sound quality but talk time is reduced my approx. 5%

(2) #3370# Deactivate Enhanced Full Rate Codec (EFR) OR *3370# ( F avourite )

(3) *#4720# Activate Half Rate Codec - Your phone uses a lower quality sound but you should gain approx 30% more Talk Time.

(4) *#4720# Deactivate Half Rate Codec.

(5) *#0000# Displays your phones software version, 1st Line : Software Version, 2nd Line : Software Release Date, 3rd Line : Compression Type. ( Favourite )

(6) *#9999# Phones software version if *#0000# does not work.

(7) *#06# For checking the International Mobile Equipment Identity (IMEI Number). ( Favourite )

(8) #pw+1234567890+1# Provider Lock Status. (use the "*" button to obtain the "p,w"
and "+" symbols).

(9) #pw+1234567890+2# Network Lock Status. (use the "*" button to obtain the "p,w"
and "+" symbols).

(10) #pw+1234567890+3# Country Lock Status. (use the "*" button to obtain the "p,w"
and "+" symbols).

(11) #pw+1234567890+4# SIM Card Lock Status. (use the "*" button to obtain the "p,w"
and "+" symbols).

(12) *#147# (vodafone) this lets you know who called you last.

(13) *#1471# Last call (Only vodofone).

(14) *#21# Allows you to check the number that "All Calls" are diverted to

(15) *#2640# Displays security code in use.

(16) *#30# Lets you see the private number.

(17) *#43# Allows you to check the "Call Waiting" status of your phone.

(18) *#61# Allows you to check the number that "On No Reply" calls are diverted to.

(19) *#62# Allows you to check the number that "Divert If Unreachable (no service)" calls
are diverted to.

(20) *#67# Allows you to check the number that "On Busy Calls" are diverted to.

(21) *#67705646# Removes operator logo on 3310 & 3330.

(22) *#73# Reset phone timers and game scores.

(23) *#746025625# Displays the SIM Clock status, if your phone supports this power saving feature "SIM Clock Stop Allowed", it means you will get the best standby time possible.

(24) *#7760# Manufactures code.

(25) *#7780# Restore factory settings.

(26) *#8110# Software version for the nokia 8110.

(27) *#92702689# Displays - 1.Serial Number, 2.Date Made, 3.Purchase Date, 4.Date of last repair (0000 for no repairs), 5.Transfer User Data. To exit this mode you need to switch your phone off then on again. ( Favourite )

(28) *#94870345123456789# Deactivate the PWM-Mem.

(29) **21*number# Turn on "All Calls" diverting to the phone number entered.

(30) **61*number# Turn on "No Reply" diverting to the phone number entered.

(31) **67*number# Turn on "On Busy" diverting to the phone number entered.

(32) 12345 This is the default security code.

press and hold # Lets you switch between lines

NOKIA 5110/5120/5130/5190

IMEI number: * # 0 6 #
Software version: * # 0 0 0 0 #
Simlock info: * # 9 2 7 0 2 6 8 9 #
Enhanced Full Rate: * 3 3 7 0 # [ # 3 3 7 0 # off]
Half Rate: * 4 7 2 0 #
Provider lock status: #pw+1234567890+1
Network lock status #pw+1234567890+2
Provider lock status: #pw+1234567890+3
SimCard lock status: #pw+1234567890+4

NOKIA 6110/6120/6130/6150/6190

IMEI number: * # 0 6 #
Software version: * # 0 0 0 0 #
Simlock info: * # 9 2 7 0 2 6 8 9 #
Enhanced Full Rate: * 3 3 7 0 # [ # 3 3 7 0 # off]
Half Rate: * 4 7 2 0 #

NOKIA 3110

IMEI number: * # 0 6 #
Software version: * # 0 0 0 0 # or * # 9 9 9 9 # or * # 3 1 1 0 #
Simlock info: * # 9 2 7 0 2 6 8 9 #

NOKIA 3330

*#06#
This will show your warranty details *#92702689#
*3370#
Basically increases the quality of calling sound, but decreases battery length.
#3370#
Deactivates the above
*#0000#
Shows your software version
*#746025625#This shows if your phone will allow sim clock stoppage
*4370#
Half Rate Codec activation. It will automatically restart
#4370#
Half Rate Codec deactivation. It will automatically restart
Restore Factory Settings
To do this simply use this code *#7780#
Manufacturer Info
Date of Manufacturing *#3283#
*3001#12345# (TDMA phones only)

This will put your phone into programming mode, and you'll be presented with the programming menu.
2) Select "NAM1"
3) Select "PSID/RSID"
4) Select "P/RSID 1"
Note: Any of the P/RSIDs will work
5) Select "System Type" and set it to Private
6) Select "PSID/RSID" and set it to 1
7) Select "Connected System ID"
Note: Enter your System ID for Cantel, which is 16401 or 16423. If you don't know yours,
ask your local dealer for it.
8) Select "Alpha Tag"
9) Enter a new tag, then press OK
10) Select "Operator Code (SOC)" and set it to 2050
11) Select "Country Code" and set it to 302 for Canada, and 310 for the US.
12) Power down the phone and power it back on again
ISDN Code
To check the ISDN number on your Nokia use this code *#92772689#

.:: Ericsson ::.

Ericson T65

*#05# Fake Insert puk screen Press no to exit

Ericsson T20

Ericsson T20

MENU tecnichal Info
[type] >*<<*<*

Displays :
1] Info service
1] Info SW
2] Info hardware
3] SIMlock
4]setup

2] Service setup
1] Contrast

3]Service Test
1] Display
2]Leds
3]Keyboard
4] ringer
5] Vibration
6]Headset
7] Microphone
8]Clock
4] Names List

MENU info
[Type] >*<<**<
Network and Subnetwork : NCK and NSCK

Ericsson T28

>*<<*<* menu Tecnichal info
SW vers. and name list
>*<<**< menu Personal Info
Network and Subnetwork : NCK and NSCK
< and > are the right and left menu's keys

Ericsson T18s/T10/A1018s

>*<<*<* software
CXC125065 Internal product code
PRG
970715 1515 Software version and SW rev.
<* CLR <><**
>*<<*<*> Displays texts and messages in the phone
It will be displayed " TEXT " then push YES
< and > are the right and left menu's keys
!!!><**
Control /Enable SIM Lock!!!

Ericsson R310

Technical Info : >*<<*<*
Options :
1) service Info
info Software
Simlock
Configuration

2) Service Test
Display
Led/Illumination
Keyboard
ringer
Vibration
Headset
Microphone
Clock

3)Text's name List
Info personal : >*<<**<

SIM Locking ( 8 digits' code ) ( it could harm your phone )
1) Network
2) Subnetwork
3) SP
4) Corporate

.:: Siemens ::.

Siemens C25

IMEI number: * # 0 6 #
Software version: put off sim card and enter: : * # 0 6 # and press LONG KEY
Bonus screen: in phone booke: + 1 2 0 2 2 2 4 3 1 2 1

.:: Bosch ::.

IMEI Number: * # 0 6 #
Dafault Language: * # 0 0 0 0 #
Net Monitor: * # 3 2 6 2 2 5 5 * 8 3 7 8 #

.::Alcatel ::.

IMEI number: * # 0 6 #
Software version: * # 0 6 #
Net Monitor: 0 0 0 0 0 0 *

.:: Samsung ::.

Samsung SGH600/2100DB

IMEI number: * # 0 6 #
Software version: * # 9 9 9 9 # albo * # 0 8 3 7 #
Net Monitor: * # 0 3 2 4 #
Chaning LCD contrast: * # 0 5 2 3 #
Memory info: * # 0 3 7 7 # albo * # 0 2 4 6 #
Reset pamieci (SIMLOCK`a removing!!!): *2767*3855#
Reset pamieci CUSTOM: *2767*2878#
Battery state: * # 9 9 9 8 * 2 2 8 #
Alarm beeper: *#9998*289#
Vibra test: *#9998*842#

.:: Dancall ::.

IMEI number: * # 0 6 #
Software version: * # 9 9 9 9 #

.:: Philips ::.

*#3333*# Displays the blocking list.
*#7489*# Displays the security code.
*#06# Displays the IMEI number.
*#8377*# Displays the SW info.

.:: Panasonic ::.

Panasonic gd90 gd93

*#9999# SW - Type the code on switch on , during network seek
-Vers. SW and production code
Enable ringing and vibration contemporarily
Enable vibration with # then increase volume with "tone menu "

Panasonic gd70

*#9999# SW - Type the code on switch on , during network seek
-Vers. SW and production code
Enable ringing and vibration contemporarily
Enable vibration with # then increase volume with "tone menu "

.:: Acer ::.

Acer V 750

*#400# Display Adc/ Set Cal-Value -
*#402# Set LCD Contrast
*#403# Display Errors Info
*#300# Display Info Hw & Sw
*#301# Menu Test
*#302# Menu Acoustics
*#303# (Settings saved) Set English language?
*#307# Menu Engineering
*#311# Reset Phone Code - [ Also reset Security Codes ! ]
*#330# (Execute not success) [ unknown ]
*#331# (Service deactivated) [ unknown ]
*#332# (Service unavailable)[ unknown ]
*#333# (Execute not success)[ unknown ]
*#351# (Service unavailable) [ unknown ]
*#360# (Invalid input)[ unknown ]
*#361# (Invalid input) [ unknown ]
*#362# (Invalid input) [ unknown ]
*#363# (Invalid input) [ unknown ]

.:: Genie ::.

Genie DB

*#06# IMEI.
*#2254*# Near Cell Mode.

For every received BTS will be displayed :
Current channel and 2 channel levels

*#06# IMEI
*#2558# time of network connection ( D/H/M )
*#2562# Fores reconnection to network
!!!*#7489# Dispalys and modify phones' security code!!!
!!!*#3377# SIM lock information !!!
*#7378# SIM card Informations : supported phase name and tipe
*#7693# Enable/disable "Sleep Mode"
*#8463# State of "Sleep Mode"
*#2255# Debug Call Mode enable/disable
*#3333*# Displays the blocking list.
*#7489*# Displays the security code
*#06# Displays the IMEI number
*#8377*# Displays the SW info.

.:: NEC ::.

NEC db2000

*#2820# software vers.
IMEI *#06#
Reset *73738# (send?)
SP Lock info:
* # 3210 # (send?)
Network barring info : *#8140# (send?)
( it could harm your phone )
SIM lock it could harm your phone )
*#4960 # (send?) -Inquiry * 4960 * password * password # (send?) lock
#4960* password # (send?) unlock
[password] [8 digits]
Net Lock
*#7320# (send?) -Inquiry * 7320 * password * password # (send?) lock
#7320* password # (send?) unlock
[password] [ 8 digits]
Net Lock 2:
*#2220# (invio) - Inquiry * 2220 * password * password # (send?) lock
#2220* password # (invio?)unlock
[password] [8 digits]
Unlock subnetwork
*#1110# (send?) - inquiry * 1110 * password * password # (send?) lock
#1110* password # (send?) unlock
[password] [nï¿½ 8 cifre]
( it could harm your phone )

.:: Trium ::.

Trium Geo/Geo @ - Astral - Cosmo -Aria

Enter the menu and type *
A new menu will be displayed :
Application : SW version and battery's voltage

Trium Galaxy
Push * and type 5806: Production date and SW version

.:: Telit ::.

Telit GM 810

MONITOR - technical menu - : type ++++ and push OK.
Adiacent cells list : # and *
Now if you push OK the phone displays battery ' s voltage and temperature

.:: Sagem ::.

Sagem MC959/940

Select commands' menu and push *
Displays a new menu' :
Appli : software vers. and battery's voltage
Eprom
Sim Lock
Test LCD: display test , green/red and vibration

Sagem MC920

Select commands' menu and push *
Displays 5 new menus :

1 APPLI
VERSION ( SW)
BATTERY (voltage )
2 PROM (IMEI)
3!!! SIM LOCK (10 digits code requested ) !!!
4- NETWORK (returns : OPTION NOT AVAILABLE)
5- TEST LCD
SYMBOL 1 (LCD)
SYMBOL 2 (test2 LCD)
BLACK (all icons and carachters displayed )
FOR PHOTO (welcome message and time )
VIBRATOR (vibration test )

.:: Sony ::.

Sony CMD Z5/J5

Vers. SW :
Without SIM , switch on phone and type l *#7353273#

.:: Eprom ::.

!!! Sim Lock [10 digits code ] ( it could harm your phone )
NETWORK : OPTION NOT AVAILABLE
Test LCD: display test of the green/red leds and vibration
Push * and type 4329 :enables/disables network monitor 1 (the same of MT35)
Push * and type 621342 :enables/disables network monitor 2
Push * and type 5807 : Serial Number Software Vers.
Push * and type 936505: IMEI -- Software Vers.
TPush * and type 547 : Test serial Data Cable DISPLAYS :"Testmode"
Push * and type 362628: ISMI BLOCK (UNKNOWN)
Push * and type 476989: NS BLOCK (UNKNOWN)
Push * and type 482896:CP BLOCK (UNKNOWN)
Push * and type 787090: ? BLOCK (UNKNOWN)
Push * and type 787292 : block current network
!!! Push * and type 967678: SP LOCK!!!
Push * and type 850696:Warm Start ( ENABLE/DISABLE)
Push * and type 3926 : Swicth off phone
Push * and type 5806: Production date and SW version

.:: Motorola ::.

Motorola V3688

IMEI *#06#
Enhanced Full Rate Codec EFR
Enable EFR : [][][] 119 [] 1 [] OK.
disable EFR : [][][] 119 [] 0 [] OK

.:: Tips and Tricks ::.

Send an E-mail from your GSM

From your telephone you can send an email to whichever E-mail customer of the Internet network.

The e-mail will be sent to the maximum of within an hour from the reception.

The sended message will contain in luminosity the telephone number of the sender.

In order for sending e-mail, send an SMS with this syntax (always separated by spaces):

EMA name@domain text-of-your-email

Example: in order to send an email to john@doe.com , do the following:

EMA john@doe.com text-of-your-email

if your phone cant print @ replace it with a !

EMA johon!doe.com text-of-your-email

And then send this message to the folloving number: +39 338 8641732

Free SMS Center numbers

From your telephone you can send SMS messages of 160 char. max. to another GSM phone

Your message will be sent through an SMS Center (usually the one that gave your provider)

You pay a little fee depending of your provider, BUT YOU WILL HAVE TO PAY something

In order for sending SMS without paying anything, you got to change your SMS Center number with these one

+491722270300 or +358405202999 or +352021100003

Codes (that they dont tell you in the manual)

To check the IMEI (International Mobile Equipment Identity) type: *#06#

Information you get from the IMEI:
XXXXXX XX XXXXXX X
TAC FAC SNR SP

TAC = Type Approval Code (first 2 digits = country code).

FAC = Final Assembly Code (For Nokia phonfiltered=10).

SNR = Serial Number.

SP = Spare (always SP=0).

To check the phone's software (firmware revision information) type: *#0000# ( or for some phones outher then Nokia 61XX you can try *#model nummber# ex. for 8110 *#8110#)

Information you can get from the phone's software version:
V 3.14
28-11-97
NSE-3

1st line: Software version.
2nd line: The date of the software release.
3nd line: Phone type, .

Some versions and dates:

V 3.14 28/11/97

V4.33 11/03/98

V 4.73 22/04/98

V 5.24 14/9/98

Pin-Out Diagram for the 6110

1 - VIN CHARGER INPUT VOLTAGE 8.4V 0.8A
2 - CHRG CTRL CHARGER CONTROL PWM 32Khz
3 - XMIC MIC INPUT 60mV - 1V
4 - SGND SIGNAL GROUND
5 - XEAR EAR OUTPUT 80mV - 1V
6 - MBUS 9600 B/S
7 - FBUS_RX 9.6 - 230.4 KB/S
8 - FBUS_TX 9.6 - 230.4 KB/S
9 - L_GND CHARGER / LOGIC GND

Revealing Headphone and Car-Kit Menus

Think about this: If you do these tricks the new menus can not be erased after the procedure. But it's not dangerous or harmful for your phone

To enable the headset-function, you have to short-circuit the "3" and "4". After a short time there is "Headset" on the display Now, menu 3-6 is now enabled!

To enable the carkit-function you have to short-circuit the "4" and "5". After a short time, "Car" is shown on the display and the menu 3-7 is enabled!!

This Trick is for you how want to hear more then your supposed to !

If you short-circuit the left and the right contact with the middle contact ("3", "6" and "9") the Nokia Software hangs! The profile "Headset" will be activated. Before you do this, just active the "auto call receive" function in the headphone profile and set the ringing volume to "mute" Now you can use your phone for checking out what people are talking about in a room. Place the phone somewhere stratidic and call your phone! The phone receives the call without ringing and you can listen to what people are talking about! .....gr8...

Serial numbers on your 6110

For more info type: *#92702689#
The first screen gives you the serial and IMEI number.
Then there is the Date of Manufacture: ex. Made 1297
Then there is the Purchasing Date: ex. Purchasing Date 0298
Then there is the last Repair Date: ex. Repaired: 0000

Note: you must turn off the phone to exit after this test, because of the last function, "transfer user data" which doesn't work as "standard"....You can use this mode only to transfer all Calender, Profile and Callers Group Information to another phone (eg. if you are replacing phone or configuring phones for use within your company or when a particular phone doesn't works correctly )

Activating and deactivating EFR and HFR, on your 6110

*3370# to activate Enhanced Full Rate - Makes calls sound better, but decreases the battery life by about 5%.

#3370# to deactivate Enhanced Full Rate

*4720# to activate Half Rate Mode - Drops call quality, but increases battery life by about 30%.

#4720# to deactivate Half Rate Mode

who says you need four core????

Confessions Of A Guilty Editor

I have a confession to make. Although I work in a lab filled with the latest hardware, my workstation still consists of a 2.8 GHz Pentium 4 http://en.wikipedia.org/wiki/Pentium_4 processor. Sure, it features Hyper-Threading, but the chip is six years old. Why am I still using it? Because it runs on a platform that has given me nary a problem. It’s stable, fast enough in Vista vista to drive four monitors with 15 different windows open at any given time, and remarkably quiet in its little Shuttle box.

Zoom

But everyone has their limits, and after six years, it’s definitely time to move on. As I considered the pieces for my next PC, I first contemplated an affordable quad-core box based on Intel http://en.wikipedia.org/wiki/Intel_Corporation ’s Q9300 or AMD’s Phenom X4 9850 BE. No—I’d rather see how low I could go on the system’s thermal footprint while still achieving reasonable performance.

And then it hit me. Do I really need a quad-core processor? Is the software I run really benefiting from the extra complexity? Could I not get more horsepower from dual-core chip? After all, it’d still be a tremendous upgrade from that 2.8 GHz Pentium 4.

In the interest of full disclosure, my workstation doesn’t touch gaming—nor could it, sporting Nvidia http://en.wikipedia.org/wiki/Nvidia ’s Quadro NVS 440. For entertainment I turn to a dual-processor Xeon-based machine that can’t be run after 10:00PM for fear of a noise complaint from the neighbors. Nevertheless, I have to imagine that there are plenty of enthusiasts out there with a solid little desktop sputtering along doing menial productivity-type tasks. So let’s take a look at what it means to upgrade to an affordable configuration with modern components as I build my next six-year system.

Zoom

The Contenders

There are actually several strong platforms available right now that’d be perfect foundations for a basic workstation upgrade. From AMD http://en.wikipedia.org/wiki/Advanced_Micro_Devices , the 780G and 790GX both help save money through capable integrated graphics cores. Nvidia’s GeForce http://en.wikipedia.org/wiki/GeForce 8200 does the same thing. Intel’s P45 chipset looks to be a strong contender if you don’t mind spending some money on an add-in video card.

However, we’re focusing on two specific chipsets here: AMD’s 740G and Intel’s G45. They aren’t meant to go head-to-head—if they were, we could tell you the performance outcome right now. Rather, the former gives us an ultra-affordable entry point for exploring platform performance, while the latter kicks things up a notch with more current technology. Is the 740G enough? Where are our bottlenecks when we make price the priority? Is Intel’s G45 finally up to snuff with regard to graphics? Most important, do you really need a quad-core CPU or will one of today’s dual-core chips serve up enough muscle to stand the test of time?

Great moments in microprocessor history

Great moments in microprocessor history
The history of the micro from the vacuum tube to today's dual-core multithreaded madness

The evolution of the modern microprocessor is one of many surprising twists and turns. Who invented the first micro? Who had the first 32-bit single-chip design? You might be surprised at the answers. This article shows the defining decisions that brought the contemporary microprocessor to its present-day configuration.
At the dawn of the 19th century, Benjamin Franklin's discovery of the principles of electricity were still fairly new, and practical applications of his discoveries were few -- the most notable exception being the lightning rod, which was invented independently by two different people in two different places. Independent contemporaneous (and not so contemporaneous) discovery would remain a recurring theme in electronics.
So it was with the invention of the vacuum tube -- invented by Fleming, who was investigating the Effect named for and discovered by Edison; it was refined four years later by de Forest (but is now rumored to have been invented 20 years prior by Tesla). So it was with the transistor: Shockley, Brattain and Bardeen were awarded the Nobel Prize for turning de Forest's triode into a solid state device -- but they were not awarded a patent, because of 20-year-prior art by Lilienfeld. So it was with the integrated circuit (or IC) for which Jack Kilby was awarded a Nobel Prize, but which was contemporaneously developed by Robert Noyce of Fairchild Semiconductor (who got the patent). And so it was, indeed, with the microprocessor.
Before the flood: The 1960s
Just a scant few years after the first laboratory integrated circuits, Fairchild Semiconductor introduced the first commercially available integrated circuit (although at almost the same time as one from Texas Instruments).
Already at the start of the decade, process that would last until the present day was available: commercial ICs made in the planar process were available from both Fairchild Semiconductor and Texas Instruments by 1961, and TTL (transistor-transistor logic) circuits appeared commercially in 1962. By 1968, CMOS (complementary metal oxide semiconductor) hit the market. There is no doubt but that technology, design, and process were rapidly evolving.
Observing this trend, Fairchild Semiconductor's director of Research & Development Gordon Moore observed in 1965 that the density of elements in ICs was doubling annually, and predicted that the trend would continue for the next ten years. With certain amendments, this came to be known as Moore's Law.
The first ICs contained just a few transistors per wafer; by the dawn of the 1970s, production techniques allowed for thousands of transistors per wafer. It was only a matter of time before someone would use this capacity to put an entire computer on a chip, and several someones, indeed, did just that.

Development explosion: The 1970s
The idea of a computer on a single chip had been described in the literature as far back as 1952 (see Resources), and more articles like this began to appear as the 1970s dawned. Finally, process had caught up to thinking, and the computer on a chip was made possible. The air was electric with the possibility.
Once the feat had been established, the rest of the decade saw a proliferation of companies old and new getting into the semiconductor business, as well as the first personal computers, the first arcade games, and even the first home video game systems -- thus spreading consumer contact with electronics, and paving the way for continued rapid growth in the 1980s.
At the beginning of the 1970s, microprocessors had not yet been introduced. By the end of the decade, a saturated market led to price wars, and many processors were already 16-bit.
The first three
At the time of this writing, three groups lay claim for having been the first to put a computer in a chip: The Central Air Data Computer (CADC), the Intel® 4004, and the Texas Instruments TMS 1000.
The CADC system was completed for the Navy's "TomCat" fighter jets in 1970. It is often discounted because it was a chip set and not a CPU. The TI TMS 1000 was first to market in calculator form, but not in stand-alone form -- that distinction goes to the Intel 4004, which is just one of the reasons it is often cited as the first (incidentally, it too was just one in a chipset of four).
In truth, it does not matter who was first. As with the lightning rod, the light bulb, radio -- and so many other innovations before and after -- it suffices to say it was in the aether, it was inevitable, its time was come.
Where are they now?
CADC spent 20 years in top-secret, cold-war-era mothballs until finally being declassified in 1998. Thus, even if it was the first, it has remained under most people's radar even today, and did not have a chance to influence other early microprocessor design.
The Intel 4004 had a short and mostly uneventful history, to be superseded by the 8008 and other early Intel chips (see below).
In 1973, Texas Instrument's Gary Boone was awarded U.S. Patent No. 3,757,306 for the single-chip microprocessor architecture. The chip was finally marketed in stand-alone form in 1974, for the low, low (bulk) price of US$2 apiece. In 1978, a special version of the TI TMS 1000 was the brains of the educational "Speak and Spell" toy which E.T. jerry-rigged to phone home.
Early Intel: 4004, 8008, and 8080
Intel released its single 4-bit all-purpose chip, the Intel 4004, in November 1971. It had a clock speed of 108KHz and 2,300 transistors with ports for ROM, RAM, and I/O. Originally designed for use in a calculator, Intel had to renegotiate its contract to be able to market it as a stand-alone processor. Its ISA had been inspired by the DEC PDP-8.
The Intel 8008 was introduced in April 1972, and didn't make much of a splash, being more or less an 8-bit 4004. Its primary claim to fame is that its ISA -- provided by Computer Terminal Corporation (CTC), who had commissioned the chip -- was to form the basis for the 8080, as well as for the later 8086 (and hence the x86) architecture. Lesser-known Intels from this time include the nearly forgotten 4040, which added logical and compare instructions to the 4004, and the ill-fated 32-bit Intel 432.
Intel put itself back on the map with the 8080, which used the same instruction set as the earlier 8008 and is generally considered to be the first truly usable microprocessor. The 8080 had a 16-bit address bus and an 8-bit data bus, a 16-bit stack pointer to memory which replaced the 8-level internal stack of the 8008, and a 16-bit program counter. It also contained 256 I/O ports, so I/O devices could be connected without taking away or interfering with the addressing space. It also possessed a signal pin that allowed the stack to occupy a separate bank of memory. These features are what made this a truly modern microprocessor. It was used in the Altair 8800, one of the first renowned personal computers (other claimants to that title include the 1963 MIT Lincoln Labs' 12-bit LINC/Laboratory Instruments Computer built with DEC components and DEC's own 1965 PDP-8).
Although the 4004 had been the company's first, it was really the 8080 that clinched its future -- this was immediately apparent, and in fact in 1974 the company changed its phone number so that the last four digits would be 8080.
Where is Intel now?
Last time we checked, Intel was still around.
RCA 1802
In 1974, RCA released the 1802 8-bit processor with a different architecture than other 8-bit processors. It had a register file of 16 registers of 16 bits each and using the SEP instruction, you could select any of the registers to be the program counter. Using the SEP instruction, you could choose any of the registers to be the index register. It did not have standard subroutine CALL immediate and RET instructions, though they could be emulated.
A few commonly used subroutines could be called quickly by keeping their address in one of the 16 registers. Before a subroutine returned, it jumped to the location immediately preceding its entry point so that after the RET instruction returned control to the caller, the register would be pointing to the right value for next time. An interesting variation was to have two or more subroutines in a ring so that they were called in round-robin order.
The RCA 1802 is considered one of the first RISC chips although others (notably Seymore Cray -- see the sidebar, The evolution of RISC -- had used concepts already).
Where is it now?
Sadly, the RCA chip was a spectacular market failure due to its slow clock cycle speed. But it could be fabricated to be radiation resistant, so it was used on the Voyager 1, Viking, and Galileo space probes (where rapidly executed commands aren't a necessity).
IBM 801
In 1975, IBM® produced some of the earliest efforts to build a microprocessor based on RISC design principles (although it wasn't called RISC yet -- see the sidebar, The evolution of RISC). Initially a research effort led by John Cocke (the father of RISC), many say that the IBM 801 was named after the address of the building where the chip was designed -- but we suspect that the IBM systems already numbered 601 and 701 had at least something to do with it also.
Where is the 801 now?
The 801 chip family never saw mainstream use, and was primarily used in other IBM hardware. Even though the 801 never went far, it did inspire further work which would converge, fifteen years later, to produce the Power Architecture™ family.
Moto 6800
In 1975, Motorola introduced the 6800, a chip with 78 instructions and probably the first microprocessor with an index register.
Two things are of significance here. One is the use of the index register which is a processor register (a small amount of fast computer memory that's used to speed the execution of programs by providing quick access to commonly used values). The index register can modify operand addresses during the run of a program, typically while doing vector/array operations. Before the invention of index registers and without indirect addressing, array operations had to be performed either by linearly repeating program code for each array element or by using self-modifying code techniques. Both of these methods harbor significant disadvantages when it comes to program flexibility and maintenance and more importantly, they are wasteful when it comes to using up scarce computer memory.
Where is the 6800 now?
Many Motorola stand-alone processors and microcontrollers trace their lineage to the 6800, including the popular and powerful 6809 of 1979
MOS 6502
Soon after Motorola released the 6800, the company's design team quit en masse and formed their own company, MOS Technology. They quickly developed the MOS 6501, a completely new design that was nevertheless pin-compatible with the 6800. Motorola sued, and MOS agreed to halt production. The company then released the MOS 6502, which differed from the 6501 only in the pin-out arrangement.
The MOS 6502 was released in September 1975, and it sold for US$25 per unit. At the time, the Intel 8080 and the Motorola 6800 were selling for US$179. Many people thought this must be some sort of scam. Eventually, Intel and Motorola dropped their prices to US$79. This had the effect of legitimizing the MOS 6502, and they began selling by the hundreds. The 6502 was a staple in the Apple® II and various Commodore and Atari computers.
Where is the MOS 6502 now?
Many of the original MOS 6502 still have loving homes today, in the hands of collectors (or even the original owners) of machines like the Atari 2600 video game console, Apple II family of computers, the first Nintendo Entertainment System, the Commodore 64 -- all of which used the 6502. MOS 6502 processors are still being manufactured today for use in embedded systems.
AMD clones the 8080
Advanced Micro Devices (AMD) was founded in 1969 by Jerry Sanders. Like so many of the people who were influential in the early days of the microprocessor (including the founders of Intel), Sanders came from Fairchild Semiconductor. AMD's business was not the creation of new products; it concentrated on making higher quality versions of existing products under license. For example, all of its products met MILSPEC requirements no matter what the end market was. In 1975, it began selling reverse-engineered clones of the Intel 8080 processor.
Where is AMD now?
In the 1980s, first licensing agreements -- and then legal disputes -- with Intel, eventually led to court validation of clean-room reverse engineering and opened the 1990s floodgates to many clone corps.
Fairchild F8
The 8-bit Fairchild F8 (also known as the 3850) microcontroller was Fairchild's first processor. It had no stack pointer, no program counter, no address bus. It did have 64 registers (the first 8 of which could be accessed directly) and 64 bytes of "scratchpad" RAM. The first F8s were multichip designs (usually 2-chip, with the second being ROM). The F8 was released in a single-chip implementation (the Mostek 3870) in 1977.
Where is it now?
The F8 was used in the company's Channel F Fairchild Video Entertainment System in 1976. By the end of the decade, Fairchild played mostly in niche markets, including the "hardened" IC market for military and space applications, and in Cray supercomputers. Fairchild was acquired by National Semiconductor in the 1980s, and spun off again as an independent company in 1997.
16 bits, two contenders
The first multi-chip 16-bit microprocessor was introduced by either Digital Equipment Corporation in its LSI-11 OEM board set and its packaged PDP 11/03 minicomputer, or by Fairchild Semiconductor with its MicroFlame 9440, both released in 1975. The first single-chip 16-bit microprocessor was the 1976 TI TMS 9900, which was also compatible with the TI 990 line of minicomputers and was used in the TM 990 line of OEM microcomputer boards.

Where are they now?

The DEC chipset later gave way to the 32-bit DEC VAX product line, which was replaced by the Alpha family, which was discontinued in 2004.
The aptly named Fairchild MicroFlame ran hot and was never chosen by a major computer manufacturer, so it faded out of existence.
The TI TMS 9900 had a strong beginning, but was packaged in a large (for the time) ceramic 64-pin package which pushed the cost out of range compared with the much cheaper 8-bit Intel 8080 and 8085. In March 1982, TI decided to start ramping down TMS 9900 production, and go into the DSP business instead. TI is still in the chip business today, and in 2004 it came out with a nifty TV tuner chip for cell phones.
Zilog Z-80
Probably the most popular microprocessor of all time, the Zilog Z-80 was designed by Frederico Faggin after he left Intel, and it was released in July 1976. Faggin had designed or led the design teams for all of Intel's early processors: the 4004, the 8008, and particularly, the revolutionary 8080.

This 8-bit microprocessor was binary compatible with the 8080 and surprisingly, is still in widespread use today in many embedded applications. Faggin intended it to be an improved version of the 8080 and according to popular opinion, it was. It could execute all of the 8080 operating codes as well as 80 more instructions (including 1-, 4-, 8-, and 16-bit operations, block I/O, block move, and so on). Because it contained two sets of switchable data registers, it supported fast operating system or interrupt context switches.
The thing that really made it popular though, was its memory interface. Since the CPU generated its own RAM refresh signals, it provided lower system costs and made it easier to design a system around. When coupled with its 8080 compatibility and its support for the first standardized microprocessor operating system CP/M, the cost and enhanced capabilities made this the choice chip for many designers (including TI; it was the brains of the TRS-80 Model 1).
The Z-80 featured many undocumented instructions that were in some cases a by-product of early designs (which did not trap invalid op codes, but tried to interpret them as best they could); in other cases the chip area near the edge was used for added instructions, but fabrication methods of the day made the failure rate high. Instructions that often failed were just not documented, so the chip yield could be increased. Later fabrication made these more reliable.
Where are they now?
In 1979, Zilog announced the 16-bit Z8000. Sporting another great design with a stack pointer and both a user and a supervisor mode, this chip never really took off. The main reason: Zilog was a small company, it struggled with support, and never managed to bank enough to stay around and outlast the competition.
However, Zilog is not only still making microcontrollers, it is still making Z-80 microcontrollers. In all, more than one billion Z-80s have been made over the years -- a proud testament to Faggin's superb design.
Faggin is currently Chairman of the Board & Co-Founder of Synaptics, a "user interface solutions" company in the Silicon Valley.
Intel 8085 and 8086
In 1976, Intel updated the 8080 design with the 8085 by adding two instructions to enable/disable three added interrupt pins (and the serial I/O pins). They also simplified hardware so that it used only +5V power, and added clock-generator and bus-controller circuits on the chip. It was binary compatible with the 8080, but required less supporting hardware, allowing simpler and less expensive microcomputer systems to be built. These were the first Intel chips to be produced without input from Faggin.
In 1978, Intel introduced the 8086, a 16-bit processor which gave rise to the x86 architecture. It did not contain floating-point instructions. In 1980 the company released the 8087, the first math co-processor they'd developed.
Next came the 8088, the processor for the first IBM PC. Even though IBM engineers at the time wanted to use the Motorola 68000 in the PC, the company already had the rights to produce the 8086 line (by trading rights to Intel for its bubble memory) and it could use modified 8085-type components (and 68000-style components were much more scarce).
Moto 68000
In 1979, Motorola introduced the 68000. With internal 32-bit registers and a 32-bit address space, its bus was still 16 bits due to hardware prices. Originally designed for embedded applications, its DEC PDP-11 and VAX-inspired design meant that it eventually found its way into the Apple Macintosh, Amiga, Atari, and even the original Sun Microsystems® and Silicon Graphics computers.
Where is the 68000 now?
As the 68000 was reaching the end of its life, Motorola entered into the Apple-IBM-Motorola "AIM" alliance which would eventually produce the first PowerPC® chips. Motorola ceased production of the 68000 in 2000.

The dawning of the age of RISC: The 1980s
Advances in process ushered in the "more is more" era of VLSI, leading to true 32-bit architectures. At the same time, the "less is more" RISC philosophy allowed for greater performance. When combined, VLSI and RISC produced chips with awesome capabilities, giving rise to the UNIX® workstation market.
The decade opened with intriguing contemporaneous independent projects at Berkeley and Stanford -- RISC and MIPS. Even with the new RISC families, an industry shakeout commonly referred to as "the microprocessor wars," would mean that we left the 1980s with fewer major micro manufacturers than we had coming in.
By the end of the decade, prices had dropped substantially, so that record numbers of households and schools had access to more computers than ever before.
RISC and MIPS and POWER
RISC, too, started in many places at once, and was antedated by some of the examples already cited (see the sidebar, The evolution of RISC).
Berkeley RISC
In 1980, the University of California at Berkeley started something it called the RISC Project (in fact, the professors leading the project, David Patterson and Carlo H. Sequin, are credited with coining the term "RISC").
The project emphasized pipelining and the use of register windows: by 1982, they had delivered their first processor, called the RISC-I. With only 44KB transistors (compared with about 100KB in most contemporary processors) and only 32 instructions, it outperformed any other single chip design in existence.
MIPS
Meanwhile, in 1981, and just across the San Francisco Bay from Berkeley, John Hennessy and a team at Stanford University started building what would become the first MIPS processor. They wanted to use deep instruction pipelines -- a difficult-to-implement practice -- to increase performance. A major obstacle to pipelining was that it required the hard-to-set-up interlocks in place to ascertain that multiple-clock-cycle instructions would stop the pipeline from loading data until the instruction was completed. The MIPS design settled on a relatively simple demand to eliminate interlocking -- all instructions must take only one clock cycle. This was a potentially useful alteration in the RISC philosophy.
POWER
Also contemporaneously and independently, IBM continued to work on RISC as well. 1974's 801 project turned into Project America and Project Cheetah. Project Cheetah would become the first workstation to use a RISC chip, in 1986: the PC/RT, which used the 801-inspired ROMP chip.
Where are they now?
By 1983, the RISC Project at Berkeley had produced the RISC-II which contained 39 instructions and ran more than 3 times as fast as the RISC-I. Sun Microsystem's SPARC (Scalable Processor ARChitecture) chip design is heavily influenced by the minimalist RISC Project designs of the RISC-I and -II.
Professors Patterson and Sequin are both still at Berkeley.
MIPS was used in Silicon Graphics workstations for years. Although SGI's newest offerings now use Intel processors, MIPS is very popular in embedded applications.
Professor Hennessy left Stanford in 1984 to form MIPS Computers. The company's commercial 32-bit designs implemented the interlocks in hardware. MIPS was purchased by Silicon Graphics, Inc. in 1992, and was spun off as MIPS Technologies, Inc. in 1998. John Hennessy is currently Stanford University's tenth President.
IBM's Cheetah project, which developed into the PC-RT's ROMP, was a bit of a flop, but Project America was in prototype by 1985 and would, in 1990, become RISC System/6000. Its processor would be renamed the POWER1.
RISC was quickly adopted in the industry, and today remains the most popular architecture for processors. During the 1980s, several additional RISC families were launched. Aside from those already mentioned above were:
• CRISP (C Reduced Instruction Set Processor) from AT&T Bell Labs.
• The Motorola 88000 family.
• Digital Equipment Corporation Alpha's (the world's first single-chip 64-bit microprocessor).
• HP Precision Architecture (HP PA-RISC).
32-bitness
The early 1980s also saw the first 32-bit chips arrive in droves.
BELLMAC-32A
AT&T's Computer Systems division opened its doors in 1980, and by 1981 it had introduced the world's first single-chip 32-bit microprocessor, the AT&T Bell Labs' BELLMAC-32A, (it was renamed the WE 32000 after the break-up in 1984). There were two subsequent generations, the WE 32100 and WE 32200, which were used in:
• the 3B5 and 3B15 minicomputers
• the 3B2, the world's first desktop supermicrocomputer
• the "Companion", the world's first 32-bit laptop computer
• "Alexander", the world's first book-sized supermicrocomputer
All ran the original Bell Labs UNIX.
Motorola 68010 (and friends)
Motorola had already introduced the MC 68000, which had a 32-bit architecture internally, but a 16-bit pinout externally. It introduced its pure 32-bit microprocessors, the MC 68010, 68012, and 68020 by 1985 or thereabouts, and began to work on a 32-bit family of RISC processors, named 88000.
NS 32032
In 1983, National Semiconductor introduced a 16-bit pinout, 32-bit internal microprocessor called the NS 16032, the full 32-bit NS 32032, and a line of 32-bit industrial OEM microcomputers. Sequent also introduced the first symmetric multiprocessor (SMP) server-class computer using the NS 32032.
Intel entered the 32-bit world in 1981, same as the AT&T BELLMAC chips, with the ill-fated 432. It was a three-chip design rather than a single-chip implementation, and it didn't go anywhere. In 1986, its 32-bit i386 became its first single-chip 32-bit offering, closely followed by the 486 in 1989.
Where are they now?
AT&T closed its Computer Systems division in December, 1995. The company shifted to MIPS and Intel chips.
Sequent's SMP machine faded away, and that company also switched to Intel microprocessors.
The Motorola 88000 design wasn't commercially available until 1990, and was cancelled soon after in favor of Motorola's deal with IBM and Apple to create the first PowerPC.
ARM is born
In 1983, Acorn Computers Ltd. was looking for a processor. Some say that Acorn was refused access to Intel's upcoming 80286 chip, others say that Acorn rejected both the Intel 286 and the Motorola MC 68000 as being not powerful enough. In any case, the company decided to develop its own processor called the Acorn RISC Machine, or ARM. The company had development samples, known as the ARM I by 1985; production models (ARM II) were ready by the following year. The original ARM chip contained only 30,000 transistors.
Where are they now?
Acorn Computers was taken over by Olivetti in 1985, and after a few more shakeups, was purchased by Broadcom in 2000.
However, the company's ARM architecture today accounts for approximately 75% of all 32-bit embedded processors. The most successful implementation has been the ARM7TDMI with hundreds of millions sold in cellular phones. The Digital/ARM combo StrongARM is the basis for the Intel XScale processor.

A new hope: The 1990s
The 1990s dawned just a few months after most of the Communist governments of Eastern and Central Europe had rolled over and played dead; by 1991, the Cold War was officially at an end. Those high-end UNIX workstation vendors who were left standing after the "microprocessor wars" scrambled to find new, non-military markets for their wares. Luckily, the commercialization and broad adoption of the Internet in the 1990s neatly stepped in to fill the gap. For at the beginning of that decade, you couldn't run an Internet server or even properly connect to the Internet on anything but UNIX. A side effect of this was that a large number of new people were introduced to the open-standards Free Software that ran the Internet.
The popularization of the Internet led to higher desktop sales as well, fueling growth in that sector. Throughout the 1990s, desktop chipmakers participated in a mad speed race to keep up with "Moore's Law" -- often neglecting other areas of their chips' architecture to pursue elusive clock rate milestones.
32-bitness, so coveted in the 1980s, gave way to 64-bitness. The first high-end UNIX processors would blazon the 64-bit trail at the very start of the 1990s, and by the time of this writing, most desktop systems had joined them. The POWER™ and PowerPC family, introduced in 1990, had a 64-bit ISA from the beginning.

Power Architecture
IBM introduced the POWER architecture -- a multichip RISC design -- in early 1990. By the next year, the first single-chip PowerPC derivatives (the product of the Apple-IBM-Motorola AIM alliance) were available as a high-volume alternative to the predominating CISC desktop structure.
Where is Power Architecture technology now?
Power Architecture technology is popular in all markets, from the high-end UNIX eServer™ to embedded systems. When used on the desktop, it is often known as the Apple G5. The cooperative climate of the original AIM alliance has been expanded into an organization by name of Power.org.

DEC Alpha
In 1992, DEC introduced the Alpha 21064 at a speed of 200MHz. The superscalar, superpipelined 64-bit processor design was pure RISC, but it outperformed the other chips and was referred to by DEC as the world's fastest processor. (When the Pentium was launched the next spring, it only ran at 66MHz.) The Alpha too was intended to be used in both UNIX server/workstations as well as desktop variants.
The primary contribution of the Alpha design to microprocessor history was not in its architecture -- that was pure RISC. The Alpha's performance was due to excellent implementation. The microchip design process is dominated by automated logic synthesis flows. To deal with the extremely complex VAX architecture, Digital designers applied human, individually crafted attention to circuit design. When this was applied to a simple, clean architecture like the RISC-based Alpha, the combination gleaned the highest possible performance.
Where is Alpha now?
Sadly, the very thing that led Alpha down the primrose path -- hand-tuned circuits -- would prove to be its undoing. As DEC was going out of business, , its chip division, Digital Semiconductor, was sold to Intel as part of a legal settlement. Intel used the StrongARM (a joint project of DEC and ARM) to replace its i860 and i960 line of RISC processors.
The Clone Wars begin
In March 1991, Advanced Micro Devices (AMD) introduced its clone of Intel's i386DX. It ran at clock speeds of up to 40MHz. This set a precedent for AMD -- its goal was not just cheaper chips that would run code intended for Intel-based systems, but chips that would also outperform the competition. AMD chips are RISC designs internally; they convert the Intel instructions to appropriate internal operations before execution.
Also in 1991, litigation between AMD and Intel was finally settled in favor of AMD, leading to a flood of clonemakers -- among them, Cyrix, NexGen, and others -- few of which would survive into the next decade.
In the desktop space, Moore's Law turned into a Sisyphean treadmill as makers chased elusive clock speed milestones.
Where are they now?
Well, of course, AMD is still standing. In fact, its latest designs are being cloned by Intel!
Cyrix was acquired by National Semiconductor in 1997, and sold to VIA in 1999. The acquisition turned VIA into a processor player, where it had mainly offered core logic chipsets before. The company today specializes in high-performance, low-power chips for the mobile market.

Where are we now? The 2000s
The 2000s have come along and it's too early yet to say what will have happened by decade's end. As Federico Faggin said, the exponential progression of Moore's law cannot continue forever. As the day nears when process will be measured in Angstroms instead of nanometers, researchers are furiously experimenting with layout, materials, concepts, and process. After all, today's microprocessors are based on the same architecture and processes that were first invented 30 years ago -- something has definitely got to give.
We are not at the end of the decade yet, but from where we sit at its mid-way point, the major players are few, and can easily be arranged on a pretty small scorecard:
In high-end UNIX, DEC has phased out Alpha, SGI uses Intel, and Sun is planning to outsource production of SPARC to Fujitsu (IBM continues to make its own chips). RISC is still king, but its MIPS and ARM variants are found mostly in embedded systems.
In 64-bit desktop computing, the DEC Alpha is being phased out, and HP just ended its Itanium alliance with Intel. The AMD 64 (and its clones) and the IBM PowerPC are the major players, while in the desktop arena as a whole, Intel, AMD, and VIA make x86-compatible processors along RISC lines.
As for 2005 and beyond, the second half of the decade is sure to bring as many surprises as the first. Maybe you have ideas as to what they might be! Take this month's chips challenge, and let us know your predictions for chips in 2005.
The history of microprocessors is a robust topic -- this article hasn't covered everything, and we apologize for any omissions. Please e-mail the Power Architecture editors with any corrections or additions to the information provided here.

All CPU Charts 2008

CPU Charts offer the most comprehensive x86 processor comparison on the Internet. The 2008 edition includes as many as 91 AMD and Intel desktop processors, which we tested across 35 individual benchmarks under Windows Vista. The benchmarks include industry standard software as well as popular workloads across multiple application categories, allowing you to find the CPU that meets your requirements (see benchmark details below the charts). The Charts are also a valuable tool if you want to check how your computer stacks up against current hardware. The 2008 charts include three different processor generations, starting with AMD's Socket 939 Athlons and Semprons, and including all Pentium 4 and Pentium D processors that were released for Socket 775. Intel Core 2 Quad
TYPE	CACHE	FSB	PRICE
Core 2 Quad Q6600 2.40GHz fan	8MB	1066MHz	$209.99
Core 2 Quad Q6700 2.66GHz fan	8MB	1066MHz	$289.99
Core 2 Quad Q6600 2.40GHz	8MB	1066MHz	$189.99
Core 2 Quad Q6700 2.66GHz	8MB	1066MHz	$259.99
Core 2 Quad Q8200 2.33GHz	4MB	1333MHz	$229.99
Core 2 Quad Q6700 Corsair Dual Channel TWINX 2048MB PC6400 DDR2 800MHz E.P.P. Memory (2 x 1024)			$279.99
Core 2 Quad Q9300 2.50GHz fan	6MB	1333MHz	$299.99
Core 2 Quad Q9550 2.83GHz fan	12MB	1333MHz	$349.99
Core 2 Quad Q9300 2.50GHz fan	6MB	1333MHz	$299.99
Core 2 Quad Q9550 2.83GHz fan	12MB	1333MHz	$349.99
Core 2 Quad Q9400 2.66GHz fan	6MB	1333MHz	$299.99
Core 2 Quad Q9650 3.0GHz fan	12MB	1333MHz	$574.99
Core 2 Quad Q8200 2.33GHz	4MB	1333MHz	$239.99

Intel Core 2 Extreme
TYPE	CACHE	FSB	PRICE
Core 2 Extreme QX9770 Processor - BX80569QX9770	3.20GHz	12MB Cache	$1499.99

Intel Pentium Dual Core
TYPE	CACHE	FSB	PRICE
Pentium Dual Core E2180 2.0GHz fan	1MB	800MHz	$74.99
Pentium Dual Core E2180 2.0GHz	1MB	800MHz	$59.99
Pentium Dual Core E2200 2.20GHz	1MB	800MHz	$69.99
Pentium Dual Core E5200 2.50GHz fan	2MB	800MHz	$89.99

Intel Celeron Dual Core
TYPE	CACHE	FSB	PRICE
Celeron Dual Core E1200 1.60GHz	512kB	800MHz	$39.99
Celeron Dual Core E1400 2.0GHz fan	512kB	800MHz	$59.99

Intel Core 2 Duo
TYPE	CACHE	FSB	PRICE
Core 2 Duo E6550 2.33GHz	4MB	1333MHz	$174.97
Core 2 Duo E4600 2.40GHz fan	2MB	800MHz	$129.99
Core 2 Duo E6750 2.66GHz fan	4MB	1333MHz	$199.99
Core 2 Duo E8500 3.16GHz fan	6MB	1333MHz	$289.99
Core 2 Duo E7200 2.53GHz fan	3MB	1066MHz	$129.99
Core 2 Duo E6300 1.86GHz	2MB	1066MHz	$174.97
Core 2 Duo E4700 2.60GHz	2MB	800MHz	$129.99
Core 2 Duo E8400 3.0GHz fan	6MB	1333MHz	$179.99
Core 2 Duo E7300 2.66GHz fan	3MB	1066MHz	$149.99
Core 2 Duo E8600 3.33GHz fan	6MB	1333MHz	$289.99

Intel Xeon
TYPE	CACHE	FSB	PRICE
Xeon 3065 2.33GHz	4MB	1333MHz	$189.99
Xeon 3075 2.66GHz	4MB	1333MHz	$219.99
Xeon X3350 2.66GHz fan	12MB	1333MHz	$349.99
Xeon 5150 2.66GHz	4MB	1333GHz	$779.99
Xeon E5405 2.0GHz	12MB	1333MHz	$239.99
Xeon E5410 2.33 GHz	12MB	1333MHz	$299.99
Xeon E5420 2.50GHz	12MB	1333MHz	$369.99
Xeon 5130 2.0GHz	4MB	1333MHz	$354.99
Xeon E5335 2.0GHz	8MB	1333MHz	$359.99
Xeon E5310 1.60GHz	8MB	1066MHz	$239.99
Xeon E5345 2.33GHz	8MB	1333MHz	$499.99
Xeon 5160 3.0GHz	4MB	1333GHz	$949.99
Xeon E5320 1.86GHz	8MB	1066MHz	$299.99