System Architecture

Published on January 2017 | Categories: Documents | Downloads: 26 | Comments: 0 | Views: 343
of 45
Download PDF   Embed   Report

Comments

Content

SYSTEM ARCHITECTURE

This is dedicated to Adey

Introduction The computer system has evolved over the years and it has moved from a general specialized computer to the everyday personal computer PC that has become very popular because of its simple usage and low cost. Various type of computer exists, from mainframes to minicomputers, the more popular microcomputers, and the smaller laptops and palmtops that add mobility to the computing world. All these forms of computers have basically the same architecture; they have an input device which could be a keyboard or other pointing devices, and output device which is generally the display, but some can have others like the printers, or their output might be to other computer system entirely serving as inputs to those computers or other devices like ATM machines, automatic doors, or even microwave ovens. And there is the processing device, which is the brain of the system. Some systems are client systems with few processing capacity and resources, which often request resources from other counterparts known as servers. Servers generally have a high processing capacity and a lot of memories and are used to host and shares resources. For example we have the application servers, which host applications, the database servers that host and shares data, web servers that host and shares web pages, DHCP servers for IP addresses, Proxy servers, that receives and processes request on behalf of other systems, proxy servers are basically seen to process request for web pages or applications, authentications servers, DNS servers, and so many others. Even though all types of computers have similar architecture, we would focus more on server side system architecture. Until client computers had real processing power and capability, the early client/server paradigm operated without the conventional client and server machines common in most current business environments. Rather, the early client/server model established an ad hoc definition of client server(A type of network in which every computer is either a server with a defined role of sharing resources with clients or a client that can access the resources on the server) and based on which unit issued a request for information or services (thereby becoming the client) and which one responded to such requests (thereby becoming the server). This kind of capability continues to be used to this day, particularly in peer-to-peer services. Primitive implementations of this technology are also forever preserved in old reliable Internet applications, including File Transfer Protocol (FTP) and Telnet (networked virtual terminal). Initial client/server computing centered around extremely large, expensive devices called mainframes (A somewhat vague distinction that identifies any large computer system normally capable of supporting many users and programs simultaneously) Mainframes could provide access to multiple simultaneous users by running multipleuser operating systems and were first available to big business and academia. Access to these large-scale computers for system operators and end users alike came via a terminal console (an input device and output display, mostly known as dumb terminals). Mainframe designs and concepts of yesteryear are vastly different from contemporary mainframes.

Functionality for the early mainframes consisted of business and academic applications that required no programming skill to use or maintain. This permitted business managers to create spreadsheets for ad hoc business modeling and reporting and to keep database entries for internal records, for example. The legacy applications they continue to run (and serve up) provides mainframes with their incredible staying power, often because business managers are wary of abandoning older, working systems for newer technology. Transitions from oldies but goodies into more modern equivalents can also be time-consuming and expensive, and it continues to provide a powerful argument for maintaining the mainframe as a form of status quo. When cutover from old systems to new ones is unavoidable, parallel operation is nearly always practiced during a cutover phase, so that the old system runs alongside the new one, often in complete lock-step. This is especially likely whenever costs or risks of downtime or new system failure are unacceptably high; the old system is kept up and running as a kind of "hot standby" in case anything affects the operation of the new one. Dedicated server computing found its initial justification from the consolidation and centralization of common network resources and devices it enabled. Owing to the prohibitive cost of then-nascent technologies such as early printers, tape backups, large storage repositories, and the operational overhead involved with maintaining and operating a mainframe, business managers opted to centralize most commonly used resources and attach them to the mainframe (sometimes directly, sometimes through a variety of peripheral processing units). This eventually spurred the development of more cost-effective "minicomputer" designs, which replicated most mainframe functionality at a fraction of the cost. This affected business operations by providing higher availability and yielding higher productivity.

Content at a Glance Introduction Session I: Session II: Session III: Session IV: The Motherboard The Processor System Memory The Disk Subsystem

Session I: The Motherboard The Motherboard or system board or planar board is a circuit board that provides the circuitry to other components of the computer system. Microprocessors Expansion Bus Chipsets Memory Sockets and RAM modules Integrated Drive Electronics (IDE) Enhnaced IDE or Small Computer System Interface SCSI Controllers • Parallel and Serial Port • Video Graphics Adaptors (VGA) And the list can go on Motherboards often come as two different types, the Integrated Motherboards and the Non-Integrated Motherboard (Often known as the open board). The Integrated motherboards contain quite a lot of other accessories of varying degree fused right in the board. The manufacturers of integrated motherboards try to make larger sales by including these additional devices like network card, modems, video card, sound card and host of other components on the board, and these components are more often referred to on-board components. Unlike the integrated boards, the open boards have major components on expansion card and these components are bought separately and included on the board. The advantage of this type of board is the choice that the buyer has on the types of components to include on the board. Open Boards are mostly server-side boards where administrators could chose high-end components to include on the board. Nowadays however, boards are found in-between integrated and nonintegrated, where some basic components would be included, while some would have to be added via an expansion card. Figure 1.1 shows a typical motherboard, with its expansion buses and RAM slots • • • • •

Figure 1.1: A Bare Motherboard

A motherboard contains a number of special sockets that accept various PC components. Motherboards provide sockets for the microprocessor (Figure 1.2); sockets for RAM (Figure 1.3); sockets to provide power (Figure 1.4); connectors for floppy drives and hard drives (Figure1.5); and connectors for external devices, such as mice, printers, joysticks, and keyboards (Figure 1.6.)

Figure 1.2: Sockets for CPU

Figure 1.3: sockets for RAM

Figure 1.4: Socket for Power Plug

Figure 1.5: Sockets for Hard disk and Floppy

Figure 1.6: Connectors for external devices Motherboards use tiny wires, called "traces," to link the various components of the PC together electrically Form Factors Motherboard form factors are motherboard design specifications. Motherboards manufacturers must meet the specification of a particular form factor. Two devices with the same form factor are physically interchangeable. The IBM PC, XT, and XT Model 286, for example, all use power supplies that are internally different but have exactly the same form factor. The form factor refers to the physical dimensions (size and shape) as well as certain connector, screw holes, and other positions that dictate into which type of case the board will fit. Some are true standards (meaning that all boards with that form factor are interchangeable), whereas others are not standardized enough to allow for interchangeability. Unfortunately, these non-standard form factors preclude any easy upgrade or inexpensive replacement, which generally means they should be avoided. Some of the very popular motherboard form factors include the AT, ATX, BTX, and variants of these. We also have other form factors that are peculiar to server systems; Small form factor:


SSI TEB (Thin Electronics Bay)

• •

microATX (AT Extended) microBTX (Balanced Technology Extended)

Tower and pedestal:
• • • • • • • • •

microATX microBTX ATX Extended ATX BTX SSI CEB (Compact Electronics Bay) SSI EEB (Entry-Level Electronics Bay) SSI MEB (Midrange Electronics Bay) Proprietary designs (large x86-based; Itanium and RISC-based designs)

Rack-mounted:
• • • •

PICMG (PCI Industrial Computer Manufacturers Group) SSI CEB SSI TEB Blade servers

Note that some form factors fall into more than one category. Most server systems boards have form factors that could accommodate multiple processors on the same board though sharing similar system bus, expansion slots and RAM modules; these systems are more often termed Symmetric Multiprocessing (SMP) systems.

1 – PCI Expansion Slot 3- PCI X Slot 4- Dual core Processor sockets 5 - Memory Sockets 7 – Power Connector

Figure 1.7 Symmetric Multiprocessing Board

Session II: The Processor Server processors can be divided into three major categories:






CISC-based processors CISC (complex instruction set computer) is a processor architecture used by x86 processors such as the Xeon, Opteron, and other server and desktop processors manufactured by Intel, AMD, and other vendors. It uses variable-length instructions. RISC-based processors RISC (reduced instruction set computer) uses a smaller number of instructions than CISC to improve processor efficiency. Server processors such as the PowerPC, Power Architecture, PA-RISC family, MIPS, Alpha, and Sun SPARC are RISC based. EPIC-based processors EPIC (explicitly parallel instruction computer) groups instructions together into very long instruction words (VLIW), preloads the instructions most likely to be used next, and uses a technique called predication to run all possible code branches in parallel and discard those that are not needed, as opposed to branch prediction (used in CISC processors). Intel's Itanium server processors are EPIC based.

If you are planning to purchase a preconfigured server, you can choose from products in any of the listed categories. A bit less than half of the servers on the market fall into the "Wintel" (Intel/Intel-compatible hardware running Windows) category. Many users prefer other server platforms. Which category should you choose? Here are some considerations:






Lowest initial cost x86-based servers, because they closely resemble x86 desktop computers in general design (and often use the same components), are the least expensive to purchase and to customize. Scalability If you need one to eight processors in a single server, any processor category can foot the bill. However, if you need a larger number of processors, you should consider a server based on RISC or EPIC technologies. Operating system support The server operating system you prefer will have a major influence on your choice of server processor. If you prefer Linux, you can choose a platform based on virtually any current or recent server processor. However, if you prefer a different server operating system, your choices are more limited. With x86 processors, you can choose from various versions of Windows 2000 Server or Windows Server 2003, popular Linux server and enterprise-level distributions, and Sun Solaris. Itanium processors can run 64-bit versions of Windows Server 2003 and Linux. Sun SPARC processors can run Linux or Solaris. PowerPC servers from Apple can run MacOS X or Linux, while PowerPC and Power architecture servers from IBM can run Linux or AIX 5L, a proprietary version of UNIX. Hewlett-Packard PA-RISCbased Hewlett-Packard 9000 series servers use HP-UX, a proprietary version of UNIX. Hewlett-Packard AlphaServers run OpenVMS, Tru64 UNIX, or Linux.

Processor Specifications

Server processors, like all other processors, can be identified by two main parameters: how wide their data path is and how fast they are. The speed of a processor is a fairly simple concept. Speed is counted in megahertz (MHz) and gigahertz (GHz), which means millions and billions, respectively, of cycles per secondand faster is better. The width of a processor is a little more complicated to discuss because three main specifications in a processor are expressed in width:
• • •

Data I/O bus Address bus Internal registers

Note that the processor side bus (PSB) is also called the front-side bus (FSB) or CPU bus. These terms all refer to the bus that is between the CPU and the main chipset component (North Bridge or memory controller hub [MCH]). Intel uses the FSB or PSB terminology, whereas AMD uses only FSB. CPU bus is the least confusing of the terms, and it is also completely accurate. Generally speaking, the speed of a processor can be determined by two factors:
• •

The internal clock speed of the processor The speed of the CPU bus

In terms of clock speed, the fastest server processors from each manufacturer include the following in order, from highest to lowest clock speed:
• • • • • • • • • • •

Intel Pentium 4 3.8GHz (single-core) Intel Pentium D, Pentium Extreme Edition 3.2GHz (dual-core) AMD Opteron 854/254/154 2.8GHz (single-core) PowerPC 970MP/G5 2.5GHz (dual-core) AMD Opteron 880/280/180 2.4GHz (dual-core) PA-8900 2.0GHz (dual-core) IBM Power5+ 1.9GHz (dual-core) Sun UltraSPARC III 1.593GHz Sun UltraSPARC IV 1.35GHz (dual-core) Alpha 21364 1.3GHz MIPS R16000 700MHz

Clock speed figures by themselves can be very misleading. Other factors, including chip architecture, the size of the address bus, single or dual-core design, the presence and size of L2 (and L3) memory cache, the speed of the processor bus, the number of processors installed, and whether the system is using SMP or NUMA multiprocessing also affect server performance. Dual-core designs, which have become very common in the past few years, provide a huge benefit to servers because they provide virtually every benefit of multiple processors, even if the server has room for only one processor. They enable servers to handle more programs and execution threads without slowing down. For example, a single-processor server using a dual-core processor can handle multitasking almost as well as a dual-processor server. A two-way (dual-processor) server becomes the virtual equivalent of a four-way server if dual-core processors are used, and so forth.

Although most dual-core processors are slightly lower in clock speed than single-core processors from the same family because of thermal issues, the increased workload, especially in server-oriented tasks, makes a dual-core processor worthwhile. Other portions of the server design, such as system memory speeds and sizes, the speed and interfaces used for network adapters and hard disks, and the operating system used, also affect the actual throughput of a particular server. Because a server provides services to client devices, a server's performance is measured by metrics such as the number of simultaneous clients that can be serviced and the speed at which each client receives information.

The Data I/O Bus
Perhaps the most important features of a processor are the speed and width of its external data bus. This defines the rate at which data can be moved into or out of the processor, also called the throughput. The processor bus discussed most often is the external data busthe bundle of wires (or pins) used to send and receive data. The more signals that can be sent at the same time, the more data that can be transmitted in a specified interval and, therefore, the faster (and wider) the bus. Having a wider data bus is like having a highway with more lanes, which enables greater throughput. Servers that work with large databases benefit the most from wide data buses. Data in a computer is sent as digital information consisting of a time interval in which a single wire carries a specified voltage to signal a 1 data bit or 0V to signal a 0 data bit. The more wires you have, the more individual bits you can send in the same time interval. A good way to understand this flow of information is to consider a highway and the traffic it carries. If a highway has only one lane for each direction of travel, only one car at a time can move in a certain direction. To increase traffic flow, you can add another lane in each direction so that twice as many cars pass in a specified time. You can think of an 8-bit chip as being a single-lane highway because 1 byte flows through at a time. (1 byte equals 8 individual bits.) The 16-bit chip, with 2 bytes flowing at a time, resembles a two-lane highway. You might have four lanes in each direction to move a large number of automobiles; this structure corresponds to a 32bit data bus, which has the capability to move 4 bytes of information at a time. Taking this further, a 64-bit data bus is like an 8-lane highway moving data in and out of the chip. All current server processors feature 64-bit (8 bytes wide) data buses. Therefore, they can transfer 64 bits of data at a time to and from the motherboard chipset or system memory.

The Address Bus
The address bus is the set of wires that carries the addressing information used to describe the memory location to which the data is being sent or from which the data is being retrieved. As with the data bus, each wire in an address bus carries a single bit of information. This single bit is a single digit in the address. The more wires (digits)

used in calculating these addresses, the greater the total number of address locations. The size (or width) of the address bus indicates the maximum amount of RAM a chip can address. Figure 2.1 shows how the data, address, and control buses relate to each other.

Figure 2.1 How data, address, and control buses connect the processor, memory, and I/O components of a server. The width of the data bus and the size of the address bus are not dependent on each other, and chip designers can use whatever size they want for each. Usually, however, chips with larger data buses have larger address buses. The sizes of the buses can provide important information about a chip's relative power, measured in two ways: The size of the data bus is an indication of the chip's information-moving capability, and the size of the address bus tells how much memory the chip can handle.

Internal Registers
The size of the internal registers indicates how much information the processor can operate on at one time and how it moves data around internally within the chip. This is sometimes also referred to as the internal data bus, to distinguish it from the external data bus, which connects the processor to memory. A register is a holding cell within the processor; for example, the processor can add numbers in two different registers, storing the result in a third register. The register size determines the size of data on which the processor can operate. The register size also describes the type of software or commands and instructions a chip can run. As described in the following sections, server processors fall into two categories: those with 32-bit registers and those with 64-bit registers.

32-Bit Processors
Internal registers are often twice the size of the external data bus, which means the chip requires two cycles to fill a register before the register can be operated on. Server processors with a 32-bit register size and 64-bit data bus, such as the Pentium Pro and its many descendents, up through most versions of the Pentium 4 and the AMD Athlon MP, use this design. To enable the data to be processed efficiently, most 32-bit processors use multiple 32bit pipelines for processing information. (A pipeline is the section of a processor that performs calculations on data.) As a result, recent processors can perform six or more operations per clock cycle. Although 32-bit processors have reached very high clock speeds and process information efficiently, their biggest drawback is the 4GB limit on directly addressable memory. Note Many Intel processors use a 36-bit address bus, which translates into 64GB of addressable memory. However, most processors with a 36-bit address bus have 32-bit register sizes, which limits addressable memory to 4GB. How can a 32-bit server support more than 4GB of RAM? Windows 2000 Advanced Server and Windows Server 2003 Enterprise Edition include a translation feature called Physical Address Extension (PAE), which enables the memory above 4GB to be accessed in 4GB blocks up to 16GB. To access memory above 16GB, some database applications such as SQL Server and Oracle use special APIs called Address Windowing Extensions. These workarounds are not necessary with processors that have 64-bit registers.

64-Bit Processors
Processors with a 64-bit register size can work with programs and data that exceed the 4GB limit imposed by 32-bit architecture. Thus, these processors can be more suitable for server use than 32-bit processors in applications that require manipulation of extremely large amounts of data. However, there are profound differences in how different 64-bit processor architectures work with existing 32-bit operating systems and programs. If you plan to use a 64-bit server with existing software, you need to carefully consider these differences before you choose a particular processor platform. Current 64-bit server processors include the following: x86-based 64-bit processors (can also run x86 32-bit software at full speed):
• • •

AMD Opteron Intel Pentium D Intel Pentium Extreme Edition

• •

Intel Pentium 4 (selected models) Intel Xeon (selected models)

EPIC-based 64-bit processors (can also run x86 32-bit software, but not at top speed):
• •

Intel Itanium Intel Itanium 2

RISC-based 64-bit processors (compatible with 32-bit RISC-based software varies):
• • • • • •

Power3 and higher series Alpha (all models) PowerPC G5/970 series PA-RISC 8xxx series MIPS R4xxx and higher series UltraSPARC (all models)

As you can see from this list, although x86-compatible server processors with 64-bit capabilities are relatively new, 64-bit capabilities have been available since the mid1990s in RISC-based processors. Keep in mind that a 64-bit processor can take full advantage of its architecture only when running a 64-bit operating system and 64-bit applications. Multiple CPUs Most of the processors discussed support multiple-processor operation. Servers that contain multiple processors have the following major advantages:




Superior performance when running multithreaded tasks. In this scenario, different processors can run different program threads being used by a single program. Relatively few multithreaded applications are currently available (mostly CAD, high-end graphics, and 3D rendering programs). Superior performance when running multiple single-threaded applications. In an environment where applications are not multithreaded, a multiprocessor system can still run individual applications on separate processors.

SMP (Symmetric Multi Processing) is the multiple-CPU design used by most multiprocessor-equipped servers up to four-way designs. In a server that uses SMP, all processors use the same operating system and memory space. The operating system kernel subdivides multiple threads of a single multithreaded task or separate single-threaded tasks among the processors Cache Memory Processor core speeds have increased dramatically in the past decade. Although memory speeds have increased as well during the same time period, memory speeds have not kept up with processor performance. How could you run a processor faster than the memory from which you feed it without having performance suffer terribly? The answer was cache. In its simplest terms, cache memory is a high-speed memory buffer that temporarily stores data the processor needs, allowing the processor to

retrieve that data faster than if it came from main memory. But there is one additional feature of a cache over a simple buffer: intelligence. A cache is a buffer with a brain. A buffer holds random data, usually on a first-in, first-out (FIFO) or first-in, last-out (FILO) basis. A cache, on the other hand, holds the data the processor is most likely to need in advance of it actually being needed. This enables the processor to continue working at either full speed or close to it without having to wait for the data to be retrieved from slower main memory. Cache memory is usually made up of static RAM (SRAM) memory integrated into the processor die, although older systems with cache also used chips installed on the motherboard. Processors use various types of cache algorithms to determine what data to store in cache and what data to remove from cache to make room for new data. Common methods include the following:
• •

Least recently used (LRU) This method discards the oldest data from the cache. Least frequently used (LFU) This method discards from the cache the data that is used less often than other data.

At least two levels of processor/memory cache are used in a modern server: Level 1 (L1) and Level 2 (L2). Some server processors such as the Itanium series and the latest Xeon models from Intel as well as some RISC-based processors also use Level 3 (L3) cache. Internal Level 1 Cache All server processors include an integrated L1 cache and controller. This feature was first introduced to x86-compatible processors starting with the 486 family. Historically, server processors have featured L1 cache sizes from as little as 8KB to as much as 3MB. However, most recent server processors have L1 cache sizes ranging If the processor had to get information from main memory every time it needed to process information, the processor would spend a lot of time waiting for memory. Because L1 cache is always built in to the processor die, it runs at the full-core speed of the processor internally. Full-core speed means that the cache runs at the same speed as the internal processor core rather than the slower external motherboard speed. This cache is basically an area of very fast memory that is built in to the processor and used to hold some of the current working set of code and data. Cache memory can be accessed with no wait states because it is running at the same speed as the processor core. Using cache memory reduces traditional system bottlenecks because system RAM is almost always much slower than the CPU; the performance difference between memory and CPU speed has become especially large in recent systems. Using cache memory prevents the processor from having to wait for code and data from much

slower main memory, therefore improving performance. Without the L1 cache, a processor would frequently be forced to wait until system memory caught up. If the data the processor wants is already in the internal cache, the CPU does not have to wait. If the data is not in the cache, the CPU must fetch it from the Level 2 cache or (in less sophisticated system designs) from the system bus, meaning directly from main memory. To mitigate the dramatic slowdown every time an L1 cache miss occurs, a secondary cache, L2, is employed. At one time, L2 cache was located outside the processor die. Depending on the processor, it might have been located on the motherboard or in a bulky processor cartridge (as with the Pentium II, Pentium II Xeon, and some versions of the Pentium III and Pentium III Xeon processors). In such cases, there was a slowdown when an L1 cache miss took place, but the system had the desired information in L2 cache. L3 cache is the third level !of cache, and it is present in only a few very-highperformance server and high-performance workstation processors at this time. These include the Intel Pentium 4 Extreme Edition, Xeon MP, Intel Itanium family, and a few RISC-based processors. L3 cache is checked after the processor checks L1 and then L2 cache for the necessary information. As with L2 cache, a large L3 cache improves performance by storing a larger amount of the contents of main memory for quick access by the processor. The location of the L3 cache affects its speed. If the L3 cache is off-die, it runs at a slower speed than the processor core. Consequently, the system's performance is reduced when L1 and L2 cache misses take place, but L3 cache contains the desired information. However, even in such cases, accessing L3 cache is generally faster than accessing main memory. Processors with on-die L3 cache access it at the same speed as L1 and L2 cache. In such cases, there is little practical distinction between the operations of L2 and L3 cache.

Session III: System Memory Memory is the workspace for a system's processor. It is a temporary storage area where the programs and data being operated on by the processor must reside. Memory storage is considered temporary because the data and programs remain there only as long as the server has electrical power or is not reset. Before the server is shut down or reset, any data that has been changed should be saved to a more permanent storage device (usually a hard disk) so it can be reloaded into memory in the future. Memory is often called RAM, for random access memory. RAM's contents are volatile, requiring frequent power refreshes to remain valid. Main memory is called RAM because you can randomly (as opposed to sequentially) access any location in memory. This designation is somewhat misleading and often misinterpreted. Readonly memory (ROM), for example, is also randomly accessible, yet it is usually differentiated from system RAM because ROM, unlike RAM, maintains its contents without power and can't normally be written to. Disk memory is also randomly accessible, but we don't consider that RAM, either. Over the years, the definition of RAM has changed from being a simple acronym to referring to the primary memory workspace the processor uses to run programs, which is usually constructed of a type of chip called dynamic RAM (DRAM). One of the characteristics of DRAM chips (and, therefore, RAM in general) is that they store data dynamically, which really has two meanings. One meaning is that the information can be written to RAM repeatedly at any time. The other has to do with the fact that DRAM requires the data to be refreshed (essentially rewritten) every 15ms (milliseconds) or so. A type of RAM called static RAM (SRAM) does not require this periodic refreshing. An important characteristic of RAM in general is that data is stored only as long as the memory has electrical power. When we talk about a computer's memory, we usually mean the RAM or physical memory in the system (the memory modules used to temporarily store currently active programs and data). It's important not to confuse memory with storage, which refers to things such as disk and tape drives (although they can be used as a substitute for RAM called virtual memory). RAM can refer to both the physical chips/modules that make up memory in a system and the logical mapping and layout of that memory. Logical mapping and layout refer to how the memory addresses are mapped to actual chips and what address locations contain which types of system information. Memory temporarily stores programs when they are running, along with the data being used by those programs. RAM chips are sometimes termed volatile storage because when you turn off a computer or an electrical outage occurs, whatever is stored in RAM is lost unless it's been saved to a local hard disk or other storage device or the server's hard disk. Because of the volatile nature of RAM, many computer users make it a habit to save their work frequently. (Some software applications can do timed backups automatically.) Launching a computer program from either local or network storage brings files into RAM, and as long as they are running, computer programs reside in RAM. The CPU executes programmed

instructions in RAM and also stores results in RAM. The server transmits data to onboard storage or to connected workstations that request the information. Physically, the main memory in a system is a collection of chips or modules containing chips that are usually plugged in to the motherboard. These chips or modules vary in their electrical and physical designs and must be compatible with the system into which they are being installed in order to function properly. This session discusses the various types of chips and modules that can be installed in different systems. Next to the processor and motherboard, memory can be one of the most expensive components in a modern PC, although the total amount of money spent on memory for a typical system has declined over the past few years. It is somewhat more expensive than desktop memory because it supports data-protection and reliability features such as parity/error correcting code (ECC) and a signal buffer. (Memory that uses a signal buffer chip is known as registered memory.) If you build a new server, you can't expect to be able to use just any existing server memory in your inventory. Similarly, if you upgrade the motherboard in an existing server, it's likely that the new motherboard will not support the old motherboard's memory. Therefore, you need to understand all the various types of memory on the market today so you can best determine which types are required by which systems and thus more easily plan for future upgrades and repairs. To better understand physical memory in a system, you should see where and how it fits into the system. Three main types of physical memory are used in modern systems:
• • •

Read-only memory (ROM) Dynamic random access memory (DRAM) Static random access memory (SRAM)

ROM ROM is a type of memory that can permanently or semipermanently store data. It is called read-only because it is either impossible or difficult to write to. ROM is also often referred to as nonvolatile memory because any data stored in ROM remains there, even if the power is turned off. Therefore, ROM is an ideal place to put a server's startup instructions (that is, the software that boots the system). Note that ROM and RAM are not opposites, as some people seem to believe. They are both simply types of memory. In fact, ROM could be classified as technically a subset of the system's RAM. In other words, a portion of the system's RAM address space is mapped into one or more ROM chips. This is necessary to contain the software that enables the PC to boot up; otherwise, the processor would have no program in memory to execute when it was powered on.

The main ROM BIOS is contained in a ROM chip on the motherboard, but there are also adapter cards with ROM on them as well. ROM on adapter cards contains auxiliary BIOS routines and drivers needed by the particular card, especially for those cards that must be active early in the boot process, such as video cards. Cards that don't need drivers active at boot time typically don't have ROM because those drivers can be loaded from the hard disk later in the boot process. Most systems today use a type of ROM called electrically erasable programmable ROM (EEPROM), which is a form of flash memory. Flash is a truly nonvolatile memory that is rewritable, enabling users to easily update the ROM or firmware in their motherboards or any other components (video cards, SCSI cards, peripherals, and so on). DRAM DRAM is the type of memory chip used for most of the main memory in a modern PC. The main advantages of DRAM are that it is very dense, meaning you can pack a lot of bits into a very small chip, and it is inexpensive, which makes purchasing large amounts of it affordable. The memory cells in a DRAM chip are tiny capacitors that retain a charge to indicate a bit. The problem with DRAM is that it is dynamic. Because of its design, DRAM must be constantly refreshed; otherwise, the electrical charges in the individual memory capacitors drain, and the data is lost. A refresh occurs when the system memory controller takes a tiny break and accesses all the rows of data in the memory chips. Most systems have a memory controller (which is built in to the North Bridge portion of the motherboard chipset of most servers or is found in the processor on the AMD Opteron), which is set for an industry-standard refresh rate of 15ms. This means that every 15ms, all the rows in the memory chip are automatically read to refresh the data.

Unfortunately, refreshing memory takes processor time away from other tasks because each refresh cycle takes several CPU cycles to complete. A few systems allow you to alter the refresh timing parameters via the CMOS Setup, but be aware that increasing the time between refresh cycles to speed up your system can allow some of the memory cells to begin draining, which can cause random soft memory errors to appear. (A soft error is a data error that is not caused by a defective chip.) On a server using ECC memory, the server will automatically correct a single-bit soft error without any user intervention. A soft error involving two or more memory bits would trigger an error message. Servers using advanced ECC (chipkill) can correct up to four bit errors in the same memory module. However, a few low-end servers don't use ECC memory. On those servers, any type of memory error would cause the system to lock up and require a restart. Unsaved data would be lost. It is usually safer to stick with the recommended or default refresh timing. Because refreshing consumes less than 1% of a modern system's overall bandwidth, altering the refresh rate has little effect on performance. It is almost always best to use default or automatic settings for any memory timings in the BIOS Setup. Most servers don't

allow changes to memory timings and are permanently set to automatic settings. On an automatic setting, the motherboard reads the timing parameters out of the serial presence detect (SPD) ROM found on the memory module and sets the cycling speeds to match. Even if you're accustomed to altering memory settings on desktop PCs to boost performance, such changes often require a lot of experimentation to find a balance between performance and stability. With a server, you should always opt for stability over a relatively minute gain in performance. The transistor for each DRAM bit cell reads the charge state of the adjacent capacitor. If the capacitor is charged, the cell is read to contain a 1; no charge indicates a 0. The charge in the tiny apacitors is constantly draining, which is why the memory must be refreshed constantly. Even a momentary power interruption, or anything that interferes with the refresh cycles, can cause a DRAM memory cell to lose the charge and therefore the data. If this happens in a running system, it can lead to blue screens, global protection faults, corrupted files, and any number of system crashes. Therefore, you should use battery backup systems (UPS devices) and high-quality surge suppressors on your servers. DRAM is used in desktop and server systems because it is inexpensive, and the chips can be densely packed, so a lot of memory capacity can fit in a small space. Unfortunately, DRAM is also slow, typically much slower than the processor. For this reason, many types of DRAM architectures have been developed to improve performance. SRAM: Cache Memory Another distinctly different type of memory exists that is significantly faster than most types of DRAM. SRAM is so named because it does not need the periodic refresh rates that DRAM needs. Because of how SRAM is designed, not only are refresh rates unnecessary, but SRAM is much faster than DRAM and much more capable of keeping pace with modern processors. SRAM is available in access times of 2ns (nanoseconds) or less, so it can keep pace with processors running 500MHz or faster. This is because of the SRAM design, which calls for a cluster of six transistors for each bit of storage. The use of transistors but no capacitors means that refresh rates are not necessary because there are no capacitors to lose their charges over time. As long as there is power, SRAM remembers what is stored. Unfortunately, SRAM is too expensive and too large to use as main memory. However, SRAM is a perfect choice for memory caching. Cache memory runs at speeds close to or equal to the processor speed and is the memory the processor usually directly reads from and writes to. During read operations, the data in the highspeed cache memory is resupplied from the lower-speed main memory or DRAM in advance. Cache memory is built in to all modern server processors, starting with the Pentium and Pentium Pro. Cache effectiveness is expressed as a hit ratio, which is the ratio of cache hits to total memory accesses. A hit occurs when the data the processor needs has been preloaded into the cache from the main memory, meaning that the processor can read it from the

cache. A cache miss occurs when the cache controller does not anticipate the need for a specific address and the desired data is not preloaded into the cache. In the case of a miss, the processor must retrieve the data from the slower main memory instead of the faster cache. Anytime the processor reads data from main memory, the processor must wait longer because the main memory cycles at a much slower rate than the processor. If a processor with integral on-die cache is running at 3400MHz (3.4GHz), both the processor and the integral cache would be cycling at 0.29ns, while the main memory would most likely be cycling 8.5 times more slowly, at 2.5ns (200MHz DDR). Therefore, the memory would be running at only a 400MHz equivalent rate. So, every time the 3.4GHz processor read from main memory, it would effectively slow down 8.5-fold to only 400MHz! The slowdown is accomplished by having the processor execute wait states, which are cycles in which nothing is done; the processor essentially cools its heels while waiting for the slower main memory to return the desired data. Obviously, you don't want your processors slowing down, so cache function and design become more important as system speeds increase. To minimize the processor being forced to read data from the slow main memory, two or three stages of cache usually exist in a modern system: called Level 1 (L1), Level 2 (L2), and Level 3 (L3). The L1 cache is also called the integral, or internal, cache because it has always been built directly in to the processor as part of the processor die (the raw chip). Therefore, the L1 cache always runs at the full speed of the processor core and is the fastest cache in any system. L1 cache has been a part of all processors since the Intel 386. To improve performance, later processor designs from Intel (starting with the Pentium Pro of 1995) and AMD (starting with the K6-III of 1999) included the L2 cache as a part of the processor die (earlier systems used the L2 cache on the motherboard). Although the Pentium Pro included L2 cache in the processor core, running at full speed, this was a very expensive processor to build, and Intel switched over to slot-based designs for the Pentium II, Pentium II Xeon, and early versions of the Pentium III and Pentium III Xeon, a design also used by the original AMD Athlon. These processors placed L2 cache in separate chips from the processor core, and the L2 cache ran at half the speed (or sometimes a bit less) of the processor core. However, by late 1999, with the introduction of the Pentium III Coppermine and Pentium III Xeon (Advanced Transfer Cache) processors, all of Intel's subsequent processors for servers as well as desktops have placed full-speed L2 cache in the processor core. Likewise, AMD's Socket A Athlon (first introduced in 2000) and Athlon MP server processors led AMD's return to on-die full-speed L2 cache. Today, all server (as well as desktop) processors use on-die L2 cache. In chips with on-die L2, the cache runs at the full core speed of the processor and is much more efficient than older designs that placed L2 cache outside the processor core. On-die L3 cache has been present in high-end workstation and server processors such as the Xeon and Itanium families since 2001. Having more levels of cache helps mitigate the speed differential between the fast processor core and the relatively slow motherboard and main memory. L2 and L3 cache is faster and is accessed much more quickly than main memory. Thus, virtually all motherboards designed for processors with built-in cache don't have any cache on the board; the entire cache is contained in

the processor or processor module instead. The key to understanding both cache and main memory is to see where they fit in the overall system architecture.

Session IV: The Disk Subsystem The primary storage is the PC’s main memory of RAM, and serves as the PC’s active storage that temporarily stores data and instructions while they’re in use by the system. The hard disk (along with CDROM and floppy) is secondary storage device that provides permanent storage for user’s data, programs and other objects, even after the power goes off. Hard disks (and floppy disks) organize media into logical divisions: cylinders, tracks, sectors, and clusters. This organization, along the servo system on the disk, is the foundation of the addressing system used to locate, store, and retrieve data on the disk. The basic organization elements on hard and floppy disks are Tracks: A floppy disk has around 80 tracks, and a hard disk can have 1,000 tracks or more. Figure 3.1 illustrates how disk tracks are concentric bands that complete one circumference of the disk. The first track on a disk, typically track 0, is on the outside edge of the disk. Sectors: Disks are divided into cross-sections that intersect across all tracks, as illustrated in Figure 3.1. The result is that each track is broken into a number of addressable pieces, called sectors. A sector is 512 bytes in length; a hard disk has from 100 to 300 sectors per track, and a floppy disk from 9 to 18 sectors per track. Sectoring creates addressable elements on a track, including its starting point. Cylinders: All the tracks with the same number on all the platters (the flat round metallic disks located inside the hard disk) of a hard disk drive create a logical entity called a cylinder. The read/write heads of a disk move in unison and are all over the same track number on each disk platter. A hard disk with three platters, as illustrated in Figure 10-2, has six disk surfaces and six track 52s, which logically create cylinder 52. Cylinders are not used on floppy disks.

Figure 3.1: Track and Sectors on Disk

Figure 3.2: Disk Cylinder _ Clusters: Clusters are logical groupings of disk sectors used by operating systems to track and transfer data to and from the disk. Typically, a cluster comprises around 64 sectors, but the total capacity of the disk drive and the operating system determine the number of sectors in a cluster on any particular PC. Operating systems that use clusters as the basic transfer unit operate in block mode. Disk Drive Capacity Disk drive capacities are stated in megabytes (millions of bytes; MB) and gigabytes (billions of bytes; GB), but drives with terabyte (trillions of bytes; TB) capacity are beginning to appear. Table 10-1 lists the common data capacity measurements used with disk drives. Measurement Abbreviation Capacity Kilobyte KB One thousand bytes Megabyte Gigabyte Terabyte Petabyte Exabyte MB GB TB PB EB One million byte One billion byte One trillion byte One quadrillion byte One quintillion byte

Note Most hard disk drives available today are in the 1 to 40GB range and come in many different types and styles. However, they use the same basic components, are constructed essentially the same way, and operate the same. Where they differ is in their storage capacities and speeds,how they encode the data, and the interface used to communicate with the PC. Hard disk Partitions Partitioning is the process of subdividing a storage area into smaller regions for proper management of the space. When disk are portioned, they can achieve the following purposes; • • • • Divide the disk into logical sub-drives that are assigned a different drive letter, such as C:, D:, and E:, and can be separately addressed Load multiple operating systems on the same disk, such as Windows 98 and Linux, with each operating system in its own partition Support multiple file systems, such as NT File System (NTFS) and FAT32, on the same disk drive Separate data files from application files on different partitions to speed up data backups

Partitioning a hard disk can improve the disk’s efficiency and overcome an operating system’s sizing issues. For example, Windows sizes disk clusters proportionately to the size of the partition. A bigger partition can result in bigger clusters, which translates to numerous, small unused spaces on the disk. Strategically reducing the partition sizes or creating many smaller partitions reduces cluster sizes to better match the data. A disk can have more than one partition, but some operating systems limit each partition size. Thus, on some systems, larger disks must be divided into smaller partitions. DOS, Windows 3.x, or early releases of Windows 95 don’t support partition sizes larger than 2GB. This means that to use the entire disk drive, a disk larger than 2GB must be divided into two or more partitions. Later versions of Windows (98, NT, 2000, and XP) allow you to create partitions up to 4TB (2GB being the norm), depending on the Windows version and the file system in use. A hard disk can be divided into two types of partitions: • Primary partitions: A primary partition is created to hold an operating system and is typically the partition used to boot the PC. A hard disk can be divided into as many as four primary partitions, but only one primary partition can be active (set as the system partition) at a time. Another type of primary partition is the boot partition, which stores the operating system’s files, such as the Windows folders.



Extended partitions: An extended partition can be divided into as many as 23 logical sub-partitions. Each logical partition can be assigned its own drive identity, such as D:, E:, or F:, and used for any purpose other than as the active partition.

Redundancy – The RAID Systems As mentioned earlier, servers may be rated based on their availability, or the amount of uptime they have. The more reliable a server has to be, the more "insurance" you have to buy for contingencies. Lots of things can go wrong on a server everything from driver corruption to power supply problems or the failure of an add-in card. To create servers that are truly fault tolerant, vendors go to great lengths to create redundancy as well as seamless failover. In fault-tolerant systems essentially every component in the server is duplicated. In such a system, when a component of the server fails, the entire server fails over quickly and seamlessly. Storage systems, disks, and disk controllers are the components that fail most often in servers. A disk is a mechanical device, and it runs hot and at high speed. You can make these components very reliable, but you can't completely eliminate failure. When you multiply the number of disks to expand your storage capacity, the potential for problems also goes up linearly. Even in servers that don't need to be highly available and can suffer some downtime, protecting your storage assets (that is, data) is paramount. The popularization of inexpensive disk drive technology provided the impetus to create disk structures that could survive different types of failure, including data corruption, drive failures, host bus failures, and array failures. You do this by creating redundancy, which you can achieve in several different ways:
• •

If a drive fails, you can switch to another drive with exactly the same data, called a mirror. If your data spans more than one drive and a drive fails, you can reconstruct your data from additional parity data written to the volume that "fills in" the missing data, provided that the data is spread out evenly (that is, striped)

These are two very different approaches to two very different problems, yet all of them are grouped together under a concept called RAID (Redundant Array of Independent Disk). The redundancy discussed in the first case involves both hardware (a disk) and data (a data set). In the second case, the data set still exists, but hardware is added (a spare disk), and the data set is re-established in its original form. RAID can be relatively inexpensive (for example, replace a drive) or very expensive (for example, run duplicate or triplicate data systems for redundancy). RAID as an underpinning storage virtualization tool gives a modern server powerful methods for managing data and can improve operation, availability, and performance immeasurably. Types of Raid Systems

- JBOD or Spanning Just a Bunch of Disk or more popular known in Microsoft term as Spanned disk, is a method for obtaining a larger volume size out of a group of smaller-sized drives. This scheme concatenates the disks so that the addressing scheme extends or spans over the drives in the array, creating a virtual volume, as shown in Figure 4.1 Figure 4.1: JBOD

JBOD/spanning isn't RAID, but it does require the use of a controller. Given that most any controller offers RAID of some kind, most people opt for a RAID solution instead of relying on a JBOD methodology. In Windows, for example, you could use the Disk Management snap-in of the MMC (formerly called the Disk Administrator) to create a JBOD. Creating a JBOD in the Disk Manager is just like creating individual partitions on multiple drives that your system can access. JBOD doesn't get recommended very often, but because the fact that it is both cheap and easy to implement makes it popular in certain environments. With JBOD, when a disk fails, you can still recover the files on your working disks. However, to reestablish the volume, you need to restore from backup. - RAID 0: Striping, No Parity RAID has come to be codified into several different levels, which are different configurations or techniques for creating data structures. RAID 0, which is the favorite disk structure for PC gamers who want to inexpensively achieve higher disk performance. Figure 4.2: Stripping- RAID Level 0

RAID 0 takes any number of diskslet's call that number Dn and combines those disks into a single container or logical structure. The container is then formatted so that the data is "striped" across all the disks sequentially as shown in Figure 4.2. As data is written, the disk head proceeds to write one set of blocks on one drive, followed by

the head of the second disk writing the same set of blocks on the second, and so forth. At the end of the last drive in the array, the data writing continues on the first drive, picking up at the next set of blocks that followed the previous set. RAID 0 is sometimes referred to as a striped set. Note Because RAID 0 must stripe across similar-sized areas of a disk, most RAID 0 implementations require that you use the same disk drive size in order to create the array. This isn't always the case, though; some HBAs or disk managers allow you to create the same-sized partition on each drive and stripe across that. Whenever possible, you should always opt for the solution that gives you the maximum amount of flexibility. Because the hard drive of today is the dinosaur of tomorrow, you want to be able to move up in capacity without having to swap out all your drives at once. So it pays to look carefully for the more forgiving RAID implementations For a small file, you might find that the data is on one single drive, but in most cases, you will find that the data is actually spread out across two or more drives, maybe even all the drives. For multifile operations, particularly in a multitasked environment, the data is actually spread out randomly among all the drives, and some operations can approach n times the performance of a single drive. Sometimes people use RAID 0 to split a collection of disks into a larger set of virtual disks. RAID 0 imposes a slight performance penalty for write operations because the controller has to manage the move from one disk to another. When it comes to read operations, however, RAID 0 gives you a considerable performance enhancement because you can have multiple heads reading data on multiple disks at the same time, with each of them concurrently sending that data to the host controller. The smaller and more numerous the reads, the more performance is boosted. For large data files that are written sequentially across all the disks, RAID 0 doesn't offer much of a performance boost, even for read operations, because the file still has to be read sequentially, which negates the advantage of having more drive heads accessing the data. RAID 0 is helpful when you have a large NFS server with multiple disks or where the operating system limits you to a smaller number of drive letters, such as the 24-letter limit in Windows. RAID 0 is the lowest form of RAID, and some people argue that it isn't RAID at all. There's no redundant data being written, and there's absolutely no data protection. You can't pull a disk out of RAID 0 because all disks are part of the data structure, although some RAID systems let you add a disk to RAID 0. - RAID 1: Mirroring RAID 1 is the simplest form of data redundancy you can have, and it is often used for small, entry-level RAID systems. In a RAID 1 configuration, your data is written in two places at essentially the same time, which is where the term mirror is derived. When you have a problem with one disk, you can break the mirror and switch over to the second copy on the other disk, which is presumably still functioning normally.

When your hardware problem is fixed, usually because you've replaced the malfunctioning drive, you can add the first volume back and rebuild the mirror. Figure 4.3: Mirroring: RAID Level 1

A mirror can be made of part of a disk perhaps a volume; or a mirror can be the simple duplication of one whole drive to another. In the former case, the mirror is almost always created and managed by either the operating system or a program running under the operating system. In the latter case, block copying is faster when done in hardware at the HBA. RAID 1 provides no performance enhancement with either read or writes operations. Depending on how mirroring is implemented and the robustness of your controller, most often mirroring doesn't affect your servers' performance at all. Most mirroring implementations done in hardware are fast enough that they don't have to buffer or cache content written to the second disk. Because you want your mirrored disk to be valid, there is some reading of the second disk to determine whether it has the same data at the same place as the first disk. Usually this data validation can be done in the background. RAID 10 (or 1+0): Mirroring with Striping RAID 10, also called RAID 1+0, combines the RAID techniques you've just learned about in the two previous sections in a combination that some people describe as either nested or stacked RAID levels. RAID 10 gives you the redundancy of a mirrored volume or drive along with the performance benefit of a striped array. Because RAID 1+0 is nothing more than RAID 1 and RAID 0 combined, you will find that all RAID HBAs, from the cheapest ones you can find to the most expensive ones, offer RAID 1+0. Many people refer to this RAID level as RAID 10 or mirroring with striping to reflect the order in which the RAID levels are applied. Because RAID 10 is both high performance and fully redundant, this particular configuration is recommended for high-transaction-volume applications where you need performance and fast failover to a redundant data set should something fail on the first array. RAID 10 is one of the most popular RAID levels implemented. Large database servers, messaging servers, and webservers often implement RAID 10 for their disk arrays.

You can think of RAID 1+0 as being formed by applying to your physical disks a first layer of RAID (mirroring, in this case). Then you overlay the second RAID level, which stripes data across the mirrored array. RAID 0+1: Striping with Mirroring You might ask what the difference is between RAID 1+0 (or RAID 10) and the level of RAID you would create if you striped first and then applied a mirroror if, indeed, there is any difference at all. There is a difference, and this other RAID level is referred to as RAID 0+1 or striping with mirroring. (You usually don't see RAID 0+1 abbreviated as RAID 01 because there is concern that people will get this abbreviation mixed up with RAID 1.) Figure 4.4 and 4.5 illustrates the difference between RAID 10 and RAID 0+1. In this example, you have a number of disks (Dn) arranged in two equal-sized mirrored volumes. Consider what happens when a drive fails. In RAID 10, you would break the mirror, replace the disk, and rebuild the mirror. A RAID 10 array can survive both the loss of one mirror (any number of disks) and disk failures in both arrays, as long as the failed disks in the two arrays are not the same failed members of the sets. That is, in a four-disk array where A1B1 is mirrored to A2B2, and the As and Bs have the same data, you could fail A1 and B2 or B1 and A2, but not A1 and A2 or B1 and B2. Now let's consider RAID 0+1, where first you stripe and then you mirror. When you lose a disk in this configuration, one of your RAID 0 sets has been lost. When you break the mirror and add your new disk, the remaining disks in the stripe no longer correspond to the disks in the other striped mirror. The result is that you actually need to start from scratch and either rebuild the complete stripe from your working drives in the damaged mirror (plus the new one) or, as is often the case, start with a complete set of new drives that match the damaged set being replaced. In this circumstance, RAID 0+1 has to write more data than RAID 1+0. Writing data is strictly a mechanical process, and except with very large volumes, you may not care if you have to rebuild the entire stripe. However, while RAID 10 survives the loss of disks in both arrays, RAID 0+1 does not. When you lose two or more drives on both mirrors in RAID 0+1 you can no longer rely on RAID to get you going again, and you have to re-establish your array from backups (and invariably some new data is lost). Nothing is perfect in this world, but RAID 10 is a little more perfect than RAID 0+1.

Figure 4.4: Mirroring with Stripping RAID Level 10 Needs 4 disk or more, highly performance, but less reliable than a simple mirror

Figure 4.5: Stripping with Mirroring RAID 0+1 Needs 4 disk or more, highly redundant and reliable First the data is mirrored
A B C D
HBA Writes data simultaneously

E F G H

RAID 5: Striping with Parity RAID 5 is the third most popular RAID level used in the industry. RAID 5 performs block-level striping across the disk set of a volume and writes some redundant information, called parity data, that lets you reconstruct missing data if a disk fails. The parity data is written across all the disks so that the array can survive a single disk failure and still be rebuilt. RAID 5 doesn't offer quite the performance of RAID 1 because of the overhead of reading and writing parity data, but it provides some redundancy that you don't find in RAID 1 without having to duplicate the volume as part of a mirror. Thus, if you like, you can think of RAID 5 as poor man's RAID 10. Let's look a little more closely at how RAID 5 works and what parity is all about. You need three or more disks to create a RAID 5 volume, as shown in Figure 4.6. A block

HB A Wri tes data sim ulta neo usly

Then the data is stripped

A B C D
HBA Writes data simultaneously

E F G H

Figure 4.6: Stripping with parity- RAID Level 5, needs 3 or more disk, fast reads, slow writes and rebuilds

of data is subdivided by partitioning software into sectors, and the number of blocks is determined by the capacity of the disk. The number of sectors is a variable that you can define usually 256 or fewer sectors. As each block of data is written to disk, the RAID 5 algorithm calculates a parity block that corresponds to the data and then writes the parity block on the same stripe, but not on the same disk. If stripe n in a three-disk RAID 5 volume has the parity block on disk 1, then stripe n+1 would have the parity block on disk 2, stripe n+2 would place the parity block on disk 3, so that by stripe n+3, the parity block would be returned to disk 1 again to start the cycle over, creating a distributed parity block arrangement RAID 5 imposes some overhead on write operations because in an n disk system, 1/n of the data that is written isn't going to be anything you can use, except in the hopefully rare case in which you need to rebuild a RAID 5 array. RAID 5 requires at least three disks because if one of a two disk RAID 5 arrays fails, you would lose half of your data set. The third disk in the RAID 5 set provides the extra redundancy you need to make the system work. To really make RAID 5 perform, you want to have more than three disks. Although there is a write operation penalty, RAID 5 doesn't impose a read operation penalty on your disk I/O because the parity information is ignored. You still have the same benefit of multiple heads on multiple disks reading data at the same time. Upon a read operation, RAID 5 performs a cyclic redundancy check (CRC) calculation to see if the data is valid. CRC is an algorithm that reads the data and writes a sum, based on the data it contains. When an error is detected, RAID 5 can read the parity block in that stripe, locate the sector with the error, and use the data to reconstruct the sector with the incorrect checksum. This process usually occurs onthe-fly, without your noticing it (though it will post an appropriate error message). If a whole disk fails, RAID 5 can rebuild the missing disk from the data contained in the remaining disks, and it can even do so automatically to a hot spare. In most implementations, RAID 5 arrays can continue to collect data even while rebuilding a failed disk, although at a considerably slower rate because the disk heads are busy with the rebuild. RAID 5 can sustain the loss of a single drive without data loss. RAID 5 can be intelligent. Some RAID 5 systems have predictive functions that can determine whether a disk is likely to fail and initiate a rebuild based on that information. Data such as inaccessible clusters, hot spots on the disk, and other factors can be an indication of an impending disk failure. That predictive capability can be put to good use. In some instances, RAID 5 systems can analyze disk activity, figure

out where your hot spots are, and move data around to lessen the stress on that area of the disk as well as to optimize data locations to improve disk head access. It's no wonder that RAID 5 is so popular. It really has a lot of things going for it.

Glossary This glossary contains computer and electronics terms that are applicable to the subject matter in this book. access time The time that elapses from the instant information is requested to the point at which delivery is completed. It's usually described in nanoseconds (ns) for memory chips and in milliseconds (ms) for disk drives. Most manufacturers rate average access time on a hard disk as the time required for a seek across one third of the total number of cylinders plus one half the time for a single revolution of the disk platters (latency). Also known as disk access time. active partition Any partition marked as bootable in the partition table. adapter (1) A device that serves as an interface between a system unit and the devices attached to it. It's often synonymous with circuit board, circuit card, or card. (2) A connector or cable adapter that changes one type of connector to another. address (1) An identifier that refers to where a particular piece of data or other information is found in the computer. (2) An identifier that refers to the location of a set of instructions. address bus One or more electrical conductors used to carry a binary-coded address from the microprocessor throughout the rest of the system. ANSI (American National Standards Institute) A nongovernmental organization founded in 1918 to propose, modify, approve, and publish data processing standards for voluntary use in the United States. It's also the U.S. representative to the ISO in Paris and the IEC. API (application programming interface) A system call (routine) that gives programmers access to the services provided by the operating system. In IBM-compatible systems, the ROM BIOS and DOS together present an API that a programmer can use to control the system hardware.

application End-user oriented software, such as a word processor, spreadsheet, database, graphics editor, game, or web browser. ASCII (American Standard Code for Information Interchange) A standard 7-bit code created in 1965 by Robert W. Bemer to achieve compatibility among various types of data processing equipment. The standard ASCII character set consists of 128 decimal numbers ranging from 0 to 127, which are assigned to letters, numbers, punctuation marks, and the most common special characters. In 1981, IBM introduced the extended ASCII character set with the IBM PC, extending the code to 8 bits and adding characters from 128 to 255 to represent additional special mathematic, graphics, and foreign characters. ATA (AT Attachment) An IDE disk interface standard introduced in 1989 that defines a compatible register set, a 40-pin connector, and its associated signals. ATX A motherboard and power supply form factor standard designed by Intel and introduced in 1995. It is characterized by a double row of rear external I/O connectors on the motherboard, a single keyed power supply connector, memory and processor locations that are designed not to interfere with the installation of adapter cards, and an improved cooling flow. The current specification, ATX 2.0, was introduced in December 1996 backup disk A disk that contains information copied from another disk. It is used to ensure that original information is not destroyed or altered. bandwidth (1) Generally, the measure of the range of frequencies within a radiation band required to transmit a particular signal. The difference between the lowest and highest signal frequencies. The bandwidth of a computer monitor is a measure of the rate at which a monitor can handle information from the display adapter. The wider the bandwidth, the more information the monitor can carry and the greater the resolution. (2) A measure of the data-carrying capacity of a given communications circuit or pathway. The bandwidth of a circuit is a measure of the rate at which information can be passed. base 2 The computer numbering system that consists of two numerals: 0 and 1.

BIOS (basic input/output system) The part of an operating system that handles the communications between the computer and its peripherals. Often burned into ROM chips or rewritable flash (EEPROM) memory chips found on motherboards and expansion cards, such as video cards and SCSI and ATA/IDE host adapters bit (binary digit) Represented logically by 0 or 1 and electrically by 0 volts and (typically) 5 volts. Other methods are used to represent bits physically (tones, different voltages, lights, and so on), but the logic is always the same. blade server A thin circuit board that contains processors, memory, and, often, storage, which plugs into a special rack-mounted chassis. Multiple blade servers can occupy a single chassis. boot To load a program into a computer. The term comes from the phrase "pulling a boot on by the bootstrap boot manager A program that enables you to select which active partition to boot from. Often supplied with aftermarket disk-partitioning programs, such as PartitionMagic, or installed by default when you install a Windows upgrade into a separate disk partition instead of replacing your old version. BTX (Balanced Technology Extended) A PC and server architecture introduced by Intel in 2003 that is designed to improve internal cooling by placing memory and processors in line with cooling fans. buffer A block of memory that is used as a holding tank to store data temporarily. A buffer is often positioned between a slower peripheral device and the faster computer. All data moving between the peripheral and the computer passes through the buffer. A buffer enables the data to be read from or written to the peripheral in larger chunks, which improves performance. A buffer that is x bytes in size usually holds the last x bytes of data that moved between the peripheral and CPU. This method contrasts with a cache, which adds intelligence to the buffer so that the most often accessed data, rather than the last accessed data, remains in the buffer (cache). A cache can improve performance greatly over a plain buffer.

bus A linear electrical signal pathway over which power, data, and other signals travel. A bus is capable of connecting to three or more attachments. A bus is generally considered to be distinct from radial or point-to-point signal connections. The term comes from the Latin omnibus, meaning "for all." When used to describe a topology, bus always implies a linear structure. byte A collection of bits that makes up a character or other designation. Generally, a byte is 8 data bits. cache An intelligent buffer. By using an intelligent algorithm, a cache contains the data accessed most often between a slower peripheral device and the faster CPU Celeron A family of processors that are low-cost versions of the Pentium II, Pentium III, and Pentium 4 processors. The major differences include a smaller amount of L2 cache and lower clock speeds. client/server A type of network in which every computer is either a server with a defined role of sharing resources with clients or a client that can access the resources on the server. CPU (central processing unit) A computer's microprocessor chip; the brains of the outfit. Typically, it is an IC using VLSI technology to pack several functions into a tiny area. The most common electronic device in the CPU is the transistor, of which several thousand to several million or more are found CRC (cyclic redundancy check) An error-detection technique that consists of a cyclic algorithm performed on each block or frame of data by both sending and receiving modems. The sending modem inserts the results of its computation in each data block in the form of a CRC code. The receiving modem compares its results with the received CRC code and responds with either a positive or negative acknowledgment. data

A group of facts processed into information. A graphic or textural representation of facts, concepts, numbers, letters, symbols, or instructions used for communication or processing. data bus A connection that transmits data between the processor and the rest of the system. The width of the data bus defines the number of data bits that can be moved into or out of the processor in one cycle. default (1) Any setting that is assumed at startup or reset by the computer's software and attached devices and that is operational until changed by the user. It is an assumption the computer makes when no other parameters are specified. When you type DIR without specifying the drive to search, for example, the computer assumes you want it to search the default drive. (2) In software, any action the computer or program takes on its own with embedded values. DHCP (Dynamic Host Configuration Protocol) A protocol for assigning dynamic IP addresses to devices on a network. With dynamic addressing, a device can have a different IP address every time it connects to the network. Routers, gateways, and broadband modems can function as DHCP hosts to provide IP addresses to other computers and devices on the network. driver A program designed to interface a particular piece of hardware to an operating system or other standard software. dual-core processor A processor that contains two distinct physical processor cores in a single package. This type of processor provides most of the benefits of dualprocessor designs at lower cost. dumb terminal A screen and keyboard device with no inherent processing power connected to a computer that is usually remotely located. EBCDIC (Extended Binary Coded Decimal Interchange Code) An IBM-developed 8-bit code for the representation of characters. It allows 256 possible character combinations within a single byte. EBCDIC is the standard code on IBM minicomputers and mainframes, but not on the IBM microcomputers, where ASCII is used instead.

ECC (error correcting code) A type of system memory or cache that is capable of detecting and correcting some types of memory errors without interrupting processing. EDO (extended data out) RAM A type of RAM chip that enables a timing overlap between successive accesses, thus improving memory cycle time. EEPROM (electrically erasable programmable read-only memory) A type of nonvolatile memory chip used to store semipermanent information in a computer, such as the BIOS. An EEPROM can be erased and reprogrammed directly in the host system without special equipment. This is used so manufacturers can upgrade the ROM code in a system by supplying a special program that erases and reprograms the EEPROM chip with the new code. Also called flash ROM. eight-way server A server that contains eight processors. EPIC (Explicitly Parallel Instruction Computing) The RISC-based 64-bit processor architecture used by the Intel Itanium and Itanium 2 processors. EPIC is not the same architecture as AMD64 or EM64T. expansion card An IC card that plugs in to an expansion slot on a motherboard to provide access to additional peripherals or features not built in to the motherboard. Also referred to as an add-in board. expansion slot A slot on a motherboard that physically and electrically connects an expansion card to the motherboard and the system buses. extended memory Direct processor-addressable memory addressed by an Intel (or compatible) 286 or more advanced processor in the region beyond the first megabyte. It is addressable only in the processor's protected mode of operation. extended partition A nonbootable DOS partition (also supported by Windows) that contains DOS volumes. Starting with DOS v3.3, the FDISK program can create two partitions that serve DOS: an ordinary, bootable partition (called the primary partition)

and an extended partition, which can contain as many as 23 volumes, from D: to Z:. external device A peripheral installed outside a system case. FAT (file allocation table) A table held near the outer edge of a disk that tells which sectors are allocated to each file and in what order.

FAT32 A disk file allocation system from Microsoft that uses 32-bit values for FAT entries instead of the 16-bit values used by the original FAT system, enabling partition sizes up to 2TB (terabytes). Although the entries are 32 bits, 4 bits are reserved, and only 28 bits are used. FAT32 first appeared in Windows 95B and is also supported by Windows 98, Windows Me, Windows 2000, and Windows XP. FIFO (first-in, first-out) A method of storing and retrieving items from a list, table, or stack so that the first element stored is the first one retrieved. form factor The physical dimensions of a device. Two devices with the same form factor are physically interchangeable. The IBM PC, XT, and XT Model 286, for example, all use power supplies that are internally different but have exactly the same form factor. hit ratio In describing the efficiency of a disk or memory cache, the ratio of the number of times the data is found in the cache to the total number of data requests. 1:1 is a perfect hit ratio, meaning that every data request was found in the cache. The closer to 1:1 the ratio is, the more efficient the cache. host The main device when two or more devices are connected. When two or more systems are connected, the system that contains the data is typically called the host, and the other is called the guest or user. hot-swapping

The removal and replacement of equipment without shutting down the server. hub A common connection point for multiple devices in a network. A hub contains a number of ports to connect several segments of a LAN together. When a packet arrives at one of the ports on the hub, it is copied to all the other ports so all the segments of the LAN can see all the packets. A hub can be passive, intelligent (allowing remote management, including traffic monitoring and port configuration), or switching. A switching hub is also called a switch. iSCSI (Internet SCSI) An implementation of SCSI that uses Ethernet networks using TCP/IP to transfer data in both directions between a server and a SCSI drive or drive array LAN (local area network) The connection of two or more computers, usually via a network adapter card or NIC. A LAN is a network contained within a building. Both home and office networks are considered LANs. Ethernet, Fast Ethernet, Gigabit Ethernet, and Wireless Ethernet are used in office LANs, whereas home LANs might use Ethernet, Fast Ethernet, HomePNA, HomeRF, or Wi-Fi Wireless Ethernet. MBR (master boot record) On hard disks, a one-sector-long record that contains the master boot program as well as the master partition table containing up to four partition entries. The master boot program reads the master partition table to determine which of the four entries is active (bootable) and then loads the first sector of that partition, called the volume boot record. The master boot program tests the volume boot record for a 55AAh signature at offset 510; if it's present, program execution is transferred to the volume boot sector, which typically contains a program designed to load the operating system files. The MBR is always the first physical sector of the disk, at Cylinder 0, Head 0, Sector 1. memory caching A service provided by extremely fast memory chips that keeps copies of the most recent memory accesses. When the CPU makes a subsequent access, the value is supplied by the fast memory rather than by the relatively slow system memory. L1 and L2 caches are memory caches found on most recent processors parity A method of error checking in which an extra bit is sent to the receiving device to indicate whether an even or odd number of binary 1 bits was

transmitted. The receiving unit compares the received information with this bit and can obtain a reasonable judgment about the validity of the character. The same type of parity (even or odd) must be used by two communicating computers, or both may omit parity. When parity is used, a parity bit is added to each transmitted character. The bit's value is 0 or 1, to make the total number of 1s in the character even or odd, depending on which type of parity is used. Parity checking isn't widely supported on recent systems, but memory with parity bits can be used as ECC memory on systems with ECC-compatible chipsets. partition A section of a hard disk devoted to a particular operating system. Most hard disks have only one partition, devoted to DOS. A hard disk can have as many as four partitions, each occupied by a different operating system. DOS v3.3 or later can occupy two of these four partitions. A boot manager enables you to select the partition occupied by the operating system you want to start if you have multiple operating systems installed in different partitions. rack An open framework that enables servers and other types of network equipment to be stacked for maximum efficiency. Many racks are designed to permit fast insertion and removal of rack-mounted equipment. Rack vertical dimensions are measured in rack units (RU). Racks are 20.125 inches wide, to support standard 19-inch rack-mounted hardware. rack-mounted server A server that fits into a rack. Typical sizes range from 1U to 5U. RAID (redundant array of independent [or inexpensive] disks) A storage unit that employs two or more drives in combination for fault tolerance and greater performance, used mostly in file server applications. Originally used only with SCSI drives and host adapters, many motherboards now feature ATA/IDE or SATA RAID implementations. redundancy Including server or network equipment that will automatically take over if the primary equipment fails. On a typical server, redundant hardware might include hard disks in a RAID array, memory, or a power supply. Some servers are built to provide redundancy for all components. This type of server is known as a fault-tolerant server. RISC (reduced instruction set computer) A computer whose processor has a simple instruction set that requires only one or a few execution cycles. These simple instructions can be used more

effectively than CISC systems with appropriately designed software, resulting in faster operations. ROM (read-only memory) A type of memory that has values permanently or semipermanently burned in. These locations are used to hold important programs or data that must be available to the computer when the power initially is turned on. SAN (storage area network) A high-speed local or remote network that connects servers with storage devices. SANs often use fiber-optic connections. scalable Capable of being scaled so that a person can add capacity to a system or network as needed while maintaining performance. server A computer in a network that enables resources such as files and printers to be shared by multiple users. system files Files with the system attribute. They are usually the hidden files that are used to boot the operating system. The MS-DOS and Windows 9x system files include IO.SYS and MSDOS.SYS; the IBM DOS system files are IBMBIO.COM and IBMDOS.COM. WAN (wide area network) A LAN that extends beyond the boundaries of a single building.

Sponsor Documents

Or use your account on DocShare.tips

Hide

Forgot your password?

Or register your new account on DocShare.tips

Hide

Lost your password? Please enter your email address. You will receive a link to create a new password.

Back to log-in

Close