You are on page 1of 26

Boot from SAN in Windows Server 2003

and Windows 2000 Server

Microsoft Corporation
December 2003

Abstract

Booting from a storage area network (SAN), rather than from local disks on individual servers, can enable
organizations to maximize consolidation of their IT resources, minimize their equipment costs, and realize the
considerable management benefits of centralizing the boot process.

This white paper describes boot from SAN technology in a Windows environment, the advantages and
complexities of the technology, and a number of key SAN boot deployment scenarios. Boot from SAN

technologies are supported on Microsoft Windows Server 2003 and Microsoft Windows 2000 Server
platforms.
Contents

Introduction ...................................................................................................................................... 2

The Boot Process: An Overview...................................................................................................... 3


Local Boot........................................................................................................................... 3
Remote Boot....................................................................................................................... 3
Boot from SAN.................................................................................................................... 4

Boot from SAN: Pros and Cons....................................................................................................... 5


Advantages......................................................................................................................... 5
Disadvantages .................................................................................................................... 6

Key Solution Components ............................................................................................................... 7


Hardware Requirements..................................................................................................... 7
Boot Specific Requirements ............................................................................................... 8

SAN Boot Scenarios ...................................................................................................................... 10


Basic SAN ............................................................................................................................... 10
Multipath Configurations ......................................................................................................... 12
Clustered Servers ................................................................................................................... 13
Directly Attached Paging Disk................................................................................................. 17
iSCSI boot from SAN .............................................................................................................. 17

Troubleshooting Boot from SAN.................................................................................................... 18


General Boot Problems .................................................................................................... 18
Potential Difficulties with SAN Boot .................................................................................. 19
Current Limitations to Windows Boot from SAN............................................................... 20

Summary ....................................................................................................................................... 21

Additional Resources..................................................................................................................... 22

Appendix........................................................................................................................................ 23
The Boot Process: Details....................................................................................................... 23
Pre-boot ............................................................................................................................ 23
Boot Sequence ................................................................................................................. 23
Intel IA-64 Architecture Differences.................................................................................. 25

Boot from SAN in Windows Server 2003 i


Introduction
One of the ways in which organizations with large scale server deployments (consisting of
hundreds to tens of thousands of servers, such as those found in enterprise data centers) are
dramatically cutting costs is to replace large servers with highly compact rack-mountable forms.
These smaller form factors dispense with costly individually-attached hardware (such as power
supplies and device interfaces) in favor of shared resources among the servers in a rack.
The densest of these forms is the blade server, which began shipping in 2002. In addition to
reducing the hardware, electrical and square footage costs, blade servers are more manageable
than traditional servers, since, in addition to being hot pluggable and simplifying cable
configurations (and incorrect configurations can be a major source of downtime), a single server
can be used to manage all other servers in the same shared chassis. While some blade servers
have internal disks, they tend to have lower performance and capacity than SCSI disks, a fact
which is helping to drive adoption of diskless blade servers used in combination with storage
networks (NAS and SAN).
The development of diskless blade servers introduced a challenge for versions of the Windows
operating system prior to the Windows Server 2003 server release, since Windows boot
procedures were originally developed with the requirements that the boot disk be directly
attached to the server, and that the operating system have access to the boot volume at all times.
With the release of Windows Server 2003 (and updates to Windows 2000), the Windows platform
supports boot from SAN capabilities, without requiring that the dedicated boot disk be local to the
server. These capabilities, and the necessary steps for successful deployment in a variety of
environments, including clustering, are explained in the sections that follow.
Because boot from SAN configurations are fundamentally dependent on hardware configurations
and capabilities, it is important to note that support for boot from SAN in the Windows
environment comes from the hardware vendors, not from Microsoft.

Boot from SAN in Windows Server 2003 2


The Boot Process: An Overview
The boot process, variously known as booting or bootstrapping, is the iterative process of loading
the installed operating system code from the storage device into computer memory when the
computer is powered on. Since BIOS (Basic Input/Output System) is the most basic code, it is
loaded first. It serves to initialize the computer hardware and read in the code (from a storage
device or network) necessary to begin the next stage of booting. This code loads the operating
system, completes hardware setup and produces a fully functional operating system residing in
memory. For a more detailed description of the boot process, see the Appendix of this white
paper.
The boot process can occur from a direct attached disk, over a local area network, or from a
storage area network. In all cases, the critical step to a successful boot is locating the boot disk.
The device controlling that process varies, depending on the boot type.

Local Boot
The most common booting approach is to boot from a direct-attached disk. The server BIOS
locates the SCSI adapter BIOS, which contains the instructions allowing the server to determine
which of the internal disks is the boot disk necessary to load the operating system.

Remote Boot
Remote booting (also called network boot) is the ability of a computer system to boot over a local
area network (LAN) from a remote boot server. Critical to remote boot is the network adapter
card, which contains the instructions necessary for booting. Remote boot is not a new concept;
UNIX systems have had remote boot capabilities for about 30 years.
Remote booting confers a number of advantages, including remote administration of client
workstations, greater control over distribution of software, and cost reduction by eliminating the
need for local hard drives. However, the downside of remote boot is that booting over the LAN is
a considerable security risk.
1
While Microsoft Windows NT Server 4.0 enables remote boot, Microsoft has not made this
capability widely available in its other operating system products because of the inherent security
risks. Windows does, however, support a much more secure remote boot technology, boot from
SAN.

1
Windows NT Server 4.0 uses the Remoteboot Service, which requires a special chip on each client network interface
card to enable remote booting of MS-DOS, Windows 3.1, Windows 95 and Windows 98. This boot programmable read-
only memory (PROM) chip redirects the standard startup process from the client to the network adapter card and
establishes the network connection to the server. The client can then obtain a boot imagethe minimum necessary
information for startup and configurationdirectly from the server. The Remoteboot Service requires installation of the
NetBEUI and DLC protocols on the server.

Boot from SAN in Windows Server 2003 3


Boot from SAN
Boot from SAN is a remote boot technology; however, in this case, the source of the boot disk is
on the storage area network (SAN), not on the LAN. The server communicates with the SAN
through the host bus adapter (HBA), and it is the HBA BIOS that contains the instructions that
enable the server to find the boot disk on the SAN.
Boot from SAN, like LAN-based remote boot, offers the advantages of reduced equipment costs.
It also offers a number of other advantages, including reduced server maintenance, improved
security, and better performance. These factors are addressed in greater detail in the next
section.

Boot from SAN in Windows Server 2003 4


Boot from SAN: Pros and Cons
Booting from a SAN can offer organizations a number of storage management advantages.
However, while boot from SAN is conceptually straightforward (and is the same process whether
the boot is local to the server or from the SAN), configuration of the various hardware
components to guarantee a successful SAN boot is both difficult and inadequately documented.
Given this complexity, any organization interested in boot from SAN capabilities should weigh the
increased complexity against the advantages boot from SAN can confer.

Advantages
Boot from SAN technologies help businesses continue the trend toward consolidated and
effective management of storage resources, decoupled from the server.
Server Consolidation. Boot from SAN alleviates the necessity for each server to boot from
its own direct-attached disk, since each server can now boot from an image of the operating
system on the SAN. Thin diskless servers take up less facility space, require less power to
operate, and, because they have fewer hardware components, are generally less expensive.
Internal disk failure is very common in large datacenters.
Centralized Management. Since operating system images can be stored to disks on the
SAN, all upgrades and fixes can be managed at a centralized location. This eliminates the
need to manually install upgrades on each system. Changes made to the disks in the storage
array are readily accessible by each server.
Simplified Recovery from Server Failures. Recovery from server failures is simplified in a
SAN environment. Rather than a lengthy process of re-installing the operating system and a
backup copy of the data from tape to a spare server, the spare can simply be booted from the
SAN and then access the data stored on the SAN, returning to production with maximum
efficiency.
Rapid Disaster Recovery. All the boot information and production data stored on a local
SAN can be replicated to a SAN at a remote disaster recovery site. If a disaster destroys
functionality of the servers at the primary site, the remote site can take over with minimal
downtime.
Rapid Redeployment for Temporary Server Loads. Businesses that experience temporary
but high production workloads can take advantage of SAN technologies to clone the boot
image, and distribute the image to multiple servers for rapid deployment. Such servers may
only need to be in production for hours or days, and can be readily removed when the
production need has been met; the highly efficient deployment of the boot image makes such
temporary deployment a cost effective endeavor.

Boot from SAN in Windows Server 2003 5


Disadvantages
For all its advantages, booting from a SAN is not a technology for the storage administrator who
is unfamiliar with the complexity of deploying a SAN.
Hardware Deployment is Complex. Solution components, including HBAs and logical unit
2
number (LUN) management must all be configured correctly for a server to successfully boot
from the SAN. These challenges increase in a multivendor hardware environment.
Boot Process is Complex. The details of the operating system boot process, and the
dependencies of the process on operating system functionality are conceptually challenging
(see Appendix of this white paper), and need to be generally understood to make
troubleshooting more effective.

2
A logical disk. A LUN may map onto a single or multiple physical disks, and may constitute a whole or only part of any given disk or disks.

Boot from SAN in Windows Server 2003 6


Key Solution Components
Boot from SAN, while conceptually straightforward, can be problematic to deploy correctly. Key to
effective deployment, whether the most basic topology or a complex enterprise configuration, is
ensuring that both software and hardware components are installed according to specified vendor
requirements. This section outlines the key solution components. The actual sequence of steps
depends on the boot from SAN scenario deployed.

Hardware Requirements
The sections that follow outline the basic hardware components necessary for correctly deploying
a boot from SAN solution. It is recommended that key components (HBAs, switches etc) are
duplicated for redundancy in the event of hardware failure.

Servers
Each new server designated to be connected to the SAN storage array should be installed as
per vendor instructions.
If the server has already been in production, ensure that all disks are backed up before
connecting it to the SAN.

Ensure that the operating system supports boot from SAN. Windows Server 2003, Windows
Storage Server 2003, Windows 2000 Server, and Windows NT 4.0 are capable of booting
from the SAN.

Host Bus Adapters


For each server to be connected to the SAN, record each world wide name (WWN) for each
HBA prior to installation, or obtain this information from the setup utility resident on the HBA.
The WWN is a unique address assigned by the manufacturer, and will be used during the
configuration process. It may be necessary to obtain both the world-wide port name and the
world-wide node name.
Install the HBA according to vendor instructions. Ensure that the HBA supports booting
(some do not).
Ensure that the HBA BIOS has the correct version of the firmware installed.
Obtain the correct HBA driver. It is this driver that allows the server to communicate with the
disks in the SAN as if they were local SCSI attached disks. The driver also provides the
bootstrap program. In certain configurations, the Microsoft Storport driver is the
recommended driver for boot from SAN.
Ensure that the HBA settings are configured to match all components of the particular
solution deployed, including the server, operating system version, the SAN fabric, and the
storage array. Vendors will include instructions on any necessary changes to the default
configuration.

Boot from SAN in Windows Server 2003 7


SAN Fabric
The SAN fabric consists of the switches and cabling that connect the servers to the storage
array. The HBA for each server is connected to a switch, and from there through to the port
on the storage array.
Assign the new HBA devices to zones (groups) on the SAN fabric. Communication is
restricted to members of the same zone. WWNs or the physical switch ports are used to
identify members of the zone.

Storage Array
Storage controllers control communication between the disks on the array and the ports to
which the servers connect. Storage arrays should have at least two controllers for
redundancy.
Create the RAID sets and LUNs for each server on the storage array. The logical units are
either numbered automatically by the storage array, or they can be assigned by the user.
Many storage arrays have the capability of managing disk security so that servers can only
access those LUNs that belong to them. Disks and LUNs are assigned to ports; hence a
single port connects to multiple disks or LUNs. These storage resources must be shared
among the multiple servers; LUN management through masking is critical to prevent multiple
hosts from having access to the same LUN at the same time. Microsoft only supports boot
from SAN when used with LUN masking.

Boot Specific Requirements


A number of boot specific factors must be considered in order to correctly deploy a boot from
SAN solution.

Boot Bios
Ensure that the correct boot BIOS is on the HBA; without this the HBA may not detect any
disks on the SAN.
The default setting for the HBA boot BIOS is typically disabled; this must be enabled on only
one adapter per server in order to boot from SAN.

HBA Driver
Ensure that the HBA driver is appropriate for the boot configuration design. The SCSIport
driver, which was designed for parallel SCSI solutions, is not the appropriate driver for high
performance SAN configurations; instead use a Storport miniport driver with Windows
Server 2003, as it is specifically designed for such solutions.
The Storport driver features improved reset handling, which makes it possible to boot from
SAN and have clusters running on the same adapters. Storport also allows for queue
management, which is critical in a SAN fabric where fabric events such as adding or
removing devices are common. Further details on the Storport driver can be found in the
white paper, Storport in Windows Server 2003: Improving Manageability and Performance in
Hardware RAID and Storage Area Networks.

Boot from SAN in Windows Server 2003 8


Designate Boot Disk
Each server must have access to its own boot drive. Designate as many boot disks (or LUNs)
in the storage array as there are servers accessing the storage in that array.
For each server to access the correct boot disk, a setting on the HBA boot BIOS must be
changed to reflect the address of the disk or LUN on the SAN.

Boot from SAN in Windows Server 2003 9


SAN Boot Scenarios
SAN deployment configurations can be quite simple, and can grow to enormous complexity. This
section guides the reader through the most basic configuration to some of the more complex
configurations common to enterprise storage environments.

Basic SAN
The simplest SAN configuration, shown in Figure 1, is to deploy two diskless servers, each with a
3
single HBA, connected to Fibre Channel storage . (For simplicitys sake, this configuration does
not employ redundant components, even though it is recommended that they are deployed.) For
Windows boot from SAN solutions to work correctly, each server must have access to its own
dedicated boot device.

Figure 1. Basic boot from SAN Configuration

3
One of the simplest SAN configurations is a Fibre Channel arbitrated loop (FC-AL) configuration in which up to 126 devices are
connected. However, this configuration is not supported in Windows boot from SAN, since the addition or removal of devices from a FC-AL
configuration may result in all the devices acquiring a new network address. Moreover, the interruptions that occur when loop events occur
can cause I/O to fail, which can cause the whole system to stop booting or to crash.

Boot from SAN in Windows Server 2003 10


Follow these steps to set up the system so that the BIOS for each server correctly locates its boot
LUN:
1. Configure the storage array with LUNs. (These LUNs are empty until after the operating
system is loaded and the file structure is added to allow population with data.) The storage
array returns the LUN numbers automatically, or the administrator can set them. (LUN
numbers either remain as assigned, or are remapped by the HBA, as discussed later in these
steps.) In this case, the intention will be to install boot files on LUN 1 and LUN 2. LUN 3 can
be used for data.
The array is assumed to support a single LUN 0 instance, which is not a disk device. This
logical unit is used to obtain discovery information from the array through the use of the
SCSI-3 REPORT LUNS command. Devices that only comply with earlier specifications
are not recommended for use with Fibre Channel, clustering, or when booting from SAN.
(The array must also return the HiSup bit set in the LUN 0 INQUIRY data unless a
Storport miniport is available.)
2. Determine the manufacturer-set world wide node name (WWNN) for each HBA adapter. (This
number can be read from the HBA label prior to installation, or it may be displayed using the
BIOS setup program.)
3. Determine the port name (WWPN) of the controller on the storage array.
4. Ensure that each server only has access to the LUNs allocated to it. This is done through the
process of unmasking4 the appropriate LUN in the storage array to the appropriate server, so
that a path is traced from LUN to the controller across the fabric into the HBA. The LUN
number, node, and port addresses are all required. In the configuration in the example,
LUN 1 will be unmasked to server 1 and LUN 2 to server 2. It is advisable to keep LUN 3
masked at this point so that it does not appear as a choice during installation of the operating
system.
5. Begin installation of the operating system on server 1 from a bootable CD. If steps 1-4 have
been correctly carried out, installation should proceed without error.
6. When prompted during setup for third party storage drivers, press the F6 key and make sure
that the miniport driver is available (typically on a floppy disk or CD) for the HBA. While the
inbox Fibre Channel drivers can be used for Server 2003, check for the correct driver
required by the storage vendor. For Windows 2000 or earlier, ensure that the appropriate
driver is available.

4
Depending on the vendor, the default state of the LUNs is either masked or unmasked to the server. Thus, whether the administrator
unmasks or masks depends on the default state of the LUNs on the storage array.

Boot from SAN in Windows Server 2003 11


7. Setup searches for and lists all available LUNs. Since only LUN 1 has been unmasked to the
server, the LUN on which to install will be clear. Installation proceeds in two stages:
Text mode: The target LUN is formatted and partitioned, and the operating system files
are copied to the boot LUN. Once complete, the system automatically restarts and then
begins the second phase of setup.
Graphical user interface mode: Setup discovers and enumerates server hardware,
installs drivers, and finishes installation of the operating system files. The system restarts,
and the user can now log into server A, which is running Windows from the SAN.
8. Repeat step 5 for server 2, using LUN 2. Again, if steps 1-4 have been correctly carried out,
installation of the operating system will be successful. Subsequent boots from the SAN
should proceed without problem.

Adding redundancy to this basic configuration introduces a further layer of complexity. This is
discussed in the next section, Multipath Configurations.

Multipath Configurations
Using the same two-server configuration introduced in the previous section, the administrator can
add redundant HBAs, cabling, switches, controllers, and ports on each controller. This
configuration, which confers high availability and high performance to the deployment, is shown in
Figure 2.

Figure 2. A Fully Redundant Boot from SAN Configuration

Boot from SAN in Windows Server 2003 12


In order for multipath solutions to work correctly, path management software that works with
5
Windows is necessary . Follow steps 1-3 as listed above for the basic SAN configuration,
obtaining the WWNN and WWPN for each HBA and controller.
1. Configure storage, creating LUNs 1-3.
2. On each server, unmask LUNs to both HBAs:
Server 1
HBA A: controller 1 LUN 1, controller 2 LUN 1
HBA A: controller 1 LUN 1, controller 2 LUN 1
Server 2
HBA B: controller 1 LUN 2, controller 2 LUN 2
HBA B: controller 1 LUN 2, controller 2 LUN 2
3. Make sure that only one HBA has its BIOS enabled. Only one LUN can be the boot LUN for
each server.

Continue with all installation activities, as outlined in the basic SAN configuration. Note that, since
only one boot device can be exposed to the BIOS as the boot LUN, the BIOS requires a manual
reset to boot from HBA A if HBA A fails.

Crash Dump File Creation


In the event of a system or kernel software component failure, a crash dump file is created and
used to aid with diagnosis of the failure. To be created, the file must be written to the system drive
(C:). The crash dump stack (created at boot and the precursor to creation of the crash dump file)
is specific to the HBA path from which the system is booted.
This creates a difficulty in multipathing solutions, since the crash dump stack does not have
multipath drivers available. Using the example given in Figure 2, if the boot path is through HBA A
and that adapter fails, the system is no longer able to write the crash dump file, since HBA A is
not recognized by the crashdump driver. However, if the failure is transient, the system
administrator might not be aware of the problem.

Clustered Servers
When implementing a Microsoft clustering solution (MSCS) in a SAN boot environment, the
MSCS servers must keep the boot LUNs separate from the shared cluster LUNs. Whether or not
dedicated HBAs must be used to accomplish this separation depends on whether the loaded
Windows driver is SCSIport or Storport.

5
Vendors can use the Microsoft MPIO driver package to develop effective path management solutions that work with Windows. See the
white paper, Highly Available Storage: Multipathing and the Microsoft MPIO Driver Architecture, available at the storage website.

Boot from SAN in Windows Server 2003 13


SCSIport Driver
If SCSIport is installed as the HBA drivers for the servers, each server will require two HBAs, one
to expose the boot disks to and the other to expose the shared cluster disks to. This is because
clustering limits the connection of shared cluster resources to a separate bus, necessary since
bus-level resets are used within the cluster software. A reset on the port attached to the boot LUN
has the potential of disrupting paging I/O and the resulting timeout can result in a system crash.
Functionally, this does not prevent using boot from SAN capabilities in a clustered environment,
although it does require careful deployment.
Note that this basic cluster deployment, shown in Figure 3, does not provide HBA redundancy.
HBA A within server 1 accesses one of the boot LUNs and HBA B accesses the shared cluster
LUNs. Server 2 accesses the other boot LUN through HBA C. HBA D also accesses the shared
cluster LUNs. The shared cluster design allows for high application availability and service-level
failover in case a hardware component fails.
To set up the cluster servers so that each server can correctly locate its boot LUN and the shared
cluster LUNs, use either a combination of zoning and masking, or masking alone. Once zoning
and/or masking are complete, install or configure the clustering software.
Zoning + Masking. This is a two step process. First, apply zoning to the ports. In the case
where the LUNs are presented on different storage ports, this step separates the shared
cluster LUNs from the boot LUNs. Both HBA A and HBA C are zoned to share controller 1
and access boot LUNs only. HBA B and D are zoned to share controller 2 and access the
shared cluster LUNs.
The second step is to use masking to ensure that each server only has access to the
appropriate boot LUN. Since both servers share the cluster LUNs, those LUNs must be
unmasked to both nodes of the cluster. If the storage array contains additional shared
clusters, they will require zoning and masking to ensure that only the appropriate servers
access the appropriate cluster resources.
Masking Only. This method does not employ zoning techniques. While it can be successfully
adopted for clustering deployments, correct deployment is difficult unless very high quality
masking implementations are used.

Boot from SAN in Windows Server 2003 14


Figure 3. Boot from SAN and Clustering, Using the SCSIport Driver

Storport Driver
The most significant limitation to deploying a clustering solution in a SAN boot environment using
the SCSIport driver is the HBA slot limit. Since separate HBAs must be used to access boot
LUNs and shared cluster LUNs, to implement a fully redundant solution, 4 HBAs (or two dual
channel HBAs) are necessary in each server. If the server cannot accommodate 4 HBA cards, or
if the cost of obtaining those cards is too great, a high availability multipathing solution is not
possible.
The Storport driver overcomes this limitation. With Storport, given its hierarchical reset
6
capabilities , bus-level resets are rare events, eliminating the necessity for multiple HBAs to
separate boot and cluster LUNs. The basic configuration (without multipathing redundancy) is
shown in Figure 4.

6
See the Storport white paper, Storport in Windows Server 2003, for further details.

Boot from SAN in Windows Server 2003 15


This solution is much less expensive and simpler to configure. Since only a single controller and
port is used, all LUNs are visible on the port and zoning cannot be used to completely isolate
cluster LUNs from boot LUNs. Masking must be used to ensure that each server has access to
the correct boot LUN (and no access to the boot LUN of another server). Both servers will share
access to the cluster LUNs. One final step is required to enable this configuration for clustering.
(See the Microsoft Knowledge Base article 304415 for details.)

Figure 4. Boot from SAN and Clustering, Using the Storport Driver
Storport also allows miniport controllable queue management, which allows HBA vendors to build
drivers that can survive SAN transients (such as Fibre events) without crashing the system. This
is of considerable importance in cluster configurations.

Boot from SAN in Windows Server 2003 16


Directly Attached Paging Disk
A pagefile is a reserved portion of the hard disk that is used to expand the amount of virtual
memory available to applications. Paging is the process of temporarily swapping out the inactive
contents of system physical memory to hard disk until those contents are needed again. Since
the operating system must have unrestricted access to the pagefile; the pagefile is commonly
placed on the same drive as system files. Thus, the C: drive normally includes boot, system and
7
paging files .
While there is negligible contention between the boot reads and paging writes, there can be
considerable resource contention between systems on the SAN when they are all trying to do
paging I/O, or when many systems attempt to boot simultaneously from the same storage port.
One way to lessen this problem is to offload non-data I/O (such as paging, registry updates and
other boot-related information) from data I/O (created by such sources as SQL or Exchange). The
different ways to store the files are shown in Table 1.

Table 1. Various Possible Locations of the Pagefile


Case 1 Case 2 Case 3
SAN (C: ) SAN (C: ) SAN (e.g. D: )
boot boot boot
system system Local disk (C: )
pagefile Local disk (e.g. D: ) system
pagefile pagefile

iSCSI boot from SAN


Thus far, this paper has only discussed booting from SAN in a Fibre Channel interconnect
environment. Windows also supports boot from SAN using iSCSI interconnects to the SAN,
provided iSCSI HBAs are used to enable the boot process. As in Fibre Channel environments,
8
the HBAs must support INT13 BIOS extensions that enable the boot process. Boot from SAN is
not supported using the Microsoft iSCSI software initiator. See the paper, Microsoft Support for
iSCSI for further details.

7
The boot files are the files required to run the Windows operating system. The system files are the files required to boot Windows. These
files include boot.ini, Ntldr and Ntdetect. The paging file is typically called pagefile.sys.
8
INT 13 are device service routines (DSRs) that communicate with hard drives (or diskettes) before other system drivers are loaded. The
INT13 extensions enable systems to see partitions up to 2 TB, well beyond the original 7.8GB limitation of the original INT13 functionality.

Boot from SAN in Windows Server 2003 17


Troubleshooting Boot from SAN
A number of problems can arise during configuration that can result in a failure to load the
operating system. It is important to distinguish between those problems that are shared by all
types of boot, and those that are specific to boot from SAN environments.
Because correct deployment of boot from SAN depends on the user undertaking the exact vendor
steps for HBA and SAN configuration, hardware vendors must be the primary point of contact for
issues related to booting.

General Boot Problems


The most common cause of boot problems is a failure to locate the boot partition and boot files.
This can happen for a multitude of reasons, ranging from boot sector viruses to hardware failure
to configuration problems. While failure to locate the boot device can occur in any boot
environment (not simply boot from SAN), this issue is more problematic in complex storage
configurations where new devices are added and removed frequently.

Variable Device Enumeration Order


As new storage targets (such as disks or LUNs within the storage array) become available on the
SAN, the HBA controller assigns each a target ID. Although each device already has a unique
WWN assigned by the manufacturer, the Windows operating system requires that devices are
numbered, using target IDs, according to the SCSI device convention. The target IDs are
assigned to storage devices as they appear in the fabric. When a LUN is created within a target,
the fabric does not register its presence as an event; instead the HBA is responsible for notifying
Plug and Play (PnP) of its presence, or a manual rescan of disks is required (using either
diskpart or the Disk Management snap-in).
Some Fibre Channel arbitrated loop (FC-AL) configurations do not work well with boot from SAN.
A single server on one loop, accessing a disk on a SAN with a single port, works well. This is
because, with no other servers or port targets, the controller ID is 0, the desired state. If a second
device (such as another disk) is added, it is given the target ID=1, which also works effectively.
However, with a power off/on sequence (or with reinitialization of the loop following a fabric
event), the devices might not be enumerated in the same order. One unintended consequence of
such a change is that the boot device may not be addressed as expected, and the operating
system cannot load.
Although this problem can be circumvented by using HBA persistent binding (which prevents the
SCSI target ID from changing even after a system reboot), this FC-AL solution is not supported
by Microsoft.

Multiple Adapter Complexity with PnP


Although adapter devices can be successfully hot plugged and manually enumerated so that the
attached systems behave as expected, when the power cycles off and on, the devices can be re-
enumerated, possibly causing all the HBA port addresses to change. The fact that different
system vendors assign their PCI slots differently can introduce configuration problems.

Boot from SAN in Windows Server 2003 18


Potential Difficulties with SAN Boot
Boot from SAN introduces a number of specific challenges that the administrator must be aware
of to ensure that the solution works as intended.

Lack of Standardized Assignment of LUN 0 to Controller


Some vendors storage adapters automatically assign logical unit numbers (LUNs). Others
require that the storage administrator explicitly define the numbers. With parallel SCSI, the boot
LUN is LUN 0 by default.
Fibre Channel configurations must adhere to SCSI-3 storage standards. In correctly configured
arrays, LUN 0 is assigned to the controller (not to a disk device), and is accessible to all servers.
This LUN 0 assignment is part of the SCSI-3 standard, since many operating systems do not boot
unless the controller is correctly assigned as LUN 0. Correctly assigning LUN 0 to the controller
allows it to assume the critical role in discovering and reporting a list of all other LUNs available
through that adapter. In Windows, these LUNs are reported back to the kernel in response to the
SCSI REPORT LUNS command.
Unfortunately, not all vendor storage arrays comply with the standard of assigning LUN 0 to the
controller. Failure to comply with that standard means the boot process may not proceed
correctly. In some cases, even with LUN 0 correctly assigned, the boot LUN cannot be found, and
the operating system fails to load. In the following cases (without HBA LUN remapping), the
kernel finds LUN 0, but may not be successful in enumerating the LUNs correctly.

Without HBA LUN Remapping


1. The kernel finds LUN 0 (the controller) and sends it a Report LUNs command.
2. The storage array controller:
a) Correctly interprets Report LUNs, and returns a LUN list for each HBA (HBA A: LUN 1,
LUN 3; HBA B: LUN 2)
Each server can boot from its assigned boot LUN
b) Does NOT correctly interpret this command (most likely because SCSI-3 standards were
not followed). A LUN list is not produced for each HBA.
The Windows kernel attempts further LUN discovery using a sequential discovery
algorithm, which starts searching beginning with LUN 0 and increments
sequentiallyi.e. the next LUN is LUN 1.
LUN 1 is not found (because it is masked)
No further discovery attempts are made, since the algorithm essentially returns no
more LUNs.
LUNs 2 and 3 are NOT found; neither server 1 nor 2 can boot.

These problems can be solved by implementing HBA-based LUN mapping, described in the next
topic. HBA mapping must be available on the HBA controller.

Boot from SAN in Windows Server 2003 19


With HBA LUN Mapping
HBA LUN mapping can correct the problems that arise when LUN 0 is not assigned to the
storage controller. Suppose the boot disk for server A is LUN 1. The HBA can remap this disk,
changing its identity from LUN 1 to LUN 0, which is a bootable disk.
Hence in the prior example, for HBA A, LUN 1 is remapped to LUN 0, and HBA B LUN 2 is also
remapped as LUN 0. (Because each LUN is masked from the other server, the fact that the
numbers are the same does not matter.)

HBA Configuration is Complex


The setup routine of each HBA boot ROM must be individually configured for boot from SAN
solutions to work. For each adapter, the correct disks must be allocated, and the boot disk or LUN
must be correctly identified.

Too Many Systems Booting From the Array


The number of servers that can reliably boot from a single fabric connection is limited. If too many
servers send I/Os at the same time, the link can become saturated, delaying the boot for the
server that is attempting to boot. If this condition persists for too long, the requests will time out
and the system can crash.
9
The actual number of systems that can boot from a single fabric connection is vendor specific .

Current Limitations to Windows Boot from SAN


There are a number of advanced scenarios that are not currently possible in Windows boot from
SAN environments.

No Shared Boot Images


Windows servers cannot currently share a boot image. Each server requires its own dedicated
LUN to boot.

Mass Deployment of Boot Images Requires ADS


Windows does not currently support en masse distribution of boot images. While cloning of boot
images could help here, Windows does not have the tools for distribution of these images. In
enterprise configurations, however, Windows Automated Deployment System (ADS) can help.

9
Switch and storage controllers both have limitations.

Boot from SAN in Windows Server 2003 20


Summary
This paper introduces the boot from SAN technology in the Windows environment. Boot from
SAN simplifies the adoption of diskless server technologies, and simplifies storage management
by facilitating a centralized approach to operating system installation and booting processes. This
paper describes a number of boot from SAN scenarios supported in the Windows environment,
including multipathing and clustering, and offers critical troubleshooting information to help ensure
successful deployment. The paper includes an appendix of the boot process to aid understanding
of the SAN boot process.

Boot from SAN in Windows Server 2003 21


Additional Resources
Click the technology links to obtain these white papers, all available through the Microsoft
Windows storage portal (http://go.microsoft.com/fwlink/?LinkId=18974)
Microsoft Support for iSCSI, August 2003.
Highly Available Storage: Multipathing and the Microsoft MPIO Driver Architecture, October
2003.
Storport in Windows Server 2003: Improving Manageability and Performance in Hardware
RAID and Storage Area Networks, December 2003.

See also the Knowledge Base Articles:


Support for Booting from a Storage Area Network (SAN)
(http://go.microsoft.com/fwlink/?LinkId=22265)
Support for Multiple Clusters Attached to the same SAN Device,
(http://go.microsoft.com/fwlink/?LinkId=22266)

Boot from SAN in Windows Server 2003 22


Appendix

The Boot Process: Details


The boot (or bootstrapping) process starts with the execution of the shortest and simplest code
necessary to begin the boot process, and successively accesses and executes more complex
code. The following sections outline the high level details of the boot process for 32-bit (x86)
architecture (differences with the Intel IA-64 architecture are discussed in the section following).
The boot process is the same whether or not the boot occurs from a direct-attached disk or from
a disk on a SAN.

Pre-boot
The pre-boot process is common to all operating systems. During this stage of the process, the
following steps occur:
POST. The system BIOS (stored on read-only memory chips) performs a power-on self test
to ensure that there are no problems with the hardware, such as voltage irregularities or hard
disk failure. If the hardware is working correctly, the CPU can begin operations
The BIOS locates and initializes all bootable devices. The BIOS first locates all add-in
devices (such as network interface cards and host bus adapters), as well as the local system
hard and floppy drives, then it determines which devices are bootable.
The BIOS sets the boot device. Although multiple devices are potentially able to supply the
boot files (including multiple hard drives if the BIOS provides multi-boot support), the boot
device actually used is either the first bootable device found (the default), or is set by the user
in the case of multi-boot capable systems. The BIOS gives this device the address drive=80,
which is the boot drive. (Note that, in configurations with multiple adapters, the order in which
the devices are enumerated becomes critical to this determination because the BIOS
assigns, by default, the first bootable drive it finds as the boot device.)
Load boot sector. Having assigned the device from which the system will boot, the system
will then search that device for the boot sector. All x86 systems require that the first sector of
the primary disk contain the Master Boot Record (MBR). The MBR contains the system
partition that contains the code and configuration files (Ntldr, boot.ini, and Ntdetect.com)
necessary to boot Windows. This partition must be set as active (bootable) in order to
proceed. Once the boot sector is loaded into memory, it can execute the next steps of the
boot process.

Boot Sequence
The boot sequence described in this section is Windows specific. The file Ntldr controls much of
this process. Once control is passed to Ntoskrnl.exe, the boot process is nearly complete.
1. Initial boot loader phase. The boot sector loads the Ntldr file, which begins loading the
operating system in a series of phases, the first of which is the initial boot loader phase.
During this phase, the Ntldr code enables the system to access all physical memory
(protected-mode). Prior to this, only the first 1 MB of system memory was available (real-
mode). At this point Ntldr also enables paging, which is the normal mode of Windows
operation.

Boot from SAN in Windows Server 2003 23


2. Selection of the operating system. In the next phase, Ntldr loads the boot.ini file, which tells
Ntldr where the operating system kernel, registry, and device drivers reside. The boot.ini file
locates the boot files using either an ARC (Advanced RISC computing) path or disk
signatures.
ARC Path. The ARC path, used to locate system files, may need to be modified if the system
is shut down and more hardware is added. The format of the ARC path is either:
multi (n) disk (n) rdisk (n) partition (n) \systemroot
or scsi (n) disk (n) rdisk (n) partition (n) \systemroot
where
multi (n): indicates a multifunction adapter (such as IDE), or a SCSI adapter (NIC or
HBA) with an onboard BIOS
scsi (n): indicates the device is a legacy SCSI device with no onboard BIOS
rdisk (n): used with legacy SCSI devices, indicates the target addresses of the disks on
the controller (0-7). The boot disk is normally assigned as rdisk (0). For IDE, this value is
0.
disk (n): for IDE, indicates which disk at the target address; for SCSI, indicates the logical
unit number (LUN). It is used with controllers that support a master/slave disk
configuration.
partition (n): indicates the partition upon which the boot information resides.
By default, the system root in Windows 2000 and later is \Windows. For NT4, it is
WINNT.
3. Disk Signature. Rather than using the ARC path to locate the disk upon which the boot files
reside, the disk signature, a unique 32 bit number, can be used to identify each disk. The
format for signatures is: signature(abcdefg).
4. Hardware detection. Ntldr loads Ntdetect.com, which uses the system BIOS to query the
computer for additional information, such as the machine ID, bus/adapter type, the number
and sizes of disk drives, and ports. This information will later be recorded in the registry.
5. Kernel initialization. Ntldr loads the files from the boot partition necessary for kernel
initialization. Included among these are the kernel (typically ntoskrnl.exe), the Hardware
Abstraction Layer (typically HAL.DLL), file system drivers and any device drivers necessary
to boot the system. Control then passes to Ntoskrnl.exe, which must also successfully locate
the boot disk to update the registry with driver changes.
6. User mode. Once these phases are complete and the remaining system driver files are
loaded, the Session Manager Subsystem (SMSS) is loaded, and in turn, loads files
necessary to create the user mode interface. If the boot has been successful, the user can
log in.

Boot from SAN in Windows Server 2003 24


Intel IA-64 Architecture Differences
Windows Server 2003, using the EFI (Extensible Firmware Interface) BIOS, supports 64 bit
addressing, enabling hardware with 64 bit capabilities to realize full performance improvements.
Bootable HBAs must be tested with IA-64 hardware. In certain configurations (see the scenario
section), the Storport driver is the recommended HBA driver. In contrast to IA-32 design,
The EFI BIOS can boot from any device. The drive address does not use the INT13
mechanisms previously described.
Drives are partitioned with the GPT (GUID Partition Tables).
The file EFIloader, rather than NTldr, is used to load the information gathered in steps 4-5.
Hardware detection is accomplished by the firmware, rather than by the software-based file
Ntdetect. The hardware detection is actually easier, since a device path (WWN and real
eight bit LUN numbers) and disk signatures are used, ensuring that the solution works
correctly.

Windows Server System is the comprehensive, integrated


server software that simplifies the development, deployment, and operation of
agile business solutions.
www.microsoft.com/windowsserversystem

The information contained in this document represents the current view of Microsoft Corporation on the issues discussed as of the date of
publication. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of
Microsoft, and Microsoft cannot guarantee the accuracy of any information presented after the date of publication.
This white paper is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS
DOCUMENT.
Complying with all applicable copyright laws is the responsibility of the user. Without limiting the rights under copyright, no part of this
document may be reproduced, stored in, or introduced into a retrieval system, or transmitted in any form or by any means (electronic,
mechanical, photocopying, recording, or otherwise), or for any purpose, without the express written permission of Microsoft Corporation.
Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual property rights covering subject matter in this
document. Except as expressly provided in any written license agreement from Microsoft, the furnishing of this document does not give you
any license to these patents, trademarks, copyrights, or other intellectual property.
2001 Microsoft Corporation. All rights reserved.
Microsoft, Windows 2000 Server, and Windows Server 2003 are either registered trademarks or trademarks of Microsoft Corporation in the
United States and/or other countries.

Boot from SAN in Windows Server 2003 25

You might also like