Nielsen c03.tex V4 - 07/21/2009 12:07pm Page 72
Part I Laying the Foundation
■ Normalization can be summed up as the three ‘‘Rules of One’’: one group of items = one
table, one item = one row, one fact = one column.
■ Generalization is the buffer against normalization over-complexity.
With smart database design and normalization as a foundation, the next few chapters move into
installing SQL Server, connecting clients, and using the tools.
72
www.getcoolebook.com
Nielsen c04.tex V4 - 07/23/2009 1:58pm Page 73
Installing SQL Server
2008
IN THIS CHAPTER
Server-hardware
recommendations
Planning an installation
Installing multiple instances of
SQL Server
Upgrading from previous
versions of SQL Server
Migrating to SQL Server
T
he actual process of installing SQL Server is relatively easy; the trick is
planning and configuring the server to meet the current and f uture needs
of a production environment — planning the hardware, selecting the
operating system, choosing the collation, and several other decisions should be
settled prior to the SQL Server installation.
Not every SQL Server 2008 server will be a fresh installation. SQL Server 2000
and SQL Server 2005 servers can be upgraded to SQL Server 2008. Additionally,
thedatamightresideinaforeigndatabase, such as Microsoft Access, MySQL, or
Oracle, and the project might involve porting the database to SQL Server.
Not every SQL Server 2008 server will run production databases — there
are developer sandbox servers, quality test servers, integration test servers,
performance test servers, and the list goes on. This chapter discusses all these
situations to help you avoid surprises.
Selecting Server Hardware
The value per dollar for hardware has improved significantly and continues to do
so. Nevertheless, large datacenters can still cost hundreds of thousands of dollars.
This section provides some design guidelines for planning a server.
CPU planning
SQL Server needs plenty of raw CPU horsepower. Fortunately, the newer crop
of CPUs perform very well and today’s servers use multi-core CPUs. Microsoft
licenses SQL Server by the CPU socket, not the number of cores. Comparing the
dropping price of multi-core CPUs with the license cost of SQL Server, it makes
sense to buy the most cores possible per socket. When planning your server, note
the following:
73
www.getcoolebook.com
Nielsen c04.tex V4 - 07/23/2009 1:58pm Page 74
Part I Laying the Foundation
■ As a beginning point for planning, I recommend one CPU core per 500 transactions per
second. Of course, you should test your application to determine the number of transactions a
core can provide.
■ A well-planned server will have CPUs running at 30%–50% utilization, as reflected by
Performance Monitor.
■ I also strongly recommend using 64-bit CPUs for their large memory addressing. If the server
will see high transactions (>10K per second), then choose Itanium 64-bit CPUs because they
have better throughput than x64 CPUs.
What’s New with SQL Server Setup?
I
f you’ve been installing SQL Server servers for a while, the first thing you’ll see is that the setup is
brand-new. It’s been completely rewritten from the ground up.
If running the Surface Area Configuration tool was part of your SQL Server 2005 installation process, you’ll
notice that it’s gone, replaced with Policy-Based Management policies.
Copious memory
Memory is a magic elixir for SQL Server. Any time the data is already in cache it’s a big win for per-
formance. Balance the performance of the CPUs, memory, and disk subsystems, but focus on memory.
More memory will reduce the I/O requirement and thus also reduce the CPU requirement.
When planning server memory, I recommend the following:
■ The easy answer is to b uy as much memory as you can afford. SQL Server consumes memory
for cached query execution plans and cached data pages, so the amount of memory needed
isn’t based on the size of the database but on the number of queries. I recommend using this
formula as a baseline for required memory: 2 Gb for the OS and SQL Server, plus 1 Gb per
1,000 queries per second. Of course, this greatly depends on the complexity of the query and
the type of index access. An efficiently designed database (great schema, queries, and indexing)
can support more queries per gigabyte than a poorly designed database.
■ If the amount of memory will eventually exceed 4 Gb, I also strongly recommend using 64-bit
versions of the o perating system and SQL Server because the memory addressing is so much
smoother than the 32-bit AWE solution.
To enable AWE, SQL Server 2008 must run under an account that has the Lock Pages in
Memory option turned on and the AWE Enabled option set to 1 using
sp_configure.
Disk-drive subsystems
The disk subsystem is critical for both performance and availability. Here are some guidelines for
planning the disk subsystem:
■ The scalability bottleneck is typically the disk subsystem throughput. If you can use a storage
area network (SAN) for your disk subsystem, do so. A properly configured SAN will scale
74
www.getcoolebook.com
Nielsen c04.tex V4 - 07/23/2009 1:58pm Page 75
Installing SQL Server 2008 4
further than local disk subsystems. SANs offer four significant benefits: They spread the files
across several disk spindles; they use a high-speed fiber optic connection; they typically
include a very large RAM buffer to absorb bursts o f traffic; and SANs can usually perform a
hardware level snapshot backup and restore.
The cons are that SANs cost 40–50 times as much as local disk space and they are very
difficult to configure and tune, so encourage the SAN administrator to focus on the database
requirements and carefully configure the database LUNs (Logical Unit Number — similar to
a virtual drive) so the database isn’t lost in the organization’s common file traffic. This can
be very difficult to do, especially when file server and database traffic are combined on the
same SAN.
■ Never, never, ever try to use iSCSI devices that connect the server and the disk subsystem
using Ethernet. The Ethernet simply won’t keep up and the TCP/IP stack possessing will
consume CPU cycles. It’s a sure way to waste a lot of time (trust me).
■ Watch the prices for the new solid-state drives (SSD) and move to them as soon as it’s afford-
able. SSD drives will dramatically improve both database performance and availability. Even
if you have a SAN, I’d use a local SSD drive for the database transaction log,
tempdb and its
transaction log.
If you aren’t using a SAN, here are my recommendations for configuring local direct attached storage
(DAS). Each DAS disk subsystem has its own disk controller:
■ Using one large R AID 5 disk array and placing all the files on the array may be easy to con-
figure, but it will cost performance. The goal of the disk subsystem is more than redundancy.
You want to separate different files onto dedicated disks for specific purposes.
■ SATA drives don’t wait for a write to complete before telling Windows they’re finished with
the task. While this might be great for a PC, it shortcuts the SQL Server write-ahead transac-
tion log verification and compromises data durability. Don’t use SATA drives for production,
use SCSI drives.
■ The goal for database disk subsystems is not to use the largest disk available, but to use more
spindles. Using four 36GB drives is far better than a single 146GB drive. More spindles is
always better than fewer spindles. If a byte is striped across 8 drives, then the controller can
read the entire byte in one-eighth of the time it wouldtakeifthebytewereonasingledrive.
Use RAID striping, multiple filegroups, or a SAN to spread the load across multiple spindles.
■ When choosing drives, choose the highest spindle speed and throughput you can afford.
■ SQL Server is optimized to read and write sequentially from the disk subsystem for b oth data
files and transaction logs, so use RAID 1 (mirrored) or RAID 10 (mirrored and striped), which
is also optimized for sequential operations, rather than RAID 5, which is better for random
access.
■ While software options are available to provide behavior similar to RAID, they are not as
efficient as RAID-specific hardware solutions. The software solutions tie up CPU cycles to
perform the RAID activities that could be used for server processing. Don’t use software RAID
for a production SQL Server.
■ The transaction log for any database that sees a significant value of writes should be on a
dedicated DAS so that the heads can stay near the end of the transaction log without moving
to other files. In addition, be sure to put enough memory into the disk controller to buffer a
burst of transactions, but ensure that the disk controller b uffer has an on-board battery.
75
www.getcoolebook.com
Nielsen c04.tex V4 - 07/23/2009 1:58pm Page 76
Part I Laying the Foundation
■ SQL Server adds additional threads to handle additional data files, so it’s far better to use three
data files on three DAS subsystems than a single larger file. Using multiple files to spread the
load across multiple drives is better than manually using multiple filegroups to separate tables.
■ SQL Server’s Query Processor makes heavy use of
tempdb. The best disk optimization you
can do is to dedicate a DAS to
tempdb and, of course, another disk to tempdb’s transac-
tion log. Placing
tempdb on multiple files across multiple DAS disk subsystems is another
good idea.
■ Windows wants to have a quick swap file. Regardless of how much physical memory is in the
server, configure a large swap file and place it on a dedicated disk subsystem.
■ To recap scaling out a non-SAN disk subsystem, Table 4-1 lists one possible configuration of
disk subsystems. Each drive letter might actually be configured for multiple striped drives.
They’re listed by priority — for example, if you only have four drive subsystems, then break
out the transaction log, first data file, and
tempdb transaction log.
TABLE 4-1
Scaling Non-SAN Disk Subsystems
Logical Drive Purpose
C: Windows system and SQL Server executables
D: Transaction log
E: First data file
F: tempdb transaction log
G: tempdb data file
H: Windows swap file
I: Additional data files
RAID Illustrated
R
AID stands for Redundant Array of Independent/Inexpensive Disks. It is a category of disk drives
that utilizes two or more drives in combination for increased performance and fault tolerance. RAID
applications are typically found in high-performance disk systems utilized by servers to improve the
persistence and availability of data. Table 4-2 describes the various levels of RAID.
Network performance
Typical motherboards today include built-in network interface cards (NICs) capable of auto-switching
between 10/100/1,000Mbps. As with most built-in devices, these tend to utilize the CPU for required
processing, which affects performance. A variety of manufacturers today offer NIC cards that include
onboard TCP/IP stack possessing, freeing up those tasks from the CPU. This improves overall network
performance while reducing the CPU load, and I highly recommended them.
76
www.getcoolebook.com
Nielsen c04.tex V4 - 07/23/2009 1:58pm Page 77
Installing SQL Server 2008 4
TABLE 4-2
RAID Levels
RAID Level Diagram Redundancy Percentage
(percent of disks dedicated
to redundancy)
Description
JBOD 0%
Just a Bunch of Disks
— Each extra
disk extends the storage as if the disk
were replaced with a larger one.
0 0%
Data striping
— Data is spread out
across multiple drives, speeding up
data writes and reads. No parity,
redundancy, or fault tolerance is
available.
1 50%
Data mirroring
— Data is written to
two drives and read from either drive,
providing better fault tolerance.
5
Parity
1/(n-1)
For example, if the RAID
array has five drives,
then
1
4
, or 25% of the
array is used for
redundancy.
Data striping with a parity bit written
to one of the drives. Because of the
parity bit, any single drive can fail
and the disk subsystem can still
function. When the failed disk drive is
replaced, the disk subsystem can
recreate the data on the failed drive it
contained.
6
Parity Parity
Depends on the number
of drives (e.g., if ten
drives and the last two
are used for parity, then
25%)
Or use a formula: 2/(n-2)
RAID 6 is similar to RAID 5, except
the parity bit is mirrored to two
drives, so any data drive and one of
the parity drives could fail
simultaneously and the drive
subsystem could continue to function.
10 50%
Mirrored striped drives
— These offer
the speed of data striping and the
protection of data Mirroring.
77
www.getcoolebook.com
Nielsen c04.tex V4 - 07/23/2009 1:58pm Page 78
Part I Laying the Foundation
Preparing the Server
With the server in place, it’s time to configure the operating system and set up the service accounts.
Dedicated server
I strongly recommend running any production SQL Server on a dedicated server — don’t use the server
for file or printer services, and no exchange, and no IIS — for two reasons:
■ Availability: Multiple services running on the same server increases how often the server
requires service pack installations, adjustments, and possibly rebooting. A dedicated SQL
Server installation increases database availability.
■ Economics: SQL Server is both resource intensive and expensive. SQL Server eats memory
for lunch. It doesn’t make economic sense to share the server resources between the expensive
SQL Server license and other less expensive software.
When to Consolidate and Virtualize?
S
erver consolidation and virtual servers is a hot topic in the IT world. Running several logical servers on
one large box can be cost effective and attractive in a spreadsheet. The question is, when is it a good
idea for SQL Server?
■ CTP testing: Personally, I’m a big fan of Microsoft Virtual PC and I often build up a
SQL Server instance using VPC. Until the SQL Server 2008 RTM version is available,
I only run Katmai CTP builds in a VPC. (I wrote this entire book with Katmai CTPs
running on a VPC full-screen on one monitor and Word running on the host operating
system on a second monitor.) Virtual PC is so easy to set up, and so safe, it doesn’t
make any sense to run a pre-RTM version of any software on the host operating
system.
■ Developer sandboxes: Developer Edition is so inexpensive that I recommend all
developers have a license for Developer Edition on their own machine, rather than use
a virtualized server.
■ QA, test, integration testing, and pre-prod servers: Every organization has its own
way of pushing code from the developer to production, or from the vendor to pro-
duction. Because these applications demand strict application compatibility, but not
performance compatibility, VPCs are perfect for this type of testing.
■ Service pack testing: A virtual server is a great way to build up test servers for these
configurations, or for testing Microsoft service packs.
■ Performance testing: Depending on the scope of performance testing, a virtual server
may be able to show performance gains for a new rev of the database code; but if the
continued
78
www.getcoolebook.com
Nielsen c04.tex V4 - 07/23/2009 1:58pm Page 79
Installing SQL Server 2008 4
continued
performance test is to prove the database can handle
n
tps, then the performance test
environment needs to be identical to production and that counts out virtualization.
■ Production: With Windows Server 2008 and HyperV, running multiple SQL Server
servers in a virtualized environment for production is a very reasonable alternative.
The other obvious option of SQL Server consolidation is running multiple SQL Server instances on the same
physical server. This still enables each instance to have its own server settings and service pack level. My
primary concern with multiple instances is that I want to avoid any resource contention.
Operating system
SQL Server 2008 installs and runs on various operating systems — from Windows XP to Windows
Server 2003 Enterprise Edition, with the more feature-rich versions running on the higher-end operating
systems. Appendix A, ‘‘SQL Server 2008 Specs,’’ includes a table listing the supported operating systems
by edition.
Service accounts
The SQL Server services require Windows login accounts to run and access the file system. It’s possible
to allow SQL Server to run using the local servic e account, but creating a specific Windows user account
for SQL Server services provides better security and reliability. You can configure these accounts with
the required minimum set of permissions (user, not administrator) and access to the data files.
The accounts can be specified independently during installation by selecting the ‘‘Customize for each
service account’’ option. By default, SQL Server, SQL Server Agent, Analysis Server, and SQL Browser
share the same login account. Ensure that the assigned Windows login account for each service has
the appropriate file and resource permissions. Each login account and service relationship is listed in
Table 4-3.
If the installation will include servers that will communicate and perform distributed queries or r eplica-
tion, then the login account must be a domain-level account.
Server instances
SQL Server 2008 Enterprise Edition supports up to 50 instances of SQL Server running on the same
physical server, including instances of different editions (Enterprise, Standard, or Developer). The advan-
tage of multiple instances is that each instance can have its own server-level configuration and service
pack level.
Using multiple instances of SQL Server to provide multiple databases on the same server
negatively affects performance. Each instance requires its own resources as well as CPU
cycles to handle requests. While using a multi-core processor could mitigate the performance issues to
an extent, using a large virtualized server or a single SQL Server to handle multiple databases is the best
solution.
79
www.getcoolebook.com
Nielsen c04.tex V4 - 07/23/2009 1:58pm Page 80
Part I Laying the Foundation
TABLE 4-3
Startup Accounts for SQL Server Services
SQL Server Service
Name
Default Account Optional Accounts
SQL Server SQL Express on Windows 2000–local
system
SQL Express on all other operating
systems–network service
All other editions on all operating
systems–domain user
SQL Express–domain user, local
system, network service
All other editions–domain user,
local system, network service
SQL Server Agent Domain user Domain user, local system,
network service
Analysis Services Domain user Domain user, local system,
network service, local service
Reporting Services Domain user Domain user, local system,
network service, local service
Integration
Services
Windows 2000–local system
All other operating systems–network
service
Domain user, local system,
network service, local service
Full-Text Search Same as SQL Server Domain user, local system,
network service, local service
SQL Server
Browser
SQL Express on Windows 2000–local
system
SQL Express on all other operating
systems–local service
All other editions on all operating
systems–domain user
Domain user, local system,
network service, local service
SQL Server Active
Directory Helper
Network service Local system, network service
SQL Writer Local system Local system
The default location for SQL Server and associated files will be similar to the following:
C:\Program Files\Microsoft SQL Server\MSSQL.#
An instance can be installed as the default instance(withthesamenameastheserver)oranamed
instance (with the name as servername\instancename). I recommend installing only named instances on a
production server as a security measure. Depending on the hack, the hacker needs to know the instance
name as well as the server name.
80
www.getcoolebook.com
Nielsen c04.tex V4 - 07/23/2009 1:58pm Page 81
Installing SQL Server 2008 4
Not all installed services are shared among the multiple instances. Table 4-4 shows a list of shared ver-
sus instance services. Instance-specific services will have their own installed components.
TABLE 4-4
Shared SQL Server Services
Service Shared Instance Specific?
SQL Browser Yes
SQL Server Active Directory Helper Yes
SQL Writer Yes
SQL Server Yes
SQL Server Agent Yes
Analysis Services Yes
Report Services Yes
Full-Text Search Yes
Performing the Installation
Once the installation plan has been created and the server is set up to meet the SQL Server
requirements, it is time to install the software.
Setup.exe opens the SQL Server Installation Center, shown in Figure 4-1, which brings together into
one UI a broad collection of installation utilities and resources — from the planning stage to advanced
options.
In the Planning page there’s an option to install the Upgrade Advisor. If you’re upgrading an existing
database to SQL Server, I recommend running this utility on the existing database to check for any
potential issues.
By default, the Installation Center will choose the highest CPU level available on the server (x86, x64,
IA-64). If you want to install a different CPU version of SQL Server, the Installation Center’s Options
page is the only location for this option.
Attended installations
A new installation is initiated from the Installation page ➪ ‘‘New SQL Server stand-alone installation or
add features to an existing installation.’’ The installation process moves through several pages according
to the type of installation and components selected.
Errors, or missing items, appear at the bottom of the page with red Xs. You cannot progress past any
page with an error.
81
www.getcoolebook.com