Tải bản đầy đủ (.pdf) (80 trang)

Thinking in C# phần 5 pps

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (1002.86 KB, 80 trang )


342 Thinking in C# www.ThinkingIn.NET
.NET to program Windows Forms, it will place all the code relating to
constructing the user-interface into a method called InitializeComponent( );
this method may be hundreds of lines long, but it contains no control-flow
operators, so it’s length is irrelevant. On the other hand, the 15 lines of this leap
year calculation are about as complex as is acceptable:
//:c09:LeapYearCalc.cs
using System;

class LeapYearCalc {
static bool LeapYear(int year){
if (year % 4 != 0) {
return false;
} else {
if (year % 400 == 0) {
return true;
} else {
if (year % 100 == 0) {
return false;
} else {
return true;
}
}
}
}

public static void Test(int year, bool val){
if (val == LeapYear(year)) {
Console.WriteLine(
"{0} correctly calced as {1}", year, val);


return;
}
throw new TestFailedException(
String.Format("{0} not calc'ed as {1}", year, val) );
}

public static void Main(){
Test(1999, false);
Test(2000, true);
Test(1900, false);
}

Chapter 9: Coupling and Cohesion 343
}

class TestFailedException : ApplicationException{
public TestFailedException(String s): base(s){ }
}///:~

Some simple testing code is shown because, less than a month before this book
went to press, we found a bug in the LeapYearCalc( ) function had! So maybe
the 15 lines in that function are a little more complex than allowable…
Make stuff as private as possible
Now that we’ve introduced the concept of coupling and cohesion, the use of the
visibility modifiers in C# should be more compelling. The more visible a piece of
data, the more available it is to be used for common coupling or communicational
and worse forms of cohesion.
The very real advantages that come from object-orientation, C#, and the .NET
Framework do not derive from the noun.Verb( ) form of method calls or from
using brackets to specify scope. The success of the object-oriented paradigm

stems from encapsulation, the logical organization of data and behavior with
restricted access. Coupling and cohesion are more precise terms to discuss the
benefits of encapsulation, but class interfaces, inheritance, the visibility
modifiers, and Properties – the purpose of all of these things is to hide a large
number of implementation details while simultaneously providing functionality
and extensibility.
Why do details need to be hidden? For the original programmer, details that are
out of sight are out of mind, and the programmer frees some amount of his or her
finite mental resources for work on the next issue. More importantly than this,
though, details need to be hidden so software can be tested, modified, and
extended. Programming is a task that is characterized by continuously
overcoming failure: a missed semicolon at the end of a line, a typo in a name, a
method that fails a unit test, a clumsy design, a customer who says “this isn’t
what I wanted.” So as a programmer you are always revisiting existing work,
whether it’s three minutes, three weeks, or three years old. Your productivity as a
professional programmer is not governed by how fast you can create, it is
governed by how fast you can fix. And the speed with which you can fix things is
influenced by the number of details that must be characterized as relevant or
irrelevant. Objects localize and isolate details.

344 Thinking in C# www.MindView.net
Coupling, cohesion,
and design trends
Coupling and cohesion, popularized by Ed Yourdon and Larry Constantine way
back in the 1970s, are still the best touchstones for determining whether a
method or type is built well or poorly. The most important software engineering
book of the 1990s was Design Patterns: Elements of Reusable Object-Oriented
Software (Addison-Wesley, 1995) by Erich Gamma, Richard Helm, Ralph
Johnson, and John Vlissides (the “Gang of Four”). What really set Design
Patterns apart is that it was based on an archaeological approach to design;

instead of putting their no-doubt-clever heads together and saying “Here’s a new
way to solve this problem,” the book documents common structures and
interactions (design patterns) that they found in proven software systems. When
compared to other object-oriented design books, what leaps out about Design
Patterns is the complete lack of references to objects that correspond to physical
items in the real world and the recurring emphasis of techniques to decrease
coupling and increase cohesion.
An interesting question is whether low coupling and high cohesion are a cause of
good design or a consequence of it. The traditional view has been that they are a
consequence of design: you go into your cubicle, fire up your CASE tool, think
deep thoughts, and emerge with a set of diagrams that will wow the crowds at the
design review. This view is challenged by one of the better books of the past few
years: Martin Fowler’s Refactoring: Improving the Design of Existing Code
(Addison-Wesley, 1999). This book makes the fairly radical claim that taking
“simple, even simplistic” steps on existing code, no matter how chaotic, leads to
good design. Fowler goes even further and points out that without refactoring,
the design of a system decays over time as the system is maintained; this is one of
those obvious-in-retrospect observations that invalidates an entire worldview, in
this case, the worldview that design is done with a diagramming tool and a blank
piece of paper.
Refactoring is changing the internal structure of your code without changing its
internal behavior; Fowler presents a suite of refactorings and “code smells” to
indicate when refactoring is needed. The book doesn’t explicitly address issues of

Chapter 9: Coupling and Cohesion 345
coupling and cohesion
5
, but when viewed through the lens of structured design,
refactoring is clearly driven by these concerns.
Summary

Any software project of more than a few hundred lines of code should be
organized by a principle. This principle is called the software’s architecture. The
word architecture is used in many ways in computing; software architecture is a
characteristic of code structure and data flows between those structures. There
are many proven software architectures; object-orientation was originally
developed to aid in simulation architectures but the benefits of objects are by no
means limited to simulations.
Many modern-day projects are complex enough that it is appropriate to
distinguish between the architecture of the overall systems and the architecture
of different subsystems. The most prevalent examples of this are Web-based
systems with rich clients, where the system as a whole is often an n-tier
architecture, but each tier is a significant project in itself with its own organizing
principle.
Where the aims of architecture are strategic and organizational, the aims of
software design are tactical and pragmatic. The purpose of software design is to
iteratively deliver client value as inexpensively as possible. The most important
word in that previous sentence is “iteratively.” You may fool yourself into
believing that design, tests, and refactoring are wastes of time on the current
iteration, but you can’t pretend that they are a waste of time if you accept that
whatever you’re working on is likely to be revisited every three months, especially
if you realize that if you don’t make things clear, they’re going to be going to be
calling you at 3 o’clock in the morning when the Hong Kong office says the
system has frozen
6
.
Software design decisions, which run the gamut from the parameters of a method
to the structure of a namespace, are best made by consideration of the principles
of coupling and cohesion. Coupling is the degree to which two software elements
are interdependent; cohesion is a reflection of a software element’s internal



5
Like Extreme Programming, another excellent recent book, Refactoring promotes
homespun phrases like “code smells” and “the rule of three” that are no more or less
exclusionary than the software engineering jargon they pointedly avoid.
6
Actually, they’ll call the IT guys first. That’s why it’s important to cultivate the perception
that you know absolutely nothing about system administration and hardware.

346 Thinking in C# www.ThinkingIn.NET
dependencies. Good software designs are characterized by loose coupling and
high cohesion. With the rise of object orientation, the word “encapsulation” has
come to be used to characterize all of the benefits of detail hiding, high cohesion,
and loose coupling.
At this halfway point in the book, we have covered C# as a language and the
concepts of object-orientation. However, we’ve hardly scratched the surface of
the .NET Framework SDK, hundreds of classes and namespaces that provide an
object-oriented view of everything from data structures to user-interfaces to the
World Wide Web. From hereon out, the concerns of the book are generally less
specific to the C# language per se and more generally applicable to the
capabilities that the .NET Framework would make available to any language. This
does not mean that we’ve exhausted our discussion of the C# language, however.
Some of the most interesting aspects of the C# language are yet to be introduced.
Exercises
1. Try pair programming on one of the problems in the party domain. Try to
reserve judgment until you've paired with programmers who are more,
less, and similarly experienced.
2. Read Appendix C, “Test-First Programming with NUnit” and tackle a
simple task in the party domain via test-first programming.
3. Write a one-page essay evaluating your personal experience with pair

and test-first programming.
4. Fill in the following Venn diagram comparing aspects of software
development with physical architecture.

Chapter 9: Coupling and Cohesion 347
Software
Development
Architecture
Shared

5. Write a one-page essay defending or refuting the statement “Software is
architecture.”
6. The hardware manufacturers are thrilled with your work with the robotic
party servant and want you to lead the development of all the robot's
behavioral software. What kind of architecture will you adopt? Why?
7. Evaluate your party servant system. Use everything that you have learned
to improve your design and implementation.


349
10: Collecting
Your Objects
It’s a fairly simple program that has only a fixed quantity of
objects with known lifetimes.
In general, your programs will always be creating new objects based on some
criteria that will be known only at the time the program is running. You won’t know
until run-time the quantity or even the exact type of the objects you need. To solve
the general programming problem, you need to be able to create any number of
objects, anytime, anywhere. So you can’t rely on creating a named reference to hold
each one of your objects:

MyType myObject;

since you’ll never know how many of these you’ll actually need.
To solve this rather essential problem, C# has several ways to hold objects (or
rather, references to objects). The built-in type is the array, which has been
discussed before. Also, the C# System.Collections namespace has a reasonably
complete set of container classes (also known as collection classes). Containers
provide sophisticated ways to hold and manipulate your objects.
Containers open the door to the world of computing with data structures, where
amazing results can be achieved by manipulating the abstract geometry of trees,
vector spaces, and hyperplanes. While data structure programming lies outside of
the workaday world of most programmers, it is very important in scientific,
graphic, and game programming.
Arrays
Most of the necessary introduction to arrays was covered in Chapter 5, which
showed how you define and initialize an array. Holding objects is the focus of this
chapter, and an array is just one way to hold objects. But there is a number of other
ways to hold objects, so what makes an array special?
There are two issues that distinguish arrays from other types of containers:
efficiency and type. The array is the most efficient way that C# provides to store

350 Thinking in C# www.ThinkingIn.NET
and randomly access a sequence of objects (actually, object references). The array is
a simple linear sequence, which makes element access fast, but you pay for this
speed: when you create an array object, its size is fixed and cannot be changed for
the lifetime of that array object. You might suggest creating an array of a particular
size and then, if you run out of space, creating a new one and moving all the
references from the old one to the new one. This is the behavior of the ArrayList
class, which will be studied later in this chapter. However, because of the overhead
of this size flexibility, an ArrayList is measurably less efficient than an array.

The vector container class in C++ does know the type of objects it holds, but it has
a different drawback when compared with arrays in C#: the C++ vector’s
operator[] doesn’t do bounds checking, so you can run past the end
1
. In C#, you
get bounds checking regardless of whether you’re using an array or a container—
you’ll get an IndexOutOfRangeException if you exceed the bounds. As you’ll
learn in Chapter 11, this type of exception indicates a programmer error, and thus
you don’t need to check for it in your code. As an aside, the reason the C++ vector
doesn’t check bounds with every access is speed—in C# you have the performance
overhead of bounds checking all the time for both arrays and containers.
The other generic container classes that will be studied in this chapter,
ICollection, IList and IDictionary, all deal with objects as if they had no
specific type. That is, they treat them as type object, the root class of all classes in
C#. This works fine from one standpoint: you need to build only one container, and
any C# object will go into that container. This is the second place where an array is
superior to the generic containers: when you create an array, you create it to hold a
specific type. This means that you get compile-time type checking to prevent you
from putting the wrong type in, or mistaking the type that you’re extracting. Of
course, C# will prevent you from sending an inappropriate message to an object,
either at compile-time or at run-time. So it’s not much riskier one way or the other;
it’s just nicer if the compiler points it out to you, faster at run-time, and there’s less
likelihood that the end user will get surprised by an exception.
Typed generic classes (sometimes called “parameterized types” and sometimes just
“generics”) are not part of the initial .NET framework but will be. Unlike C++’s
templates or Java’s proposed extensions, Microsoft wishes to implement support
for “parametric polymorphism” within the Common Language Runtime itself. Don
Syme and Andrew Kennedy of Microsoft’s Cambridge (England) Research Lab



1
It’s possible, however, to ask how big the vector is, and the at( ) method does perform
bounds checking.

Chapter 10: Collecting Your Objects 351
published papers in Spring 2001 on a proposed strategy and Anders Hjelsberg
hinted at C#’s Spring 2002 launch that implementation was well under way.
For the moment, though, efficiency and type checking suggest using an array if you
can. However, when you’re trying to solve a more general problem arrays can be too
restrictive. After looking at arrays, the rest of this chapter will be devoted to the
container classes provided by C#.
Arrays are first-class objects
Regardless of what type of array you’re working with, the array identifier is actually
a reference to a true object that’s created on the heap. This is the object that holds
the references to the other objects, and it can be created either implicitly, as part of
the array initialization syntax, or explicitly with a new expression. Part of the array
object is the read-only Length property that tells you how many elements can be
stored in that array object. For rectangular arrays, the Length property tells you
the total size of the array, the Rank property tells you the number of dimensions in
the array, and the GetLength(int) method will tell you how many elements are in
the given rank.
The following example shows the various ways that an array can be initialized, and
how the array references can be assigned to different array objects. It also shows
that arrays of objects and arrays of primitives are almost identical in their use. The
only difference is that arrays of objects hold references, while arrays of primitives
hold the primitive values directly.
//:c10:ArraySize.cs
// Initialization & re-assignment of arrays.
using System;


class Weeble {
} // A small mythical creature

public class ArraySize {
public static void Main() {
// Arrays of objects:
Weeble[] a; // Null reference
Weeble[] b = new Weeble[5]; // Null references
Weeble[,] c = new Weeble[2, 3]; //Rectangular array
Weeble[] d = new Weeble[4];
for (int index = 0; index < d.Length; index++)
d[index] = new Weeble();
// Aggregate initialization:

352 Thinking in C# www.MindView.net
Weeble[] e = {
new Weeble(), new Weeble(), new Weeble()
};
// Dynamic aggregate initialization:
a = new Weeble[] {
new Weeble(), new Weeble()
};
// Square dynamic aggregate initialization:
c = new Weeble[,] {
{ new Weeble(), new Weeble(), new Weeble()},
{ new Weeble(), new Weeble(), new Weeble()}
};

Console.WriteLine("a.Length=" + a.Length);
Console.WriteLine("b.Length = " + b.Length);

Console.WriteLine("c.Length = " + c.Length);
for (int rank = 0; rank < c.Rank; rank++) {
Console.WriteLine(
"c.Length[{0}] = {1}", rank, c.GetLength(rank));
}
// The references inside the array are
// automatically initialized to null:
for (int index = 0; index < b.Length; index++)
Console.WriteLine("b[" + index + "]=" + b[index]);
Console.WriteLine("d.Length = " + d.Length);
Console.WriteLine("d.Length = " + d.Length);
a = d;
Console.WriteLine("a.Length = " + a.Length);

// Arrays of primitives:
int[] f; // Null reference
int[] g = new int[5];
int[] h = new int[4];
for (int index = 0; index < h.Length; index++)
h[index] = index*index;
int[] i = { 11, 47, 93};
// Compile error: Use of unassigned local variable 'f'
//! Console.WriteLine("f.Length=" + f.Length);
Console.WriteLine("g.Length = " + g.Length);
// The primitives inside the array are
// automatically initialized to zero:

Chapter 10: Collecting Your Objects 353
for (int index = 0; index < g.Length; index++)
Console.WriteLine("g[" + index + "]=" + g[index]);

Console.WriteLine("h.Length = " + h.Length);
Console.WriteLine("i.Length = " + i.Length);
f = i;
Console.WriteLine("f.Length = " + f.Length);
f = new int[] { 1, 2};
Console.WriteLine("f.Length = " + f.Length);
}
} ///:~

Here’s the output from the program:
a.Length=2
b.Length = 5
c.Length = 6
c.Length[0] = 2
c.Length[1] = 3
b[0]=
b[1]=
b[2]=
b[3]=
b[4]=
d.Length = 4
d.Length = 4
a.Length = 4
g.Length = 5
g[0]=0
g[1]=0
g[2]=0
g[3]=0
g[4]=0
h.Length = 4

i.Length = 3
f.Length = 3
f.Length = 2

The array a is initially just a null reference, and the compiler prevents you from
doing anything with this reference until you’ve properly initialized it. The array b is
initialized to point to an array of Weeble references, but no actual Weeble objects
are ever placed in that array. However, you can still ask what the size of the array is,
since b is pointing to a legitimate object. This brings up a slight drawback: you can’t
find out how many elements are actually in the array, since Length tells you only

354 Thinking in C# www.ThinkingIn.NET
how many elements can be placed in the array; that is, the size of the array object,
not the number of elements it actually holds. However, when an array object is
created its references are automatically initialized to null, so you can see whether a
particular array slot has an object in it by checking to see whether it’s null.
Similarly, an array of primitives is automatically initialized to zero for numeric
types, (char)0 for char, and false for bool.
Array c shows the creation of the array object followed by the assignment of
Weeble objects to all the slots in the array. Array d shows the “aggregate
initialization” syntax that causes the array object to be created (implicitly with new
on the heap, just like for array c) and initialized with Weeble objects, all in one
statement.
The next array initialization could be thought of as a “dynamic aggregate
initialization.” The aggregate initialization used by d must be used at the point of
d’s definition, but with the second syntax you can create and initialize an array
object anywhere. For example, suppose Hide( ) is a method that takes an array of
Weeble objects. You could call it by saying:
Hide(d);


but you can also dynamically create the array you want to pass as the argument:
Hide(new Weeble[] { new Weeble(), new Weeble() });

In some situations this new syntax provides a more convenient way to write code.
Rectangular arrays are initialized using nested arrays. Although a rectangular array
is contiguous in memory, C#’s compiler will not allow you to ignore the
dimensions; you cannot cast a flat array into a rectangular array or initialize a
rectangular array in a “flat” manner.
The expression:
a = d;

shows how you can take a reference that’s attached to one array object and assign it
to another array object, just as you can do with any other type of object reference.
Now both a and d are pointing to the same array object on the heap.
The second part of ArraySize.cs shows that primitive arrays work just like object
arrays except that primitive arrays hold the primitive values directly.

Chapter 10: Collecting Your Objects 355
The Array class
In System.Collections, you’ll find the Array class, which has a variety of
interesting properties and methods. Array is defined as implementing
ICloneable, IList, ICollection, and IEnumerable. This is actually a pretty
sloppy declaration, as IList is declared as extending ICollection and
IEnumerable, while ICollection is itself declared as extending IEnumerable
(Figure 10-1)!
ICollection
IEnumerable
ICloneable
IList
Array


Figure 10-1: The Array class has a complex set of base types
The Array class has some properties inherited from ICollection that are the same
for all instances: IsFixedSize is always true, IsReadOnly and IsSynchronized
are always false.
Array’s static methods
The Array class has several useful static methods, which are illustrated in this
program:
//:c10:ArrayStatics.cs
using System;
using System.Collections;

356 Thinking in C# www.MindView.net

class Weeble {
string name;
internal string Name{
get { return name;}
set { name = value;}
}
internal Weeble(string name) {
this.Name = name;
}
}
class ArrayStatics {
static string[] dayList = new string[]{
"sunday", "monday", "tuesday", "wednesday",
"thursday", "friday", "saturday"
};


static string[,] famousCouples = new string[,]{
{ "George", "Martha"}, { "Napolean", "Josephine"},
{ "Westley","Buttercup"}
};

static Weeble[] weebleList = new Weeble[]{
new Weeble("Pilot"), new Weeble("Firefighter")
};

public static void Main() {
//Copying arrays
Weeble[] newList = new Weeble[weebleList.Length];
Array.Copy(weebleList, newList, weebleList.Length);
newList[0] = new Weeble("Nurse");
bool newReferences = newList[0] != weebleList[0];
Console.WriteLine("New references == "
+ newReferences);
//Copying a rectangular array works
string[,] newSquareArray =
new string[famousCouples.GetLength(0),
famousCouples.GetLength(1)];
Array.Copy(famousCouples, newSquareArray,
famousCouples.Length);


Chapter 10: Collecting Your Objects 357
//In-place sorting
string[] sortedDays = new string[dayList.Length];
Array.Copy(dayList, sortedDays, dayList.Length);
Array.Sort(sortedDays);

for (int i = 0; i < sortedDays.Length; i++) {
Console.WriteLine("sortedDays[{0}] = {1}",
i, sortedDays[i]);
}
//Binary search of sorted 1-D Array
int tuesdayIndex =
Array.BinarySearch(sortedDays, "tuesday");
Console.WriteLine(
"dayList[{0}] == \"tuesday\"", tuesdayIndex);
//! int georgeIndex =
//! Array.BinarySearch(famousCouples, "George"};
// Causes compile error

//Reverse
Array.Reverse(sortedDays);
for (int i = 0; i < sortedDays.Length; i++) {
Console.WriteLine(
"Reversed sortedDays[{0}] = {1}",
i, sortedDays[i]);
}

//Quickly erasing an array section,
//even if multidimensional
Array.Clear(famousCouples, 2, 3);
for(int x = 0; x < famousCouples.GetLength(0); x++)
for(int y = 0; y < famousCouples.GetLength(1);
y++)
Console.WriteLine(
"FamousCouples[{0},{1}] = {2}",
x, y, famousCouples[x,y]);

}
}///:~

After declaring a Weeble class (this time with a Name property to make them easier
to distinguish), the ArrayStatics class declares several static arrays – dayList
and weebleList, which are both one-dimensional, and the square
famousCouples array.

358 Thinking in C# www.ThinkingIn.NET
Array.Copy( ) provides a fast way to copy an array (or a portion of it). The new
array contains all new references, so changing a value in your new list will not
change the value in your original, as would be the case if you did:
Weeble[] newList = weebleList;
newList[0] = new Weeble("Nurse");

Array.Copy( ) works with multidimensional arrays, too. The program uses the
GetLength(int) method to allocate sufficient storage for the new SquareArray,
but then uses the famousCouples.Length property to specify the size of the
copy. Although Copy( ) seems to “flatten” multidimensional arrays, using arrays of
different rank will throw a runtime RankException.
The static method Array.Sort( ) does an in-place sort of the array’s contents and
BinarySearch( ) provides an efficient search on a sorted array.
Array.Reverse( ) is self-explanatory, but Array.Clear( ) has the perhaps
surprising behavior of slicing across multidimensional arrays. In the program,
Array.Clear(famousCouples, 2, 3) treats the multidimensional
famousCouples array as a flat array, setting to null the values of indices [1,0],
[1,1], and [2,0].
Array element comparisons
How does Array.Sort( ) work? A problem with writing generic sorting code is that
sorting must perform comparisons based on the actual type of the object. Of course,

one approach is to write a different sorting method for every different type, but you
should be able to recognize that this does not produce code that is easily reused for
new types.
A primary goal of programming design is to “separate things that change from
things that stay the same,” and here, the code that stays the same is the general sort
algorithm, but the thing that changes from one use to the next is the way objects are
compared. So instead of hard-wiring the comparison code into many different sort
routines, the Strategy Pattern is used. In the Strategy Pattern, the part of the code
that varies from case to case is encapsulated inside its own class, and the part of the
code that’s always the same makes a call to the part of the code that changes. That
way you can make different objects to express different strategies of comparison
and feed them to the same sorting code.
In C#, comparisons are done by calling back to the CompareTo( ) method of the
IComparable interface. This method takes another object as an argument, and
produces a negative value if the current object is less than the argument, zero if the

Chapter 10: Collecting Your Objects 359
argument is equal, and a positive value if the current object is greater than the
argument.
Here’s a class that implements IComparable and demonstrates the comparability
by using Array.Sort( ):
//:c10:CompType.cs
// Implementing IComparable in a class.
using System;

public class CompType: IComparable {
int i;
int j;
public CompType(int n1, int n2) {
i = n1;

j = n2;
}
public override string ToString() {
return "[i = " + i + ", j = " + j + "]";
}

public int CompareTo(Object rv) {
int rvi = ((CompType)rv).i;
if (i > rvi)
return 1;
else if (i == rvi)
return 0;
else
return -1;
(i < rvi ? -1 : (i == rvi ? 0 : 1));
}

private static Random r = new Random();

private static void ArrayPrint(String s, Array a){
Console.Write(s);
foreach(Object o in a){
Console.Write(o + ",");
}
Console.WriteLine();
}


360 Thinking in C# www.MindView.net
public static void Main() {

CompType[] a = new CompType[10];
for (int i = 0; i < 10; i++) {
a[i] = new CompType(r.Next(100), r.Next(100));
}
ArrayPrint("Before sorting, a = ", a);
Array.Sort(a);
ArrayPrint("After sorting, a = ", a);
}
} ///:~

When you define the comparison function, you are responsible for deciding what it
means to compare one of your objects to another. Here, only the i values are used
in the comparison, and the j values are ignored.
The Main( ) method creates a bunch of CompType objects that are initialized
with random values and then sorted. If Comparable hadn’t been implemented,
then you’d get an InvalidOperationException thrown at runtime when you tried to
call Array.Sort( ).
What? No bubbles?
In the not-so-distant past, the sort and search methods used in a program were a
matter of constant debate and anguish. In the good old days, even the most trivial
datasets had a good chance of being larger than RAM (or “core” as we used to say)
and required intermediate reads and writes to storage devices that could take, yes,
seconds to access (or, if the tapes needed to be swapped, minutes). So there was an
enormous amount of energy put into worrying about internal (in-memory) versus
external sorts, the stability of sorts, the importance of maintaining the input tape
until the output tape was verified, the “operator dismount time,” and so forth.
Nowadays, 99% of the time you can ignore the particulars of sorting and searching.
In order to get a decent idea of sorting speed, this program requires an array of
1,000,000 elements, and still it executes in a matter of seconds:
//:c10:FastSort.cs

using System;

class Sortable : IComparable {
int i;
internal Sortable(int i) {
this.i = i;
}

Chapter 10: Collecting Your Objects 361

public int CompareTo(Object o) {
try {
Sortable s = (Sortable) o;
return i = s.i;
} catch (InvalidCastException) {
throw new ArgumentException();
}
}
}

class SortingTester {
static TimeSpan TimedSort(IComparable[] s){
DateTime start = DateTime.Now;
Array.Sort(s);
TimeSpan duration = DateTime.Now - start;
return duration;
}
public static void Main() {
for (int times = 0; times < 10; times++) {
Sortable[] s = new Sortable[1000000];

for (int i = 0; i < s.Length; i++) {
s[i] = new Sortable(i);
}
Console.WriteLine("Time to sort already sorted"
+ " array: " + TimedSort(s));
Random rand = new Random();
for (int i = 0; i < s.Length; i++) {
s[i] = new Sortable(rand.Next());
}
Console.WriteLine("Time to sort mixed up array: "
+ TimedSort(s));
}
}
}///:~

The results show that Sort( ) works faster on an already sorted array, which
indicates that behind the scenes, it’s probably using a merge sort instead of
QuickSort. But the sorting algorithm is certainly less important than the fact that a
computer that costs less than a thousand dollars can perform an in-memory sort of
a million-item array! Moore’s Law has made anachronistic an entire field of

362 Thinking in C# www.ThinkingIn.NET
knowledge and debate that seemed, not that long ago, fundamental to computer
programming.
This is an important lesson for those who wish to have long careers in
programming: never confuse the mastery of today’s facts with preparation for
tomorrow’s changes. Within a decade, we will have multi-terabyte storage on the
desktop, trivial access to distributed teraflop processing, and probably specialized
access to quantum computers of significant capability. Eventually, although
probably not within a decade, there will be breakthroughs in user interfaces and

we’ll abandon the keyboard and the monitor for voice and gesture input and
“augmented reality” glasses. Almost all the programming facts that hold today will
be as useless as the knowledge of how to do an oscillating sort with criss-cross
distribution. A programmer must never stand still.
Unsafe arrays
Despite the preceding discussion of the steady march of technical obsolescence, the
facts on the ground often agitate towards throwing away the benefits of safety and
abstraction and getting closer to the hardware in order to boost performance.
Often, the correct solution in this case will be to move out of C# altogether and into
C++, a language which will continue for some time to be the best for the creation of
device drivers and other close-to-the-metal components.
However, manipulating arrays can sometimes introduce bottlenecks in higher-level
applications, such as multimedia applications. In such situations, unsafe code may
be worthwhile. The basic impetus for using unsafe arrays is that you wish to
manipulate the array as a contiguous block of memory, foregoing bounds checking.
As a testbed for exploring performance with unsafe arrays, we’ll use a
transformation that actually has tremendous practical applications. Wavelet
transforms are fascinating and their utility has hardly been scratched. The simplest
transform is probably the two-dimensional Haar transform on a matrix of doubles.
The Haar transform converts a list of values into the list’s average and differences,
so the list {2, 4} is transformed into {3, 1} == {(2 + 4) / 2, ((2 + 4) / 2) – 2}. A two-
dimensional transform just transforms the rows and then the columns, so {{2,
4},{5,6}} becomes {{4.25, .75},{1.25, -0.25}}:

Chapter 10: Collecting Your Objects 363
24
56
31
5.5 0.5
4.25

.75
1.25 25
Horizontal transform
Vertical
transform

Figure 10-2: The Haar transform is a horizontal followed by vertical transform
Wavelets have many interesting characteristics, including being the basis for some
excellent compression routines, but are expensive to compute for arrays that are
typical of multimedia applications, especially because to be useful they are usually
computed log
2
(MIN(dimension size)) times per array!
The following program does such a transform in two different ways, one a safe
method that uses typical C# code and the other using unsafe code.
//:c10:FastBitmapper1.cs
using System;
using System.IO;

namespace FastBitmapper{
public interface Transform{
void HorizontalTransform(double[,] matrix);
void VerticalTransform(double[,] matrix);
}

public class Wavelet {
public void Transform2D(double[,] matrix,
Transform t) {
int minDimension = matrix.GetLength(0);
if (matrix.GetLength(1) < minDimension)

minDimension = matrix.GetLength(1);
int levels =
(int) Math.Floor(Math.Log(minDimension, 2));
Transform2D(matrix, levels, t);
}

public void Transform2D(double[,] matrix,

364 Thinking in C# www.MindView.net
int steps, Transform tStrategy) {
for (int i = 0; i < steps; i++) {
tStrategy.HorizontalTransform(matrix);
tStrategy.VerticalTransform(matrix);
}
}

public void TestSpeed(Transform t) {
Random rand = new Random();
double[,] matrix = new double[2000,2000];
for (int i = 0; i < matrix.GetLength(0); i++)
for (int j = 0; j < matrix.GetLength(1); j++) {
matrix[i,j] = rand.NextDouble();
}
DateTime start = DateTime.Now;
this.Transform2D(matrix, t);
TimeSpan dur = DateTime.Now - start;
Console.WriteLine(
"Transformation with {0} took {1} ",
t.GetType().Name, dur);
}


public static void Main() {
Wavelet w = new Wavelet();
for (int i = 0; i < 10; i++) {
//Get things right first
w.TestSpeed(new SafeTransform());
//Have not defined UnsafeTransform yet
//! w.TestSpeed(new UnsafeTransform());
}
}
}

internal class SafeTransform : Transform {
private void Transform(double[] array) {
int halfLength = array.Length >> 1;
double[] avg = new double[halfLength];
double[] diff = new double[halfLength];
for (int pair = 0; pair < halfLength; pair++) {
double first = array[pair * 2];
double next = array[pair * 2 + 1];

Chapter 10: Collecting Your Objects 365
avg[pair] = (first + next) / 2;
diff[pair] = avg[pair] - first;
}
for (int pair = 0; pair < halfLength; pair++) {
array[pair] = avg[pair];
array[pair + halfLength] = diff[pair];
}
}


public void HorizontalTransform(double[,] matrix) {
int height = matrix.GetLength(0);
int width = matrix.GetLength(1);
double[] row = new double[width];
for (int i = 0; i < height; i++) {
for (int j = 0; j < width; j++) {
row[j] = matrix[i, j];
}
Transform(row);
for (int j = 0; j < width; j++) {
matrix[i,j] = row[j];
}
}
}

public void VerticalTransform(double[,] matrix) {
int height = matrix.GetLength(0);
int length = matrix.GetLength(1);
double[] colData = new double[height];
for (int col = 0; col < length; col++) {
for (int row = 0; row < height; row++) {
colData[row] = matrix[row, col];
}
Transform(colData);
for (int row = 0; row < height; row++) {
matrix[row, col] = colData[row];
}
}
}

}
}///:~


366 Thinking in C# www.ThinkingIn.NET
Get things right…
The cardinal rule of performance programming is to first get the system operating
properly and then worry about performance. The second rule is to always use a
profiler to measure where your problems are, never go with a guess. In an object-
oriented design, after discovering a hotspot, you should always break the problem
out into an abstract data type (an interface) if it is not already. This will allow you to
switch between different implementations over time, confirming that your
performance work is accomplishing something and that it is not diverging from
your correct “safe” work.
In this case, the Wavelet class uses an interface called Transform to perform the
actual work:
Wavelet
Transform
void HorizontalTransform(double[, ] matrix)
void VerticalTransform(double[, ] matrix)

Figure 10-3: The Wavelet class relies on the Transform interface
The Transform interface contains two methods, each of which takes a rectangular
array as a parameter and performs an in-place transformation;
HorizontalTransform( ) converts a row of values into a row containing the
averages and differences of the row, and VerticalTransform( ) performs a
similar transformation on the columns of the array.
The Wavelet class contains two Transform2D( ) methods, the first of which
takes a rectangular array and a Transform. The number of steps required to
perform a full wavelet transform is calculated by first determining the minimum

dimension of the passed-in matrix and then using the Math.Log( ) function to
determine the base-2 magnitude of that dimension. Math.Floor( ) rounds that
magnitude down and the result is cast to the integer number of steps that will be
applied to the matrix. (Thus, an array with a minimum dimension of 4 would have
2 steps; an array with 1024 would have 9.)
The constructor then calls the second constructor, which takes the same
parameters as the first plus the number of times to apply the wavelet (this is a
separate constructor because during debugging a single wavelet step is much easier
to comprehend than a fully processed one, as Figure 10-4 illustrates)

Tài liệu bạn tìm kiếm đã sẵn sàng tải về

Tải bản đầy đủ ngay
×