Tải bản đầy đủ (.pdf) (22 trang)

Tài liệu Grid Computing P22 doc

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (515.53 KB, 22 trang )

22
NaradaBrokering: an event-based
infrastructure for building scalable
durable peer-to-peer Grids
Geoffrey Fox and Shrideep Pallickara
Indiana University, Bloomington, Indiana, United States
22.1 INTRODUCTION
The peer-to-peer (P2P) style interaction [1] model facilitates sophisticated resource shar-
ing environments between ‘consenting’ peers over the ‘edges’ of the Internet; the ‘disrup-
tive’ [2] impact of which has resulted in a slew of powerful applications built around
this model. Resources shared could be anything – from CPU cycles, exemplified by
SETI@home (extraterrestrial life) [3] and Folding@home (protein folding) [4] to files
(Napster and Gnutella [5]). Resources in the form of direct human presence include col-
laborative systems (Groove [6]) and Instant Messengers (Jabber [7]). Peer ‘interactions’
involve advertising resources, search and subsequent discovery of resources, request for
access to these resources, responses to these requests and exchange of messages between
peers. An overview of P2P systems and their deployments in distributed computing and
Grid Computing – Making the Global Infrastructure a Reality. Edited by F. Berman, A. Hey and G. Fox

2003 John Wiley & Sons, Ltd ISBN: 0-470-85319-0
580
GEOFFREY FOX AND SHRIDEEP PALLICKARA
collaboration can be found in Reference [8]. Systems tuned towards large-scale P2P sys-
tems include Pastry [9] from Microsoft, which provides an efficient location and routing
substrate for wide-area P2P applications. Pastry provides a self-stabilizing infrastructure
that adapts to the arrival, departure and failure of nodes. FLAPPS [10], a Forwarding
Layer for Application-level Peer-to-Peer Services, is based on the general ‘peer inter-
networking’ model in which routing protocols propagate availability of shared resources
exposed by remote peers. File replications and hoarding services are examples in which
FLAPPS could be used to relay a source peer’s request to the closest replica of the shared
resource. The JXTA [11] (from juxtaposition) project at Sun Microsystems is another


research effort that seeks to provide such large-scale P2P infrastructures. Discussions per-
taining to the adoption of event services as a key building block supporting P2P systems
can be found in References [8, 12].
We propose an architecture for building a scalable, durable P2P Grid comprising
resources such as relatively static clients, high-end resources and a dynamic collection of
multiple P2P subsystems. Such an infrastructure should draw upon the evolving ideas
of computational Grids, distributed objects, Web services, peer-to-peer networks and
message-oriented middleware while seamlessly integrating users to themselves and to
resources, which are also linked to each other. We can abstract such environments as a
distributed system of ‘clients’, which consist either of ‘users’ or ‘resources’ or proxies
thereto. These clients must be linked together in a flexible, fault-tolerant, efficient, high-
performance fashion. We investigate the architecture, comprising a distributed brokering
system that will support such a hybrid environment. In this chapter, we study the event bro-
kering system – NaradaBrokering – that is appropriate to link the clients (both users and
resources of course) together. For our purposes (registering, transporting and discovering
information), events are just messages – typically with time stamps. The event brokering
system NaradaBrokering must scale over a wide variety of devices – from handheld com-
puters at one end to high-performance computers and sensors at the other extreme. We
have analyzed the requirements of several Grid services that could be built with this model,
including computing and education and incorporated constraints of collaboration with
a shared event model. We suggest that generalizing the well-known publish–subscribe
model is an attractive approach and this is the model that is used in NaradaBroker-
ing. Services can be hosted on such a P2P Grid with peer groups managed locally and
arranged into a global system supported by core servers. Access to services can then
be mediated either by the ‘broker middleware’ or alternatively by direct P2P interactions
between machines ‘on the edge’. The relative performance of each approach (which could
reflect computer/network cycles as well as the existence of firewalls) would be used in
deciding on the implementation to use. P2P approaches best support local dynamic inter-
actions; the distributed broker approach scales best globally but cannot easily manage
the rich structure of transient services, which would characterize complex tasks. We use

our research system NaradaBrokering as the distributed brokering core to support such a
hybrid environment. NaradaBrokering is designed to encompass both P2P and the tradi-
tional centralized middle-tier style of interactions. This is needed for robustness (since P2P
interactions are unreliable and there are no guarantees associated with them) and dynamic
resources (middle-tier style interactions are not natural for very dynamic clients and
resources). This chapter describes the support for these interactions in NaradaBrokering.
NARADABROKERING
581
There are several attractive features in the P2P model, which motivate the development
of such hybrid systems. Deployment of P2P systems is entirely user-driven obviating the
need for any dedicated management of these systems. Peers expose the resources that they
are willing to share and can also specify the security strategy to do so. Driven entirely on
demand a resource may be replicated several times; a process that is decentralized and
one over which the original peer that advertised the resource has sometimes little control.
Peers can form groups with the fluid group memberships. In addition, P2P systems tend
to be very dynamic with peers maintaining an intermittent digital presence. P2P systems
incorporate schemes for searching and subsequent discovery of resources. Communication
between a requesting peer and responding peers is facilitated by peers en route to these
destinations. These intermediate peers are thus made aware of capabilities that exist at
other peers constituting dynamic real-time knowledge propagation. Furthermore, since
peer interactions, in most P2P systems, are XML-based, peers can be written in any
language and can be compiled for any platform. There are also some issues that need to
be addressed while incorporating support for P2P interactions. P2P interactions are self-
attenuating with interactions dying out after a certain number of hops. These attenuations
in tandem with traces of the peers, which the interactions have passed through, eliminate
the continuous echoing problem that results from loops in peer connectivity. However,
attenuation of interactions sometimes prevents peers from discovering certain services that
are being offered. This results in P2P interactions being very localized. These attenuations
thus mean that the P2P world is inevitably fragmented into many small subnets that are
not connected. Peers in P2P systems interact directly with each other and sometimes use

other peers as intermediaries in interactions. Specialized peers are sometimes deployed to
enhance routing characteristics. Nevertheless, sophisticated routing schemes are seldom
in place and interactions are primarily through simple forwarding of requests with the
propagation range being determined by the attenuation indicated in the message.
NaradaBrokering must support many different frameworks including P2P and cen-
tralized models. Though native NaradaBrokering supports this flexibility we must also
expect that realistic scenarios will require the integration of multiple brokering schemes.
NaradaBrokering supports this hybrid case through gateways to the other event worlds.
In this chapter we look at the NaradaBrokering system and its standards-based exten-
sions to support the middle-tier style and P2P style interactions. This chapter is organized
as follows; in Section 22.2 we provide an overview of the NaradaBrokering system.
In Section 22.3 we outline NaradaBrokering’s support for the Java Message Service
(JMS) specification. This section also outlines NaradaBrokering’s strategy for replac-
ing single-server JMS systems with a distributed broker network. In Section 22.4 we
discuss NaradaBrokering’s support for P2P interactions, and in Section 22.5 we discuss
NaradaBrokering’s integration with JXTA.
22.2 NARADABROKERING
NaradaBrokering [13–18] is an event brokering system designed to run a large
network of cooperating broker nodes while incorporating capabilities of content-
based routing and publish/subscribe messaging. NaradaBrokering incorporates protocols
582
GEOFFREY FOX AND SHRIDEEP PALLICKARA
for organizing broker nodes into a cluster-based topology. The topology is then
used for incorporating efficient calculation of destinations, efficient routing even in
the presence of failures, provisioning of resources to clients, supporting application
defined communications scope and incorporating fault-tolerance strategies. Strategies for
adaptive communication scheduling based on QoS requirements, content type, networking
constraints (such as presence of firewalls, MBONE [19] support or the lack thereof)
and client-processing capabilities (from desktop clients to Personal Digital Assistant
(PDA) devices) are currently being incorporated into the system core. Communication

within NaradaBrokering is asynchronous, and the system can be used to support
different interactions by encapsulating them in specialized events. Events are central in
NaradaBrokering and encapsulate information at various levels as depicted in Figure 22.1.
Clients can create and publish events, specify interests in certain types of events and
receive events that conform to specified templates. Client interests are managed and used
by the system to compute destinations associated with published events. Clients, once they
specify their interests, can disconnect and the system guarantees the delivery of matched
events during subsequent reconnects. Clients reconnecting after prolonged disconnects,
connect to the local broker instead of the remote broker that it was last attached to.
This eliminates bandwidth degradations caused by heavy concentration of clients from
disparate geographic locations accessing a certain known remote broker over and over
again. The delivery guarantees associated with individual events and clients are met even
in the presence of failures. The approach adopted by the Object Management Group
(OMG) is one of establishing event channels and registering suppliers and consumers to
those channels. The channel approach in the CORBA Event Service [20] could however
entail clients (consumers) to be aware of a large number of event channels.
22.2.1 Broker organization and small worlds behavior
Uncontrolled broker and connection additions result in a broker network that is suscepti-
ble to network partitions and that is devoid of any logical structure making the creation of
Source
Destinations
Event descriptors
Content descriptors
Content payload
Event distribution traces /
Time To Live (TTL)
Event origins
Explicit
destinations
Used to compute

destinations
Used for eliminating
continuous echoing/
attenuation of event.
Used to handle
content
Figure 22.1 Event in NaradaBrokering.
NARADABROKERING
583
efficient broker network maps (BNM) an arduous if not impossible task. The lack of this
knowledge hampers development of efficient routing strategies, which exploits the broker
topology. Such systems then resort to ‘flooding’ the entire broker network, forcing clients
to discard events they are not interested in. To circumvent this, NaradaBrokering incorpo-
rates a broker organization protocol, which manages the addition of new brokers and also
oversees the initiation of connections between these brokers. The node organization pro-
tocol incorporates Internet protocol (IP) discriminators, geographical location, cluster size
and concurrent connection thresholds at individual brokers in its decision-making process.
In NaradaBrokering, we impose a hierarchical structure on the broker network, in
which a broker is part of a cluster that is part of a super-cluster, which in turn is part
of a super-super-cluster and so on. Clusters comprise strongly connected brokers with
multiple links to brokers in other clusters, ensuring alternate communication routes during
failures. This organization scheme results in ‘small world networks’ [21, 22] in which
the average communication ‘pathlengths’ between brokers increase logarithmically with
geometric increases in network size, as opposed to exponential increases in uncontrolled
settings. This distributed cluster architecture allows NaradaBrokering to support large
heterogeneous client configurations that scale to arbitrary size. Creation of BNMs and the
detection of network partitions are easily achieved in this topology. We augment the BNM
hosted at individual brokers to reflect the cost associated with traversal over connections,
for example, intracluster communications are faster than intercluster communications.
The BNM can now not only be used to compute valid paths but also to compute shortest

paths. Changes to the network fabric are propagated only to those brokers that have
their broker network view altered. Not all changes alter the BNM at a broker and those
that do result in updates to the routing caches, containing shortest paths, maintained at
individual brokers.
22.2.2 Dissemination of events
Every event has an implicit or explicit destination list, comprising clients, associated with
it. The brokering system as a whole is responsible for computing broker destinations
(targets) and ensuring efficient delivery to these targeted brokers en route to the intended
client(s). Events as they pass through the broker network are to be updated to snapshot its
dissemination within the network. The event dissemination traces eliminate continuous
echoing and in tandem with the BNM – used for computing shortest paths – at each
broker, is used to deploy a near optimal routing solution. The routing is near optimal
since for every event the associated targeted set of brokers are usually the only ones
involved in disseminations. Furthermore, every broker, either targeted or en route to one,
computes the shortest path to reach target destinations while employing only those links
and brokers that have not failed or have not been failure-suspected. In the coming years,
increases in communication bandwidths will not be matched by commensurately reduced
communication latencies [23]. Topology-aware routing and communication algorithms
are needed for efficient solutions. Furthermore, certain communication services [24] are
feasible only when built on top of a topology-aware solution. NaradaBrokering’s routing
solution thus provides a good base for developing efficient solutions.
584
GEOFFREY FOX AND SHRIDEEP PALLICKARA
22.2.3 Failures and recovery
In NaradaBrokering, stable storages existing in parts of the system are responsible for
introducing state into the events. The arrival of events at clients advances the state asso-
ciated with the corresponding clients. Brokers do not keep track of this state and are
responsible for ensuring the most efficient routing. Since the brokers are stateless, they
can fail and remain failed forever. The guaranteed delivery scheme within NaradaBroker-
ing does not require every broker to have access to a stable store or database management

system (DBMS). The replication scheme is flexible and easily extensible. Stable storages
can be added/removed and the replication scheme can be updated. Stable stores can fail
but they do need to recover within a finite amount of time. During these failures, the
clients that are affected are those that were being serviced by the failed storage.
22.2.4 Support for dynamic topologies
Support for local broker accesses, client roams and stateless brokers provide an envi-
ronment extremely conducive to dynamic topologies. Brokers and connections could be
instantiated dynamically to ensure efficient bandwidth utilizations. These brokers and con-
nections are added to the network fabric in accordance with rules that are dictated by the
agents responsible for broker organization. Brokers and connections between brokers can
be dynamically instantiated on the basis of the concentration of clients at a geographic
i
4
5
6
l
13 14
15
j
7
8
9
h
1
2
3
k
10
11
12

m
16
17
18
n
20
21
19
22
Measuring
subscriber
Publisher
Figure 22.2 Test topology.
NARADABROKERING
585
location and also on the basis of the content that these clients are interested in. Similarly,
average pathlengths for communication could be reduced by instantiating connections to
optimize clustering coefficients within the broker network. Brokers can be continuously
added or can fail and the broker network can undulate with these additions and failures of
brokers. Clients could then be induced to roam to such dynamically created brokers for
optimizing bandwidth utilization. A strategy for incorporation of dynamic self-organizing
overlays similar to MBONE [19] and X-Bone [25] is an area for future research.
22.2.5 Results from the prototype
Figure 22.3 illustrates some results [14, 17] from our initial research in which we studied
the message delivery time as a function of load. The results are from a system compris-
ing 22 broker processes and 102 clients in the topology outlined in Figure 22.2. Each
broker node process is hosted on one physical Sun SPARC Ultra-5 machine (128 MB
RAM, 333 MHz), with no SPARC Ultra-5 machine hosting two or more broker node pro-
cesses. The publisher and the measuring subscriber reside on the same SPARC Ultra-5
machine. In addition, there are 100 subscribing client processes with 5 client processes

attached to every other broker node (broker nodes 22 and 21 do not have any other clients
besides the publisher and the measuring subscriber, respectively) within the system. The
100 client node processes all reside on a SPARC Ultra-60 (512 MB RAM, 360 MHz)
Transit delay under different matching rates: 22 brokers 102 clients
Match rate = 100%
Match rate = 50%
Match rate = 10%
0
100
200
300
400
500
600
700
800
900
1000
Publish rate
(Events/second)
0
50
100
150
200
250
300
350
400
450

500
Event size
(bytes)
0
50
100
150
200
250
300
350
400
450
Mean
transit delay
(ms)
s
Figure 22.3 Transit delays for different matching rates.
586
GEOFFREY FOX AND SHRIDEEP PALLICKARA
machine. The run-time environment for all the broker node and client processes is Solaris
JVM (JDK 1.2.1, native threads, JIT). The three matching values correspond to the
percentages of messages that are delivered to any given subscriber. The 100% case corre-
sponds to systems that would flood the broker network. The system performance improves
significantly with increasing selectivity from subscribers. We found that the distributed
network scaled well with adequate latency (2 ms per broker hop) unless the system became
saturated at very high publish rates. We do understand how a production version of the
NaradaBrokering system could give significantly higher performance – about a factor of
3 lower in latency than the prototype. By improving the thread scheduling algorithms and
incorporating flow control (needed at high publish rates), significant gains in performance

can be achieved. Currently, we do not intend to incorporate any non-Java modules.
22.3 JMS COMPLIANCE IN NARADABROKERING
Industrial strength solutions in the publish/subscribe domain include products like
TIB/Rendezvous [26] from TIBCO and SmartSockets [27] from Talarian. Other related
efforts in the research community include Gryphon [28], Elvin [29] and Sienna [30].
The push by Java to include publish–subscribe features into its messaging middleware
include efforts such as Jini and JMS. One of the goals of JMS is to offer a unified
Application Programming Interface (API) across publish–subscribe implementations. The
JMS specification [31] results in JMS clients being vendor agnostic and interoperating
with any service provider; a process that requires clients to incorporate a few vendor
specific initialization sequences. JMS does not provide for interoperability between
JMS providers, though interactions between clients of different providers can be
achieved through a client that is connected to the different JMS providers. Various
JMS implementations include solutions such as SonicMQ [32] from Progress, JMQ from
iPlanet and FioranoMQ from Fiorano. Clients need to be able to invoke operations as
specified in the specification; expect and partake from the logic and the guarantees that
go along with these invocations. These guarantees range from receiving only those events
that match the specified subscription to receiving events that were published to a given
topic irrespective of the failures that took place or the duration of client disconnect.
Clients are built around these calls and the guarantees (implicit and explicit) that are
associated with them. Failure to conform to the specification would result in clients
expecting certain sequences/types of events and not receiving those sequences, which in
turn lead to deviations that could result in run-time exceptions.
22.3.1 Rationale for JMS compliance in NaradaBrokering
There are two objectives that we meet while providing JMS compliance within NaradaBro-
kering:
Providing support for JMS clients within the system: This objective provides for JMS-
based systems to be replaced transparently by NaradaBrokering and also for NaradaBro-
kering clients (including those from other frameworks supported by NaradaBrokering

×