Nielsen c33.tex V4 - 07/21/2009 2:06pm Page 762
Part V Data Connectivity
Framework is that each provider can be different, in that one provider does not really care about the
other types of data stores or providers. This is all handled by the sync session. When a synchronization
session takes place, data between a source data store and a destination data store are exchanged by
connecting a source provider with a destination provider.
As synchronization takes place, the destination provider supplies its current state to the sync session.
At that point, it accepts a list of changes from the source, detects any conflicts between the data it just
received and its own data, and then applies any changes to its own data store.
Sync knowledge
A term that you should remember when working with the Sync Framework is the concept of sync
knowledge. Knowledge is metadata. It is the information that the Sync Framework algorithms use to
utilize change enumeration and conflict detection. The metadata describes each and every change that
has been or will be applied to a data store via synchronization or applied directly. Thus, knowledge
assists in the following:
■ Change enumeration: The process of resolving changed items — that is, resolving which
items have been changed on the source data store that have not been applied to the destina-
tion data store (thus, the destination data store does not have knowledge of the source data
store changes).
■ Conflict detection: The process of obtaining synchronization conflicts. A synchronization
conflict happens when an operation was made by one data store and knowledge of that change
was not transferred to the other data store.
Now, with that in mind, it is important to understand that neither an application nor a synchronization
provider uses sync knowledge directly. Nor should it. T hat is the job of the Sync Framework. The Sync
Framework will call the necessary methods to initiate operations for them.
However, it is also important to understand how sync knowledge works, so this section briefly describes
sync knowledge operations and how they are used to enumerate and send the changes, as well as detect
conflicts.
There are four synchronization knowledge operations:
■ Contains: Used in both change enumeration and conflict detection, this operation checks
whether the data store that owns the specified data has applied the changes. This is deter-
mined by looking at the specified item version of a k nowledge object.
■ Project: Creates a new knowledge object containing the same changes as the original knowl-
edge object. This new object is based on one or more item IDs or a change unit ID. No other
items are included.
■ Exclude: This is the exact opposite of the Project operation. The new knowledge object
includes knowledge about everything except the specified item.
■ Union: Creates a new knowledge object that includes the same changes within at least o ne or
both of the original knowledge objects.
These operations are critical in determining synchronization state. For example, what happens when a
synchronization is interrupted or there is a failure in applying changes? What about when a synchro-
nization knowledge object needs to be filtered or restricted?
762
www.getcoolebook.com
Nielsen c33.tex V4 - 07/21/2009 2:06pm Page 763
Sync Framework 33
That is where these operations come in. The Sync Framework uses these operations to provide itself
with the appropriate k nowledge necessary to conduct change enumeration and apply the changes.
Change enumeration takes place when the destination provider sends current knowledge of the
destination data store to the source provider. The source provider then goes through all of the items in
the source data store and determines whether the destination knowledge has the same version of the
item contained in the source store. If it does not, then the items are batched and sent to the destination
provider.
Changes are sent from the source provider to the destination provider in batches. Each batch contains
metadata that describes each change in the batch (added by the provider) and current knowledge of the
source data store, which will then be used to detect conflicts.
Think of this as two types of knowledge: made-with knowledge and learned knowledge. When the batch
is created, it is created with current knowledge of the source data store. When those changes are applied
at the destination, the conflicts are logged and tracked by the current knowledge. Therefore, made-with
knowledge can determine what was known when the changes were made, and learned knowledge is
what is learned after the changes are applied.
These two types of knowledge are extremely useful when detecting conflicts. When conflicts arise, the
Sync Framework looks at two things. First, is the change a conflict with the current version of the item
stored in the destination data store? Second, is the destination version of the item superseded by the
current version, thus making the change obsolete?
When the version of the item stored in the destination data store is not contained in the knowledge of
the source data store, then there is a change conflict. If the current version is found in the knowledge
of the destination data store, then the change is obsolete. Simple as that.
Once the changes have been received by the destination and the conflicts detected, the changes are
applied to the destination data store. This is acco mplished by modifying the learned knowledge of the
change batch with things that happen at the destination. The knowledge of the destination data store is
then replaced with an updated version that is calculated by the Sync Framework.
Note two things here: The Sync Framework will use the
Exclude operator to remove any conflicts that
are not detected or resolved. Also, a recoverable error for each change is set by the Sync Framework i f
there is an i nterruption or cancellation of the synchronization, or if there are digital rights management
(DRM) issues such as locked objects.
Sync fundamentals
One of the key terms when dealing with synchronization is replica. A replica is, in essence, a full or
partial copy of the data source. The key to understanding replication is that two basic but essential
components are used. These components have been discussed previously:
■ The synchronization session
■ The sync providers
In simple terms, synchronization takes place when an application creates a synchronization session, pass-
ing to it both a source provider and a destination provider. The session then manages the exchange of
763
www.getcoolebook.com
Nielsen c33.tex V4 - 07/21/2009 2:06pm Page 764
Part V Data Connectivity
data between the source data store and destination data store by determining changes at both the source
and the destination.
In terms of the provider, it has already been stated that the provider is the mechanism that contains
metadata and knowledge for the replica, as well as for each item being synchronized. It is the provider
that actually performs the transfer of the data in and out of the replica. The provider has two other
primary responsibilities as well: It itemizes all the changes when the provider is acting as a source, and
it detects conflicts when it is acting as a destination. These two responsibilities are important to ensure
that the right (correct) data is transferred to and from the replica.
Synchronization takes place by using the following algorithm (uni-directional):
1. The application creates a synchronization session.
2. The sync session obtains current destination knowledge and sends it to the source provider.
3. The source provider itemizes changes not present in destination knowledge.
4. The source provider sends changes to the sync session.
5. The sync session detects conflicts and applies changes to the destination via the destination
provider.
Two-way synchronization i s accomplished by simply executing two one-way synchronizations.
TheSyncFrameworkcanusemanaged code and unmanaged code to perform synchronizations.
However, the way in which sessions are managed differs between the two. Managed code uses the
SyncOrchestrator class, whose sole responsibility is to initiate and control synchronization sessions.
When using unmanaged code, the application must use the
ISyncSession interface. This interface
controls a synchronization session between providers.
This is important because developers can create their own providers, which necessitates a fair amount of
work. Thus, the approach of using managed code or unmanaged code when developing a provider must
be taken into consideration.
Sync services for ADO.NET 2.0
With an understanding of the Sync Framework now under your belt, it is time to discuss Sync Services
for ADO.NET 2.0. Earlier in the chapter it was mentioned that Sync Services provides a very powerful
yet flexible API that enables developers to build applications for offline and collaboration scenarios.
The need to support remote and mobile users grows increasingly every day and is becoming more
important for many organizations. As this need grows, so does the need to provide these users with the
same data they have access to when they are working in the office. This presents a problem because
in many cases these users are using a laptop, PDA, or smartphone, and remote access usually means
connecting via a VPN or some other method.
There are several serious downsides to this approach:
■ Data access speeds: When users are in the office, they have direct access to the company’s
high-speed and reliable network. Remote users do not have this luxury. They use either wire-
less or broadband connection, which can be slow and unreliable depending on the c onnection
strength.
764
www.getcoolebook.com
Nielsen c33.tex V4 - 07/21/2009 2:06pm Page 765
Sync Framework 33
■ Scalability: More remote users means fewer available resources on existing servers (and
more cost to purchase additional hardware), and more data being transferred (thus, a slower
connection).
■ More points of failure: In a remote scenario, users are relying on several components to
access information. Not just the SQL Server, but their VPN solution, as well as how they are
connecting (wireless, etc.). If any one of these items fails, then users cannot gain access to their
information.
■ Network requirements: Imagine an individual who is constantly on the go as part of his
or her job. Reliable access to data can be difficult or nearly impossible due to dropped
connections and bad service areas.
■ Data persistence: Every piece of data that the client wants to access must be downloaded to
the client; and because there is no way to cache the data, the same data could potentially be
downloaded multiple times.
Necessity facilitates invention. Sync Services is not new, but Sync Services for ADO.NET 2.0 provides
new capabilities to overcome all of these downsides and enable remote workers access to their data
through occasionally connected applications. Occasionally connected applications enable remote users to
access their data at all times and in all places. Continuous access to data, how sweet is that? The user
has real-time access to the data because the data is local, not remote. This i s where data synchronization
comes in.
Sync Services enables synchronization of data between two distinct sources of data such as databases.
Data synchronization is the ability to transfer information from one data store to another data store on
a periodic b asis, such as from a client database (e.g., a SQL Server Compact 3.5 database) to a server
database (Microsoft SQL Server). There are several advantages to data synchronization:
■ It removes the need for constant network connection.
■ Access to the data is limited only by the speed of the client device.
Because the data is stored locally, the user has continuous access to all of the information stored on the
local device and can access the data as quickly as the device can operate. No lag time, performance, or
reliability issues trying to access data remotely. Thus, access to the data is much faster and more reliable.
Sync Services is part of the Microsoft Sync Framework. As stated earlier in the chapter, Sync Services
provides an easy-to-use API that enables developers to create and distribute applications that require
offline data access and collaboration. T he API is extremely flexible; developers can use as much or as
little of the API components as needed to meet their application requirements. Because Sync Services
is part of the Sync Framework, any database that uses Sync Services is able to share and exchange
information with other data sources that are also using Sync Services.
What’s new in Sync Services 2.0
The following list describes the new features and capabilities in Microsoft Sync Services 2.0 and
Synchronization Services for ADO.NET 1.1 (SP1) and 2.0:
■ Peer-to-peer synchronization: Part of the API, this enables applications to engage in
collaboration.
■ Sync Services inclusion in the Microsoft Sync Framework: Sync Services now requires
Microsoft.Synchronization.dll.
765
www.getcoolebook.com
Nielsen c33.tex V4 - 07/21/2009 2:06pm Page 766
Part V Data Connectivity
■ Device Synchronization: Provides synchronization capabilities between a server database and
SQL Server Compact 3.5 databases on devices
■ SQL Server change tracking: Available in SQL Server 2008, this provides a way to track
changes, either via manual synchronization commands or using the synchronization adapter
builder.
■ SQL Server 2008 data types: New data types in SQL Server 2008 are supported.
■ Synchronization process tracing: Provides the capability to trace and troubleshoot issues
that can be hard to identify.
The following two sections briefly describe offline and collaboration scenarios in a synchronization
environment.
Offline scenarios
More and more applications that are being developed today fit into the two-tier, n-tier, and serviced-
based architecture category, and Sync Services for ADO.NET 2.0 fits perfectly into these environments.
This match is due to a very flexible Sync Services API for client and server synchronization that provides
a powerful set of components to enable the synchronizing of data between data services and a local data
store, whereas previously the solution included the replication of the database and its schema.
The technology world is also seeing an increase in mobile applications that run on portable devices or
mobile devices. As stated earlier, the downside to mobile applications is that they (and the applications
that run on them) do not have a reliable connection to the source data, so the need to access the
data locally is becoming increasingly important. It would also be nice to be able to do some form of
synchronization if and when a stable and reliable connection to the central server and data source were
available.
This is where Sync Services comes in. The Sync Services API provides a powerful yet friendly synchro-
nization platform, simply because it is modeled after the ADO.NET data access APIs, which enables the
building of occasionally connected applications as an extension of building always-connected applica-
tions.
It might be helpful at this point to highlight the differences between Sync Services and other technolo-
gies that are designed to be incorporated into the occasionally connected environment. While there are a
few, the two most common are as follows:
■ RDA (remote data access): Used to synchronize a SQL Server Compact 3.5 database with
other editions of SQL Server
■ Merge replication: Used to synchronize different editions of SQL Server, including SQL
Server Compact 3.5
While these technologies are extremely useful and fulfill many needs, they are primarily focused
on Microsoft database technologies. Table 33-1 briefly compares these options in order to help you
determine which technology is appropriate for the type of application you want to build and the
environment in which the application will be used.
A few words about this list. First, RDA supports incremental uploads. Downloads are done via a
snapshot that updates the client data. Second, if you have done any merge replication, then you k now
that it has a built-in conflict-resolution solution. Not so with Sync Services. It provides a mechanism for
building a custom solution for resolving conflicts, meaning the capability is there but you need to build
it yourself.
766
www.getcoolebook.com
Nielsen c33.tex V4 - 07/21/2009 2:06pm Page 767
Sync Framework 33
TABLE 33-1
Comparing RDA, Merge Replication, and Sync Services
Feature RDA Merge Replication Sync Services
Sync using services No No Yes
Heterogeneous database support No No Yes
Incremental change tracking No Yes Yes
Detect and resolve conflicts No Yes Yes
Create data views on the client No No Yes
Initialize schema and data automatically Yes Yes Yes
Large dataset support Yes Yes Yes
Transmit schema changes automatically No Yes No
Automatically repartition data No Yes No
Use on devices Yes Yes Yes
Also keep in mind that the Sync Services architectur e for client and server synchronization is asymmet-
ric, meaning that change tracking is built into the client database. Incremental changes on the server can
be downloaded, but they must be tracked by the programmer.
In determining which path to take when looking at a data synchronization solution, it helps to consider
some of the concepts behind these technologies. For example, not many developers really work with
SQL Server replication technology. That is usually a DBA function and is primarily designed to
keep SQL Server databases in sync with each other.
RDA itself is a great technology, but it is somewhat inferior to Sync Services. A SQL Server Compact 3.5
technology, it is simply intended to provide applications with the capability to access data from a remote
SQL Server database and store it locally. RDA uses a push/pull method to propagate data changes to and
from the SQL Server Compact database.
However, the Microsoft MSDN Books Online document clearly states the following:
Because of design limitations, remote data access (RDA) will be removed in a future release. If
you are currently using RDA, you should consider transitioning to Microsoft Synchronization
Services for ADO.NET. If you were planning to use RDA in a new application, you should
instead consider merge replication or Synchronization Services. Note that Synchronization
Services is available for both desktop and mobile devices.
Sync Services is a richer programming model that includes many features also found in merge repli-
cation technology. It is superior to RDA and, unlike replication, it is targeted toward developers who
767
www.getcoolebook.com
Nielsen c33.tex V4 - 07/21/2009 2:06pm Page 768
Part V Data Connectivity
want the power and flexibility to access the client data that is based on a server database or another data
source.
And unlike RDA, Sync Services is not limited to just SQL Server databases, as it includes support for
heterogeneous databases. Sync Services also allows for synchronization over services such as WCF (Win-
dows Communication Foundation).
So, it boils down to the following. Use Sync Services if the following apply:
■ The application needs to synchronize with non-SQL Server databases.
■ The application needs separate components to enable synchronization over different transports
or services.
■ The need to replicate a schema and its data from one database to another database is not a
requirement.
■ You want to replicate data without the administrative overhead of merge replication, but with
the core merge engine functionality.
■ The capability to architect and develop a really cool multi-tier or serviced-based synchroniza-
tion application is extremely appealing.
Okay, so I threw in that last one just for the fun of it, but it is really cool.
This section wraps up with a discussion of the architecture and classes needed for offline scenario syn-
chronization such as two-tier, n-tier, and service-based architecture.
Sync Services is flexible enough to provide multiple synchronization types, including the following:
■ Snapshot and download-only: Used to store and update reference data. Changes made to
data on the server are downloaded to the client during synchronization. Snapshot synchroniza-
tion does a complete refresh of the data every time the client is synchronized. Download-only
synchronization downloads only the changes that have occurred since the last synchronization.
■ Upload only: Used to insert data on a client. Changes made to the client (such as inserts) are
uploaded to the server during synchronization.
■ Bi-directional: Used for data that can be updated on both the client and the server.
When synchronizing, Sync Services uses the f ollowing classes:
■ Microsoft.Synchronization.Data.dll
■ Microsoft.Synchronization.Data.SqlServerCe.dll
■ Microsoft.Synchronization.Data.Server.dll
The Synchronization Agent, Synchronization tables, and Synchronization Groups are found in
Microsoft.Synchronization.Data.dll.
The Client Synchronization Provider is found in
Microsoft.Synchronization.Data
.SqlServerCe.dll.
The Server Synchronization Provider and Synchronization Adapters are in
Microsoft
.Synchronization.Data.Server.dll.
768
www.getcoolebook.com
Nielsen c33.tex V4 - 07/21/2009 2:06pm Page 769
Sync Framework 33
When working with two-tier applications, all of the Sync Services DLLs are located on the client. For
n-tier applications,
Microsoft.Synchronization.Data.dll and Microsoft.Synchronization
.Data.Server.dll are located on a separate computer that provides a synchronization service.
The following three illustrations show how the components map to a set of Sync Services classes. Note
the existence of a client database, a server database, and Sync Services classes.
The client database in a Sync Services application is a SQL Server Compact 3.5 database. The server
database can be any database that is supported by an available ADO.NET adapter. Out of the box, Sync
Services provides the capability to track changes in the client database. This infrastructure is enabled the
first time any table is synchronized using any method other than snapshot synchronization.
Figure 33-9 shows a standard two-tier architecture. All of the items shown in the figure correspond to a
Sync Services class, except for the two databases. In a two-tier architecture, a direct connection between
the server and the client is required.
FIGURE 33-9
A standard two-tier architecture
Sync Adapters
Server Sync Provider
Synchronization
Agent
Server
DB
Client
DB
Synchronization
Tables (Group)
Client Sync Provider
An n-tier architecture (see Figure 33-10) is similar to a two-tier architecture with the additional
requirement of a proxy and service, as well as a transport mechanism whose responsibility it is to
facilitate communication between the client and server databases.
An n-tier architecture is becoming more commonplace simply because it does not require the constant
connection between server and client, and it provides a more flexible architecture.
Figure 33-11 illustrates a service-based architecture. This differs from the previous two examples in that
it does not include a server database a nd corresponding synchronization providers and adapters. Rather,
in a service-based architecture, the application needs to communicate with the Synchronization Agent
through a custom proxy and custom service, such as WCF (Windows Communication Foundation).
The thing to keep in mind is that the custom proxy and custom service must provide the same function-
ality that the synchronization provider and adapter would provide, such as enumerating synchronization
changes.
769
www.getcoolebook.com
Nielsen c33.tex V4 - 07/21/2009 2:06pm Page 770
Part V Data Connectivity
FIGURE 33-10
An
n
-tier architecture
Sync Adapters
Server Sync Provider
Synchronization
Agent
Server
DB
Client
DB
Synchronization
Tables (Group)
Client Sync Provider
Transport
Proxy
Srvc
FIGURE 33-11
A simple service-based architecture
Synchronization
Agent
Client
DB
Synchronization
Tables (Group)
Client Sync Provider
Transport
Proxy
Srvc
Any Data
Source
The common theme in each of these examples is the existence of the Sync Services classes: SyncAgent,
DbServerSyncProvider, SqlCeClientSyncProvider, SyncAdapter, SyncTable, and SyncGroup. Here is a
quick review of the roles and responsibilities of each of these classes. The Synchronization Agent has the
following responsibilities:
■ Enumerate through each table being synchronized
■ Call the client synchronization provider, retrieving and applying changes at the client database
■ Call the server synchronization provider, retrieving and applying changes at the client database
The Client Synchronization Provide r is responsible for the following:
■ Store information on the client about tables that are enabled for synchronization
■ Obtain changes that occurred on the client since the last synchronization
■ Detect conflict changes
■ Apply changes to the client
The Server Synchronization Provider has similar responsibilities to the Client Synchronization Provider,
but obviously performs all of its necessary actions on the server.
770
www.getcoolebook.com
Nielsen c33.tex V4 - 07/21/2009 2:06pm Page 771
Sync Framework 33
Several other items are highlighted in the figure above that need to be discussed here. First, a synchro-
nization table is defined for each table that is included for synchronization. The r esponsibility of a syn-
chronization table is to store settings about the table, such as the synchronization direction.
Second, once a synchronization table is defined, it is added to a synchronization group. The purpose of
the synchronization group is to provide the means of ensuring reliable synchronization changes to all of
the synchronization tables in the group. Thus, all the changes are a pplied in a transaction and are syn-
chronized as a unit, or a whole, not individually. If a change from one of the tables in the group fails,
then all of the changes are rolled back, and the group is applied again during the next synchronization.
The final section in this chapter discusses collaboration scenarios in a peer-to-peer synchronization
architecture.
Collaboration scenarios
Sync Services for ADO.NET 2.0 provides the capability to perform peer-to-peer synchronization — each
peer can synchronize with any other peer, and that peer can synchronize with any other peer. And all of
this can take place without the need to go through a central repository or data store. This is what makes
this architecture so good for collaboration scenarios. Peer-to-peer synchronization can also be used when
applications do not have a reliable network connection for offline synchronization.
This section first provides an overview of peer-to-peer synchronization, and then discusses its
architecture.
Like the offline synchronization scenarios, a comparison between Sync Services and other technologies
might be useful here. Table 33-2 highlights the main features of Sync Services and peer-to-peer
transactional replication that will be helpful in determining when to use which technology for building
applications.
TABLE 33-2
Comparing Sync Services to Transactional Replication
Feature Peer-to-Peer Transactional Replication Sync Services
Synchronize using services No Yes
Synchronize with other types of data stores No Yes
Incremental change tracking Yes Yes
Conflict detection and resolution Yes Yes
Support large datasets Yes Yes
Automatically initialize schema and data Yes No
Automatically propagate schema changes Yes No
771
www.getcoolebook.com