Tải bản đầy đủ (.pdf) (10 trang)

Scalable voip mobility intedration and deployment- P35 docx

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (134.29 KB, 10 trang )

340 Chapter 8
www.newnespress.com
parties, although everyone must support 32 and 64 is recommended. The idea of the
window is that the receiver keeps the highest sequence number that it has seen from a
packet that has been successfully authenticated. (Forgeries may try to push the window
around, and so must be ignored for setting the window.) Any packet received with a
sequence number older than the current receive one, minus the window size, is dropped
right away. That leaves the packets in the middle of the window. For those packets, a list of
sequence numbers already seen is kept. If the packet with the same sequence number comes
in twice, the second one is dropped. Otherwise, the packet is allowed in and its sequence
number recorded.
IPsec is flexible enough to allow for a number of different encryption and authentication
protocols to be negotiated. Common encryption protocols are 3DES-CBC and AES-CBC. A
common authentication protocols is HMAC-SHA1. Recall that an HMAC is a special type
of signature that requires a private key to validate. If a message is received, the key and
the packet data together produce the signature, which is then compared to the one on the
packet. If they match, the sender has the right key. So the possession of the key by the
sender is proof of the authenticity of the packet.
8.3.1.1 IPsec Key Negotiation
Because IPsec is only a transport, there must be a protocol to set up the tunnels. The
simplest protocol allowed is to use none, and IPsec connections are allowed to be set up on
both sides manually. However, it is usually far simpler for management of the connections
to use some sort of user authentication and negotiation protocol.
The Internet Security Association and Key Management Protocol (ISAKMP) is used
between devices to negotiate the type of IPsec connection and to establish the security
association. The two endpoints decide on the type of tunnel, the type of encryption or
authentication algorithm to use, and other parameters using this protocol. ISAKMP is
defined in RFC 2408 and uses UDP port 500 for its communication. Related to ISAKMP is
the key exchange protocol itself, the Internet Key Exchange protocol (IKE). IKE takes care
of the key exchange portion of the setup, and thus piggybacks with ISAKMP as a part of
the setup.


ISAKMP has similar message exchanges as the other security negotiation protocols do,
including certificate requests and responses, nonce exchanges, and capabilities exchange.
ISAKMP is a complex protocol; it would not be useful to go into the same level of detail
for ISAKMP here as it was for TLS. However, let us take a look at the basic exchange.
The first phase for ISAKMP is for one endpoint—usually a VPN client—to reach out to the
authenticating server. The first message sent contains a significant amount of information.
When using Aggressive mode, which is common because it reduces the amount of messages
that have to be exchanged, the first message contains nearly everything needed in one shot
Securing Voice 341
www.newnespress.com
to set up the connection. The first major piece of information in this message is the proposal
list for the algorithms to use for authentication or encryption, for the ISAKMP/IKE
exchange itself. This list is an ordered set of the combinations of encryption and
authentication payloads methods that the client wishes to request, as well as the
authentication methods and the expected key lifetimes. The next important set of
information kicks off the key exchange. This key exchange starts off the key negotiation,
using Diffie-Hellman keys to create a session key. A nonce is included, followed by the
identification of the endpoint, and a number of options.
The next phase, in aggressive mode, comes back from the server. This selects the IPsec
encryption and authentication that will be used, and how the user is to authenticate.
Following this is a nonce, then the server’s identity. Options conclude the packet.
At this point, the ISAKMP/IKE session can be encrypted. The third phase involves the two
endpoints establishing the IPsec security association proper. The two endpoints select what
IPsec authentication and encryption mechanism they will use.
After this has completed, the information is pushed down to set up the IPsec connections
themselves.
User (not packet) authentication can occur one of a couple of ways. Each side can have a
preshared key, which is then used in the validation of the ISAKMP session. Or, each side
can use certificates. The certificate-based scheme is undoubtedly more secure, but is harder
to manage. This is precisely the same tradeoff that is experienced on link protection, such as

using 802.1X verses pre-shared key for Wi-Fi.
8.3.2 Application-Specific Encryption: SIPS and SRTP
The difficulty of using end-to-end encryption is that all of the endpoints must support it,
which may not be the case. Instead, protocol-based security can be used.
For SIP, one option is to use SIP over TCP, protected by a TLS session. This is identical to
the approach HTTP uses for protection, by using TCP and requiring a TLS negotiation first.
The advantage to doing this is simplicity, as TLS is a well-understood technology, and
vendors do not have a difficult time implementing it. Furthermore, using TLS allows the
voice mobility administrator to enable the built-in SIP authentication system, based on
WWW digests, without fear of eavesdropping. Using SIP authentication greatly decreases
the complexity of an authentication-based network, because SIP clients are far more likely
to support it out of the box. The major issue with SIPS, or SIP with TLS, is that the
processing requirements on the PBXs go up significantly, which may affect the scale that
the PBX can operate at.
Protecting SIP does nothing to protect the payload. To protect the bearer channel, SRTP is
an option. SRTP uses a AES (and only AES) encryption to encrypt each RTP packet. The
342 Chapter 8
www.newnespress.com
AES encryption is used in a stream setting, by running in counter mode, which ensures that
AES can be restarted if intervening packets are lost. SRTP is effective, in that it protects the
packets from eavesdropping and modification.
8.3.3 Consequences of End-to-End Security
There is a major consequence of using end-to-end security. Devices that may not have
been built for fast cryptographic operation, such as PBXs and media gateways, will be
forced to use computationally expensive protocols on the real-time voice path to ensure
privacy. The protocols themselves are not ubiquitously supported. For example, IPsec is
common for VPNs, and can be used from router to router or even laptop to laptop, but
phones are unlikely to have a VPN client at all, let alone one that is fast enough to be
appropriate for real-time voice traffic. PBXs are less likely to support IPsec. Using
application specific security makes more sense in this case, but even then, the protocols are

relatively new, and are not commonly deployed. For this reason, it is usually far easier to
dispense with end-to-end security, and instead to focus on protecting the mobile, exposed
portion of the network.
8.4 Protecting the Pipe
The pipe, in voice mobility networks, can be a number of things. When the mobility
network is heavily wireline, the problem becomes authenticating over Ethernet. (Encryption
for wireline networks is considered less necessary.) When voice mobility uses Wi-Fi, the
problem transforms into finding the right WPA2 settings for both authentication and
encryption. When traffic is coming in from the outside world, using fixed-mobile
convergence solutions or remote access clients, the pipe that needs protecting crosses
the Internet.
The advantage of protecting the pipe, and not the entire path, is that the part of the path
most vulnerable can be addressed using specific, dedicated security infrastructure, whereas
the less vulnerable parts can be placed in physically or logically secure networks. This
allows “legacy” voice mobility equipment to have a high chance of operating with strong
security.
For wireline networks, especially dedicated wireline voice networks that have exposed
jacks, one of the major concerns is that someone might plug into the voice network rather
than the data network, either by mistake or to cause mischief. To preserve the sanctity of the
voice wireline network at the edge, one solution available is to use 802.1X on the wireline
ports. 802.1X works on wireline in almost the same way as it does for Wi-Fi. (See Chapter
5 for details.) The major difference is that the end of the 802.1X EAP exchange does not
lead to a continuation into any sort of key exchange or encrypted session. Rather, the edge
Securing Voice 343
www.newnespress.com
switch, acting as authenticator, unlocks the port for use for more than just authentication.
The issue with using 802.1X authentication for wireline networks is that the desktop phones
may not support it. In that case, a practical, though not terribly secure, alternative is to
implement MAC address filtering on the switch. This can be done per port, or better,
switch-wide. The goal is to only let phones onto the network, and ensure that any traffic

that ends up on the network that comes from an accidentally connected device is dropped
before it starts consuming resources. One of the biggest concerns on that front is that a
client may come in and exhaust the phones’ DHCP address space. This can happen when
the accidentally connected laptop is looking for an IP address in a specific range, and gets a
completely different one from the DHCP server. If the client ends up rejecting that address
for being in, say, a private address range that it has been configured not to use, there is a
chance that the device will try again. If this happens enough times, the DHCP server will
lose all of its addresses, and any phones that get plugged in or introduced to the network
wirelessly will not be able to gain basic connectivity.
Wi-Fi security is a must. Chapter 5 went into significant detail on how preshared keys work,
compared to usernames or certificates. The advantage of using Wi-Fi’s own security, rather
than an end-to-end piece, is that the phone is likely to have a high-performance security
function built into the Wi-Fi chip, just for WPA2. This is because Wi-Fi certification
requires that every device support WPA2, and every Wi-Fi chipset manufacturer embeds
just that process into the chips that they make. Phone manufacturers need only turn on those
features; there is no heavy lifting that needs to be done. Compare this to SRTP, for example,
which requires that the voice coder engine, which is usually an optimized engine for
producing real-time payloads, must also either know how to encrypt the traffic by itself or
must pass it along to a slower software process to encrypt. This can cause significant battery
drain on the phone, if such a configuration is even supported.
FMC solutions beg the use of a remote security product. Again, the physical and resource
capabilities of the phone come into play here. Some phones do, in fact, have VPN clients,
which can be used for access into the enterprise. These VPNs terminate long before
enterprise server infrastructure is reached. Running voice protocols, using FMC soft clients,
over the VPN can make sense, although the common mode of operation is for voice to
remain on the mobile operator’s network and for data to go through the VPN. One
interesting twist, however, is that voice devices that have VPN capabilities must have the
VPN logged into when the user is on the road. They can sometimes be configured to log in
by themselves, but more often than not, they must be enabled manually. This is especially
important for converged, dual-mode phones that also operate within the enterprise. When in

the enterprise, associated to the corporate Wi-Fi network, there is no need or benefit for
enabling the VPN link. In fact, the VPN server may not even be accessible from within the
network, as its major interface is meant to point outside, to the Internet. Because the VPN
should be on for some uses of the Wi-Fi network and not others, it can stand in the way of
344 Chapter 8
www.newnespress.com
convincing users to access the enterprise network on the road, reducing the productivity
gains that the FMC solution was looked at for in the first place. One option that many Wi-Fi
infrastructure vendors offer, which takes advantage of the concept of protecting the pipe and
not the end-to-end application, is for remote users to be offered remote access points. These
access points are similar to the campus access point, and yet are designed for operation
when on the road. The remote access point is essentially a VPN client and a normal access
point combined. The access point’s VPN client tunnels through the Internet to the corporate
network, where it terminates at the wireless controller. Once connected to the controller, the
access point pulls the same enterprise configuration down as it would if it were in the office,
and provides it to the remote user. This way, the remote user can use the same cellphone as
on campus, with the same WPA2 security policies, without having to be bothered with the
VPN. This does provide a measure of privacy from the phone to the physically protected
office.
8.5 Physically Securing the Handset
The handset itself is still a weak link. Handsets are designed to be portable, so in that
sense, they are also designed to be transported away from their rightful owners. Many
converged handsets are quite impressive in their capabilities, and so they make for an
interesting target for thieves. Furthermore, busy, high-productivity enterprise users are
unlikely to set up a complicated, strong phone lock password. It is simple to imagine a
hospital on-call attending physician not wanting to be bothered with typing an eight-
character, letter, number, and punctuation-mark password on a phone with itself no more
than 16 buttons.
The problem becomes, then, how to prevent phones from being stolen in the first place, and
how to take care of the problem once they have left the building.

8.5.1 Preventing Theft
There are many practical considerations for preventing the theft of a voice mobility handset.
Unfortunately, limiting the network access is not one of them. Take Wi-Fi-only handsets as
an example. It is pretty clear to the voice mobility administrators that a Wi-Fi-only handset
will not be of much use at a person’s home, or a hotspot, or even another office. Most
phones provide tight administrative privilege requirements to change the Wi-Fi network and
SSID that the phone uses. Even if someone can accomplish the feat of penetrating those
restrictions, the phone will work only with the PBX it was configured for, unless someone
changes it. They do not perform well over the Internet, and are not going to be useful as a
personal wireless voice phone for someone’s house. However, the devices do look like
cellphones, and the people who might take one such phone are not likely to know or care
about the difference until they have the phone securely in their possession.
Securing Voice 345
www.newnespress.com
Some common-sense approaches can work. Sticking labels onto phones that state just the
fact that they will not work outside can have a slight deterring effect, much the same that
which restaurant pagers’ warnings have on their patrons. A more workable approach is to
use telephones that do not look like telephones. In the limited environments where voice
mobility can be conducted without outdoor support, such as warehouses or hospitals,
specialized devices like two-way communicator badges or ruggedized handsets can
discourage some amount of theft. For environments where devices have specialized
chargers, a daily checkout policy and a central charging station can at least keep track of the
phones that do exist and strongly discourage users from taking devices with them to places
where they might get stolen. A good example of this is with nursing—the nurses’ station
makes an acceptable location to place the chargers.
But, ultimately, if someone wants to take the phone, they can. A better way to protect
against theft is to detect theft. Theft detection can be performed somewhat readily on
converged, dual-mode phones or Wi-Fi-only phones using location tracking. Location
tracking is a feature of Wi-Fi networks or of overlay systems that use Wi-Fi for monitoring,
where the rough positions of each device are recorded within the system. Location tracking

systems can be built with automated policies, such as email alerts that are sent when
devices enter or exit certain positions. There are a number of networks that use the location
tracking system to monitor the exits to the buildings, to send alerts if a device passes
through there when it should not. This can send an email out to the person who owns the
phone, informing them of its activity, and gently reminding them that the phone is not to
leave the building. Such as system seems like it may do nothing for people who intend to
steal the device and already have it within their possession, as removing the battery or
powering down the device will disable the location tracking. Another option is for the
system to then send a message if the phone is taken off the air. In many of these
environments, phones are generally kept on 24 hours a day, and so it is possible to come up
with the right set of rules to make a deterrent useful.
Unfortunately, there is no foolproof way to prevent theft of voice mobility devices. Making
sure that the devices do not look like high-end cellphones, unless the users need those
high-end features, and a stronger educational campaign about locking phones and keeping
track of them, are likely to be the most effective.
The second half of the discussion is what to do when the device is stolen. This question is
either more simple or less simple than it looks, depending on whether the device has a
cellular radio. If the device does have a cellular radio, then the options are wider.
Cellular phones, even dual-mode phones, connect to the mobile network. Once connected,
a phone that was reported stolen can be disabled remotely by the mobile operator or
administrator, and even potentially tracked, if the phone is being used in the commission of
a serious crime. On the other hand, if the phone does not have a cellular radio, then there is
346 Chapter 8
www.newnespress.com
a good chance that the thief will not be able to use the phone for much of anything, so it
may already be in a state that is close enough to disabled for the comfort for the
administrator.
This is a reminder for voice mobility administrators to explore the encryption options for
smartphones that may be used with the enterprise network. Phones set up for encryption
may find that the information locked up in the stolen phone will be lost, but it is better than

the information being exposed.
8.6 Physically Protecting the Network
Physically protecting a voice mobility network is a very strong way to preventing
unauthorized access. People may be tempted to leave voice mobility networks more
exposed than their data counterparts, because the only traffic flowing across them is real
time call data, and not, say, highly sensitive corporate strategies. Nonetheless, it is important
that voice mobility networks be given the necessary physical security to prevent problems
with direct intrusions.
The wireline portion of the voice mobility network should be treated with the same level of
concern as the service network for data usage would. Most IT organizations are good at
ensuring that email servers are locked up, if not for security, then at least to prevent
accidental disruption. The same goes for IP PBXs and gateways. However, there has been a
historically different way of thinking about voice networks compared to data. Data networks
invariably terminate in a switch that is placed in a locked switching closet. Voice lines,
however, used to terminate in a series of punch blocks, which had been placed in locations
convenient for the wire-pullers. If a network is subsequently upgraded to voice over IP,
those same locations may be used for placing the Ethernet switches that concentrate the
desk ports and send them to the voice network, towards the PBX. Clearly, those areas
should be kept locked with the same scrutiny.
Another area of concern is with the accidental confusion of voice and data network services.
Voice networks should always be kept physically separate and distinctly marked from data
networks. Before wireline voice over IP, the phone port was never electrically like a data
port. A user making an error in reconnecting the devices on his desk would have found that
the phone and computer would both not work if the machines were plugged in the wrong
way. But, unfortunately, with voice-over-IP services, it is possible for the user to get it
wrong and still have the appearance that everything is working correctly—that is, at least,
until he tries to place a call. This is a different aspect of physically protecting the network.
Previously, we saw an example of how a misconnected device can exhaust network
resources, such as with accidental DHCP exhaustion. Physically protecting the port of the
network would also solve that problem. The simplest way to do that is not necessarily to

Securing Voice 347
www.newnespress.com
secure the jack but to place a wall plate over it that makes it difficult to unplug the phone.
This serves as a potential deterrent to accidental swapping.
On the wireless side, however, real physical security is a necessary. Again focusing on
Wi-Fi deployments, the fact that the signals do pass through walls and outside the building
requires that the network be well planned for security.
Depending on the nature of the environment, even strictly followed WPA2-Enterprise
security with certificate exchanges using TLS can reveal information about the caller that
should not be exposed. Given that a reasonable number of voice mobility devices use
preshared keys, and not WPA2-Enterprise, and a physical security approach can help
provide an additional layer of protection. Furthermore, there are a few environments where
it is important that the very fact that a user places a phone call should be hidden. Voice
traffic over Wi-Fi is designed to be distinctive, with a regular pattern of fixed-rate, fixed-
length frames coming on high-priority services tipping any observer off that a call has been
initiated.
Here, the concern is the exposure to the outside areas of the building of the in-building
voice mobility network. Physical layer Wi-Fi firewalling solutions have recently been
introduced that use RF activity to mask the presence of the network and its traffic to
different physical regions. By deploying these physical layer blocking systems on the
outside walls of the building, the systems can provide a curtain that separates the inside,
where the network is accessible, from the outside, where the network is not even recordable.
This is an inherently different solution from attempting to use specialized antennas and
beam technologies to concentrate the Wi-Fi network towards the inside of the building.
Doing the latter does not prevent an attacker from recording the voice traffic. It requires
only that the attacker get a slightly bigger antenna, with a dB gain improvement for every
dB of isolation that the network installer was able to provide. Using RF firewalling instead
blocks the signal from being intelligible past the curtain of coverage, preventing
eavesdroppers from having useful access to the leaked signals from within the building. Be
careful that some products may be labeled “RF firewalls” if they simply use the location of

the device to influence the firewall policies of the network. These are not true RF firewalls,
since they do not provide any security at the RF layer itself, and thus are completely
vulnerable to the passive leakage attacks mentioned here.
349
CHAPTER 9
The Future: Video Mobility and Beyond
9.0 Introduction
This entire book has so far been looking at the present day, with voice mobility taking
center stage. But mobile devices have begun the dramatic transition from voice-only phones
to multipurpose, “converged” systems that fit in the pocket but perform the work that a
laptop computer would have just a couple of years ago.
The main difference is that the type of applications that these devices run has expanded.
From a quality-of-service perspective, voice is no longer alone as the main application.
Video is here, in the form of webcasts, corporate events, and videoconferences.
In this chapter, we will look at video mobility, building upon the concepts already covered.
Then we will look towards the future, and try to see where mobility may be going, in the
enterprise.
9.1 Packetized Video
Video is an interesting thing. Besides that it contains voice as a proper subset, video also is
responsible for carrying quite a bit more information. Whereas voice recording and
playback technology was perfected in miniature form decades ago, video is always a work
in progress, needing bigger and bigger screens with higher resolutions and more sharpness.
In some senses, it’s hard to imagine how video and mobility go hand in hand, or simply in
your hand, as video requires watching on a device that is not constantly shaking and
jiggling, and requires nearly constant attention, whereas voice can be used on the run.
However, video has the ability to connect with the user in a way voice can never. If a
picture is worth a thousand words, a moving picture should be worth at least a thousand
times more again. More practically speaking, videoconferencing and webcasting has become
increasingly attractive, as travel budgets constrain companies from hosting large fly-in
gatherings, and video technology has improved in concert. In the end, voice and data have

already been able to make the transition from wires, but as video becomes more prevalent
as a part of networking in general, the video must naturally follow the user, and thus
become wireless as well.
©2010 Elsevier Inc. All rights reserved.
doi:10.1016/B978-1-85617-508-1.00001-3.
350 Chapter 9
www.newnespress.com
Let us look into some of the fundamental differences between voice and video. The first
difference is the most obvious: video requires significantly more throughput than voice.
Video has to carry the moving picture along with the same voice stream as before, and so
the overall content must be quite a bit larger. In voice mobility, the throughput of voice is
usually not the constraint except in very large voice aggregation centers. Instead, voice
makes its presence known by its increase in the number of packets over the network. Video,
on the other hand, is bandwidth constrained from the outset. The second difference is that
video requires synchronizing multiple media streams. This impact is felt especially when
loss rates rise or bandwidth constraints are hit, and some part of the video must be
sacrificed. In the ideal case, the software on the endpoints keeps everything in
synchronization, but this can often slip when poor network quality or lack of capacity
begins to challenge the ability of the video client to find its way without all of the
appropriate information. The third difference has more to do with how video is used today.
For the most part, video is one-way. The user watches a video that is being streamed.
Because of this, video can build up reasonable latency, and so video is not as latency-
constrained as voice is. This may change, as mobile devices are given more sophisticated
cameras and videoconferencing on a mobile device becomes possible. But, as of today,
video is still fundamentally a broadcast mechanism. And for this reason, video also differs
from voice at a fundamental network level, because it can be effectively multicast over the
network. The final difference is that video can be significantly more sensitive to loss than
voice. Without the two-way conversation, lost information may not be covered up by asking
the other side to repeat, and the user may end up with a poorer opinion on the quality of the
network.

9.1.1 Video Encoding Concepts
As mentioned earlier, video has many more dimensions (quite literally) of information to it
than voice does. Video is nothing more than a series of still pictures, whereas voice is a
series of nothing more than still point-in-time readings of sound pressure: just one small
number at a time. Where voice has just this one sample at any given time, video has the
entire picture at that time. This picture, itself, is made of two dimensions of pixels, or small
areas that possess the same color.
Let’s dig into this a bit more deeply. A picture, or still image, has hundreds of thousands of
pixels, or picture elements, arranged in a rectangular grid. Each pixel has some fixed
dimension, often measured in millimeters, with the most important aspect being the ratio of
the width to the height of the pixel. The entire screen is made of hundreds or thousands of
pixels in each of the horizontal and vertical directions, resulting in images with a given
resolution. Common resolutions are the small 640×480, meaning 640 pixels wide by 480
pixels high; 1280×720, used in 720p high-definition; and 1920×1080, used in 1080p and
1080i high-definition video. Resolution can also be a function of the actual number of

×