< Day Day Up >
•
Table of Contents
•
Index
•
Reviews
•
Reader Reviews
•
Errata
•
Academic
Squid: The Definitive Guide
By
Duane Wessels
Publisher: O'Reilly
Pub Date: January 2004
ISBN: 0-596-00162-2
Pages: 496
Squid is the most popular Web caching software in use today, and it works on a variety of
platforms including Linux, FreeBSD, and Windows. Written by Duane Wessels, the creator of
Squid, Squid: The Definitive Guide will help you configure and tune Squid for your particular
situation. Newcomers to Squid will learn how to download, compile, and install code. Seasoned
users of Squid will be interested in the later chapters, which tackle advanced topics such as
high-performance storage options, rewriting requests, HTTP server acceleration, monitoring,
debugging, and troubleshooting Squid.
< Day Day Up >
< Day Day Up >
•
Table of Contents
•
Index
•
Reviews
•
Reader Reviews
•
Errata
•
Academic
Squid: The Definitive Guide
By
Duane Wessels
Publisher: O'Reilly
Pub Date: January 2004
ISBN: 0-596-00162-2
Pages: 496
Copyright
Dedication
Preface
About This Book
Recommended Reading
Conventions Used in This Book
Comments and Questions
Acknowledgments
Chapter 1. Introduction
Section 1.1. Web Caching
Section 1.2. A Brief History of Squid
Section 1.3. Hardware and Operating System Requirements
Section 1.4. Squid Is Open Source
Section 1.5. Squid's Home on the Web
Section 1.6. Getting Help
Section 1.7. Getting Started with Squid
Section 1.8. Exercises
Chapter 2. Getting Squid
Section 2.1. Versions and Releases
Section 2.2. Use the Source, Luke
Section 2.3. Precompiled Binaries
Section 2.4. Anonymous CVS
Section 2.5. devel.squid-cache.org
Section 2.6. Exercises
Chapter 3. Compiling and Installing
Section 3.1. Before You Start
Section 3.2. Unpacking the Source
Section 3.3. Pretuning Your Kernel
Section 3.4. The configure Script
Section 3.5. make
Section 3.6. make Install
Section 3.7. Applying a Patch
Section 3.8. Running configure Later
Section 3.9. Exercises
Chapter 4. Configuration Guide for the Eager
Section 4.1. The squid.conf Syntax
Section 4.2. User IDs
Section 4.3. Port Numbers
Section 4.4. Log File Pathnames
Section 4.5. Access Controls
Section 4.6. Visible Hostname
Section 4.7. Administrative Contact Information
Section 4.8. Next Steps
Section 4.9. Exercises
Chapter 5. Running Squid
Section 5.1. Squid Command-Line Options
Section 5.2. Check Your Configuration File for Errors
Section 5.3. Initializing Cache Directories
Section 5.4. Testing Squid in a Terminal Window
Section 5.5. Running Squid as a Daemon Process
Section 5.6. Boot Scripts
Section 5.7. A chroot Environment
Section 5.8. Stopping Squid
Section 5.9. Reconfiguring a Running Squid Process
Section 5.10. Rotating the Log Files
Section 5.11. Exercises
Chapter 6. All About Access Controls
Section 6.1. Access Control Elements
Section 6.2. Access Control Rules
Section 6.3. Common Scenarios
Section 6.4. Testing Access Controls
Section 6.5. Exercises
Chapter 7. Disk Cache Basics
Section 7.1. The cache_dir Directive
Section 7.2. Disk Space Watermarks
Section 7.3. Object Size Limits
Section 7.4. Allocating Objects to Cache Directories
Section 7.5. Replacement Policies
Section 7.6. Removing Cached Objects
Section 7.7. refresh_pattern
Section 7.8. Exercises
Chapter 8. Advanced Disk Cache Topics
Section 8.1. Do I Have a Disk I/O Bottleneck?
Section 8.2. Filesystem Tuning Options
Section 8.3. Alternative Filesystems
Section 8.4. The aufs Storage Scheme
Section 8.5. The diskd Storage Scheme
Section 8.6. The coss Storage Scheme
Section 8.7. The null Storage Scheme
Section 8.8. Which Is Best for Me?
Section 8.9. Exercises
Chapter 9. Interception Caching
Section 9.1. How It Works
Section 9.2. Why (Not) Intercept?
Section 9.3. The Network Device
Section 9.4. Operating System Tweaks
Section 9.5. Configure Squid
Section 9.6. Debugging Problems
Section 9.7. Exercises
Chapter 10. Talking to Other Squids
Section 10.1. Some Terminology
Section 10.2. Why (Not) Use a Hierarchy?
Section 10.3. Telling Squid About Your Neighbors
Section 10.4. Restricting Requests to Neighbors
Section 10.5. The Network Measurement Database
Section 10.6. Internet Cache Protocol
Section 10.7. Cache Digests
Section 10.8. Hypertext Caching Protocol
Section 10.9. Cache Array Routing Protocol
Section 10.10. Putting It All Together
Section 10.11. How Do I
Section 10.12. Exercises
Chapter 11. Redirectors
Section 11.1. The Redirector Interface
Section 11.2. Some Sample Redirectors
Section 11.3. The Redirector Pool
Section 11.4. Configuring Squid
Section 11.5. Popular Redirectors
Section 11.6. Exercises
Chapter 12. Authentication Helpers
Section 12.1. Configuring Squid
Section 12.2. HTTP Basic Authentication
Section 12.3. HTTP Digest Authentication
Section 12.4. Microsoft NTLM Authentication
Section 12.5. External ACLs
Section 12.6. Exercises
Chapter 13. Log Files
Section 13.1. cache.log
Section 13.2. access.log
Section 13.3. store.log
Section 13.4. referer.log
Section 13.5. useragent.log
Section 13.6. swap.state
Section 13.7. Rotating the Log Files
Section 13.8. Privacy and Security
Section 13.9. Exercises
Chapter 14. Monitoring Squid
Section 14.1. cache.log Warnings
Section 14.2. The Cache Manager
Section 14.3. Using SNMP
Section 14.4. Exercises
Chapter 15. Server Accelerator Mode
Section 15.1. Overview
Section 15.2. Configuring Squid
Section 15.3. Gee, That Was Confusing!
Section 15.4. Access Controls
Section 15.5. Content Negotiation
Section 15.6. Gotchas
Section 15.7. Exercises
Chapter 16. Debugging and Troubleshooting
Section 16.1. Some Common Problems
Section 16.2. Debugging via cache.log
Section 16.3. Core Dumps, Assertions, and Stack Traces
Section 16.4. Replicating Problems
Section 16.5. Reporting a Bug
Section 16.6. Exercises
Appendix A. Config File Reference
http_port
https_port
ssl_unclean_shutdown
icp_port
htcp_port
mcast_groups
udp_incoming_address
udp_outgoing_address
cache_peer
cache_peer_domain
neighbor_type_domain
icp_query_timeout
maximum_icp_query_timeout
mcast_icp_query_timeout
dead_peer_timeout
hierarchy_stoplist
no_cache
cache_access_log
cache_log
cache_store_log
cache_swap_log
emulate_httpd_log
log_ip_on_direct
cache_dir
cache_mem
cache_swap_low
cache_swap_high
maximum_object_size
minimum_object_size
maximum_object_size_in_memory
cache_replacement_policy
memory_replacement_policy
store_dir_select_algorithm
mime_table
ipcache_size
ipcache_low
ipcache_high
fqdncache_size
log_mime_hdrs
useragent_log
referer_log
pid_filename
debug_options
log_fqdn
client_netmask
ftp_user
ftp_list_width
ftp_passive
ftp_sanitycheck
cache_dns_program
dns_children
dns_retransmit_interval
dns_timeout
dns_defnames
dns_nameservers
hosts_file
diskd_program
unlinkd_program
pinger_program
redirect_program
redirect_children
redirect_rewrites_host_header
redirector_access
redirector_bypass
auth_param
authenticate_ttl
authenticate_cache_garbage_interval
authenticate_ip_ttl
external_acl_type
wais_relay_host
wais_relay_port
request_header_max_size
request_body_max_size
refresh_pattern
quick_abort_min
quick_abort_max
quick_abort_pct
negative_ttl
positive_dns_ttl
negative_dns_ttl
range_offset_limit
connect_timeout
peer_connect_timeout
read_timeout
request_timeout
persistent_request_timeout
client_lifetime
half_closed_clients
pconn_timeout
ident_timeout
shutdown_lifetime
acl
http_access
http_reply_access
icp_access
miss_access
cache_peer_access
ident_lookup_access
tcp_outgoing_tos
tcp_outgoing_address
reply_body_max_size
cache_mgr
cache_effective_user
cache_effective_group
visible_hostname
unique_hostname
hostname_aliases
announce_period
announce_host
announce_file
announce_port
httpd_accel_host
httpd_accel_port
httpd_accel_single_host
httpd_accel_with_proxy
httpd_accel_uses_host_header
dns_testnames
logfile_rotate
append_domain
tcp_recv_bufsize
err_html_text
deny_info
memory_pools
memory_pools_limit
forwarded_for
log_icp_queries
icp_hit_stale
minimum_direct_hops
minimum_direct_rtt
cachemgr_passwd
store_avg_object_size
store_objects_per_bucket
client_db
netdb_low
netdb_high
netdb_ping_period
query_icmp
test_reachability
buffered_logs
reload_into_ims
always_direct
never_direct
header_access
header_replace
icon_directory
error_directory
maximum_single_addr_tries
snmp_port
snmp_access
snmp_incoming_address
snmp_outgoing_address
as_whois_server
wccp_router
wccp_version
wccp_incoming_address
wccp_outgoing_address
delay_pools
delay_class
delay_access
delay_parameters
delay_initial_bucket_level
incoming_icp_average
incoming_http_average
incoming_dns_average
min_icp_poll_cnt
min_dns_poll_cnt
min_http_poll_cnt
max_open_disk_fds
offline_mode
uri_whitespace
broken_posts
mcast_miss_addr
mcast_miss_ttl
mcast_miss_port
mcast_miss_encode_key
nonhierarchical_direct
prefer_direct
strip_query_terms
coredump_dir
ignore_unknown_nameservers
digest_generation
digest_bits_per_entry
digest_rebuild_period
digest_rewrite_period
digest_swapout_chunk_size
digest_rebuild_chunk_percentage
chroot
client_persistent_connections
server_persistent_connections
pipeline_prefetch
extension_methods
request_entities
high_response_time_warning
high_page_fault_warning
high_memory_warning
ie_refresh
vary_ignore_expire
sleep_after_fork
Appendix B. The Memory Cache
Appendix C. Delay Pools
Section C.1. Overview
Section C.2. Configuring Squid
Section C.3. Examples
Section C.4. Issues
Section C.5. Monitoring Delay Pools
Appendix D. Filesystem Performance Benchmarks
Section D.1. The Benchmark Environment
Section D.2. General Comments
Section D.3. Linux
Section D.4. FreeBSD
Section D.5. OpenBSD
Section D.6. NetBSD
Section D.7. Solaris
Section D.8. Number of Disk Spindles
Appendix E. Squid on Windows
Section E.1. Cygwin
Section E.2. SquidNT
Appendix F. Configuring Squid Clients
Section F.1. Manually
Section F.2. Proxy Auto-Configuration
Section F.3. WPAD
Section F.4. Summary
Colophon
Index
< Day Day Up >
< Day Day Up >
Copyright © 2004 O'Reilly Media, Inc.
Printed in the United States of America.
Published by O'Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472.
O'Reilly & Associates books may be purchased for educational, business, or sales promotional
use. Online editions are also available for most titles (
). For more
information, contact our corporate/institutional sales department: (800) 998-9938 or
Nutshell Handbook, the Nutshell Handbook logo, and the O'Reilly logo are registered
trademarks of O'Reilly Media, Inc. Squid: The Definitive Guide, the image of a giant squid and
related trade dress are trademarks of O'Reilly Media, Inc.
Many of the designations used by manufacturers and sellers to distinguish their products are
claimed as trademarks. Where those designations appear in this book, and O'Reilly & Associates
was aware of a trademark claim, the designations have been printed in caps or initial caps.
While every precaution has been taken in the preparation of this book, the publisher and
authors assume no responsibility for errors or omissions, or for damages resulting from the use
of the information contained herein.
< Day Day Up >
< Day Day Up >
Dedication
To my darling Anne. You have no idea.
< Day Day Up >
< Day Day Up >
Preface
About This Book
Recommended Reading
Conventions Used in This Book
Comments and Questions
Acknowledgments
< Day Day Up >
< Day Day Up >
About This Book
I started the Squid project eight years ago while working at the National Laboratory for Applied
Network Research and the University of California. Back then I certainly enjoyed writing code
and fixing bugs but always felt bad about the lack of decent documentation. This book is my
attempt to rectify that situation. It's been a long time coming and almost didn't happen. Like
they say, "better late than never!"
This book is written for those who are tasked with setting up and maintaining one or more
Squid caches. If you're new to Squid, I'll show you how to download, compile, and install the
code. Those of you who have been using Squid for a while will be more interested in the later
chapters, where I talk about disk cache performance, modifying requests, surrogate mode,
caching hierarchies, monitoring Squid, and more.
In order to use this book, you should have a basic knowledge of Unix systems. Many of the
book's examples are based on free operating systems, such as Linux, FreeBSD, NetBSD, and
OpenBSD. I also have some tips for Solaris users. If you're more comfortable with Windows
systems, you can use Squid under a Unix emulator or give the native NT port a try.
Here's an overview of the book's contents:
Chapter 1, Introduction
This chapter introduces you to Squid and web caching. I give a brief history of the
project, and a few notes on our future work. I explain how you can find additional
support and information, including a FAQ, on the Squid web site.
Chapter 2, Getting Squid
In this chapter, I explain how and why you should download Squid's source code. You
may prefer to install a precompiled binary or use a preconfigured package. I also talk
about staying up to date with Squid using the anonymous CVS server.
Chapter 3, Compiling and Installing
Assuming you've downloaded the source code, this chapter explains how to configure
and compile Squid. In some cases you may need to tune your system before compiling
Squid. For example, your kernel may have relatively low file-descriptor limits that affect
Squid's performance.
Chapter 4, Configuration Guide for the Eager
Here, I give a brief introduction to Squid's configuration file. If you are the impatient
type and can't wait to start using Squid, this chapter will leave you with a minimal
configuration file you can start playing with.
Chapter 5, Running Squid
In this chapter, I explain how to run Squid for the first time and how to test Squid in a
terminal window. Following that, I suggest a number of ways to configure your system
so that Squid starts each time it boots. I also explain how to reconfigure Squid while it is
running and how to safely shut it down.
Chapter 6, All About Access Controls
I talk extensively about access controls in this chapter. Squid has a powerful collection
of access control features and a number of different rule sets that determine how
requests and responses are treated. This is an important chapter because a mistake in
your access controls may leave your cache, or even internal systems, vulnerable to
abuse from outsiders.
Chapter 7, Disk Cache Basics
This chapter is about Squid's primary function: storing cached responses on disk. I
explain how to configure the disk cache, including replacement policies and freshness
controls. I also show you how to manually remove unwanted objects from the cache.
Chapter 8, Advanced Disk Cache Topics
In this chapter, I explain how to improve the performance of Squid's disk cache. I'll talk
about Squid's different storage schemes and a number of filesystem tuning options that
may help. If your Squid cache handles a relatively light load, you probably don't need to
worry about disk performance.
Chapter 9, Interception Caching
Here, I explain how to configure Squid for HTTP interception, sometimes also called
transparent caching. Actually, configuring Squid is the easy part. The difficulty comes
from setting up a router or switch on your network and the host from which Squid is
running. I explain how to configure networking equipment from Cisco, Alteon, Foundry,
and Extreme. I'll also show you how to configure your operating system (Linux,
FreeBSD, NetBSD, OpenBSD, and Solaris) for HTTP interception. Finally, I talk about
WCCP.
Chapter 10, Talking to Other Squids
In this chapter, I cover the ins and outs of cache cooperation, including meshes, arrays,
and hierarchies. You may also find it useful if you simply need to forward requests from
Squid to another proxy or intermediary. I'll talk about the various intercache protocols
supported by Squid (ICP, HTCP, Cache Digests, and CARP) and how Squid chooses the
next-hop location for a given cache miss.
Chapter 11, Redirectors
Redirectors are the best way to make Squid rewrite HTTP requests before forwarding
them. I describe the interface between Squid and a redirector program so that you can
write your own. I also present a few of the more popular third-party redirectors
available.
Chapter 12, Authentication Helpers
In this chapter, I explain how Squid interfaces with external authentication databases
such as LDAP, NT domain controllers, and password files. Squid comes with a number of
authentication helpers and understands Basic, Digest, and NTLM authentication
credentials. I also document the API for each, in case you want to develop your own
helper.
Chapter 13, Log Files
I cover Squid's various log files in this chapter, including access.log, store.log, cache.
log, and others. I explain what each log file contains and how you should periodically
maintain them.
Chapter 14, Monitoring Squid
This chapter provides a lot of information on monitoring Squid's operation. I cover both
SNMP and Squid's own cache manager interface. You'll find it useful for both long-term
monitoring and short-term problem diagnosis.
Chapter 15, Server Accelerator Mode
Squid's server accelerator mode is useful in a number of situations. You can use it to
boost your origin server's poor performance, as a firewall to protect the server, or even
to build your own content delivery network. I show how to set up Squid and make sure
that outsiders can't abuse your service.
Chapter 16, Debugging and Troubleshooting
The book's final chapter explains how to debug and troubleshoot problems with Squid.
You may find that some sites, or some user agents, don't work properly with Squid. I
show how to isolate and reproduce the problem and how to present the information to
Squid developers for assistance.
Appendix A, Config File Reference
This appendix is a reference guide for each of Squid's 200 configuration file directives.
Each has a description, syntax, defaults, and examples.
Appendix B, The Memory Cache
This brief appendix explains a little about Squid's memory cache.
Appendix C, Delay Pools
You can use Squid's delay pools feature to limit bandwidth consumed by web surfers. I
explain how the delay pools work and provide a number of example configurations.
Appendix D, Filesystem Performance Benchmarks
In this appendix, I present the results of numerous filesystem benchmarks. These may
help you make informed decisions regarding particular operating systems, filesystem
features, and Squid's storage techniques.
Appendix E, Squid on Windows
Have a look at this appendix if you'd like to run Squid on your Windows box. I talk about
using Cygwin and about a native port of Squid, called SquidNT.
Appendix F, Configuring Squid Clients
This appendix contains information on how to configure various user agents to use
Squid. I talk about manual configuration, environment variables, Proxy Auto-
Configuration functions, and the Web Proxy Auto Discovery protocol.
As I'm finishing up this book, the latest stable version is Squid-2.5.STABLE4, and the
development version is Squid-3.0. Perhaps the most important difference between the two is
that Squid-3 is being rewritten in C++. You should find that most things are backward-
compatible, although a few new configuration directives have been created. Please read the
release notes carefully if you use Squid-3.0 or later.
I have created a web site for the book, located at
There, you will find
errata, supplemental information, and links to online resources.
Topics Not Covered
Due to a lack of time and space, there are some topics I was unable to cover in this book; they
include:
Non-HTTP protocols
You'll find that I mostly talk about HTTP, even though Squid also supports FTP, Gopher,
and some other relatively obscure protocols.
Customizing error messages
Squid's error messages can be customized and the source distribution includes versions
of the error messages in a number of different languages. You can probably figure out
how to customize the error messages by modifying the default pages or by reading
Squid's source code.
Load balancing Squids
Load balancing is a popular way to increase the capacity of a caching service. Refer to
one of the load balancing books mentioned in the following section if necessary.
What is cachable
HTTP has a number of somewhat complicated rules for determining what may, or may
not be, cached, and for how long. Refer to Web Caching, or HTTP: The Definitive Guide
(for more information, see the next section).
Copyright
A number of nontechnical issues surround web caching. These include copyrights and
privacy.
Modifying the source
I don't go into detail about Squid's source code in this book. The Squid project hosts a
programmers' guide, which is generally incomplete and out of date. If you have
questions about the source code, please join the squid-dev mailing list.
SOCKS
Squid doesn't support the SOCKS protocol at this time.
< Day Day Up >
< Day Day Up >
Recommended Reading
While reading this book, you may want to consult some of these other resources for more
information (I'll refer to them throughout this book):
● The Design and Implementation of the 4.4 BSD Operating System by Marshall Kirk
McKusick, Kieth Bostic, Michael J. Karels, and John S. Quarterman (Addison-Wesley
Longman)
● DNS and BIND by Paul Albitz and Cricket Liu (O'Reilly & Associates)
● HTTP: The Definitive Guide by David Gourley and Brian Totty (O'Reilly)
● Load Balancing Servers, Firewalls, and Caches by Chandra Koopurapu (John Wiley &
Sons)
● Mastering Regular Expressions by Jeffrey E. F. Friedl (O'Reilly)
● Server Load Balancing by Tony Bourke (O'Reilly)
● Unix System Administration Handbook and Linux System Administration Handbook by
Evi Nemeth, Garth Snyder, Scott Seebass, and Trent R. Hein (Prentice Hall)
● My book, Web Caching (O'Reilly)
● RFC 1413: Identification Protocol
● RFC 1738: Uniform Resource Locators (URL)
● RFC 2186: Internet Cache Protocol (ICP), Version 2
● RFC 2187: Application of Internet Cache Protocol (ICP), Version 2
● RFC 2396: Uniform Resource Identifiers (URI): Generic Syntax
● RFC 2616: Hypertext Transfer Protocol—HTTP/1.1
● RFC 2617: HTTP Authentication: Basic and Digest Access Authentication
● RFC 2756: Hypertext Caching Protocol
● RFC 2817: Upgrading to TLS Within HTTP/1.1
● RFC 3040: Internet Web Replication and Caching Taxonomy
● RFC 3143: Known HTTP Proxy/Caching Problems
● Caching-related web sites, such as and -
cache.com/
< Day Day Up >
< Day Day Up >
Conventions Used in This Book
I use the following typesetting conventions in this book:
Italic
Used for new terms where they are defined, buttons, pages, configuration file directives,
filenames, modules, ACLs, directories, and URI/URLs
Constant width
Used for configuration file examples, program output, HTTP header names and
directives, scripts, options, environment variables, functions, methods, rules, keywords,
libraries, and command names
Constant width italic
Used for replaceable text within examples and code pieces
Constant width bold
Used to indicate commands to be typed verbatim
When displaying a Unix command, I'll include a shell prompt, like this:
% ls -l
If the command is specific to the Bourne shell (sh) or C shell (csh), the prompt will indicate
which you should use:
sh$ ulimit -a
csh% limits
If the command requires super-user privileges, the shell prompt is a hash mark:
# make install
Occasionally, I provide configuration file examples with long lines. If the line is too wide to fit
on the page, it's wrapped around and indented. Squid doesn't accept this sort of syntax, so you
must make sure to place everything on one line.
This icon signifies a tip, suggestion, or general note.
This icon indicates a warning or caution.
< Day Day Up >
< Day Day Up >
Comments and Questions
Please address comments and questions concerning this book to the publisher:
O'Reilly & Associates, Inc.
1005 Gravenstein Highway North
Sebastopol, CA 95472
(800) 998-9938 (in the United States or Canada)
(707) 829-0515 (international or local)
(707) 829-0104 (fax)
There is a web page for this book, which lists errata, examples, and any additional information.
You can access this page at:
/>To comment or ask technical questions about this book, send email to:
For more information about books, conferences, Resource Centers, and the O'Reilly Network,
check the O'Reilly web site at:
You can contact the author at
< Day Day Up >
< Day Day Up >
Acknowledgments
Looking back at the events and people that allowed me to write this book makes me feel
extremely humble and grateful. I'm so happy to have been a part of the Harvest project with
Mike Schwartz, Peter Danzig, and the others. That led directly to my work with kc claffy and
Hans-Werner Braun at NLANR/UCSD. The Squid project would have never been at all without
their support, and the grant from the National Science Foundation.
I'm also very thankful for all the hard work put in by the small crew of Squid developers:
Henrik Nordström, Robert Collins, Adrian Chadd, and everyone else who has contributed time
and code to the project. And I'm sorry that you ever had to read and/or fix any ugly code I
wrote.
To all the reviewers who read the drafts—Joe Cooper, Scott Pepple, Robert Collins, and Adrian
Chadd—thanks for finding my mistakes and suggesting ways to make the book better. I also
owe so much to the people at O'Reilly for making the book possible, and for making it all come
together. My editors Tatiana Diaz and Nat Torkington, the production editor Mary Anne Mayo,
the graphic designer Melanie Wang, the illustrator, Rob Romano, the XML mungers Andrew
Savikas and Joe Wizda, and the countless other folks working behind the scenes for me.
To my good friend, and business partner, Alex Rousskov: thanks for giving me the time and
freedom to see this little project through. Finally, to the members of my new family, Annie and
Blooey, thanks for putting up with the late nights. Can I make it up to you with extra back
scratches?
< Day Day Up >
< Day Day Up >
Chapter 1. Introduction
This long-overdue book is about Squid: a popular open source caching proxy for the Web. With
Squid you can:
● Use less bandwidth on your Internet connection when surfing the Web
● Reduce the amount of time web pages take to load
● Protect the hosts on your internal network by proxying their web traffic
● Collect statistics about web traffic on your network
● Prevent users from visiting inappropriate web sites at work or school
● Ensure that only authorized users can surf the Internet
● Enhance your user's privacy by filtering sensitive information from web requests
● Reduce the load on your own web server(s)
● Convert encrypted (HTTPS) requests on one side, to unencrypted (HTTP) requests on
the other
Squid's job is to be both a proxy and a cache. As a proxy, Squid is an intermediary in a web
transaction. It accepts a request from a client, processes that request, and then forwards the
request to the origin server. The request may be logged, rejected, and even modified before
forwarding. As a cache, Squid stores recently retrieved web content for possible reuse later.
Subsequent requests for the same content may be served from the cache, rather than
contacting the origin server again. You can disable the caching part of Squid if you like, but the
proxying part is essential.
Figure 1-1. Squid sits between clients and servers
As Figure 1-1 shows, Squid accepts HTTP (and HTTPS) requests from clients, and speaks a
number of protocols to servers. In particular, Squid knows how to talk to HTTP, FTP, and
Gopher servers.
[1]
Conceptually, Squid has two "sides." The client-side talks to web clients (e.
g., browsers and user-agents); the server-side talks to HTTP, FTP, and Gopher servers. These
are called origin servers, because they are the origin location for the data they serve.