James Turnbull
Pro Nagios 2.0
6099_FM_final.qxd 3/16/06 10:38 PM Page i
Pro Nagios 2.0
C
opyright © 2006 by James Turnbull
All rights reserved. No part of this work may be reproduced or transmitted in any form or by any means,
electronic or mechanical, including photocopying, recording, or by any information storage or retrieval
system, without the prior written permission of the copyright owner and the publisher.
ISBN-13: 978-1-59059-609-8
ISBN-10: 1-59059-609-9
Printed and bound in the United States of America 9 8 7 6 5 4 3 2 1
Trademarked names may appear in this book. Rather than use a trademark symbol with every occurrence
of a trademarked name, we use the names only in an editorial fashion and to the benefit of the trademark
owner, with no intention of infringement of the trademark.
Lead Editor: Jim Sumser
Technical Reviewer: Justin Kulikowski
Editorial Board: Steve Anglin, Dan Appleman, Ewan Buckingham, Gary Cornell, Jason Gilmore,
Jonathan Hassell, James Huddleston, Chris Mills, Matthew Moodie, Dominic Shakeshaft,
Jim Sumser, Matt Wade
Project Manager: Elizabeth Seymour
Copy Edit Manager: Nicole LeClerc
Copy Editor: Liz Welch
Assistant Production Director: Kari Brooks-Copony
Production Editor: Kelly Winquist
Compositor: Linda Weidemann, Wolf Creek Press
Proofreader: Nancy Sixsmith
Indexer: John Collin
Artist: Kinetic Publishing Services, LLC
Cover Designer: Kurt Krames
Manufacturing Director: Tom Debolski
Distributed to the book trade worldwide by Springer-Verlag New York, Inc., 233 Spring Street, 6th Floor,
New York, NY 10013. Phone 1-800-SPRINGER, fax 201-348-4505, e-mail
,
or visit .
For information on translations, please contact Apress directly at 2560 Ninth Street, Suite 219, Berkeley,
CA 94710. Phone 510-549-5930, fax 510-549-5939, e-mail , or visit .
The information in this book is distributed on an “as is” basis, without warranty. Although every precaution
has been taken in the preparation of this work, neither the author(s) nor Apress shall have any liability to
any person or entity with respect to any loss or damage caused or alleged to be caused directly or indirectly
b
y the infor
mation contained in this wor
k.
The source code for this book is available to readers at
in the Source Code section.
6099_FM_final.qxd 3/16/06 10:38 PM Page ii
To my parents, whose love of books and writing inspired me to write
6099_FM_final.qxd 3/16/06 10:38 PM Page iii
6099_FM_final.qxd 3/16/06 10:38 PM Page iv
Contents at a Glance
About the Author . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
About the Technical Reviewer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvii
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xix
■CHAPTER 1 Installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
■CHAPTER 2 Basic Object Configur
ation
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
■CHAPTER 3 Security and Administration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
■CHAPTER 4 Using the Web Console . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
■CHAPTER 5 Monitoring Hosts and Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
■CHAPTER 6 Adv
anced Commands
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
■CHAPTER 7 Advanced Object Configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249
■CHAPTER 8 Distributed Monitoring
, Redundancy, and Failover
. . . . . . . . . . . . . 269
■CHAPTER 9 Integrating Nagios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299
■CHAPTER 10 Developing Plug-ins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343
■INDEX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 367
v
6099_FM_final.qxd 3/16/06 10:38 PM Page v
6099_FM_final.qxd 3/16/06 10:38 PM Page vi
Contents
About the Author . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
About the Technical Reviewer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvii
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xix
■CHAPTER 1 Installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Positioning the Server. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Choosing Software and Hardware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Capacity Planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
Redundancy and Backup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Installing the Nagios Software. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Prerequisites for Software Installa
tion
. . . . . . . . . . . . . . . . . . . . . . . . . . 6
Installing the Nagios Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Installing the Nagios Plug-ins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Configuring
Your
Web Ser
ver for Na
gios
. . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Basic Configura
tion
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Virtual Server Configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Configuring Your Web Server with an RPM Installation. . . . . . . . . . . 25
Restarting Apache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Testing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Checkpoint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Resources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Mailing Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Sites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
■CHAPTER 2 Basic Object Configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
29
How Does Nagios Work? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Ho
w Is Na
gios Configured?
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
30
Getting Started with Your Configuration . . . . . . . . . . . . . . . . . . . . . . . 32
Specifying
Y
our Configuration Files
. . . . . . . . . . . . . . . . . . . . . . . . . . .
32
Defining Na
gios Configuration Objects
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
35
vii
6099_FM_final.qxd 3/16/06 10:38 PM Page vii
6e067a1cf200c3b6e021f18882237192
Defining Your First Host . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
Defining the Hostname and Address . . . . . . . . . . . . . . . . . . . . . . . . . . 37
Parents, Host Groups, and Contact Groups. . . . . . . . . . . . . . . . . . . . . 38
Checking the Host . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
Notifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
Flapping. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
Event Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
Retention of Status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
State Stalking, Obsession, and Performance Data . . . . . . . . . . . . . . 52
Defining Services. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
Basic Service Directives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
Service Checking. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
Service Status and Notifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
Service Flapping and Event Handling. . . . . . . . . . . . . . . . . . . . . . . . . . 65
Service Stalking and Obsession . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
Other Directives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
Using Templates for Objects Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
Contact Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
Grouping Objects. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
Host Group Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
Service Group Objects. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
Contact Group Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
Defining
Time Periods
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
Defining Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
Check Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
Event Handler Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
Notification Commands. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
Checkpoint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
Resources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
■CHAPTER 3 Security and Administration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
87
General Security Guidelines. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
Do Not Run Nagios As the root User. . . . . . . . . . . . . . . . . . . . . . . . . . . 88
Securing and Administering for External Commands . . . . . . . . . . . . 88
Securing the
W
eb Console
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
90
Web Console Authentication with Apache. . . . . . . . . . . . . . . . . . . . . . 91
Nagios Authentication and Authorization . . . . . . . . . . . . . . . . . . . . . . . 97
■CONTENTSviii
6099_FM_final.qxd 3/16/06 10:38 PM Page viii
Nagios Administration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
Starting and Stopping the Nagios Server. . . . . . . . . . . . . . . . . . . . . . 101
Nagios init Script . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
Logging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
Checkpoints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
Resources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
■CHAPTER 4 Using the Web Console . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
General . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
Monitoring. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
Tactical Monitoring Over
view
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
Service Detail . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
Host Detail. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
Host and Service Group Views. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
Process Information. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
Scheduling Queue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
Other Items in the Monitoring Menu. . . . . . . . . . . . . . . . . . . . . . . . . . 133
Reporting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
The Availability Report . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
The Event Log Report . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
Checkpoints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
■CHAPTER 5 Monitoring Hosts and Services . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
Introduction to Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
Monitoring Hosts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
Monitoring Services. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
Local Unix Monitoring. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
147
Monitoring Network-Based Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
Remote Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
161
Monitoring via NRPE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
Monitoring via SSH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
Monitoring via SMNP. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
Monitoring Windows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
NSClient++ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
188
Checkpoint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
Resources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
Books. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
Sites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
■CONTENTS
ix
6099_FM_final.qxd 3/16/06 10:38 PM Page ix
■CHAPTER 6 Advanced Commands. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
Macros. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
On-Demand Macros . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
Macros As Environmental Variables . . . . . . . . . . . . . . . . . . . . . . . . . . 209
Event Handlers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
Notifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
Sending Notifications via Instant Messenger . . . . . . . . . . . . . . . . . . 217
Notification Aggregation and Suppression . . . . . . . . . . . . . . . . . . . . 220
External Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
Processing Checks Results with External Commands. . . . . . . . . . . 224
External Commands for Adaptive Monitoring . . . . . . . . . . . . . . . . . . 225
Performance Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
Processing Performance Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228
Using Performance Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
Checkpoints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246
Resources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247
■CHAPTER 7 Advanced Object Configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . 249
Host and Ser
vice Dependencies
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250
Service Dependencies. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250
Service Dependency Shortcuts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253
Inheritance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254
Host Dependencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256
Notifica
tion Escalations
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258
Service Notification Escalations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259
Service Escalation Shortcuts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262
Host Escala
tions
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264
Extended Host and Service Information Definitions. . . . . . . . . . . . . . . . . . 265
Checkpoint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
■CHAPTER 8 Distributed Monitoring,
Redundancy,
and F
ailo
ver
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
269
Distributed Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
Distributed Server Configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271
Central Server Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 280
Redundanc
y and F
ailover
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
289
Configuring the Master Server. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290
Configuring the Sla
ve Ser
ver
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
294
F
ailover Process
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
297
■CONTENTSx
6099_FM_final.qxd 3/16/06 10:38 PM Page x
Checkpoint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298
Resources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298
Sites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298
■CHAPTER 9 Integrating Nagios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299
syslog-NG and Nagios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299
Installing the Remote Host . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 300
Configuring the Nagios Server. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309
Wrapping Up. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313
Snort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313
Configuring Snort for Integration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314
Configuring syslog-NG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315
Configuring Nagios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 316
Wrapping Up. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317
Integrating with MRTG, Cacti, and Related Tools. . . . . . . . . . . . . . . . . . . . 317
Querying MRTG Log Files. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 318
Querying RRD Databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 322
SNMP Traps and Nagios. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323
Receiving SNMP Traps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325
Sending SNMP Traps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 336
Checkpoint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340
Resources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341
Books. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341
Sites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341
MIB Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341
■CHAPTER 10 Developing Plug-ins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343
Writing
Y
our F
irst Plug-in
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
343
Writing Perl Plug-ins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 350
Other Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
355
Specifying Threshold Ranges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355
Specifying Performance Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 356
Commands and Files. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357
Plug-in Timeouts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 358
Command-Line Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
358
Other Guidelines. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 359
■CONTENTS
xi
6099_FM_final.qxd 3/16/06 10:38 PM Page xi
Nagios Event Broker . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 359
Helloworld . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 360
NDO Utilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 362
Other Sources of Information. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 365
Checkpoints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 366
Resources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 366
■INDEX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 367
■CONTENTSxii
6099_FM_final.qxd 3/16/06 10:38 PM Page xii
About the Author
■JAMES TURNBULL is a senior consultant with pure-play security consultancy B-Sec in Mel-
bourne, Australia. He was previously an IT&T security manager at the Commonwealth Bank
of Australia. James is an experienced infrastructure architect with a background in Linux/
Unix, AS/400, Windows, and Storage systems. He has been involved in security consulting,
infrastructure security design, SLA and support services design, and business application
support. He has a strong interest in security metrics and measurement.
xiii
6099_FM_final.qxd 3/16/06 10:38 PM Page xiii
6099_FM_final.qxd 3/16/06 10:38 PM Page xiv
xv
About the Technical Reviewer
■JUSTIN KULIKOWSKI is a student at Pennsylvania State University achieving his BS in infor-
mation sciences and technology. He takes a particular interest in backend administration,
database-driven applications, security, and automation. Justin is active in the open source
community, and fulfills various freelance requests varying from long-term server adminis-
tration to short-term installation, configuration, and troubleshooting.
When not computing, Justin can be found performing in the Penn State Blues Band,
where he plays mellophone. In his free time, Justin remains active on campus by developing
applications that benefit students. Examples of this include a website built to notify students
of class openings, and an e-commerce website built to provide a means for students to order
products from local grocery food chains. To learn more, visit his website at
www.jpk236.com.
6099_FM_final.qxd 3/16/06 10:38 PM Page xv
6099_FM_final.qxd 3/16/06 10:38 PM Page xvi
Acknowledgments
Lucinda Mora—for her understanding and patience
Ruth Brown—for her friendship and support
Jim Sumser—for letting me do it all again
Dennis Matotek, Feodor Frukhtman, and Mark Chandler—for their comments and support
Mark Ferlatte—for his notification throttling script
Seva Gluschenko—for his check_rrd plug-in
xvii
6099_FM_final.qxd 3/16/06 10:38 PM Page xvii
6099_FM_final.qxd 3/16/06 10:38 PM Page xviii
Introduction
You are an IT manager in charge of numerous systems spread across multiple countries.
It’s 4:00 a.m. and you are in bed asleep. Your cell phone rings. It’s the Help Desk calling to tell
you that users in Indonesia can’t access their email. You get up, dial into work, and start to
diagnose the problem. After an hour of work you identify that the issue is disk space on the
mail server in the Indonesian office. You clear some free space, confirm the users can access
their mail, and go back to sleep.
This is a common scenario in the IT industry. IT staff are geographically and temporally
separated from the systems and applications they manage. All troubleshooting, monitoring,
and management of systems and applications occurs remotely. The systems and applications
being managed are complex and hugely configurable. They are also made up of multiple com-
ponents—hardware, software, networking devices, networks, and supporting infrastructure
like environmental and electrical systems. In the event of a problem, many of these compo-
nents need to be checked in order to eliminate them as a cause.
All this has resulted in monitoring, management, and troubleshooting becoming increas-
ingly complicated and time consuming. No longer do IT professionals have time to individually
review every log, every setting, and every variable on all the systems and applications they are
responsible for. They need tools to automatically monitor the characteristics of the assets they
are responsible for managing . . . tools that will detect anomalies, failures, or performance
issues and alert IT staff via email, a pager, or an SMS message . . . tools that can automatically
perform actions, such as restarting a service, in response to events they have detected. These
types of tools perform functions generally known as enterprise management.
So what advantages does an enterprise management solution offer? Well, let’s say that as
an IT manager you have deployed an enterprise management solution. With this solution in
place, let’s revisit our troubleshooting scenario again. Instead of a call from the Help Desk, this
time your cell phone beeps to indicate it has received an SMS message. You read the message:
4/21/05 04:54AM C:\ drive on server INDOEXCH01 has 1% of free space remaining.
Y
ou get up, dial into work, connect to the INDOEXCH01 server, clear some free space,
confirm the server is functional, and then go back to sleep. Total elapsed time? Ten minutes.
Now instead of your having to diagnose the problem, eliminate all the possible variables,
r
eview log files, and test multiple components, the actual root cause of the issue is presented
dir
ectly to y
ou.
In this book I am going to introduce the popular open source enterprise management
tool Nagios. At the time of this writing, version 2.0 has just been released as the stable produc-
tion version.
This book takes adv
antage of this r
elease to provide an introduction to Nagios
and how you can use it streamline, manage, and monitor your IT assets.
xix
6099_FM_final.qxd 3/16/06 10:38 PM Page xix
What Is Nagios?
So what is Nagios?
1
It is an open source, Unix-based enterprise monitoring package with
a web-based front-end or console. Nagios can monitor assets like servers, network devices,
and applications, essentially any device or service that has an address and can be contacted
via TCP/IP. It can monitor hosts running Microsoft Windows, Unix/Linux, Novell NetWare,
and other operating systems. It can be configured to work through firewalls, VPN tunnels,
across SSH tunnels, and via the Internet.
Nagios can monitor a variety of attributes on your assets. These can range from operating
system attributes such as CPU, disk, and memory usage to the status of applications, files, and
databases. You can use a variety of network protocols, including HTTP, SNMP, and SSH, to
conduct this monitoring. Nagios can also receive SNMP traps, and you can build and easily
integr
ate your own custom monitoring checks using a variety of languages, including C, Perl,
and shell scripts.
The Nagios tool is capable of being deployed in a distributed model with multiple servers,
collecting data about your assets and reporting them to a central server (which is ideal for
organizations with disparate geographical locations that are controlled from a central site or
network operations center). Nagios can also be configured as a robust redundant monitoring
infrastructure that is capable of disaster recovery and failover modes of operation.
Nagios is developed by a single developer, Ethan Galstad, as an open source software
project. This means that time between releases can be extensive but the overall product
tends be to carefully and extensively tested. Additionally, many of the newer features tend
to be released and stay in CVS versions of the package for long periods of time. These
releases, while usually fairly stable, are generally not recommended for production use.
Who Should Read This Book?
This book is an introductory guide to Nagios 2.0. It presumes no prior knowledge of Nagios
and indeed focuses entirely (with minor digressions) on the current version of Nagios. It is
designed for system administrators, operations managers, IT managers, and support staff
who need to deploy a tool to monitor and report on IT assets and applications.
The book starts from scratch and introduces how to install Nagios, build your basic
monitoring configur
ation, and use the Nagios web console. It then covers some advanced
topics such as escalations, dependencies, distributed monitoring, how to integrate Nagios
with other tools, and how to develop your own monitoring checks. At the end of this book,
you should have the ability to deploy and monitor using Nagios, to implement redundancy
or failover capabilities with Nagios, to integrate Nagios into other tools such as MRTG and
syslog-NG—and you should have an idea of where to look for additional knowledge and
resources that might answer more advanced questions and issues.
The book presumes some experience with Unix and Windows platforms, as you will need
to install and configure Nagios on a Unix-based host, and you will need to configure monitor-
ing for your remote hosts and devices. This assumed prior knowledge includes the following:
■INTRODUCTIONxx
1. Nagios is a recursive acronym for “Nagios Ain’t Gonna Insist On Sainthood.” It was previously known
as “NetSaint.”
6099_FM_final.qxd 3/16/06 10:38 PM Page xx
• Knowledge of TCP/IP networking
•
Some knowledge of firewalls, including
i
ptables
• Some exposure to the Apache web server
• Ability to install and run software on Unix and Windows hosts
• The ability to use editors and command-line tools on Unix and Windows hosts
■Note If you wish to develop your own monitoring checks, you will need some knowledge of program-
ming. Some commonly used languages for this purpose are C, Perl, Python, and shell script. See Chapter 10
for further details.
What’s in This Book?
• Chapter 1, “Installation,” deals with installing the Nagios server and its associated pre-
requisites, including a web server for the console.
• Chapter 2, “Basic Object Configuration,” covers basic configuration of your monitoring
environment and explains how the Nagios object model and object template system
functions.
• Chapter 3, “Security and Administration,” focuses on how to administer and secure
Nagios servers and includes information on securing the web console, understanding
general Nagios security, starting and stopping the Nagios daemon, and handling logs.
• Chapter 4, “Using the Web Console,” deals with the Nagios web console.
• Chapter 5, “Monitoring Hosts and Services,” addresses how to use Nagios to monitor
y
our hosts and services. This includes monitoring by SSH, using SNMP and a variety
of other methods. It includes details on how to monitor Unix and Windows hosts.
• Chapter 6, “Advanced Commands,” covers advanced use of Nagios command objects,
including macros, event handlers, advanced notifications, external commands, and
performance data.
• Chapter 7, “Advanced Object Configuration,” focuses on some of the objects that
w
eren’t covered in Chapter 2. These include notification escalations, host and service
dependencies, and extended host and ser
vice infor
mation.
•
Chapter 8,
“
Distributed Monitoring, Redundancy, and Failover,” demonstrates how to
configure a distributed monitoring model to allow you to distribute your monitoring
load and to monitor hosts you may not be able to directly connect to because of net-
wor
k str
uctur
e or segmentation. It also shows how to use Nagios in redundant and
failover modes to enhance the resiliency and availability of your Nagios solution.
■INTRODUCTION
xxi
6099_FM_final.qxd 3/16/06 10:38 PM Page xxi
• Chapter 9, “Integrating Nagios,” looks at how you can integrate Nagios with a number
of other tools, including syslog-NG, SNMP, and MRTG.
• Chapter 10, “Developing Plug-ins,” examines plug-in development and includes
details on developing plug-ins in shell script and Perl. The chapter also covers the
N
agios Event Broker, an integration engine and interface that allows integration with
tools such as databases.
■INTRODUCTIONxxii
6099_FM_final.qxd 3/16/06 10:38 PM Page xxii
Installation
The first stage in deploying Nagios is installing the software and any required infrastructure.
The Nagios installation process can be complicated, and you must follow a number of steps to
ensure all the correct components are installed. This chapter takes you through those steps and
explains several of the possible installation options and models from which you can choose.
By the end of this chapter you should have a good understanding of where to position
your Nagios server or servers in your environment to best monitor all the required assets. You
will also have some guidelines for selecting and sizing your Nagios hardware and choosing
your operating system software. I will also cover the steps you need to take to install the Nagios
server and the plug-ins. Finally, I’ll demonstrate how to set up and configure a web server to
run the Nagios web console.
Positioning the Server
Before you actually install the software itself, let’s briefly look at where to locate your Nagios
servers. Where you deploy your Nagios server(s) is an important part of your Nagios imple-
mentation. I’ll briefly cover the broad issues involved in server deployment here to make you
aware of them. I’ll also go into these issues in more detail in later chapters when I look at
monitoring hosts through firewalls and when I examine how to deploy Nagios in distributed
and failover configurations.
1
First, Nagios uses Transmission Control Protocol/Internet Protocol (TCP/IP) to monitor
hosts and devices. Thus, you need to deploy your Nagios server or servers where they have
network visibility of the hosts and devices that you require to be monitored. If you have fire-
walls, network links, network segregation, or filtering devices between your Nagios server(s)
and the hosts to be monitored, then you may not have the visibility of the hosts required to
monitor them. For example, if you rely on Internet Control Message Protocol (ICMP) pings
to monitor the presence of a host, the intervening network devices must allow ICMP traffic to
traverse them.
If this network visibility is not available, you may need to deploy an additional server or
multiple additional servers to monitor those hosts. The best deployment model is to place the
additional server(s) in a distributed configuration where remote Nagios servers send the results
of checks back to a central server. This means you only need to monitor one web console and
have only one set of notification infrastructure to maintain. This configuration does require
the ability to manipulate the firewall or other network device between the central server and
1
CHAPTER 1
■■■
1. See Chapters 5 and 8.
6099_c01_final.qxd 3/16/06 11:03 PM Page 1
the distributed Nagios servers to allow the check and results traffic to traverse the network.
You can see this configuration in Figure 1-1.
2
If you cannot manipulate the intervening firewall or network device to allow this traffic
through, then these servers may need to be configured as independent Nagios servers moni-
toring any hosts that are not visible and collecting and notifying any events detected on them.
This does complicate your management regime, as each independently configured Nagios
server would have its own web console that would need to be monitored and potentially its
own notification infrastructure. I have demonstrated this configuration in Figure 1-2.
CHAPTER 1 ■ INSTALLATION2
2. I discuss distributed configurations in Chapter 8.
Figure 1-1. Nagios servers in a distributed configuration
Figure 1-2. Independent Nagios servers
6099_c01_final.qxd 3/16/06 11:03 PM Page 2