Opened 15 years ago
Closed 13 years ago
#142 closed enhancement (fixed)
Add monitoring for failure of the backend network
| Reported by: | mitchb | Owned by: | |
|---|---|---|---|
| Priority: | minor | Milestone: | |
| Component: | internals | Keywords: | sipb-noc | 
| Cc: | 
Description
We don't presently have a Nagios test that will alert us if there's a failure of the backend network switch, or the backend interface on an individual server. All the probes for sql.mit.edu will still pass because they run over the public network.
We should use some plugin to run a 'select 1;' or something similarly trivial on each scripts server.
Change History (4)
comment:1 Changed 14 years ago by adehnert
- Keywords sipb-noc added
comment:2 Changed 13 years ago by adehnert
- Resolution set to fixed
- Status changed from new to closed
comment:3 Changed 13 years ago by quentin
- Resolution fixed deleted
- Status changed from closed to reopened
This isn't good enough; if the routes over the backend interface disappear, we will happily talk to sql over the frontend network and not notice the outage.
Unfortunately, it doesn't look like check_ping supports specifying an interface to check from. I guess we could pretend and ping the backend IP of sql instead.
comment:4 Changed 13 years ago by adehnert
- Resolution set to fixed
- Status changed from reopened to closed
Fixed in r2192.


Fixed (see sipb-nagios commit 7d9206eae4e48824e0203d1ce19c4563f9bb664b and scripts r2190).