Issue5108

Title NTP errors
Priority normal Status chatting
Superseder Nosy List ajit, dan, dasu, rader, radtke, wcmaier
Assigned To Topic
Group IT

Created 2008.02.13 07:53 by wcmaier.
Last changed 2008.02.15 09:33 by wcmaier.

Messages
msg13658 (view) From: rader To: ajit, dan, dasu, help, hostmaster, rader, radtke, wcmaier Date: 2008.02.14 13:45
Bill et al,

Ntp.hep.wisc.edu just jumped 28801 seconds into the past.  I stopped
ntpd, and ntpdate tried to adjust it's clock 29297 seconds into the
future and failed.  And now I'm watching ntpd report wildly differing
offsets.  So clearly the system is sick and this is our problem.

Sorry for the bother.

steve
--

 > ---- Original Message ----
 > From: Hostmaster <hostmaster@doit.wisc.edu>
 > Steve,
 > 
 >   Is ntp.hep.wisc.edu still experiencing syncing problems?  Due to a 
 > logging issue we don't have historical data that we can look at right 
 > now.  We are working with the POST sysadmins to correct this.  That 
 > being said we don't see any problems at the host level and it seems to 
 > be in sync with outside sources as shown below:
 > 
 > ntpdate -q 128.104.30.17
 >  
 > server 128.104.30.17, stratum 2, offset -0.004851, delay 0.02597
 > 14 Feb 11:48:09 ntpdate[11550]: adjust time server 128.104.30.17 offset 
 > -0.004851 sec
 > 
 > 
 > [from 128.104.30.17]
 > ntpq> peer
 >      remote           refid      st t when poll reach   delay   offset 
 >  jitter
 > ============================================================================== 
 > 
 > -starfish.doit.w clock.xmission.  2 u   32  128   36    0.790  -21.757 
 >  0.326
 >  tang.doit.wisc. clock.via.net    2 u   12  128    3    0.749    5.539   
 > 0.372
 > -ntp.okstate.edu .USNO.           1 u   22  128  377   49.716  -10.423 
 >  0.373
 > *time-B.timefreq .ACTS.           1 u   90  128  377   77.397   -2.261 
 >  0.171
 > +time.nist.gov   .ACTS.           1 u   22  128  377   76.858   -2.488 
 >  1.334
 > +rrcs-64-183-55- .GPS.            1 u   18  128  377   89.275   -6.993 
 >  1.328
 > 
 > We will continue to look into the situation, if there is anything else 
 > we can do to assist you please let us know. 
 > 
 > Thanks,
 > DoIT Hostmasters (Bill Foster)
msg13655 (view) From: rader To: ajit, dan, dasu, noc, rader, radtke, wcmaier Date: 2008.02.13 16:03
Dear DoIT NOC,

One of our NTP servers (ntp.hep.wisc.edu aka oregano) is having 
problems sync'ing to 128.104.30.17 (aka ntp1.doit.wisc.edu.)
It seems to be ~8 seconds outa wack.  Our network monitoring 
indicates the problem first happened at 2249 Monday.

Our other NTP server (ntp2.hep.wisc.edu) doesn't have this problem.

Please investigate and report back.  Thanks.

steve
--

bash$ grep ntp /var/log/messages
Feb 12 11:46:03 oregano xntpd[284]: [ID 774427 daemon.notice] time reset (step) 0.394867 s
Feb 12 21:07:49 oregano xntpd[284]: [ID 774427 daemon.notice] time reset (step) 8.921594 s
Feb 12 23:37:08 oregano xntpd[284]: [ID 774427 daemon.notice] time reset (step) 8.245922 s
Feb 13 00:57:19 oregano xntpd[284]: [ID 774427 daemon.notice] time reset (step) 9.137410 s
Feb 13 11:07:09 oregano xntpd[284]: [ID 774427 daemon.notice] time reset (step) 9.004741 s

bash$ date; ntpdc -c peer oregano
Wed Feb 13 15:50:15 CST 2008
***Warning changing the request packet size from 160 to 48
     remote           local      st poll reach  delay   offset    disp
=======================================================================
=caesar.cs.wisc. 128.104.28.199   2   64    3 0.00063  9.070315 7.89149
=taylor.cs.wisc. 128.104.28.199   2   64    7 0.03136  9.070904 3.89621
=dr-zaius.cs.wis 128.104.28.199   2   64    7 0.00056  1.070421 3.90781
=dogie.macc.wisc 128.104.28.199   2   64    7 0.00053 481.07828 3.90765
=dns.doit.wisc.e 128.104.28.199   2   64    7 0.00046  9.092417 3.90775
msg13654 (view) From: rader To: ajit, dan, dasu, rader, radtke, wcmaier Date: 2008.02.13 15:53
> Nagios has been reporting NTP problems for the past 24 hours, though
 > I didn't see anything obvious in /var/log/*.

Really?

Looks like something's wrong with dogie.macc.wisc.edu.  (Note 481/60 = 8)

We're not seeing "that problem" with rosemary (aka basil) so something
odd's going on.

I'll email the noc anyway.

steve
--

oregano(rader): grep ntp /var/log/messages
Feb 12 11:46:03 oregano xntpd[284]: [ID 774427 daemon.notice] time reset (step) 0.394867 s
Feb 12 21:07:49 oregano xntpd[284]: [ID 774427 daemon.notice] time reset (step) 8.921594 s
Feb 12 23:37:08 oregano xntpd[284]: [ID 774427 daemon.notice] time reset (step) 8.245922 s
Feb 13 00:57:19 oregano xntpd[284]: [ID 774427 daemon.notice] time reset (step) 9.137410 s
Feb 13 11:07:09 oregano xntpd[284]: [ID 774427 daemon.notice] time reset (step) 9.004741 s

ginseng(rader): date; ntpdc -c peer oregano
Wed Feb 13 15:50:15 CST 2008
***Warning changing the request packet size from 160 to 48
     remote           local      st poll reach  delay   offset    disp
=======================================================================
=caesar.cs.wisc. 128.104.28.199   2   64    3 0.00063  9.070315 7.89149
=taylor.cs.wisc. 128.104.28.199   2   64    7 0.03136  9.070904 3.89621
=dr-zaius.cs.wis 128.104.28.199   2   64    7 0.00056  1.070421 3.90781
=dogie.macc.wisc 128.104.28.199   2   64    7 0.00053 481.07828 3.90765
=dns.doit.wisc.e 128.104.28.199   2   64    7 0.00046  9.092417 3.90775

  
 > ----- Forwarded message from Steve Rader <rader@noc.hep.wisc.edu> -----
 > 
 > From: Steve Rader <rader@noc.hep.wisc.edu>
 > Date: Wed, 13 Feb 2008 04:29:50 -0600
 > To: wcmaier@hep.wisc.edu
 > Subject: PROBLEM: oregano's NTP
 > X-Spambayes-Classification: ham; 0.00
 > 
 > oregano's NTP service is CRITICAL as of Wed Feb 13 04:29.
 > 
 > The plugin reported:
 > 
 >   NTP CRITICAL: Server Error and offset -489.354884 sec  +/- 120 sec
 > 
 > ----- End forwarded message -----
 > 
 > -- 
 > 
 > o--------------------------{ Will Maier }--------------------------o
 > | jabber:...wcmaier@xmpp.lfod.us | email:..will.maier@hep.wisc.edu |
 > | office:...........608.263.9692 | cell:..............608.438.6162 |
 > *--------------------[ UW High Energy Physics ]--------------------*
 > 
 > ----------
 > group: IT
 > messages: 13644
 > nosy: ajit, dan, dasu, rader, radtke, wcmaier
 > priority: triage
 > status: unread
 > title: NTP errors
 > 
 > ______________________________________
 > UW-HEP Help System <help@hep.wisc.edu>
 > <https://help.hep.wisc.edu/issue5108>
 > ______________________________________
msg13644 (view) From: wcmaier To: ajit, dan, dasu, rader, radtke, wcmaier Date: 2008.02.13 07:53
Nagios has been reporting NTP problems for the past 24 hours, though
I didn't see anything obvious in /var/log/*.

----- Forwarded message from Steve Rader <rader@noc.hep.wisc.edu> -----

From: Steve Rader <rader@noc.hep.wisc.edu>
Date: Wed, 13 Feb 2008 04:29:50 -0600
To: wcmaier@hep.wisc.edu
Subject: PROBLEM: oregano's NTP
X-Spambayes-Classification: ham; 0.00

oregano's NTP service is CRITICAL as of Wed Feb 13 04:29.

The plugin reported:

  NTP CRITICAL: Server Error and offset -489.354884 sec  +/- 120 sec

----- End forwarded message -----

-- 

o--------------------------{ Will Maier }--------------------------o
| jabber:...wcmaier@xmpp.lfod.us | email:..will.maier@hep.wisc.edu |
| office:...........608.263.9692 | cell:..............608.438.6162 |
*--------------------[ UW High Energy Physics ]--------------------*
History
Date User Action Args
2008-02-15 09:33:35wcmaiersetpriority: triage -> normal
assignedto: rader
2008-02-14 13:45:54radersetmessages: + msg13658
2008-02-13 16:03:01radersetmessages: + msg13655
2008-02-13 15:53:35radersetstatus: unread -> chatting
messages: + msg13654
2008-02-13 07:53:37wcmaiercreate