[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [xmlblaster] socket reconnect flood

To: xmlblaster at server.xmlBlaster.org
Subject: Re: [xmlblaster] socket reconnect flood
From: "Póka Balázs" <p.balazs at gmail.com>
Date: Mon, 6 Nov 2006 21:05:26 +0100
Domainkey-signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:references; b=uRVPjgao3t9pRQEi5Oe8GSM+mbT/GTvh6cwVyNNOOVzSHWP2b84XB18b4Sbd3gIBwXSV9VW5OPgqX0yq9cVRS1F6f+XGD2DlyGNsfzjzBjTUvXIwEVBHOXgdzj3dGMunsVcwiTzkfS8EpNxvI73XnBM96nBvLQcTY1w8afkyQjg=
In-reply-to: <d425539c0610300800i80dddyf1eab6c77c8928a3 at mail.gmail.com>
References: <d425539c0610281605r14497de5oc6ed5feb53bc4f1f at mail.gmail.com> <4544DFC1.9010006 at marcelruff.info> <454522DD.5010201 at marcelruff.info> <d425539c0610300800i80dddyf1eab6c77c8928a3 at mail.gmail.com>
Reply-to: xmlblaster at server.xmlBlaster.org
Sender: owner-xmlblaster at server.xmlBlaster.org

Hi Marcel!

My reconnect flooding problem is getting weirder: it also behaves this way when there is no apparent problem with the truststore file.
Let me present you part of a log file (200 megs of this/5 hours are generated when the bug appears):
(I unfortunately forgot to switch on FINER logging for the java.utils.logging framework...)

2006-11-06 20:20:07,847 INFO [XmlBlaster.PingTimer] (SocketConnection.java:180) - SSL client socket enabled for socket://******:7608, keyStore=/home/disp/disp/conf/nova_disp/truststore
2006-11-06 20:20:07,846 INFO [XmlBlaster.PingTimer] (SocketConnection.java:180) - SSL client socket enabled for socket://******:7608, keyStore=/home/disp/disp/conf/nova_disp/truststore
2006-11-06 20:20:07,849 WARN [ XmlBlaster.PingTimer] (Timeout.java:189) - No connection established, socket://******:7608 still seems to be down after 5239 connection retries.
2006-11-06 20:20:07,858 INFO [XmlBlaster.PingTimer] (SocketConnection.java :180) - SSL client socket enabled for socket://******:7608, keyStore=/home/disp/disp/conf/nova_disp/truststore
2006-11-06 20:20:08,042 INFO [XmlBlaster.PingTimer] (SocketConnection.java:180) - SSL client socket enabled for socket://******:7608, keyStore=/home/disp/disp/conf/nova_disp/truststore
2006-11-06 20:20:08,043 WARN [XmlBlaster.PingTimer] (Timeout.java:189) - No connection established, socket://******:7608 still seems to be down after 2965 connection retries.

The really interesting things to note are:
- The first two messages are in reverse time order. This can only happen if the thread "XmlBlaster.PingTimer" is in fact not ONE thread, but there are more than one threads with the same name, all trying to connect at once!
- No mentioning of truststore not found. This is still the original, unpatched code running.
- Watch the messages by Timeout.java:189. First there were 5239 connection retries, then 2965 connection retries. This is consistent in all 200 megs of logs. Either someone's randomizing numbers :), or there really are separate objects (and threads) trying to connect, each counting on its own.

It seems that somewhere in the code there may be a race condition or similar which, once triggered by something, allows a ravaging horde of connect threads to spawn and kill everything in their way. :)

thanks for help,
Balázs Póka

Follow-Ups:
- Re: [xmlblaster] socket reconnect flood
  - From: "Póka Balázs" <p.balazs at gmail.com>

References:
- [xmlblaster] socket reconnect flood
  - From: "Póka Balázs" <p.balazs at gmail.com>
- Re: [xmlblaster] socket reconnect flood
  - From: Marcel Ruff <mr at marcelruff.info>
- Re: [xmlblaster] socket reconnect flood
  - From: Marcel Ruff <mr at marcelruff.info>
- Re: [xmlblaster] socket reconnect flood
  - From: "Póka Balázs" <p.balazs at gmail.com>

Prev by Date: Re: [xmlblaster] RequestBroker.getInternalSessionInfo() question
Next by Date: Re: [xmlblaster] socket reconnect flood
Previous by thread: Re: [xmlblaster] socket reconnect flood
Next by thread: Re: [xmlblaster] socket reconnect flood
Index(es):
- Date
- Thread