[Linux-HA] Problem with function send_ordered_nodemsg

Audet, Jean-Michel Jean-Michel.Audet at ca.Kontron.com
Wed Jun 18 09:04:47 MDT 2008


Hi,
	I already sent this message and never get any feedback.  Here is my problem.

I have hearbeat 2.1.3 (Same problem with 2.1.2).
I am using a Master/Slave model.

I am using the communication link of heartbeat to transfer data from 2 nodes.  Data is state and data.  Since, with Ethernet, I am limited in size, I am transferring multiple chunks of 8K data for up to 1MB (120 * 8KB approx).  

The problem is after couple of data set (maybe 300, 400, sometime more, sometime less... but always), the function send_ordered_nodemsg hang and I am not able to transfer data anymore.  It looks, from debug information that it hangs in function socket_resume_io_read. 

I have tried Unicast and Broadcast.

>From Dejan, it maybe that I am pushing heartbeat communication layer to the limit.  I am a little bit surprise that 1MB of data can be a problem.

I am stuck now and I need a solution cause my application is not usable and I may have to look at other ha package (I really don't want to).

Any input, suggestions, whatever will be greatly appreciated.
May it be good to consider creating a new communication link (client/server).  

Jean-Michel Audet
Kontron Canada 


-----Message d'origine-----
De : Audet, Jean-Michel 
Envoyé : Thursday, June 05, 2008 11:13 AM
À : 'General Linux-HA mailing list'
Objet : Problem with function send_ordered_nodemsg 


Hi, 
	I currently have a problem with my software that hangs when I call the function send_ordered_nodemsg (exhibit the same problem with sendnodemsg).  I am able to send many message (many dozens) and then, it hangs.  With extra debug, I found that it hangs somewhere in the function socket_resume_io_read. I base my code on the CIB implementation. 


I am requesting any helps that may help me find the problem.  I know that CIB is using this function so I think the problem is on my side or I don't know exactly how to use it but I am trying to find this problem since many days now.  

Maybe somebody have some experience with his function and hit the same problem before.

Any help will be more than appreciated.

Jean-Michel Audet

-------------- next part --------------
A non-text attachment was scrubbed...
Name: cib.xml
Type: text/xml
Size: 2733 bytes
Desc: cib.xml
Url : http://lists.community.tummy.com/pipermail/linux-ha/attachments/20080618/1d11cf29/cib.bin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ha.cf
Type: application/octet-stream
Size: 243 bytes
Desc: ha.cf
Url : http://lists.community.tummy.com/pipermail/linux-ha/attachments/20080618/1d11cf29/ha.obj


More information about the Linux-HA mailing list