DRBD performance question ?

Ravi Wijayaratne ravi_wija at yahoo.com
Wed Mar 6 15:17:40 MST 2002


Bruce,
If protocol A is fully asynchronous then it should
not depend on the network BW or the remote server
(node 2) I/O BW at all. What should 
ideally happen is that the disk writes must happen
at the primary(node 1) at a rate some what close to 
the I/O band width of node 1. However this does not
seem to happend with DRBD as you can see in all
results shown
in the DRBD webpage, throughput is limitted by the
network or the remote server I?O band width. I got
similar results in my experiments.
If the kernel is not throttling I/O s then some thing
else must be, like the barriers for write ordering.
What do u think ?

I will run some other benchmarks and let u all know
the
results. Meanwhile if you have some insights as to 
this bottleneck please email me. 

Ravi 

--- Bruce Walker <bruce at kahuna.cag.cpqcorp.net> wrote:
> Ravi,
>   Not sure if you have the specific reason why the
> kernel is
> throttling or not, but at some point the kernel
> must,
> and after that you won't get more than the network
> b/w throughput (my guess would be VM throttling and
> not
> network data structures but it is only a guess at
> this
> point).  To test this theory, you could try
> sending more than the 100MB in the test (say 200MB).
> I would guess that the throughput would decrease
> below the
> 9MB you saw.  Similarly, if you decreased to say
> 50MB, you might see a dramatic hike in the
> throughput
> (because most of all of the data to be sent to the
> replica could be in memory).
> 
> Assuming DRBD starting sending data near the
> beginning
> of the test, you could expect about 30MB to have
> arrived after the 10 second test, leaving the other
> 70MB to be asynchronously sent (note here you
> have to be careful about not starting another test
> while the async stuff is ongoing).
> 
> Why is network bandwidth only 3MB/s?  3MB isn't
> terribly surprising, since that is about 30Megabits,
> which is about 1/3 of the capacity, not atypical for
> utilization of ethernet.  To get 8MB/s on a single
> connection (as Alan indicated) would be very
> impressive.    
> 
> Your last test (writes from node 2)
> are clearly not being replicated (you got disk
> i/o b/w results).
> 
> bruce
> 
> > Alan,
> > 
> > If protocol A is asynchronous how does the network
> > performance so drastically affect the performance 
> > of protocol A ?
> > 
> > I am seeing similar results in my testing. I have
> > attached the output from the bebchmark. I have two
> > identical 1.6GHz machines with 512MB memory and
> > 100Mb/s NIC s. The test was conducted using a
> point to
> > point link. I have 22GB raid0 software raid set up
> on
> > two disks /dev/hda and /dev/hdb. I am running
> Linux
> > 2.4.17.
> > 
> > So it seems that the benchmark reads from
> /dev/zero
> > and writes /dev/null for the network B/W test. The
> > network bandwith is around 3MB/s. When the actual
> I/O
> > is done I seem to be getting about 10GB/s for
> protocol
> > A on the primary node. If protocol A was
> asynchronous
> > I should be seeing some thing close to 50MB/s. The
> > figure I get is not even close. 
> > 
> > I have an idea. Because the number of allocated
> pages
> > for disk I/o is growing the kernel must be
> redeeming
> > pages from the slab caches limitting the number of
> > buffer_head ,skbuff s and request structures in
> the
> > system. Since DRBD uses all these structures we
> see
> > that the kernel is throttling the data limitting
> the
> > I/O performance. This is significantly apparent in
> the
> > test for network bandwidth. Since now these
> structures
> > and data are consumed faster that disk I/O the
> > available memory gets depeleted sooner and we see
> even
> > a lower bandwidth for network.
> > 
> > Any thoughts ?
> > 
> > Ravi        
> > 
> > ------------------------ ==oo==
> > -----------------------
> >  DRBD Benchmark
> >  Version: 0.6.1-pre9 (api:58)
> >  SETSIZE = 100M
> > 
> > Node1:
> >  Linux 2.4.17 i686
> >  bogomips	: 3217.81
> >  Disk write: 54.45 MB/sec (104857600 B /
> 00:01.836696)
> >  Drbd unconnected: 56.90 MB/sec (104857600 B /
> > 00:01.757340)
> > 
> > Node2:
> >  Linux 2.4.17-xfs i686
> >  bogomips	: 3217.81
> >  Disk write: 51.55 MB/sec (104857600 B /
> 00:01.939839)
> >  Drbd unconnected: 56.10 MB/sec (104857600 B /
> > 00:01.782465)
> > 
> > Network: 
> >  Bandwidth: 3.11 MB/sec (104857600 B /
> 00:32.152080)
> >  Latency: round-trip min/avg/max/mdev =
> > 0.072/0.074/0.140/0.012 ms
> > 
> > Drbd connected (writing on node1):
> >  Protocol A: 9.31 MB/sec (104857600 B /
> 00:10.744729)
> >  Protocol B: 9.38 MB/sec (104857600 B /
> 00:10.662692)
> >  Protocol C: 7.36 MB/sec (104857600 B /
> 00:13.582487)
> > 
> > Drbd connected (writing on node2):
> >  Protocol A: 57.68 MB/sec (104857600 B /
> 00:01.733805)
> >  Protocol B: 58.28 MB/sec (104857600 B /
> 00:01.715931)
> >  Protocol C: 55.53 MB/sec (104857600 B /
> 00:01.800703)
> > 
> > 
> > --- Alan Robertson <alanr at unix.sh> wrote:
> > > Ravi Wijayaratne wrote:
> > > 
> > > > Hi,
> > > > 
> > > > I was looking at DRBD performance numbers from
> 
> > > >
> > >
> >
>
http://www.complang.tuwien.ac.at/reisner/drbd/performance.html
> > > > 
> > > >>From Phillip Reisner's paper I gather that
> > > protocol A
> > > > is an asynchronous protocol. Therefore
> protocol A
> > > > should not severely impact the write
> performance
> > > at
> > > > the
> > > > primary server. Is the above assertion correct
> ?
> > > > 
> > > > If it is so how is it that the performance of
> > > protocol
> > > > A on the primary side is seems to be limitted
> by
> > > the
> > > > network B/W ? If protocol A is asynchronous we
> > > should
> > > > see a significant difference in throughput
> between
> > > > protocol A and C. However they seem to be
> quite
> > > close.
> > > > Is this discrepency caused by write ordering
> or is
> > > > there a hidden bottleneck in the protocol ?
> > > 
> > > 
> > > First of all I'm sure that these performance
> numbers
> > > (from a year ago) were 
> > > based on 2.2 kernels.  In 2.2, DRBD had to
> suffer
> > > the disk I/O scheduling 
> > > twice: one at the DRBD level, and one at the
> real
> > > disk layer.  So, with 2.4 
> > > (where this is avoided), the numbers look a lot
> > > different.
> > > 
> > > As an aside, disks don't write that much
> compared to
> > > their total bandwidth, 
> > > and that the smart protocol only rarely has to
> > > synchronize.
> > > 
> > > A dedicated 100mbit connection (which is what he
> > > tested with) provides about 
> > > 8 megabytes/second writes.  If you say each
> write is
> 
=== message truncated ===


=====
------------------------------
Ravi Wijayaratne

__________________________________________________
Do You Yahoo!?
Try FREE Yahoo! Mail - the world's greatest free email!
http://mail.yahoo.com/



More information about the Linux-HA mailing list