DRBD performance question ?

Ravi Wijayaratne ravi_wija@yahoo.com
Tue, 5 Mar 2002 15:39:53 -0800 (PST)


Alan,

If protocol A is asynchronous how does the network
performance so drastically affect the performance 
of protocol A ?

I am seeing similar results in my testing. I have
attached the output from the bebchmark. I have two
identical 1.6GHz machines with 512MB memory and
100Mb/s NIC s. The test was conducted using a point to
point link. I have 22GB raid0 software raid set up on
two disks /dev/hda and /dev/hdb. I am running Linux
2.4.17.

So it seems that the benchmark reads from /dev/zero
and writes /dev/null for the network B/W test. The
network bandwith is around 3MB/s. When the actual I/O
is done I seem to be getting about 10GB/s for protocol
A on the primary node. If protocol A was asynchronous
I should be seeing some thing close to 50MB/s. The
figure I get is not even close. 

I have an idea. Because the number of allocated pages
for disk I/o is growing the kernel must be redeeming
pages from the slab caches limitting the number of
buffer_head ,skbuff s and request structures in the
system. Since DRBD uses all these structures we see
that the kernel is throttling the data limitting the
I/O performance. This is significantly apparent in the
test for network bandwidth. Since now these structures
and data are consumed faster that disk I/O the
available memory gets depeleted sooner and we see even
a lower bandwidth for network.

Any thoughts ?

Ravi        

------------------------ ==oo==
-----------------------
 DRBD Benchmark
 Version: 0.6.1-pre9 (api:58)
 SETSIZE = 100M

Node1:
 Linux 2.4.17 i686
 bogomips	: 3217.81
 Disk write: 54.45 MB/sec (104857600 B / 00:01.836696)
 Drbd unconnected: 56.90 MB/sec (104857600 B /
00:01.757340)

Node2:
 Linux 2.4.17-xfs i686
 bogomips	: 3217.81
 Disk write: 51.55 MB/sec (104857600 B / 00:01.939839)
 Drbd unconnected: 56.10 MB/sec (104857600 B /
00:01.782465)

Network: 
 Bandwidth: 3.11 MB/sec (104857600 B / 00:32.152080)
 Latency: round-trip min/avg/max/mdev =
0.072/0.074/0.140/0.012 ms

Drbd connected (writing on node1):
 Protocol A: 9.31 MB/sec (104857600 B / 00:10.744729)
 Protocol B: 9.38 MB/sec (104857600 B / 00:10.662692)
 Protocol C: 7.36 MB/sec (104857600 B / 00:13.582487)

Drbd connected (writing on node2):
 Protocol A: 57.68 MB/sec (104857600 B / 00:01.733805)
 Protocol B: 58.28 MB/sec (104857600 B / 00:01.715931)
 Protocol C: 55.53 MB/sec (104857600 B / 00:01.800703)


--- Alan Robertson <alanr@unix.sh> wrote:
> Ravi Wijayaratne wrote:
> 
> > Hi,
> > 
> > I was looking at DRBD performance numbers from 
> >
>
http://www.complang.tuwien.ac.at/reisner/drbd/performance.html
> > 
> >>From Phillip Reisner's paper I gather that
> protocol A
> > is an asynchronous protocol. Therefore protocol A
> > should not severely impact the write performance
> at
> > the
> > primary server. Is the above assertion correct ?
> > 
> > If it is so how is it that the performance of
> protocol
> > A on the primary side is seems to be limitted by
> the
> > network B/W ? If protocol A is asynchronous we
> should
> > see a significant difference in throughput between
> > protocol A and C. However they seem to be quite
> close.
> > Is this discrepency caused by write ordering or is
> > there a hidden bottleneck in the protocol ?
> 
> 
> First of all I'm sure that these performance numbers
> (from a year ago) were 
> based on 2.2 kernels.  In 2.2, DRBD had to suffer
> the disk I/O scheduling 
> twice: one at the DRBD level, and one at the real
> disk layer.  So, with 2.4 
> (where this is avoided), the numbers look a lot
> different.
> 
> As an aside, disks don't write that much compared to
> their total bandwidth, 
> and that the smart protocol only rarely has to
> synchronize.
> 
> A dedicated 100mbit connection (which is what he
> tested with) provides about 
> 8 megabytes/second writes.  If you say each write is
> a 2kbyte block, then 
> that's nearly 4k block writes per second.  Commonly
> read rates significantly 
>   outweigh write rates.  I have measured 10 to 1
> ratios on general purpose 
> development systems.  That would mean that it would
> have to be doing 44k 
> block I/Os per second, which is a VERY amazing speed
> for most PC-based I/O 
> subsystems.
> 
> If you have a very fast disk device like a ramdisk,
> or an expensive RAID 
> controller with battery-backed up write cache, I
> would expect to see DRBD 
> impact performance some - even in 2.4 kernels.
> 
> 	-- Alan Robertson
> 	   alanr@unix.sh
> 


=====
------------------------------
Ravi Wijayaratne

__________________________________________________
Do You Yahoo!?
Try FREE Yahoo! Mail - the world's greatest free email!
http://mail.yahoo.com/