DRBD Build Steps

From trapsink.com
Jump to: navigation, search


Overview

From http://www.drbd.org/:

"DRBD® refers to block devices designed as a building block to form high availability (HA) clusters. This is done by mirroring a whole block device via an assigned network. DRBD can be understood as network based raid-1."

DRBD is the concept of taking two similar block storage devices and performing a network RAID-1 to connect them for HA redundancy. The block devices can be CBS (Cloud Block Storage), VMware vDisks, local in-chassis RAID arrays and so forth. The only requirements are they be unique and have a (preferably) private network between them for the replication. As with a traditional RAID-1 array, only one of the block devices is usable (live) - DRBD performs a block-level replication between the two devices in one direction only. However, DRBD allows for making either one of the devices Master (live) and shifting back and forth dynamically with a few commands.

Conventions

This wiki article will use 2 groups of cloud servers as examples:

[root@drbd1 ~]# ip a | grep "inet " | grep 192
    inet 192.168.5.4/24 brd 192.168.5.255 scope global eth2
[root@drbd2 ~]# ip a | grep "inet " | grep 192
    inet 192.168.5.2/24 brd 192.168.5.255 scope global eth2

root@drbd3:~# ip a | grep "inet " | grep 192
    inet 192.168.5.1/24 brd 192.168.5.255 scope global eth2
root@drbd4:~# ip a | grep "inet " | grep 192
    inet 192.168.5.3/24 brd 192.168.5.255 scope global eth2

Group 1 - CentOS

  • drbd1 and drbd2
  • CentOS 6.5
  • /dev/xvde block devices

Group 2 - Debian

  • drbd3 and drbd4
  • Debian 7 Stable
  • 20G /dev/xvde block devices


Node Prep

A working DRBD setup requires at a minimum:

  • 2x servers with similar block devices
  • DRBD kernel module and userspace utilities
  • Private network between the servers
  • iptables port 7788 open between servers on the Private network
  • /etc/hosts configured
  • NTP synchronized

For future growth, LVM should be used underneath the DRBD implementation; the underlying PV/VG/LV can then be grown and the DRBD device ("resource") resized with the drbdadm resize resource command online.

Timing is critical to proper operation - ensure NTP is configured properly

Hosts file

We need to ensure that the 2 servers can find each other on the private network as typical with any type of cluster build. When initializing the resource below, the drbdadm tool uses the hostname to match what's in the resource configuration file so it's important they align. In our examples the servers are in the domain .local as shown below, their FQDN hostnames are properly configured with drbdX.local as expected.

CentOS

/etc/hosts
192.168.5.4     drbd1.local
192.168.5.2     drbd2.local

Debian

/etc/hosts
192.168.5.1     drbd3.local
192.168.5.3     drbd4.local

IPTables

We'll add a basic rule to allow all communication on the private 192.168.5.0/24 subnet between the nodes. This can be tuned to be more granular as required.

CentOS

# vi /etc/sysconfig/iptables

...
-A INPUT -s 192.168.5.0/24 -j ACCEPT
...

# service iptables restart

Debian

# apt-get update; apt-get install iptables-persistent
# vi /etc/iptables/rules.v4

...
-A INPUT -s 192.168.5.0/24 -j ACCEPT
...

# service iptables-persistent restart
# insserv iptables-persistent

Software Installation

CentOS

CentOS requires the use of the http://www.elrepo.org/ RPM packages; this provides the DKMS-based kernel module and userspace toolset.

rpm -Uvh http://www.elrepo.org/elrepo-release-6-6.el6.elrepo.noarch.rpm
yum repolist
yum install drbd83-utils kmod-drbd83 dkms lvm2 ntp ntpdate
service ntpd restart && chkconfig ntpd on
reboot

Note that this installation will pull in kernel-devel and gcc (for DKMS) and a few device-mapper packages (for LVM2).

Debian

The Debian 7 kernel includes the drbd.ko module as a stock item; all that's needed is to install the userspace toolset on all nodes.

apt-get update
apt-get install --no-install-recommends drbd8-utils lvm2 ntp ntpdate
service ntp restart && insserv -v ntp
reboot

Note that without --no-install-recommends apt will install perl and other tools.


Storage Prep

Create a single volume group and logical volume from the storage on each node in the cluster, but do not create a filesystem - that comes later.

parted -s -- /dev/xvde mktable gpt
parted -s -- /dev/xvde mkpart primary ext3 2048s 100%
parted -s -- /dev/xvde set 1 lvm on

pvcreate /dev/xvde1
vgcreate vgdata00 /dev/xvde1
lvcreate -l 100%VG -n drbd00 vgdata00


DRBD Resource Prep

Common Settings

The file /etc/drbd.d/global_common.conf exists on both nodes; as the default content will vary from release to release it's best to edit the file provided instead of creating a new one overtop – in general you most likely want to disable the usage-count for performance and set the syncer rate – changes made to this file and the default options should be researched to provide optimum settings for the platform DRBD is being deployed on.

TODO: provide common default configurations of the global settings for various scenarios

Cloud Example

/etc/drbd.d/global_common.conf
global { usage-count no; }
common {
  syncer { rate 10M; }
}

Resource Settings

We create configuration files on both nodes that ties the two servers together with their new storage - note that the name of the file should be the name of the resource as a Best Practice. As we're building two different clusters on the same IP subnet, we'll be careful to name them uniquely to prevent any chance of collision at runtime. A shared secret was generated using pwgen as shown below.

CentOS

/etc/drbd.d/cent00.res
resource cent00 {
  protocol C;
  startup { wfc-timeout 0; degr-wfc-timeout 120; }
  disk { on-io-error detach; }
  net { cram-hmac-alg "sha1"; shared-secret "m9bTmbsK4quE"; }
  on drbd1.local {
    device /dev/drbd0;
    disk /dev/vgdata00/drbd00;
    meta-disk internal;
    address 192.168.5.4:7788;
  }
  on drbd2.local {
    device /dev/drbd0;
    disk /dev/vgdata00/drbd00;
    meta-disk internal;
    address 192.168.5.2:7788;
  }
}

Debian

/etc/drbd.d/deb00.res
resource deb00 {
  protocol C;
  startup { wfc-timeout 0; degr-wfc-timeout 120; }
  disk { on-io-error detach; }
  net { cram-hmac-alg "sha1"; shared-secret "m9bTmbsK4quE"; }
  on drbd3.local {
    device /dev/drbd0;
    disk /dev/vgdata00/drbd00;
    meta-disk internal;
    address 192.168.5.1:7788;
  }
  on drbd4.local {
    device /dev/drbd0;
    disk /dev/vgdata00/drbd00;
    meta-disk internal;
    address 192.168.5.3:7788;
  }
}


DRBD Resource Init

On both nodes, the drbdadm tool is used to initialize the resource. After the initialization and service start, on one node only we start the synchronization process. We then track the progress of the init - in our example, we'll use drbd1 as CentOS primary and drbd4 as Debian primary to show how it works in either node.

CentOS

Create the resource, start the service and start the sync:

[root@drbd1 ~]# drbdadm create-md cent00
[root@drbd2 ~]# drbdadm create-md cent00

[root@drbd1 ~]# service drbd start; chkconfig drbd on
[root@drbd2 ~]# service drbd start; chkconfig drbd on

[root@drbd1 ~]# drbdadm -- --overwrite-data-of-peer primary cent00

Check progress:

[root@drbd1 ~]# cat /proc/drbd 
version: 8.3.16 (api:88/proto:86-97)
GIT-hash: a798fa7e274428a357657fb52f0ecf40192c1985 build by phil@Build64R6, 2013-09-27 16:00:43
 0: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r-----
    ns:1124352 nr:0 dw:0 dr:1125016 al:0 bm:68 lo:0 pe:1 ua:0 ap:0 ep:1 wo:f oos:19842524
    [>...................] sync'ed:  5.4% (19376/20472)M
    finish: 0:31:21 speed: 10,536 (10,312) K/sec

[root@drbd1 ~]# drbdadm -- status cent00
<drbd-status version="8.3.16" api="88">
<resources config_file="/etc/drbd.conf">
<resource minor="0" name="cent00" cs="SyncSource" ro1="Primary" ro2="Secondary" ds1="UpToDate" ds2="Inconsistent" resynced_percent="14.6" />
</resources>
</drbd-status>

Debian

Create the resource, start the service and start the sync:

root@drbd3:~# drbdadm create-md deb00
root@drbd4:~# drbdadm create-md deb00

root@drbd3:~# service drbd start; insserv drbd
root@drbd4:~# service drbd start; insserv drbd

root@drbd4:~# drbdadm -- --overwrite-data-of-peer primary deb00

Check progress:

root@drbd4:~# cat /proc/drbd 
version: 8.3.11 (api:88/proto:86-96)
srcversion: F937DCB2E5D83C6CCE4A6C9 
 0: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r-----
    ns:808960 nr:0 dw:0 dr:809624 al:0 bm:49 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:20157788
    [>....................] sync'ed:  3.9% (19684/20472)Mfinish: 0:32:40 speed: 10,264 (10,112) K/sec

root@drbd4:~# drbdadm -- status deb00
<drbd-status version="8.3.13" api="88">
<resources config_file="/etc/drbd.conf">
<resource minor="0" name="deb00" cs="SyncSource" ro1="Primary" ro2="Secondary" ds1="UpToDate" ds2="Inconsistent" resynced_percent="15.0" />
</resources>
</drbd-status>

Filesystem Build

Remember this is not a shared filesystem, you can only format/mount/etc. on the Primary node

It's best to wait until the initial synchronization is complete; as per the above use one of these methods to watch and wait for 100% completion; the sync can take awhile depending on the size of the block storage and the speed of the network between them.

cat /proc/drbd
drbdadm -- status <resource>  ('cent00' or 'deb00' in these examples)

A proper looking sync shows both datastores (ds1, ds2) as UpToDate like so:

[root@drbd1 ~]# drbdadm -- status cent00
<drbd-status version="8.3.16" api="88">
<resources config_file="/etc/drbd.conf">
<resource minor="0" name="cent00" cs="Connected" ro1="Primary" ro2="Secondary" ds1="UpToDate" ds2="UpToDate" />
</resources>
</drbd-status>

root@drbd4:~# drbdadm -- status deb00
<drbd-status version="8.3.13" api="88">
<resources config_file="/etc/drbd.conf">
<resource minor="0" name="deb00" cs="Connected" ro1="Primary" ro2="Secondary" ds1="UpToDate" ds2="UpToDate" />
</resources>
</drbd-status>

After that's complete, normal methodology is use to format and mount the device node as defined in the resource config. Typically - as listed in the config - the first resource is /dev/drbd0, the second /dev/drbd1 and so forth. If you've lost track the devfs tree can help you with a simple ls:

[root@drbd1 ~]# ls -l /dev/drbd/by-res/
total 0
lrwxrwxrwx 1 root root 11 Jul 18 19:03 cent00 -> ../../drbd0

root@drbd4:~# ls -l /dev/drbd/by-res/
total 0
lrwxrwxrwx 1 root root 11 Jul 18 19:03 deb00 -> ../../drbd0

Double-check who is the Primary – this is the ro1 and ro2 information shown with the drbdadm status command as shown above. Notice in this example that drbd1 shows r01=Primary and drbd2 shows ro1=Secondary – we know we should be formatting, mounting, etc. on drbd1 for this work once synchronization is complete. On the Debian nodes, we see that drbd4 shows r01=Primary as expected.

[root@drbd1 ~]# drbdadm -- status cent00
<drbd-status version="8.3.16" api="88">
<resources config_file="/etc/drbd.conf">
<resource minor="0" name="cent00" cs="Connected" ro1="Primary" ro2="Secondary" ds1="UpToDate" ds2="UpToDate" />
</resources>
</drbd-status>

[root@drbd2 ~]# drbdadm -- status cent00
<drbd-status version="8.3.13" api="88">
<resources config_file="/etc/drbd.conf">
<resource minor="0" name="deb00" cs="Connected" ro1="Primary" ro2="Secondary" ds1="UpToDate" ds2="UpToDate" />
</resources>
</drbd-status>

We'll use a standard ext4 filesystem for this build on both CentOS and Debian:

mkfs.ext4 -v -m0 /dev/drbd0
mkdir /data
mount /dev/drbd0 /data
df -h

They look like what you'd expect:

[root@drbd1 ~]# df -h /data
Filesystem      Size  Used Avail Use% Mounted on
/dev/drbd0       20G  172M   20G   1% /data

root@drbd4:~# df -h /data
Filesystem      Size  Used Avail Use% Mounted on
/dev/drbd0       20G  172M   20G   1% /data


Testing

Test that the resource can be made active and mounted on the Secondary node. We'll write a test file, unmount, demote the Primary to Secondary, mount on the partner node and check the test file.

CentOS

[root@drbd1 ~]# touch /data/test.file
[root@drbd1 ~]# umount /data
[root@drbd1 ~]# drbdadm secondary cent00

[root@drbd2 ~]# drbdadm primary cent00
[root@drbd2 ~]# mkdir /data
[root@drbd2 ~]# mount /dev/drbd0 /data
[root@drbd2 ~]# ls -l /data/
total 16
drwx------ 2 root root 16384 Jul 18 19:41 lost+found
-rw-r--r-- 1 root root     0 Jul 18 19:47 test.file

Debian

root@drbd4:~# touch /data/test.file
root@drbd4:~# umount /data
root@drbd4:~# drbdadm secondary deb00

root@drbd3:~# drbdadm primary deb00
root@drbd3:~# mkdir /data
root@drbd3:~# mount /dev/drbd0 /data
root@drbd3:~# ls -l /data/
total 16
drwx------ 2 root root 16384 Jul 18 19:41 lost+found
-rw-r--r-- 1 root root     0 Jul 18 19:47 test.file


HA Failover

TODO: Build this section


References