{"id":205,"date":"2017-03-21T12:36:16","date_gmt":"2017-03-21T11:36:16","guid":{"rendered":"http:\/\/owncloud.gonscak.sk\/?p=205"},"modified":"2017-03-21T12:36:16","modified_gmt":"2017-03-21T11:36:16","slug":"how-to-set-up-drbd-primary-primary-mode-on-proxmox-4-x","status":"publish","type":"post","link":"https:\/\/www.gonscak.sk\/?p=205","title":{"rendered":"how to set up drbd primary-primary mode on proxmox 4.x"},"content":{"rendered":"<p>Today, I met with an interesting problem. I tried to create a primary-primary (dual primary) DRBD cluster on proxmox.<br \/>\nThe first we must have fully configured proxmox Two-node cluster. Like this:<br \/>\nhttps:\/\/pve.proxmox.com\/wiki\/Proxmox_VE_4.x_Cluster<br \/>\nWe must have a good configuration of <em>\/etc\/hosts <\/em>to resolve names into IP:<\/p>\n<pre>root@cl3-amd-node1:\/etc\/drbd.d# cat \/etc\/hosts\ncat \/etc\/hosts\n127.0.0.1 localhost.localdomain localhost\n192.168.1.104 cl3-amd-node1 pvelocalhost\n192.168.1.108 cl3-amd-node2<\/pre>\n<pre>root@cl3-amd-node2:\/etc\/drbd.d# cat \/etc\/hosts\ncat \/etc\/hosts\n127.0.0.1 localhost.localdomain localhost\n192.168.1.104 cl3-amd-node1\n192.168.1.108 cl3-amd-node2 pvelocalhost<\/pre>\n<p>One server was created on hardware raid PCI-E LSI 9240-4i (\/dev\/sdb) and second server was build on software raid via mdadm (\/dev\/md1) on debian jessie with installation with proxmox packages. So the backend for drbd devices was on one side &#8211; hardware raid and software raid on the other side.\u00a0 We must create a two disks with the same size (in sectors):<\/p>\n<pre>root@cl3-amd-node1:\nfdisk -l \/dev\/sdb\nDisk \/dev\/sdb: 1.8 TiB, 1998998994944 bytes, 3904294912 sectors\nUnits: sectors of 1 * 512 = 512 bytes\nSector size (logical\/physical): 512 bytes \/ 4096 bytes\nDevice\u00a0\u00a0\u00a0\u00a0 Boot Start\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 End\u00a0\u00a0\u00a0 Sectors\u00a0\u00a0 Size Id Type\n\/dev\/sdb1\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 2048 1953260927 1953258880 931.4G 83 Linux<\/pre>\n<pre>root@cl3-amd-node2:\nfdisk -l \/dev\/md1\nDisk \/dev\/md1: 931.4 GiB, 1000069595136 bytes, 1953260928 sectors\nUnits: sectors of 1 * 512 = 512 bytes\nSector size (logical\/physical): 512 bytes \/ 512 bytes\nDevice\u00a0\u00a0\u00a0\u00a0 Boot Start\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 End\u00a0\u00a0\u00a0 Sectors\u00a0\u00a0 Size Id Type\n\/dev\/md1p1\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 2048 1953260927 1953258880 931.4G 83 Linux<\/pre>\n<p>Now, we must have a direct network to each other of servers for drbd traffic, which will be very high. I use a bond of two gigabit network cards:<\/p>\n<pre>#cl3-amd-node1:\ncat \/etc\/network\/interfaces\nauto bond0\niface bond0 inet static\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 address\u00a0 192.168.5.104\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 netmask\u00a0 255.255.255.0\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 slaves eth2 eth1\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 bond_miimon 100\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 bond_mode balance-rr<\/pre>\n<pre>#cl3-amd-node2:\ncat \/etc\/network\/interfaces\nauto bond0\niface bond0 inet static\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 address\u00a0 192.168.5.108\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 netmask\u00a0 255.255.255.0\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 slaves eth1 eth2\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 bond_miimon 100\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 bond_mode balance-rr<\/pre>\n<p>And we can test the speed of this network with package iperf:<\/p>\n<pre>apt-get install iperf<\/pre>\n<p>We start an iperf instance on one server by this command:<\/p>\n<pre>#cl3-amd-node2\niperf\u00a0 -s -p 888<\/pre>\n<p>And from the other, we connect to this instance for 20 seconds:<\/p>\n<pre>#cl3-amd-node1\niperf -c 192.168.5.108 -p 888 -t 20\n#and the conclusion\n------------------------------------------------------------\nClient connecting to 192.168.5.108, TCP port 888\nTCP window size: 85.0 KByte (default)\n------------------------------------------------------------\n[\u00a0 3] local 192.168.5.104 port 49536 connected with 192.168.5.108 port 888\n[ ID] Interval\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 Transfer\u00a0\u00a0\u00a0\u00a0 Bandwidth\n[\u00a0 3]\u00a0 0.0-20.0 sec\u00a0 4.39 GBytes\u00a0 1.88 Gbits\/sec<\/pre>\n<p>So we can see, that I have a bonded network from two network cards and the resulting speed is almost 2Gbps.<br \/>\nNow, we can continue with installing and setting up the drbd resource.<\/p>\n<pre>apt-get install drbd-utils drbdmanage<\/pre>\n<p><em>All aspects of DRBD are controlled in its configuration file, \/etc\/drbd.conf. Normally, this configuration file is just a skeleton with the following contents:<\/em><br \/>\n<em>include &#8220;\/etc\/drbd.d\/global_common.conf&#8221;;<\/em><br \/>\n<em>include &#8220;\/etc\/drbd.d\/*.res&#8221;;<\/em><br \/>\nThe simplest configuration is:<\/p>\n<pre>cat \/etc\/drbd.d\/global_common.conf\nglobal {\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 usage-count yes;\n}\ncommon {\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 net {\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 protocol C;\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 }\n}<\/pre>\n<p>And the configuration of resource itself. It must be the same on both nodes:<\/p>\n<pre>root@cl3-amd-node1:\/etc\/drbd.d# cat \/etc\/drbd.d\/r0.res\nresource r0 {\ndisk {\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 c-plan-ahead 15;\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 c-fill-target 24M;\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 c-min-rate 90M;\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 c-max-rate 150M;\n}\nnet {\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 protocol C;\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 allow-two-primaries yes;\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 data-integrity-alg md5;\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 verify-alg md5;\n}\non cl3-amd-node1 {\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 device \/dev\/drbd0;\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 disk \/dev\/sdb1;\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 address 192.168.5.104:7789;\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 meta-disk internal;\n}\non cl3-amd-node2 {\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 device \/dev\/drbd0;\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 disk \/dev\/md1p1;\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 address 192.168.5.108:7789;\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 meta-disk internal;\n}\n}<\/pre>\n<pre>root@cl3-amd-node2:\/etc\/drbd.d# cat \/etc\/drbd.d\/r0.res\nresource r0 {\ndisk {\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 c-plan-ahead 15;\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 c-fill-target 24M;\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 c-min-rate 90M;\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 c-max-rate 150M;\n}\nnet {\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 protocol C;\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 allow-two-primaries yes;\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 data-integrity-alg md5;\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 verify-alg md5;\n}\non cl3-amd-node1 {\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 device \/dev\/drbd0;\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 disk \/dev\/sdb1;\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 address 192.168.5.104:7789;\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 meta-disk internal;\n}\non cl3-amd-node2 {\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 device \/dev\/drbd0;\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 disk \/dev\/md1p1;\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 address 192.168.5.108:7789;\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 meta-disk internal;\n}\n}<\/pre>\n<p>Now, we must create and initialize backend devices for drbd, on both nodes:<\/p>\n<pre>drbdadm create-md r0\n#answer yes to destroy possible data on devices<\/pre>\n<p>Now, we can start the drbd service, on both nodes:<\/p>\n<pre>root@cl3-amd-node2:\/etc\/drbd.d# \/etc\/init.d\/drbd start\n[ ok ] Starting drbd (via systemctl): drbd.service.\nroot@cl3-amd-node1:\/etc\/drbd.d# \/etc\/init.d\/drbd start\n[ ok ] Starting drbd (via systemctl): drbd.service.\n<\/pre>\n<p>Or we can start it on both nodes:<\/p>\n<pre>drbdadm up r0<\/pre>\n<p>And we can see it as inconsistent and both of them are secondary:<\/p>\n<pre>root@cl3-amd-node1:~# drbdadm status\nr0 role:Secondary\n\u00a0 disk:Inconsistent\n\u00a0 cl3-amd-node2 role:Secondary\n\u00a0\u00a0\u00a0 peer-disk:Inconsistent<\/pre>\n<p><em>Start the initial full synchronization. This step must be performed on only one\u00a0 node, only on initial resource configuration, and only on the node you selected as the synchronization source. To perform this step, issue this command:<\/em><\/p>\n<pre>root@cl3-amd-node1:# drbdadm primary --force r0<\/pre>\n<p>And we can see the status of our drbd storage:<\/p>\n<pre>root@cl3-amd-node2:~# drbdadm status\nr0 role:Secondary\n\u00a0 disk:Inconsistent\n\u00a0 cl3-amd-node1 role:Primary\n\u00a0\u00a0\u00a0 replication:SyncTarget peer-disk:UpToDate done:3.10<\/pre>\n<p>After synchronization successfully finish, we set up our secondary server to be primary:<\/p>\n<pre>root@cl3-amd-node2:~# drbdadm status\nr0 role:Secondary\n\u00a0 disk:UpToDate\n\u00a0 cl3-amd-node1 role:Primary\n\u00a0\u00a0\u00a0 peer-disk:UpToDate<\/pre>\n<pre>root@cl3-amd-node2:~# drbdadm primary r0<\/pre>\n<p>And we can see status of this dual-primary (primary-primary) drbd storage resource:<\/p>\n<pre>root@cl3-amd-node2:~# drbdadm status\nr0 role:Primary\n\u00a0 disk:UpToDate\n\u00a0 cl3-amd-node1 role:Primary\n\u00a0\u00a0\u00a0 peer-disk:UpToDate<\/pre>\n<p>Now we have a new block device on both servers:<\/p>\n<pre>root@cl3-amd-node2:~# fdisk -l \/dev\/drbd0\nDisk \/dev\/drbd0: 931.4 GiB, 1000037986304 bytes, 1953199192 sectors\nUnits: sectors of 1 * 512 = 512 bytes\nSector size (logical\/physical): 512 bytes \/ 512 bytes\nI\/O size (minimum\/optimal): 512 bytes \/ 512 bytes<\/pre>\n<p>We can configure this drbd block device as physical volume for lvm. This lvm is on top of this drbd. So, we can continue as it is a physical disk. Do it only on one server. The change will reflect on second server, due to primary-primary disk of drbd:<\/p>\n<pre>pvcreate \/dev\/drbd0\n\u00a0 Physical volume \"\/dev\/drbd0\" successfully created<\/pre>\n<p>As we can see, we must adapt <em>\/etc\/lvm\/lvm.conf<\/em> to our needs, because it scans all block devices and we can found duplicate entries:<\/p>\n<pre>root@cl3-amd-node2:~# pvs\n\u00a0 Found duplicate PV WXwDGteoexfmLxN6GQvt6Nd3jJxgvT2z: using \/dev\/drbd0 not \/dev\/md1p1\n\u00a0 Found duplicate PV WXwDGteoexfmLxN6GQvt6Nd3jJxgvT2z: using \/dev\/md1p1 not \/dev\/drbd0\n\u00a0 Found duplicate PV WXwDGteoexfmLxN6GQvt6Nd3jJxgvT2z: using \/dev\/drbd0 not \/dev\/md1p1\n\u00a0 PV\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 VG\u00a0\u00a0 Fmt\u00a0 Attr PSize\u00a0\u00a0 PFree\n\u00a0 \/dev\/drbd0\u00a0\u00a0\u00a0\u00a0\u00a0 lvm2 ---\u00a0 931.36g 931.36g\n\u00a0 \/dev\/md0\u00a0\u00a0 pve\u00a0 lvm2 a--\u00a0 931.38g\u00a0\u00a0\u00a0\u00a0\u00a0 0<\/pre>\n<p>So, we must edit filter option in this configuration.\u00a0 Look at our resouce configuration r0.res. We must exlude our backend devices (<em>\/dev\/sdb1<\/em> on one server and <em>\/dev\/md1p1<\/em> on second server), or we can reject all devices and allow only specific. I prefer reject all and allow only what we want. So edit the filter variable.<\/p>\n<pre>root@cl3-amd-node1:~# cat \/etc\/lvm\/lvm.conf | grep drbd\n\u00a0\u00a0\u00a0\u00a0 filter =[ \"a|\/dev\/drbd0|\", \"a|\/dev\/sda3|\", \"r|.*|\" ]<\/pre>\n<pre>root@cl3-amd-node2:~# cat \/etc\/lvm\/lvm.conf | grep drbd\n\u00a0\u00a0\u00a0 filter =[ \"a|\/dev\/drbd0|\", \"a|\/dev\/md0|\", \"r|.*|\" ]<\/pre>\n<p>Now, we don&#8217;t see duplicates and\u00a0 we can create a volume group. Only on one server:<\/p>\n<pre>root@cl3-amd-node2:~# vgcreate drbd0-vg \/dev\/drbd0\n\u00a0 Volume group \"drbd0-vg\" successfully created\n...\nroot@cl3-amd-node2:~# pvs\n\u00a0 PV\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 VG\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 Fmt\u00a0 Attr PSize\u00a0\u00a0 PFree\n\u00a0 \/dev\/drbd0 drbd0-vg lvm2 a--\u00a0 931.36g 931.36g\n\u00a0 \/dev\/md0\u00a0\u00a0 pve\u00a0\u00a0\u00a0\u00a0\u00a0 lvm2 a--\u00a0 931.38g\u00a0\u00a0\u00a0\u00a0\u00a0 0\n<\/pre>\n<p>And finally we add the LVM group to the proxmox. It can be done via web interface. So, go to proxmox web interface to Datacenter, click on storage and add (LVM).<br \/>\nThen create your ID (this is the name of your storage. It can not be changed later. Maybe: drbd0-vg),\u00a0 next you will see the previously created volume group drbd0-vg. So select it and enable the sharing by click the &#8216;shared&#8217; box.<br \/>\nNow, we can create virtual machine on this LVM and when we can migrate it without downtime from one server to another because of drbd. There is one shared storage. So when the migration starts, machine is started on another server and through ssh tunnel is migrate content of ram. And after few seconds, it is started.<br \/>\nSometimes, after some circumstances with network disconnect and connect, there is split-brain detected. So if this happened, don&#8217;t panic. When this happened, both servers are marked as &#8220;standalone&#8221; and drbd storage started to diverge. From this time there happened different writes to both sides. We must one of this servers mark as victim, because one of these servers has the &#8220;right&#8221; data and the other has &#8220;wrong&#8221; data. So the only way is backup the running virtuals on the &#8220;victim&#8221; and then we must destroy\/discard this data on drbd storage and synchronize it from other server, which has &#8220;right&#8221; data. So if this is happening, this is in logs:<\/p>\n<pre>root@cl3-amd-node1:~# dmesg | grep -i brain\n[499210.096185] drbd r0\/0 drbd0 cl3-amd-node1: helper command: \/sbin\/drbdadm initial-split-brain\n[499210.097306] drbd r0\/0 drbd0 cl3-amd-node1: helper command: \/sbin\/drbdadm initial-split-brain exit code 0 (0x0)\n[499210.097313] drbd r0\/0 drbd0: Split-Brain detected but unresolved, dropping connection!\n<\/pre>\n<p>We must manually solve this problem. So I choose as victim: cl3-amd-node1. We must set this node as secondary:<\/p>\n<pre>drbdadm secondary r0<\/pre>\n<p>And now, we must disconnect it and connect it back with marking data to be discarded.<\/p>\n<pre>root@cl3-amd-node1:~# drbdadm connect --discard-my-data r0<\/pre>\n<p>And after synchronization, mark it back to primary node:<\/p>\n<pre>root@cl3-amd-node1:~# drbdadm primary r0<\/pre>\n<p>And in log, we can see:<\/p>\n<pre>cl3-amd-node1 kernel: [246882.068518] drbd r0\/0 drbd0: Split-Brain detected, manually solved. Sync from peer node<\/pre>\n<p>Have fun.<br \/>\n&nbsp;<\/p>\n ","protected":false},"excerpt":{"rendered":"<p>Today, I met with an interesting problem. I tried to create a primary-primary (dual primary) DRBD cluster on proxmox. The first we must have fully configured proxmox Two-node cluster. Like this: https:\/\/pve.proxmox.com\/wiki\/Proxmox_VE_4.x_Cluster We must have a good configuration of \/etc\/hosts to resolve names into IP: root@cl3-amd-node1:\/etc\/drbd.d# cat \/etc\/hosts cat \/etc\/hosts 127.0.0.1 localhost.localdomain localhost 192.168.1.104 cl3-amd-node1 &hellip; <a href=\"https:\/\/www.gonscak.sk\/?p=205\" class=\"more-link\">Continue reading <span class=\"screen-reader-text\">how to set up drbd primary-primary mode on proxmox 4.x<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[39],"tags":[49,50,51,52,53,54,55,56],"class_list":["post-205","post","type-post","status-publish","format-standard","hentry","category-debian-jessie","tag-drbd","tag-found-duplicate-pv","tag-lvm","tag-primary","tag-proxmox","tag-raid","tag-split-brain","tag-unresolved"],"_links":{"self":[{"href":"https:\/\/www.gonscak.sk\/index.php?rest_route=\/wp\/v2\/posts\/205","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.gonscak.sk\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.gonscak.sk\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.gonscak.sk\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.gonscak.sk\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=205"}],"version-history":[{"count":0,"href":"https:\/\/www.gonscak.sk\/index.php?rest_route=\/wp\/v2\/posts\/205\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.gonscak.sk\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=205"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.gonscak.sk\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=205"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.gonscak.sk\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=205"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}