HowTo Single Disk to RAID: differenze tra le versioni
(Final edit, everything should be there) |
(Does the left TODO) |
||
(3 versioni intermedie di 2 utenti non mostrate) | |||
Riga 3: | Riga 3: | ||
== Context == | == Context == | ||
We sat up a new/old server, [[BitArno]], it has a nice /home of ~6TB backed by a RAID and a separate disk for the / partition. As the drive for the / was tested before getting to production resulting in no errors we | We sat up a new/old server, [[BitArno]], it has a nice /home of ~6TB backed by a RAID and a separate disk for the / partition. As the drive for the / was tested before getting to production resulting in no errors we thought we didn't need a RAID for it. We installed Debian stable 8.1, set up every service, went to bed just to wake up with a hung server refusing to boot. | ||
An on site check was required, we got there and found out that the / drive was failing to reallocate sectors, we fsck'ed it and it booted. "Great!" that same evening it was failing again and so we decided to switch from a single disk root to a RAID1 backed root | An on site check was required, we got there and found out that the / drive was failing to reallocate sectors, we fsck'ed it and it booted. "Great!" that same evening it was failing again and so we decided to switch from a single disk root to a RAID1 backed root especially because we do use commodity hardware taken from old computers. | ||
== The general idea == | == The general idea == | ||
We wanted a way to easily switch from a drive to the RAID without having to back everything up and copy it over again with fewer downtime possible and so we decided to: | We wanted a way to easily switch from a drive to the RAID without having to back everything up and copy it over again with fewer downtime possible and so we decided to: | ||
* Poweroff the server and attach | * Poweroff the server and attach a new disk of equal size | ||
* Power on the server | * Power on the server | ||
* Create a RAID1 array composed by 2 drives but with a drive marked as missing | * Create a RAID1 array composed by 2 drives but with a drive marked as missing (the new drive is the present one, the old one will be attached to the cluster later) | ||
* Set LVM up on top of it (as the old drive had LVM (it's a 120GB drive just for the `/` parition, we wanted to have some space available if we'll ever find in the need) | * Set LVM up on top of it (as the old drive had LVM (it's a 120GB drive just for the `/` parition, we wanted to have some space available if we'll ever find in the need) | ||
* Boot a live distro and copy / | * Boot a live distro and copy / | ||
** Install the bootloader on the new drive adding the necessary modules for RAID and LVM | ** Install the bootloader on the new drive adding the necessary modules for RAID and LVM | ||
* Reboot the server from the RAID- | * Reboot the server from the RAID-member disk | ||
* Check if everything is OK | * Check if everything is OK | ||
* Add the old disk to the RAID and let mdadm resync the contents. | * Add the old disk to the RAID and let mdadm resync the contents. | ||
Riga 23: | Riga 23: | ||
Here I'll try to cover step by step the command we used, it has been a while since we did it so I'm not 100% sure I'll remember everything | Here I'll try to cover step by step the command we used, it has been a while since we did it so I'm not 100% sure I'll remember everything | ||
First a layout of the | First a layout of the partitions when it was single-disk | ||
sda | |||
├─sda1 | sda 8:0 0 114.5G 0 disk | ||
├─sda2 | ├─sda1 8:1 0 243M 0 part /boot | ||
└─sda5 | ├─sda2 8:2 0 1K 0 part | ||
└─sda5 8:5 0 114.3G 0 part | |||
├─bitarno--system-root 253:0 0 30G 0 lvm / | |||
└─bitarno--system-swap 253:3 0 4G 0 lvm [SWAP] | |||
And after the switch | And after the switch (newly created LVM has different volume group name but same size and structure) | ||
sda 8:0 0 114.5G 0 disk | sda 8:0 0 114.5G 0 disk | ||
Riga 49: | Riga 50: | ||
├─ba--system-root 253:0 0 30G 0 lvm / | ├─ba--system-root 253:0 0 30G 0 lvm / | ||
└─ba--system-swap 253:3 0 4G 0 lvm [SWAP] | └─ba--system-swap 253:3 0 4G 0 lvm [SWAP] | ||
The sdX2 partition is an extended one which contains the RAID one | |||
In this way we have two arrays, one for the boot and one for the LVM which contains the root and the swap | In this way we have two arrays, one for the boot and one for the LVM which contains the root and the swap | ||
Riga 71: | Riga 74: | ||
rsync -avPh /boot /mnt | rsync -avPh /boot /mnt | ||
sync && umount /mnt | sync && umount /mnt | ||
We were done with /boot, | We were done with /boot, and we needed to create the LVM volumes: | ||
First we marked /dev/md2 as LVM member then we told LVM /dev/md2 was a device it could use | First we marked /dev/md2 as LVM member with fdisk then we told LVM /dev/md2 was a device it could use | ||
pvcreate /dev/md2 | pvcreate /dev/md2 | ||
Riga 81: | Riga 84: | ||
lvcreate -L30G -n root ba-system | lvcreate -L30G -n root ba-system | ||
lvcreate -L4G -n swap ba-system | lvcreate -L4G -n swap ba-system | ||
We | We created the filesystem for root | ||
mkfs.ext4 /dev/mapper/ba--system-root | mkfs.ext4 /dev/mapper/ba--system-root | ||
And for swap | And for swap | ||
mkswap /dev/mapper/ba--system-swap | mkswap /dev/mapper/ba--system-swap | ||
Then it was time to poweroff the server and boot from a live, we | Then it was time to poweroff the server and boot from a live, we choose ArchLinux as it's lightweight and provides all the packages we need without forcefully bringing up an installation procedure | ||
We mounted the root partitions | We mounted the root partitions | ||
Riga 93: | Riga 96: | ||
mount /dev/mapper/bitarno--system-root /mnt/old | mount /dev/mapper/bitarno--system-root /mnt/old | ||
And copied the | And copied the data over: | ||
rsync -avPh /mnt/old/ /mnt/new | rsync -avPh /mnt/old/ /mnt/new | ||
Riga 99: | Riga 102: | ||
mount /dev/md1 /mnt/new/boot | mount /dev/md1 /mnt/new/boot | ||
And arch-chroot'ed into it | And arch-chroot'ed into it. Arch-chroot is a wonderful piece of software that saves you from bind mounting /dev /proc and /sys in the chroot | ||
arch-chroot /mnt/new | arch-chroot /mnt/new | ||
It was time to install grub and regenerate the image to contain LVM and RAID modules, this was the hardest part as we wrongly supposed all the modules were already present. | It was time to install grub on the new disk and regenerate grub configuration and the initram image to contain LVM and RAID modules, this was the hardest part as we wrongly supposed all the modules were already present. | ||
We added "domdadm" to GRUB_CMDLINE_LINUX_DEFAULT in /etc/default/grub, the line should look something like | '''We added "domdadm"''' to GRUB_CMDLINE_LINUX_DEFAULT in /etc/default/grub, the line should look something like | ||
cat /etc/default/grub| grep -i cmdli | cat /etc/default/grub| grep -i cmdli | ||
GRUB_CMDLINE_LINUX_DEFAULT="quiet domdadm" | GRUB_CMDLINE_LINUX_DEFAULT="quiet domdadm" | ||
And we added the modules needed to GRUB_PRELOAD_MODULES in the same file, if the line don't exist create it | And we '''added the modules needed to GRUB_PRELOAD_MODULES''' in the same file, if the line don't exist create it | ||
GRUB_PRELOAD_MODULES="lvm diskfilter mdraid1x" | GRUB_PRELOAD_MODULES="lvm diskfilter mdraid1x" | ||
Riga 119: | Riga 122: | ||
And just to be sure | And just to be sure | ||
update-grub | update-grub | ||
We also had to update the initramfs to actually contain the raid module ans so we edited | |||
/etc/initramfs-tools/modules | |||
appending | |||
raid1 | |||
to the list and executed | |||
update-initramfs | |||
to re-generate the initramfs | |||
We exited from the chroot, unmounted everything, rebooted and kept our finger crossed till the password prompt of the root-on-RAID booted system. | We exited from the chroot, unmounted everything, rebooted and kept our finger crossed till the password prompt of the root-on-RAID booted system. | ||
This last step I simplified, we made several errors and had to boot into live several times, | This last step I simplified it, we made several errors and had to boot into live several times, as we kept forgetting to add the needed modules and command lines arguments | ||
To be sure to boot from the raid root you can edit the grub entry before booting, make sure it's like the following | To be sure to boot from the raid root you can edit the grub entry before booting, make sure it's like the following | ||
Riga 129: | Riga 140: | ||
linux /vmlinuz-3.16.0-4-amd64 root=/dev/mapper/ba--system-root ro quiet domdadm | linux /vmlinuz-3.16.0-4-amd64 root=/dev/mapper/ba--system-root ro quiet domdadm | ||
As we failed to achieve | As we failed to achieve everything on the first try we edited the grub command line and the performed grub install from the booted system. | ||
Once we managed to boot we had to finalize the process and add the old drive to the RAID array | Once we managed to boot we had to finalize the process and add the old drive to the RAID array | ||
Riga 141: | Riga 152: | ||
vgchange -an bitarno--system | vgchange -an bitarno--system | ||
Wiped out all the old drive first 100M | |||
dd if=/dev/zero of=/dev/sda bs=100M count=1 | dd if=/dev/zero of=/dev/sda bs=100M count=1 | ||
sync | sync | ||
Riga 147: | Riga 158: | ||
sfdisk /dev/sda < /tmp/savetablesde | sfdisk /dev/sda < /tmp/savetablesde | ||
We added the partitions to the array | |||
mdadm /dev/md1 --add /dev/sda1 | mdadm /dev/md1 --add /dev/sda1 | ||
mdadm /dev/md2 --add /dev/sda5 | mdadm /dev/md2 --add /dev/sda5 | ||
Riga 153: | Riga 164: | ||
DONE! | DONE! | ||
We then checked the status of the resync of mdadm with | |||
cat /proc/mdstat | cat /proc/mdstat | ||
Riga 160: | Riga 171: | ||
We (of course) had some issues in the process: | We (of course) had some issues in the process: | ||
* | * Thought the drive was of the same model the capacity was a little different and so we couldnt copy the partition scheme as we tougth we could do | ||
* It's very important to generate the grub image inserting the modules and the boot parameter needed for lvm and RAID | * It's very important to generate the grub image inserting the modules and the boot parameter needed for lvm and RAID | ||
* The initramfs must be rebuilt with the right raid module present in order for the kernel to find the cluster | |||
* It's impossible to backup and restore the lvm too as for a while we found ourselves with the 2 LVMs disk together and names were conflicting. | * It's impossible to backup and restore the lvm too as for a while we found ourselves with the 2 LVMs disk together and names were conflicting. | ||
Versione attuale delle 12:52, 18 dic 2016
We found ourselves in the need of switching from a single disk root to a root backed by a RAID, here's how we did.
Context
We sat up a new/old server, BitArno, it has a nice /home of ~6TB backed by a RAID and a separate disk for the / partition. As the drive for the / was tested before getting to production resulting in no errors we thought we didn't need a RAID for it. We installed Debian stable 8.1, set up every service, went to bed just to wake up with a hung server refusing to boot.
An on site check was required, we got there and found out that the / drive was failing to reallocate sectors, we fsck'ed it and it booted. "Great!" that same evening it was failing again and so we decided to switch from a single disk root to a RAID1 backed root especially because we do use commodity hardware taken from old computers.
The general idea
We wanted a way to easily switch from a drive to the RAID without having to back everything up and copy it over again with fewer downtime possible and so we decided to:
- Poweroff the server and attach a new disk of equal size
- Power on the server
- Create a RAID1 array composed by 2 drives but with a drive marked as missing (the new drive is the present one, the old one will be attached to the cluster later)
- Set LVM up on top of it (as the old drive had LVM (it's a 120GB drive just for the `/` parition, we wanted to have some space available if we'll ever find in the need)
- Boot a live distro and copy /
- Install the bootloader on the new drive adding the necessary modules for RAID and LVM
- Reboot the server from the RAID-member disk
- Check if everything is OK
- Add the old disk to the RAID and let mdadm resync the contents.
Step by step guide
Here I'll try to cover step by step the command we used, it has been a while since we did it so I'm not 100% sure I'll remember everything
First a layout of the partitions when it was single-disk
sda 8:0 0 114.5G 0 disk ├─sda1 8:1 0 243M 0 part /boot ├─sda2 8:2 0 1K 0 part └─sda5 8:5 0 114.3G 0 part ├─bitarno--system-root 253:0 0 30G 0 lvm / └─bitarno--system-swap 253:3 0 4G 0 lvm [SWAP]
And after the switch (newly created LVM has different volume group name but same size and structure)
sda 8:0 0 114.5G 0 disk ├─sda1 8:1 0 243M 0 part │ └─md1 9:1 0 242.8M 0 raid1 /boot ├─sda2 8:2 0 1K 0 part └─sda5 8:5 0 114.3G 0 part └─md2 9:2 0 114.2G 0 raid1 ├─ba--system-root 253:0 0 30G 0 lvm / └─ba--system-swap 253:3 0 4G 0 lvm [SWAP] sde 8:64 0 114.5G 0 disk ├─sde1 8:65 0 243M 0 part │ └─md1 9:1 0 242.8M 0 raid1 /boot ├─sde2 8:66 0 1K 0 part └─sde5 8:69 0 114.3G 0 part └─md2 9:2 0 114.2G 0 raid1 ├─ba--system-root 253:0 0 30G 0 lvm / └─ba--system-swap 253:3 0 4G 0 lvm [SWAP]
The sdX2 partition is an extended one which contains the RAID one
In this way we have two arrays, one for the boot and one for the LVM which contains the root and the swap
First of all we need to wipe out every possible header from the new disk as it will become a RAID member
dd if=/dev/zero of=/dev/sde bs=100M count=1
Then we created the partitions with sfdisk about the same size as the first disk. It's not needed for them to be the same as it wasn't possible to copy the partitions with dd
We need:
- one parition for the boot marked as Linux raid autodetect
- the other for LVM marked as Linux raid autodetect that takes the free space left
We then proceeded to create the 2 RAIDS, /dev/sde is the new disk
mdadm --create /dev/md1 --level=1 --raid-devices=2 /dev/sde1 missing mdadm --create /dev/md2 --level=1 --raid-devices=2 /dev/sde5 missing
We made the filesystem on the newly created /dev/md1
mkfs.ext2 /dev/md1
And copied the files
mount /dev/md1 /mnt rsync -avPh /boot /mnt sync && umount /mnt
We were done with /boot, and we needed to create the LVM volumes:
First we marked /dev/md2 as LVM member with fdisk then we told LVM /dev/md2 was a device it could use
pvcreate /dev/md2
We then created the virtual group that was going to host the logical volumes
vgcreate ba-system /dev/md2
And created the logical volumes over it
lvcreate -L30G -n root ba-system lvcreate -L4G -n swap ba-system
We created the filesystem for root
mkfs.ext4 /dev/mapper/ba--system-root
And for swap
mkswap /dev/mapper/ba--system-swap
Then it was time to poweroff the server and boot from a live, we choose ArchLinux as it's lightweight and provides all the packages we need without forcefully bringing up an installation procedure
We mounted the root partitions
mkdir /mnt/old /mnt/new mount /dev/mapper/ba--system-root /mnt/new mount /dev/mapper/bitarno--system-root /mnt/old
And copied the data over:
rsync -avPh /mnt/old/ /mnt/new
We then proceeded to mount the boot partition
mount /dev/md1 /mnt/new/boot
And arch-chroot'ed into it. Arch-chroot is a wonderful piece of software that saves you from bind mounting /dev /proc and /sys in the chroot
arch-chroot /mnt/new
It was time to install grub on the new disk and regenerate grub configuration and the initram image to contain LVM and RAID modules, this was the hardest part as we wrongly supposed all the modules were already present.
We added "domdadm" to GRUB_CMDLINE_LINUX_DEFAULT in /etc/default/grub, the line should look something like
cat /etc/default/grub| grep -i cmdli GRUB_CMDLINE_LINUX_DEFAULT="quiet domdadm"
And we added the modules needed to GRUB_PRELOAD_MODULES in the same file, if the line don't exist create it
GRUB_PRELOAD_MODULES="lvm diskfilter mdraid1x"
Then it was time to install grub on both HDD in the MBR, note that it wasn't needed to specify the boot partition as we were in a chroot
grub-install /dev/sda grub-install /dev/sde
And just to be sure
update-grub
We also had to update the initramfs to actually contain the raid module ans so we edited
/etc/initramfs-tools/modules
appending
raid1
to the list and executed
update-initramfs
to re-generate the initramfs
We exited from the chroot, unmounted everything, rebooted and kept our finger crossed till the password prompt of the root-on-RAID booted system.
This last step I simplified it, we made several errors and had to boot into live several times, as we kept forgetting to add the needed modules and command lines arguments
To be sure to boot from the raid root you can edit the grub entry before booting, make sure it's like the following
insmod mdraid1x
Have to be present inside the menuentry section of your os and the linux entry looks similar to
linux /vmlinuz-3.16.0-4-amd64 root=/dev/mapper/ba--system-root ro quiet domdadm
As we failed to achieve everything on the first try we edited the grub command line and the performed grub install from the booted system.
Once we managed to boot we had to finalize the process and add the old drive to the RAID array
We backed up the partition table of the new disk
sfdisk -d /dev/sde > /tmp/savetablesde
Unmounted the automatically-mounted /dev/sda[1,2,5] deactivating the swap and LVM
swapoff /dev/mapper/bitarno-system-swap vgchange -an bitarno--system
Wiped out all the old drive first 100M
dd if=/dev/zero of=/dev/sda bs=100M count=1 sync
And in the end we copied the partition table on the old disk
sfdisk /dev/sda < /tmp/savetablesde
We added the partitions to the array
mdadm /dev/md1 --add /dev/sda1 mdadm /dev/md2 --add /dev/sda5
DONE!
We then checked the status of the resync of mdadm with
cat /proc/mdstat
Issues
We (of course) had some issues in the process:
- Thought the drive was of the same model the capacity was a little different and so we couldnt copy the partition scheme as we tougth we could do
- It's very important to generate the grub image inserting the modules and the boot parameter needed for lvm and RAID
- The initramfs must be rebuilt with the right raid module present in order for the kernel to find the cluster
- It's impossible to backup and restore the lvm too as for a while we found ourselves with the 2 LVMs disk together and names were conflicting.