These are very destructive procedures. I bare no responsibility for any damages done to your system.
Windows and Linux knowledge is required. This guide is tailored for my system. Your mileage may vary.
After wiping my Fedora disk and re-installing Fedora from scratch I no longer had my UEFI Windows boot entry available in the BIOS.
This means that I previously only had one EFI partition on that Linux storage device and that Windows piggy backed onto it.
Now that the Fedora is reinstalled on the freshly wiped disk, I lost the Windows boot entry.
My system looked like this (lsblk output with comments).
Notice the lack of EFI partition anywhere else except for the Linux storage device.
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
sda 8:0 1 447.1G 0 disk
└─sda1 8:1 1 447.1G 0 part
Linux ssd sdb 8:16 1 223.6G 0 disk
├─sdb1 8:17 1 600M 0 part /boot/efi
├─sdb2 8:18 1 1G 0 part /boot
└─sdb3 8:19 1 222G 0 part
└─luks-REDACTED 253:0 0 222G 0 crypt /home
/
zram0 252:0 0 8G 0 disk [SWAP]
nvme0n1 259:0 0 931.5G 0 disk
├─nvme0n1p1 259:1 0 16M 0 part
└─nvme0n1p2 259:2 0 931.5G 0 part
Windows nvme nvme2n1 259:3 0 465.8G 0 disk
├─nvme2n1p1 259:4 0 16M 0 part
└─nvme2n1p2 259:5 0 465.8G 0 part
nvme1n1 259:7 0 3.6T 0 disk
├─nvme1n1p1 259:8 0 16M 0 part
└─nvme1n1p2 259:9 0 3.6T 0 part
I guess I can go back and learn how to put the Windows EFI boot option onto the existing linux-ssd EFI partition, but seeing that the Windows installer did this before, and it caused grief, I opted for the following:
Resize the Windows partition and create an EFI partition on the Windows disk for redundancy
dd if=/home/username/Downloads/Win11_23H2_English_x64v2.iso of=/dev/sdc bs=4M
Shift+F10
to get the command promptdiskpart
and do the following in the diskpart
command prompt:C:
drive: remove letter=C
list disk
to identify the Windows disk and partition. In my case it was Disk 2
and Partition 2
.select disk 2
select part 2
assign letter=C
diskpart
by running exit
and check whether the C
drive contains the correct partition.diskpart
: select disk 2
, select part 2
shrink desired=500 minimum=500
create partition efi
select part 3
format fs=fat32 quick
assign letter=y
exit
bcdboot C:\windows /s Y:
I want my linux home server to have an encrypted ZFS rootFS with raidz1. In my quest to realize my use-case, I found ZFSBootManager which has guides for major linux operating systems, including Fedora Workstation.
I decided to go with the ZFSBootMenu Fedora Workstation guide rather than try to hack something on my own.
The ZFSBootMenu Fedora Workstation guide covers the use-case with one block storage device without zraid1.
Luckily, most of it is still valid. These are the changes I had to make for my system.
NOTE: The guide still needs to be followed, just substitute the relevant sections
export BOOT_DISK="/dev/nvme0n1"
export BOOT_PART="1"
export BOOT_DEVICE="${BOOT_DISK}p${BOOT_PART}"
export POOL_PART="2"
zpool labelclear -f /dev/nvme0n1p2
zpool labelclear -f /dev/nvme1n1p2
zpool labelclear -f /dev/nvme2n1p2
wipefs -a "$BOOT_DISK"
wipefs -a /dev/nvme0n1
wipefs -a /dev/nvme1n1
wipefs -a /dev/nvme2n1
sgdisk --zap-all /dev/nvme0n1
sgdisk --zap-all /dev/nvme1n1
sgdisk --zap-all /dev/nvme2n1
sgdisk --zap-all "$BOOT_DISK"
sgdisk -n "${BOOT_PART}:1m:+512m" -t "${BOOT_PART}:ef00" "/dev/nvme0n1"
sgdisk -n "${BOOT_PART}:1m:+512m" -t "${BOOT_PART}:ef00" "/dev/nvme1n1"
sgdisk -n "${BOOT_PART}:1m:+512m" -t "${BOOT_PART}:ef00" "/dev/nvme2n1"
sgdisk -n "${POOL_PART}:0:-10m" -t "2:bf00" "/dev/nvme0n1"
sgdisk -n "${POOL_PART}:0:-10m" -t "2:bf00" "/dev/nvme1n1"
sgdisk -n "${POOL_PART}:0:-10m" -t "2:bf00" "/dev/nvme2n1"
zpool create -f -o ashift=9 \
-O compression=lz4 \
-O acltype=posixacl \
-O xattr=sa \
-O relatime=on \
-O encryption=aes-256-gcm \
-O keylocation=file:///etc/zfs/zroot.key \
-O keyformat=passphrase \
-o autotrim=on \
-m none \
zroot raidz1 /dev/nvme1n1p2 /dev/nvme0n1p2 /dev/nvme2n1p2
Testing Fedora cloud image on KVM while using NoCloud cloud-init. I like to have my “trusted” home devices (and virts) publish their hostname via their DHCP client request towards my router.
My router has a script that generates static DNS entries based on the hostname value of a client’s DHCP request.
I’ve noticed that a few “cloud” images that I’ve been trying out don’t usually propagate the “hostname” value from the meta-data of NoCloud to the DHCP settings.
So far, Alpine which uses dhclient via openRC and Fedora-cloud which uses systemd + NetworkManager for its DHCP client.
I guess this is the standard. More stealthy this way, but I don’t need this stealth.
Notice the runcmd. This is the current naming scheme for Fedora-cloud in 2024/01
#cloud-config
packages:
- sudo
users:
- name: myuser
primary_group: myuser
ssh_authorized_keys:
- ssh-rsa YOUR_PUBKEY comment
sudo: "ALL=(ALL) NOPASSWD:ALL"
groups: wheel
shell: /bin/bash
runcmd:
- 'nmcli con modify "cloud-init eth0" ipv4.dhcp-hostname myhostname'
I was playing around with Alpine Linux cloud images. Apparently, cloud-init in alpine-linux creates locked user accounts by default.
To get around this, I am using the *
password hash (not sure if needed) which should not match any password.
In addition to this, I also have to unlock the account with runcmd
which happens after the user is created. This is different to bootcmd
which happens earlier.
It is worth noting that neither of these two hacks are needed in Fedora cloud images.
#cloud-config
packages:
- sudo
users:
- name: myuser
passwd: "*"
primary_group: myuser
ssh_authorized_keys:
- ssh-rsa YOPUR_SSH_PUBLIC_KEY keycomment
sudo: "ALL=(ALL) NOPASSWD:ALL"
groups: wheel
shell: /bin/ash
runcmd:
- passwd -u myuser
Not a networking expert. A better explanation of this can be found in Mikrotik’s NAT Documentation
Assumption: Typical home LAN with a router that provides access to the internet.
I would like to access my home server by using the Public IP that’s assigned to the router’s “WAN” port.
I thought - simple - Just use a port forwarding rule (DST-NAT):
Chain: dstnat
Input interfaces: LAN (Mikrotik specific, interface lists)
DST port: 443
Protocol: TCP
DST address: <router-pub-ip>
Action: dst-nat
To-Address: <home-server-private-ip>
Unfortunately, when used from within the home network I get a timeout while trying to connect to the home server via the public IP.
What goes on “under the hood” is the following:
To make this type of connectivity work, we also need to set up a SNAT (source NAT, masquerade… use your preferred term). This Source NAT is not the one we already have for LAN->WAN connectivity. This one is specific to LAN->Public-IP->Home-Server traversal, so, depending on our use case, we need to have something like this:
Chain: srcnat
Source Address: 192.168.0.0/16
Dst Address: <home-server-private-ip>
DST port: 443
Protocol: TCP
Action: masquerade
Of course, if you have a wide range of ports you’d like to loop back from LAN to your home server (via the public IP), then this rule should be changed accordingly. It is even simpler if you want your home server to be a “catch all” for any LAN->Public-IP communication.
If your setup is slightly more robust, say, the home server(s) are on a separate subnet, then you don’t need the source NAT rule.
I’m not actually sure EXACTLY why passing back through the router is “magical”, but this is what happens:
I’m by no means an expert on the topics of VFIO/IOMMU/PCI passthrough in Linux. I found the fenguoerbian’s blog post half way through documenting my steps, and I’d encourage you to go read it because it covers more use cases and is just more comprehensive.
Other good reads:
Me, really. Writing this mostly as a document on what I did to get where I am. Having this written down in Ansible or similar IaC won’t retell the whole story of tools and articles used to figure out what needs to be done.
Have Alpine Linux be the libvirt host on a bare metal x86_64. Have some PCI (USB) devices available to be passed through to guests by using early VFIO binding.
Guide assumes that your (my) target system has vt-d or AMD’s equivalent supported and enabled in BIOS. Intel-based system is used here, so your kernel parameters and module options will differ for AMD.
Options during setup:
I run a DHCP server where I manage any static entries, so aside from choosing br0
at install time, only router config needs to be modified.
As for libvirt requirements, ensure the tun driver loads on boot:
cat /etc/modules | grep tun || echo tun >> /etc/modules
Comfort of running the virt-manager
remotely costs us having to install dbus
, polkit
and some other dependencies, so I opted to have a leaner system that I’ll manage with virsh
when I SSH in.
The only thing that needs explanation is the libvirt-guests
package which makes the host gracefully shut down guests before shutting itself off.
apk add libvirt-daemon qemu-img qemu-system-x86_64 qemu-modules openrc vim pciutils usbutils wget
rc-update add libvirtd
rc-update add libvirt-guests
For this section, it’s best to first read up on the Arch Linux Wiki on how PCI devices relate to IOMMU groups.
First, lets ensure IOMMU is enabled at boot. In /etc/update-extlinux.conf
, add intel_iommu=on
and iommu=pt
to default_kernel_opts
. For example:
cat /etc/update-extlinux.conf
default_kernel_opts="quiet rootfstype=ext4 intel_iommu=on iommu=pt"
Run
update-extlinux
REBOOT! After the reboot, let’s figure out what PCI devices we want to passthrough. In my case, I needed to pass through a few USB devices from the host.
This one is fairly easy. Just use lspci
and make a note of the device(s) you are interested in.
It is often the case that we have to isolate more than just the device we want because they share an IOMMU group. More on that later.
This is where it gets tricky. USB devices (to my knowledge) are always a “child” of a PCI device. We need to figure out the PCI->USB relation before we proceed.
First step:
dmesg | grep 'usb \d-\d' | grep Product:
Find entries of devices you want to isolate from the host. In my case:
[ 2.399790] usb 3-1: Product: C-Media USB Headphone Set
[ 3.435788] usb 3-2: Product: Sonoff Zigbee 3.0 USB Dongle Plus
[ 4.353124] usb 4-2: Product: USB Audio CODEC
Now use fenguoerbian’s fantastic script:
for usb_ctrl in $(find /sys/bus/usb/devices/usb* -maxdepth 0 -type l); do pci_path="$(dirname "$(realpath "${usb_ctrl}")")"; echo "Bus $(cat "${usb_ctrl}/busnum") --> $(basename $pci_path) (IOMMU group $(basename $(realpath $pci_path/iommu_group)))"; lsusb -s "$(cat "${usb_ctrl}/busnum"):"; echo; done
…which will print out this convenient list of IOMMU groups in relation to USB devices. Conveniently, all three of my USB devices are in the same IOMMU group. To be more precise, USB bus 3 and USB bus 4 are in the same IOMMU group:
Bus 3 --> 0000:00:1a.0 (IOMMU group 4)
...
Bus 003 Device 001: ID 1d6b:0001
...
Bus 004 Device 001: ID 1d6b:0001
...
Bus 003 Device 002: ID 0d8c:000c
...
Bus 4 --> 0000:00:1a.1 (IOMMU group 4)
...
Bus 003 Device 001: ID 1d6b:0001
...
Bus 004 Device 001: ID 1d6b:0001
...
Bus 003 Device 002: ID 0d8c:000c
...
But, if we want to pass through the relevant controllers, we have to isolate all the devices in IOMMU group 4. Let’s check out what we have in that whole group:
shopt -s nullglob
for g in $(find /sys/kernel/iommu_groups/* -maxdepth 0 -type d | sort -V); do
echo "IOMMU Group ${g##*/}:"
for d in $g/devices/*; do
echo -e "\t$(lspci -nns ${d##*/})"
done;
done;
Output:
...
IOMMU Group 4:
00:1a.0 USB controller [0c03]: Intel Corporation 82801JD/DO (ICH10 Family) USB UHCI Controller #4 [8086:3a67] (rev 02)
00:1a.1 USB controller [0c03]: Intel Corporation 82801JD/DO (ICH10 Family) USB UHCI Controller #5 [8086:3a68] (rev 02)
00:1a.2 USB controller [0c03]: Intel Corporation 82801JD/DO (ICH10 Family) USB UHCI Controller #6 [8086:3a69] (rev 02)
00:1a.7 USB controller [0c03]: Intel Corporation 82801JD/DO (ICH10 Family) USB2 EHCI Controller #2 [8086:3a6c] (rev 02)
...
I do have a whole other set of USB devices I can use on the host, so no problem there.
This concludes USB detective work. Next up we’ll be isolating the IOMMU group 4 from the OS, so we can pass down those devices to the guest
Ensure VFIO kernel drivers are loaded into the initramfs:
cat <<EOT > /etc/mkinitfs/features.d/vfio.modules
kernel/drivers/vfio/vfio.ko.*
kernel/drivers/vfio/vfio_virqfd.ko.*
kernel/drivers/vfio/vfio_iommu_type1.ko.*
kernel/drivers/vfio/pci/vfio-pci.ko.*
EOT
And add all devices from your IOMMU group to the ids=
parameters of vfio-pci
:
cat <<EOT > /etc/modprobe.d/vfio.conf
options vfio-pci ids=8086:3a67,8086:3a68,8086:3a69,8086:3a6c
options vfio_iommu_type1 allow_unsafe_interrupts=1
softdep igb pre: vfio-pci
EOT
Don’t forget to
mkinitfs
And verify that the drivers are in the initfs by running:
mkinitfs -l | grep vfio
Having the modules ready is one thing, but we also need to invoke them early. In /etc/update-extlinux.conf
update default_kernel_opts
and modules
sections to something like this:
grep '^default_kernel_opts\|^modules' /etc/update-extlinux.conf
Output:
(you don’t necessarily have the crypto stuff, just focus on iommu and vfio-pci here)
default_kernel_opts="cryptroot=UUID=eec5190e-eebd-4985-9abc-36a61341e038 cryptdm=root quiet rootfstype=ext4 intel_iommu=on iommu=pt"
modules=sd-mod,usb-storage,ext4,vfio,vfio-pci,vfio_iommu_type1,vfio_virqfd
Don’t forget to run:
update-extlinux
Reboot. Let’s check if we’re in business:
dmesg | grep vfio
And you should see something like:
[ 1.163469] vfio_pci: add [8086:3a67[ffffffff:ffffffff]] class 0x000000/00000000
[ 1.163515] vfio_pci: add [8086:3a68[ffffffff:ffffffff]] class 0x000000/00000000
[ 1.163543] vfio_pci: add [8086:3a69[ffffffff:ffffffff]] class 0x000000/00000000
[ 1.179966] vfio_pci: add [8086:3a6c[ffffffff:ffffffff]] class 0x000000/00000000
In my case:
wget https://github.com/home-assistant/operating-system/releases/download/7.5/haos_ova-7.5.qcow2.xz -O - | xzcat > home-assistant.qcow2
Maybe don’t use the name “default” as it might already exist, but:
virsh pool-define-as default dir - - - - "/var/lib/libvirt/images"
mv home-assistant.qcow2 /var/lib/libvirt/images/
..and refresh the storage pool:
virsh pool-refresh default
Earlier, when we discovered the whole IOMMU group, the lines began with:
00:1a.0 USB controller [0c03]:.....
Translate the PCI IDs you need into the format listed under --hostdev
virt-install \
--name "homeassistant" \
--vcpus 2 \
--cpu host \
--memory 4096 \
--sysinfo host \
--import \
--boot uefi \
--os-variant=alpinelinux3.14 \
--disk vol=default/home-assistant.qcow2,bus=virtio \
--network bridge=br0,mac=52:54:00:C4:2D:4A \
--graphics none \
--video none \
--sound none \
--input none \
--memballoon none \
--hostdev pci_0000_00_1a_0 \
--hostdev pci_0000_00_1a_1
Et voilà!
]]>Run:
xhost +local:root
sudo docker run --rm -ti --net=host --env="DISPLAY=$DISPLAY" ubuntu:16.04
Then, in the container:
export DEBIAN_FRONTEND=noninteractive
apt-get update
apt-get install net-tools wget python python-gtk2 python-gnome2 -y
wget https://download.foldingathome.org/releases/public/release/fahcontrol/debian-stable-64bit/v7.6/fahcontrol_7.6.13-1_all.deb
dpkg -i fahcontrol_7.6.13-1_all.deb
# Start it up:
FAHControl
FAHControl will like to connect to localhost:36330 by default. If you have it running on a remote host, you can port forward:
ssh <RemoteFoldingHost> -L 36330:localhost:36330
I was revisiting some of my old scripts, and found this messy piece of code that attempts to grab the latest “minimal” AWS AMI - HVM that’s EBS backed:
aws --query 'Images[*].[Name,ImageId]' \
--output text \
ec2 describe-images \
--owners amazon \
--filters \
"Name=root-device-type,Values=ebs" \
"Name=architecture,Values=x86_64" \
"Name=virtualization-type,Values=hvm" \
"Name=image-type,Values=machine" \
"Name=is-public,Values=true" | grep minimal
| sort | tail -n1 | awk '{print $2}'
I knew that there is a better way of doing this, but first I wanted to flex my JQ muscles before googling.
One thing I forgot is to sanitize the “query” portion of the previously used command.
Bad command, notice the --query
:
aws --query 'Images[*].[Name,ImageId]' \
--output json \
ec2 describe-images \
--owners amazon \
--filters \
"Name=root-device-type,Values=ebs" \
"Name=architecture,Values=x86_64" \
"Name=virtualization-type,Values=hvm" \
"Name=image-type,Values=machine" \
"Name=is-public,Values=true"
Output I had to deal with:
[
[
"Windows_Server-2008-R2_SP1-English-64Bit-SQL_2012_RTM_SP2_Enterprise-2018.07.11",
"ami-ffe1e514"
],
[
"amzn-ami-hvm-2016.03.2.x86_64-ebs",
"ami-fff61890"
]
]
Increase of difficulty, for sure, but not impossible to filter the way I want. To re-iterate: I want a single, latest “minimal” Amazon Linux AMI.
This is what I came up with:
aws --query 'Images[*].[Name,ImageId]' \
--output text \
ec2 describe-images \
--owners amazon \
--filters \
"Name=root-device-type,Values=ebs" \
"Name=architecture,Values=x86_64" \
"Name=virtualization-type,Values=hvm" \
"Name=image-type,Values=machine" \
"Name=is-public,Values=true" |
jq -r '
[
.[] | select(.[0] | test("^amzn-ami-minimal-hvm")) |
{
ami: .[1],
name: .[0],
day: (.[0] | match("\\d{8}") | .string)
}
] | sort_by(.day)[-1].ami
'
Translation:
select(.[0] | test("^amzn-ami-minimal-hvm"))
day: (.[0] | match("\\d{8}") | .string)
. Value of the “day” property will have the YYYYMMDD format.sort_by(.day)
[-1].ami
This solves my task, but before I got to optimizing JQ, I noticed the malicious --query
. I was wondering why the pre-JQ AWS CLI output was so sparse!
Removing the “malicious” --query
option shows us that the output we have to deal with is substantial:
{
"Images": [
{
"Architecture": "x86_64",
"CreationDate": "2016-06-03T23:22:31.000Z",
"ImageId": "ami-fff61890",
"ImageLocation": "amazon/amzn-ami-hvm-2016.03.2.x86_64-ebs",
"ImageType": "machine",
"Public": true,
"OwnerId": "137112412989",
"State": "available",
"BlockDeviceMappings": [
{
"DeviceName": "/dev/xvda",
"Ebs": {
"DeleteOnTermination": true,
"SnapshotId": "snap-c70259f1",
"VolumeSize": 8,
"VolumeType": "standard",
"Encrypted": false
}
}
],
"Description": "Amazon Linux AMI 2016.03.2 x86_64 HVM EBS",
"Hypervisor": "xen",
"ImageOwnerAlias": "amazon",
"Name": "amzn-ami-hvm-2016.03.2.x86_64-ebs",
"RootDeviceName": "/dev/xvda",
"RootDeviceType": "ebs",
"SriovNetSupport": "simple",
"VirtualizationType": "hvm"
},
...
]
}
This allowed me to eliminate JQ. The only reason I previously used JQ was because of the ability to extract matches from the match()
function. I am not aware whether JMESPath can do this. Let’s use a proper --query
this time!
aws --output text \
ec2 describe-images \
--owners amazon \
--filters \
"Name=root-device-type,Values=ebs" \
"Name=architecture,Values=x86_64" \
"Name=virtualization-type,Values=hvm" \
"Name=image-type,Values=machine" \
"Name=is-public,Values=true" \
--query '
Images[?starts_with(ImageLocation,`amazon/amzn-ami-minimal-hvm-`) == `true`] |
sort_by(@, &CreationDate)[-1:].ImageId
'
Much better, but the execution time is still 5-ish seconds long. I do most of the processing after AWS returns a lot of results.
The last thing to do was to check whether someone has done it better. The first result that came up was from an AWS blog post on how to do exactly what I was trying to do.
First thing that I was not aware of is that you can use wildcards in the --filters
, so I improved my last iteration:
aws --output text \
ec2 describe-images \
--owners amazon \
--filters \
"Name=root-device-type,Values=ebs" \
"Name=architecture,Values=x86_64" \
"Name=virtualization-type,Values=hvm" \
"Name=image-type,Values=machine" \
"Name=is-public,Values=true" \
"Name=name,Values=amzn-ami-minimal-hvm-*" \
--query '
sort_by(Images, &CreationDate)[-1:].ImageId
'
Execution time is now sub-second, and the code is much cleaner. This is thanks to this line: "Name=name,Values=amzn-ami-minimal-hvm-*"
However, and this is the second thing I was not aware of, the blog post shows the most deterministic way to get the latest image as per my spec using SSM. That’s how I found out about the AWS SSM Parameter Store. Quite handy!
So, the command I ended up going with is this:
aws ssm get-parameter \
--name /aws/service/ami-amazon-linux-latest/amzn-ami-minimal-hvm-x86_64-ebs \
--query 'Parameter.Value' \
--output text
Google is your friend, but you end up learning a lot by trying things out for yourself :)
]]>I wanted to test out my Ansible setup that provisions some of the hosts I own, including my workstation. I’m currently switching from Fedora 28 to 29. Even though upgrades have been going without a hitch for me since Fedora 25, I want to do a hard reset and test my Ansible setup against a fresh Fedora 29.
I have an UEFI system, gpt partition table on my SSD, and am using an encrypted XFS root partition which is the bulk of the drive. All of this is easily emulated with KVM/libvirt.
Because my linux workstation has 32GB of ram, I wanted to see what my options are when dealing with in-memory storage. To my knowledge, I have two options when it comes to libvirt and in-memory storage:
tmpfs
is a more modern approach to in-memory storage on Linux, but it does’t use a ram-kept block device that can be used outside of tmpfs (to my knowledge)
brd
kernel module allows for providing parameters to control how many /dev/ram*
devices we have, and what their sizes are.
Set up about ~17.5GB ramdisk to /dev/ram0
:
sudo modprobe brd rd_size=18432000 max_part=1 rd_nr=1
Define a “ramblock” storage pool for libvirt:
sudo virsh pool-define-as --name ramblock --type disk --source-dev /dev/ram0 --target /dev
Build the ramblock storage pool:
sudo virsh pool-build ramblock
Start the storage pool:
sudo virsh pool-start ramblock
Create the volume. The volume name must be ram0p1
:
sudo virsh vol-create-as ramblock ram0p1 18350316k
Create your VM and specify that you wish to use ram0p1
for your storage device (under ramblock
pool). I used the virt-manager GUI for this.
(Optional) To delete the volume with virsh, you need to do:
sudo virsh vol-delete ram0p1 --pool ramblock
…unfortunately, this is buggy. If that fails, do:
sudo parted /dev/ram0 rm 1
If you needed to use parted, give it a few minutes until the volume disappears from the list:
sudo virsh vol-list --pool ramblock
Destroy (stop) the volume-pool:
sudo virsh pool-destroy ramblock
Unload the brd
kernel module (or suffer memory exhaustion!):
sudo rmmod brd
(Optional) Undefine the volume pool. It’s fine to leave it as it won’t auto-start unless you made it so:
sudo virsh pool-undefine ramblock
sudo mkdir -p /var/lib/libvirt/ramdisk-storage-pool
sudo mount -t tmpfs -o size=18000M tmpfs /var/lib/libvirt/ramdisk-storage-pool
Define the volume pool:
sudo virsh pool-define-as --name ramdisk --type dir --target /var/lib/libvirt/ramdisk-storage-pool
Start the storage pool:
sudo virsh pool-start ramdisk
Create the volume (naming is up to you):
sudo virsh vol-create-as ramdisk fedora29 18350316k
fedora29
for your storage device (under ramdisk
pool). I used the virt-manager GUI for this.(Optional) To delete the volume with virsh, you need to do:
sudo virsh vol-delete fedora29 --pool ramdisk
Destroy (stop) the volume-pool:
sudo virsh pool-destroy ramdisk
Unmount tmpfs (or suffer memory exhaustion!):
sudo umount /var/lib/libvirt/ramdisk-storage-pool
(Optional) Undefine the volume pool. It’s fine to leave it as it won’t auto-start unless you made it so:
sudo virsh pool-undefine ramdisk
I have performed three tests:
brd-nocache | brd | tmpfs | my SSD | |
---|---|---|---|---|
Latency (msec) | 0.05 | 0.04 | 0.06 | 0.23 |
Throughput (GB/s) | 2.9 | 3.1 | 3.5 | 0.416 |
To my surprise, tmpfs perfomed best in terms of throughput, while brd with “hypervisor default” for cache had best latency results.
All in-memory based tests had more consistent read speeds compared to my SSD that would have a much higher variation.
The latency benefit is obvious compared to the SSD.
Cached brd might be the best solution latency-wise for this particular use-case, but I consider tmpfs to be easier to set up.
Check whether ramdisk
FS performs any better than tmpfs
. I doubt it because it uses brd
as a backing store, but it’s worth checking
This is very useful if you are on a fast machine. I use the following on my Core-i7 4770:
PACKER_KEY_INTERVAL=10ms packer <rest-of-packer-params>
Assuming you are running on a host that has KVM installed and working, you can do:
vagrant plugin install vagrant-libvirt
Because most vagrant images come with VirtualBox support, do check out Chef’s Bento project!
]]>