Shrinking the root disk of a Proxmox VE virtual machine

Today I tried to shrink the root disk of one of my KVM virtual machines running in Proxmox VE, from 200 GB to 100 GB. Below are the process and steps that I followed.

Obviously, this is an extremely dangerous operation – your virtual disk could be damaged and you could permanently lose your data. Please experiment with testing VMs and make backups before operating on important ones. I am not responsible for any damage caused by following the steps below.

Preparation: GParted Live ISO

GParted (GNOME Partition Editor) will be used to edit partitions in the following stages. The Live ISO allows you to boot into a live graphical environment (independent of your original operating system) to use the software.

Find the latest GParted Live ISO link here.
Using the Proxmox VE web interface, navigate to “local” > “ISO Images” > “Download from URL”, and download the ISO file to the host’s storage.

Stage 1: Shrink the file system partitions inside the disk image

The first stage is to shrink the existing file system, so that all the existing partitions and data are moved to the first half of the disk image.

Shut down the virtual machine and wait for it to fully stop.
In the VM’s Hardware tab, add a “CD/DVD Drive” and choose the ISO file downloaded earlier. It should be added as “ide0” (or higher if ide0 is already occupied).
Switch to Options tab, and edit the “Boot Order” option to check the newly added “ide0”, and move it to the top of the list.
Start the VM to boot into GParted Live environment.
Switch to the VM’s Console tab to enter the VNC graphical interface.
You should see GParted Live being booted.
- When prompted by the boot menu (to choose keyboard mapping and others), use your best judgement.
- Eventually, you will enter a lightweight Linux desktop environment with GParted automatically launched.
Shrink and/or move the partitions so that the total size is BELOW the desired new size, for example, 80%.
- The buffer will ensure that no data is accidentally cut off during the disk image resizing step due to unit difference (GB vs. GiB) or any other reasons.
Click the green checkmark button to apply your changes. It will take a while to move data blocks and shrink partitions.
When all operations are completed, close GParted software and shutdown the VM.

Example of partition layout in GParted after shrinking

Stage 2: Shrink the disk image in Proxmox VE host

After you have successfully shrunk the partitions, it’s time to reduce the disk image size to reclaim the unused space at the end of the virtual “hard drive” in PVE host.

Connect to PVE host’s terminal. You could either use SSH, or the web interface’s “Shell” tab at the host level.
Depending on the type of your disk image, execute the appropriate commands.
- For my ZFS volume setup, I first ran sudo zfs list to check the image’s volume name (which was “rpool/data/vm-<vmid>-disk-0”), and then used sudo zfs set volsize=100G rpool/data/vm-<vmid>-disk-0 to set its size.
- For other setups such as LVM or plain “.qcow2” files, refer to this thread or other online resources to determine how to shrink them.
Once the disk image’s size is successfully changed, run sudo qm rescan -vmid <vmid> to update the size tracked by Proxmox VE.

Stage 3: Repair the partition table and fill the unused buffer space

In the previous stage, you have cut off the last part of the virtual “hard drive”. If you are using GPT partition table, it would mean that the backup table at the drive’s end will be lost, and the primary table at the beginning will have a wrong size.

Also, we have left a 20% buffer space in Stage 1 within the new disk, which we could fill with one or more of our partitions.

Start the VM and boot into GParted Live environment again.
When GParted is automatically started, you might see two errors: “Invalid argument during seek…” and “The backup GPT table is corrupt…”. Temporarily click “Ignore” and “OK” respectively to skip the errors. Do NOT make any changes in GParted now!
On the desktop of the live environment, open Terminal.
Use gdisk to repair the GPT partition table.
- First, open sudo gdisk /dev/sdX (replace X with your actual drive letter)
- You should see some errors about GPT partition table being corrupt. Execute v command to diagnose.
- Understand and carefully follow the recommended steps to fix the issues. For me, the fix was x (expert mode) and e (relocate backup structure).
- Execute p to check the partition table, and w to write the changes and exit.

Example process of repairing the GPT partition table

Switch back to GParted, execute “GParted > Refresh devices” to reload the repaired partition table.
Fill the unused space as desired, apply the changes, and shutdown the VM once the changes are complete.
Remove the CD/DVD Drive from VM Hardware tab, and revert the Boot Order settings.

Now you have completed all the steps to shrink both the virtual disk image and the partitions it contains. You can try to boot into the original VM operating system to confirm that everything is working.

References: Shrink disk size of VM : r/Proxmox

Thanks Bing AI’s DALL·E 3 for generating the featured image of this article.

Mini post: A list of URLs for checking network connectivity

This is a list of URLs commonly used for checking network connectivity, and for detecting the existence of “Captive Portals“.

Provider	URL	China?	HTTP	HTTPS	IPv6
Google	`www.gstatic.com/generate_204`	⚠	204	204	✔
Google	`www.google-analytics.com/generate_204`	⚠	204	204	✔
Google	`www.google.com/generate_204`	❌	204	204	✔
Google	`connectivitycheck.gstatic.com/generate_204`	⚠	204	204	✔
Apple	`captive.apple.com`	✔	200	200	✔
Apple	`www.apple.com/library/test/success.html`	✔	200	200	⚠*
Microsoft	`www.msftconnecttest.com/connecttest.txt`	✔	200	❌	❌
Microsoft	`edge.microsoft.com/captiveportal/generate_204`	✔	204	400	✔
Cloudflare	`cp.cloudflare.com`	⚠	204	204	✔
Firefox	`detectportal.firefox.com/success.txt`	⚠	200	200	✔
Qualcomm (China)	`www.qualcomm.cn/generate_204`	✔	204	301	✔
Xiaomi	`connect.rom.miui.com/generate_204`	✔	204	204	❌
Huawei	`connectivitycheck.platform.hicloud.com/generate_204`	✔	204	204	❌
Vivo	`wifi.vivo.com.cn/generate_204`	✔	204	204	❌
USTC	`204.ustclug.org`	✔	204	204	✔

Table of URLs and their properties

About the “China?” column:

✔ – Chinese company (entity) or company with Chinese presence. Should be stable.
⚠ – Currently accessible (or partially accessible) from China, but could be blocked at any moment.
❌ – Currently blocked in China.

* Special note for www.apple.com: As of now, it doesn’t resolve to an IPv6 address when queried from China.

Sources:

老司机偶尔也中招：拯救被联想电脑管家劫持的 Edge 首页

是的，某上网冲浪十余年的老司机，浏览器的首页被流氓软件劫持了。

起因是前一段时间在笔记本上运行了“绿色版”的联想电脑管家，卸载不干净之后只能手动清除。今天发现打开 Edge 浏览器时，会自动打开 https[://]discovery.lenovo.com.cn/home/baidu/v7/c4，并最终跳转到百毒 https[://]www.baidu.com/?tn=15007414_12_dg。无论使用桌面图标打开，直接运行 msedge.exe，或从开始菜单搜索打开，都会出现相同现象。

分析

首先排除了浏览器设置（设置为打开新标签页，或打开指定网页都无效），以及桌面快捷方式被修改的情况。
浏览器内按 Ctrl-T 打开新标签页没问题。
注册表搜索 “lenovo.com.cn” 或 “baidu” 关键词，都没有找到相关结果。
使用 Windows 的 strace 替代品——Process Monitor（微软官方小工具），发现 Explorer.exe 启动 msedge.exe 时已经带了后缀。
Chrome 默认不受影响，但是如果把 Chrome 启动程序 “chrome.exe” 改名为 “msedge.exe”，也会打开被劫持的网页。

到这里其实思路很明显了：有人 hook 了启动进程的 Win32 API（具体叫啥我不清楚，可以想象成 Linux 的 exec 系统调用），检测到 “msedge.exe” 之后就自动修改命令行加上后缀。

清除

网上推荐的方法是打开联想电脑管家，然后解除“浏览器首页锁定”。但是我好不容易才摆脱了这个流氓软件，再安装回来是不现实的。

根据以往经验，上述 hook 操作通常是通过 Windows 驱动实现的。没有找到好用的工具可以列出系统里运行的第三方驱动，或者列出谁 hook 了 Win32 API。于是用笨办法，直接打开 %windir%\system32\drivers，按照日期排序，一眼就发现了 IndexProtect.sys。

接下来方案就很简单暴力了，首先用 sc delete IndexProtect（管理员权限）删除该服务，重启电脑后再删除驱动文件即可。

后记

最近某一天谈论起来，说好像“电脑管家”这东西是中国特色。老外只会安装杀毒软件，不需要 360 全家桶一键系统加速清除垃圾。

一个美好的愿望：希望世界上没有流氓软件，也没有流氓搜索引擎。

感谢 Bing AI DALL·E 3 提供本文首图。

PS：最可悲的是，因为这破事浪费了两个小时，本来是准备用来写一个小工具的。尼玛的。

Ubuntu 升级内核后 initrd.img 损坏造成 kernel panic 无法启动

这几天搞了搞我的服务器，昨天给 Kubernetes 集群中的 Ubuntu 20.04 虚拟机做了一次系统升级，重启之后就进不去系统了，喜提 20+ 小时的全站 down time。今天终于解决了这个问题，在此记录一下。

故障表现

Ubuntu 20.04 LTS 系统更新内核后，重启无法进入系统。
主机端观察虚拟机 CPU、内存占用低，且 ACPI 重启无响应，只能 Reset 硬重启。
使用 VNC / Terminal 观察启动日志，可以看到以下日志：
- /init: conf/conf.d/zz-resume-auto: line 1: syntax error: unexpected "("
- Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000200
使用 LiveCD 进入系统后，执行 lsinitramfs /boot/initrd.img 可以观察到类似下图的 conf/conf.d/zz-resume-auto 错误文件。

Output of `lsinitramfs` with `conf/conf.d/zz-resume-auto` line

故障原因

直接原因是 Ubuntu 系统更新时更新了内核版本，生成 initrd.img 时引入了错误的文件。根本原因还不好确定，在 Launchpad 上面有类似的 bug 挂了几年都没修复。

修理方法

进入恢复环境

因为系统无法启动，首先要进入恢复环境。其实最简单的方法是进到上一个版本的内核，正常启动系统之后来修复。但是因为我的虚拟机用的是 cloud-init image，试了网上的方法都没法让 Grub 菜单显示出来。下面介绍如何用 Live CD 挂载的方式进入恢复环境。

下载 Ubuntu Live Server ISO 并引导启动。在我的虚拟机环境比较简单，如果是其他 VPS 服务商或物理机，请自行搜索启动方式，或使用其它 Live Recovery 环境。
进入安装界面后不要继续，按 Ctrl+Alt+F2 切换至 TTY 命令行。

在恢复环境中重新生成 `initrd.img`

进入到恢复环境后，按照下列步骤挂载原系统，并生成一个新的 initrd.img 文件。

挂载原系统根目录，然后切换至原系统环境下：
sudo mount /dev/sda1 /mnt # /dev/sda1 替换成原系统根目录，可以用 lsblk 找一下 sudo mount --bind /dev /mnt/dev sudo mount --bind /proc /mnt/proc sudo mount --bind /sys /mnt/sys sudo chroot /mnt
执行命令重新生成正确的 initrd.img：
update-initramfs -cu -k 5.4.0-136-generic
（最后一个参数要替换成最新内核的完整版本，可以 ls /lib/modules 找到正确的版本号）
检查新生成的 initrd.img 文件，确保没有上述的 zz-resume-auto 错误文件：
lsinitramfs /boot/initrd.img | grep conf
（应该输出上面图片中除了 zz-resume-auto 的其他正常文件）
按 Ctrl-D 退出原系统环境，然后执行 sudo umount /mnt/dev /mnt/proc /mnt/sys /mnt 卸载所有目录。
重启系统，从硬盘启动试试看是否修复。

参考文献：https://forum.level1techs.com/t/solved-kernel-panic/120146

小技巧：使用 CloudFlare 阻挡特定国家的访客（免费套餐也支持）

CloudFlare 提供了防火墙功能，支持根据访客所在的国家/地区、大洲、AS 编号、IP 地址等条件来设置允许或拒绝访问。

登录 CloudFlare 控制面板，进入“防火墙” → “防火墙规则” 页面，点击“创建防火墙规则”按钮。
随便起一个规则名称，然后在下方规则处选择“字段”为“国家/地区”，“值”的下拉框中选择要拦截的国家、地区英文名称。
如果有多个国家想要拦截，可以点击第一条规则后面的“Or”按钮，然后在新增的一行中同样设置另一个国家的名称。
在最下方“选择操作”处选择“阻止”，然后保存即可。

免费套餐用户最多可以创建 5 条规则，但实际在一条规则中就可以设置多种条件。