Ubuntu 升级内核后 initrd.img 损坏造成 kernel panic 无法启动

这几天搞了搞我的服务器,昨天给 Kubernetes 集群中的 Ubuntu 20.04 虚拟机做了一次系统升级,重启之后就进不去系统了,喜提 20+ 小时的全站 down time。今天终于解决了这个问题,在此记录一下。

故障表现

  • Ubuntu 20.04 LTS 系统更新内核后,重启无法进入系统。
  • 主机端观察虚拟机 CPU、内存占用低,且 ACPI 重启无响应,只能 Reset 硬重启。
  • 使用 VNC / Terminal 观察启动日志,可以看到以下日志:
    • /init: conf/conf.d/zz-resume-auto: line 1: syntax error: unexpected "("
    • Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000200
  • 使用 LiveCD 进入系统后,执行 lsinitramfs /boot/initrd.img 可以观察到类似下图的 conf/conf.d/zz-resume-auto 错误文件。
Output of `lsinitramfs` with `conf/conf.d/zz-resume-auto` line

故障原因

直接原因是 Ubuntu 系统更新时更新了内核版本,生成 initrd.img 时引入了错误的文件。根本原因还不好确定,在 Launchpad 上面有类似的 bug 挂了几年都没修复。

修理方法

进入恢复环境

因为系统无法启动,首先要进入恢复环境。其实最简单的方法是进到上一个版本的内核,正常启动系统之后来修复。但是因为我的虚拟机用的是 cloud-init image,试了网上的方法都没法让 Grub 菜单显示出来。下面介绍如何用 Live CD 挂载的方式进入恢复环境。

  1. 下载 Ubuntu Live Server ISO 并引导启动。在我的虚拟机环境比较简单,如果是其他 VPS 服务商或物理机,请自行搜索启动方式,或使用其它 Live Recovery 环境。
  2. 进入安装界面后不要继续,按 Ctrl+Alt+F2 切换至 TTY 命令行。

在恢复环境中重新生成 initrd.img

进入到恢复环境后,按照下列步骤挂载原系统,并生成一个新的 initrd.img 文件。

  1. 挂载原系统根目录,然后切换至原系统环境下:
    sudo mount /dev/sda1 /mnt # /dev/sda1 替换成原系统根目录,可以用 lsblk 找一下
    sudo mount --bind /dev /mnt/dev
    sudo mount --bind /proc /mnt/proc
    sudo mount --bind /sys /mnt/sys
    sudo chroot /mnt
  2. 执行命令重新生成正确的 initrd.img
    update-initramfs -cu -k 5.4.0-136-generic
    (最后一个参数要替换成最新内核的完整版本,可以 ls /lib/modules 找到正确的版本号)
  3. 检查新生成的 initrd.img 文件,确保没有上述的 zz-resume-auto 错误文件:
    lsinitramfs /boot/initrd.img | grep conf
    (应该输出上面图片中除了 zz-resume-auto 的其他正常文件)
  4. 按 Ctrl-D 退出原系统环境,然后执行 sudo umount /mnt/dev /mnt/proc /mnt/sys /mnt 卸载所有目录。
  5. 重启系统,从硬盘启动试试看是否修复。

参考文献:https://forum.level1techs.com/t/solved-kernel-panic/120146

小技巧:使用 CloudFlare 阻挡特定国家的访客(免费套餐也支持)

CloudFlare 提供了防火墙功能,支持根据访客所在的国家/地区、大洲、AS 编号、IP 地址等条件来设置允许或拒绝访问。

  1. 登录 CloudFlare 控制面板,进入“防火墙” → “防火墙规则” 页面,点击“创建防火墙规则”按钮。
  2. 随便起一个规则名称,然后在下方规则处选择“字段”为“国家/地区”,“值”的下拉框中选择要拦截的国家、地区英文名称。
  3. 如果有多个国家想要拦截,可以点击第一条规则后面的“Or”按钮,然后在新增的一行中同样设置另一个国家的名称。
  4. 在最下方“选择操作”处选择“阻止”,然后保存即可。

Example of firewall rule

免费套餐用户最多可以创建 5 条规则,但实际在一条规则中就可以设置多种条件。

How the “Scan API” added in xiaodu-jsdelivr 1.3 works

I released the WordPress plugin “xiaodu-jsdelivr” months ago, which can be used to scan and replace references to static resources with jsDelivr CDN links. Details on how the plugin works can be found in the previous blog post. Yesterday, I released a new version 1.3, which contains a new feature called “Scan API”.

API Manager: https://xiaodu-jsdelivr-api.du9l.com/

What it is and why it is needed

“Scan API” is a hosted service, which can provide plugin users with pre-calculated scan results. Previous versions of the plugin used a more direct approach: Calculate local file hash, fetch CDN file and match their hashes. This obviously works, but it does slowly, because downloading remote files is a time-consuming job, which is the reason that initial scans after installing the plugin are always slow. Usually it took dozens of 30-second scanning sessions to complete an initial scan with the base WordPress, several plugins and themes. When you think about the fact that each website of all users has to go through the same process (there’s no import-export feature yet), it makes even less sense.

This is where the hosted API service becomes helpful. By pre-fetching and storing the hashes of WordPress and official plugins and themes (with all versions), and serving them to a client plugin when needed, the repetitive fetching and calculating can be avoided. That means the scanning process can be greatly accelerated, as long as the resources scanned are present in the API storage.

Flow chart of old and new processes

Current state and future development

The ultimate goal for the service is to provide scanning support for base WordPress versions, plugins and themes. As of now, both the service and plugin only implemented the first part, which is to provide hashes for all published (on GitHub) versions of WordPress. That means the client only uploads WordPress version, not plugin or theme versions; and the service only scans WordPress repository.

The remaining parts will be incrementally added in the future, which I have to think carefully. There are over 100,000 themes and plugins (combined) in the official SVN repositories, and it’s unrealistic to scan and store them all. So for the first future development goal may be to support the most popular of each category.

Also, as of right now the service is completely free. In the near future I don’t have a viable plan to charge users for the service, because with payments come payment gateway integrations and support requests. But I cannot guarantee that it will stay free forever – it may go paid or it may go away.

Technical details

Plugin users can stop reading here, because the following part is about technical details on how the service is built.

The API service is essentially a Python website built with Flask framework, with these main components:

  1. Authentication: This is provided by Auth0 (free plan), chosen for being relatively easy to use and the vast amount of login channels supported. It provides templates for a variety of languages, frameworks and applications, which can be downloaded and modified to fit the basic authentication needs. In my case (Python + Flask), Authlib is used in the template to provide the OAuth 2 functionality.
  2. Web UI: Written with React + React Router. It is not strictly necessary to use React, or even to build a single-page application, but I chose this path to get my skills up to date. For example, create-react-app is quite easy to use for creating a TypeScript-based project, while in the old days one had to configure TypeScript and Babel themselves. Also, React hooks is an interesting recent addition.
  3. API: There are two parts, one is the actual “Scan API” for the client plugin to query for stored data. The other is the backend API for the Web UI to manage API keys. MongoDB is used as permanent store for both scanning and user API key data.
  4. Scanner: Workers that regularly download and calculate remote hashes. A task queue (Celery) is used to manage scanning tasks, and the downloading part is handled by GitPython (later maybe PySVN will also be used).

The whole project is deployed in my private Kubernetes cluster, using Jenkins to build the frontend and backend and push the built images to an in-cluster docker registry. In the process of building the service, I am constantly amazed by how web development has evolved nowadays, with a lot of great tools and libraries available.

Setting up a “slim edge” with kernel packet forwarding and encryption

I have some VPS servers with good connections and rather limited hardware resources, which cannot be used as an edge node to handle Kubernetes traffic by itself. For instance, a VPS server with CN2-GIA connection is perfect for Mainland Chinese visitors, but it comes only with a shy 512 MB of memory, which is a little tight for running Docker, Kubelet and Nginx Ingress Controller.

So I am trying to set up an forwarding mechanism to proxy the traffic between visitors and another more powerful Kubernetes node. A simple solution might be to run a TCP port proxy, such as nginx, haproxy, socat, or even iptables’ MASQUERADE feature from the Linux kernel. The problems with this method are:

  • The original visitor IP will be lost and replaced with the edge’s IP.
    • Surely you can use the PROXY Protocol to wrap and forward the client IPs, but that would require additional setup. For instance, Nginx Ingress Controller allows you accept only traffic with or without PROXY Protocol, but not both.
  • With the exception of kernel forwarding, traffic must be handled by the user space, which can increase the load on an underpowered node.
  • Traffic is forwarded between nodes as-is, with no additional encryption.
    • This will be problematic if the two nodes are not connected with a private network, and the traffic forwarded is in a plaintext protocol.

I experimented with a new solution, which overcomes all the problems mentioned above. Here is an overall illustration of the setup:

Traffic flow between clients and servers
Traffic flow between clients and servers

The following are the incoming and outgoing traffic flows, along with the commands used to set up forwarding:

  1. Incoming traffic
    1. When a client from [Client.IP] hits [Slim.Edge.Public.IP]:80, the traffic will be received by the edge server’s public network interface.
    2. Edge server changes the destination address to [Powerful.Node.Private.IP]:80 using the kernel’s DNAT mechanism. This happens in the pre-routing stage.
      iptables -t nat -A PREROUTING -p tcp -m multiport --dports 80,443 -d [SLIM.EDGE.PUBLIC.IP] -j DNAT --to-destination [POWERFUL.NODE.PRIVATE.IP]

      • Note that unlike other proxy setups, there is no masquerading (iptables’ MASQUERADE action) in place, because we want to preserve the original source IP.
    3. The kernel will pass on the incoming traffic to edge server’s WireGuard interface. By default, forwarding traffic is not allowed, and a forwarding rule has to be added explicitly.
      iptables -t filter -A FORWARD -o [SLIM.EDGE.WIREGUARD.INTERFACE] -j ACCEPT

      • Here [SLIM.EDGE.WIREGUARD.INTERFACE] is the slim node’s WireGuard interface, such as wg0.
      • Also remember to enable the net.ipv4.ip_forward toggle. This is pretty routing, so I won’t get into details.
    4. Powerful node will receive the traffic on their WireGuard interface, and since the ingress controller (or HTTP server, or whatever daemon you are trying to forward the traffic to) is listening on the host interface with a wildcard address (0.0.0.0), the forwarded packets will be handled just like direct ones, with the client’s source IP addresses in tact.
  2. Outgoing traffic
    1. The powerful node will probably reply with a packet from [Powerful.Node.Private.IP]:80 to [Client.IP]:(source-port).
    2. Use an IP rule to make sure that traffic coming out of the private interface, regardless of the destination address, will be sent to the WireGuard interface.
      ip rule add from [POWERFUL.NODE.PRIVATE.IP] table 1234 prio 5678

      • Note that the routing table ID (1234) should be set in the WireGuard configuration (Table = 1234) in order for the wg-quick script to create and fill this routing table. The priority number (5678) can be anywhere between 1 and the main rule (usually 32766).
      • When setting up the WireGuard tunnel, make sure to put AllowedIPs = 0.0.0.0/0 (or the IPv6 counterpart ::/0) in the edge node’s peer definition.
      • You can examine the generated routing table by running: ip route show table 1234, and the output should be similar to: default dev [POWERFUL.NODE.WIREGUARD.INTERFACE] scope link
    3. The traffic will be received by the edge node, which it needs to change the source IP from [Powerful.Node.Private.IP] back to its own public IP, using the kernel’s SNAT mechanism.
      iptables -t nat -A POSTROUTING -p tcp -s [POWERFUL.NODE.PRIVATE.IP] -j SNAT --to-source [SLIM.EDGE.PUBLIC.IP]
    4. From the edge node’s perspective, this packet is also a forwarded one, so another forwarding rule should be added.
      iptables -t filter -A FORWARD -i wg-pe-ba-la -j ACCEPT

With this setup, I can achieve the goal of receiving traffic using the slim edge node, while avoiding the issues mentioned above: The client IP is preserved during the whole process, the traffic forwarding happens entirely in the kernel space, and traffic between servers are encrypted.

However, the encryption is totally optional, so the WireGuard tunnel between servers can be easily replaced with other layer-3 tunnels like IPIP or GRE.

小技巧:Windows 系统下 Alt+F7 快捷键失效?

最近发现在我的 Windows 系统下,JetBrain 全家桶(IDEA、PyCharm、PhpStorm 等等一系列 IDE)的 Find Usages 快捷键 Alt+F7 不响应了。对于使用 IDEA 默认键盘配置的用户来说,“查找引用”的使用频率还是挺高的。

本来是想使用一些检测快捷键冲突的软件查找一下,后来简单 Google 了一下,发现很多人都遇到了这个问题

罪魁祸首是 N 卡显卡驱动自带的 NVIDIA GeForce Experience 软件,它有一个“游戏内覆盖”的功能,主要触发快捷键是 Alt+Z,但是会把其它一系列快捷键 Alt+F1, Alt+F2, Alt+F3, Alt+F5, Alt+F6, Alt+F7, Alt+F8, Alt+F9, Alt+F10, Alt+F11 都注册掉。(思考题:为什么没有 Alt+F4?)

解决方法就是在开始菜单中启动 GeForce Experience 软件,点击右上角齿轮进入设置,直接禁用“游戏内覆盖”功能。如果确实用到游戏内覆盖界面,也可以按 Alt+Z 打开覆盖界面,在设置中的“键盘快捷键”中修改或删除冲突的快捷键。