Setting up your own IPv6 Tunnel

I have some servers that don’t come with native IPv6 connectivity, which means that in order to use the next generation protocol, they need to be tunneled by other IPv6-capable nodes over IPv4.

In the past I have exclusively gone for the Tunnel Broker service provided by Hurricane Electric. I loved their service not only because it is free and easy to set up, but also for the reasonably good quality of their tunnel, since HE is a well-known transit provider. But recently, one of my servers which I use as an Internet exit has been suffering when it tries to make connections to IPv6-enabled websites. The symptom is simple – I can ping6 some addresses but not others, and the frequency is getting higher. So, I decided to set up a private tunnel endpoint using one of my own IPv6-enabled servers.


I want to mimic the Tunnel Broker service as much as possible, because it is known to work. The current service provides tunnel users with the following stuff:

  • “Server IPv4 Address”: The remote IPv4 tunnel endpoint, like 66.220.*.*.
  • “Client IPv6 Address”: An IPv6 address representing the host connecting to the tunnel, like 2001:470:c:*::2.
  • “Server IPv6 Address”: An IPv6 address representing the tunnel server, also used as the IPv6 gateway of the client host, like 2001:470:c:*::1.
  • “Routed IPv6 Prefixes”: /64 or /48 subnets given to the tunnel operator to provide IPv6 connectivity to other internal networks through the tunnel.

I made one of my IPv6-connected servers the designated tunnel server. In order to be used as such, it has the following to provide:

  • A public IPv4 address: I will use this as the “Server IPv4 Address”, which means my client host will connect to this endpoint over IPv4.
  • Three routable IPv6 addresses: The specs actually say I have 10, and I believe they would route the whole /64 to me if I set it up right. But for this particular use case, 3 is enough: *::1 is the address of the tunnel server, *::2 and *::3 as Server and Client IPv6 Address respectively.

Since I don’t have any other subnets to make routable, I don’t need to provide another routable IPv6 prefix.

Connecting tunnel client and server

We first need to make the client and server hosts communicable using their IPv6 addresses. The protocol used by Tunnel Broker, and thus my new tunnel, is Simple Internet Transition (SIT). It is supported by Linux kernel natively and quite easy to set up. In fact, the Tunnel Broker service provides users with sample client configurations depending on their preferred network management tools. Here is an example using iproute2:

modprobe ipv6
ip tunnel add sit-ipv6 mode sit remote [SERVER-IPV4] local [CLIENT-IPV4] ttl 255
ip link set sit-ipv6 up
ip addr add [CLIENT-IPV6]/127 dev sit-ipv6
ip route add ::/0 dev sit-ipv6
ip -f inet6 addr

For my configuration, the client IPv6 address is *::3, and the netmask is set to /127 to include both ends’ addresses. If one wants to persist the configuration, they can use the method provided by their operating systems. Here is the example client configuration using Netplan (used at least by Ubuntu 18.04):

  version: 2
      mode: sit
      remote: [SERVER-IPV4]
      local: [CLIENT-IPV4]
        - "[CLIENT-IPV6]/127"
      gateway6: "[SERVER-IPV6]"

The thing about SIT tunnels is that they are symmetrical, so in order to set up the server end, one need to make the following changes:

  • Switch the server and client IPv4 addresses, so that the one after “local” is the IPv4 address of the configured machine.
  • Replace CLIENT-IPV6 with SERVER-IPV6 as the interface’s IPv6 endpoint.
  • Remove the route / gateway definition, since the server already has an external IPv6 gateway.

By now, both the tunnel server and client hosts should be able to reach each other with their brand new IPv6 addresses. This can be verified by running ping6 [SERVER-IPV6] on the client side, and vice versa.

Forwarding Tunneled Traffic

In order for the tunneled host to actually reach the global Internet, the tunnel server has to route IPv6 traffic from and to the host.

Forwarding Outgoing Traffic

Since [SERVER-IPV6] is configured to be the IPv6 gateway on the client host, all its traffic with a remote IPv6 destination address will be sent over the tunnel to the server side. By default, a server will not take the role of routing that traffic – it will only receive traffic destined to itself. To make it also forward traffic to the next hop, we need to enable packet forwarding in the kernel parameters. This can be done by running the following as root:

echo 1 > /proc/sys/net/ipv6/conf/[SERVER-TUNNEL-INTERFACE]/forwarding

This can be persisted across reboots by appending net.ipv6.conf.[SERVER-TUNNEL-INTERFACE].forwarding=1 to /etc/sysctl.conf. Note that if you have firewalls like ip6tables, you may need to configure its forwarding rules, or change the default forwarding policy to ACCEPT.

Accepting Incoming Traffic

When there is traffic coming in for the tunnel server, but has the destination address of the client host, the tunnel server’s gateway will attempt to use “Neighbor Solicitation Message” to verify its reachability. But the client host’s IPv6 address is absent on all interfaces of the server host, so it will not reply said message, causing the incoming traffic to be dropped.

In order for the tunnel server to respond to the solicitation message with a “Neighbor Advertisement Message”, we need to configure a NDP proxy for the server’s external interface. The first step is to enable NDP proxy in the Linux kernel:

echo 1 > /proc/sys/net/ipv6/conf/[SERVER-EXTERNAL-INTERFACE]/proxy_ndp

This parameter can be persisted in the same way as shown in the last section. Then we have to explicitly enable NDP proxy for the client IPv6 address. Using iproute2 this can be done as:

ip -6 neigh add proxy [CLIENT-IPV6] dev [SERVER-EXTERNAL-INTERFACE]

This line means that when the external router wants to reach the client IPv6 address on the interface, the server will respond with its own address. Then, when the traffic destined for the client host arrives, the server will forward it to the tunnel interface, since we configured a /127 subnet above to include IPv6 addresses of both ends. This can be shown by observing the routing table from running ip -6 route on the server.

The command also needs to be persisted, so that client hosts will not lose connectivity after the server reboots. The way of persistence varies by the network management tool used by the server. For ifupdown the command can be written in /etc/network/interfaces; If the server is using Netplan, the location where this command goes should probably be /etc/networkd-dispatcher/routable.d, since Netplan doesn’t come with native hook support.


I would like to revisit the route an outgoing packet will go through. Let’s say a process on the client host wants to access 2001:4860:4860::8888:

  • According to the routing table on the client end, the traffic should be forwarded to the gateway, SERVER-IPV6.
  • Then it will notice that the SERVER-IPV6 address belongs to the /127 subnet on sit-ipv6 interface.
  • When the packet is forwarded to the sit-ipv6 tunnel interface, it will be encapsulated with an IPv4 header, and sent to the SERVER-IPV4 address. This could be across the IPv4 Internet, or a private connection if there is one.
  • The encapsulated packet will be received by the sit-ipv6 interface on the server’s end, and unpacked to its original IPv6 form.
  • Since the IPv6 destination is an external one, and we have enabled forwarding on the server, it will be routed to the external gateway according to the routing table.

When the remote server replies, the packet goes the exact opposite way back to the client host.

西数各系列硬盘使用 SMR 和 PMR 的型号列表

前一段时间,西数红盘系列使用 SMR(叠瓦磁记录)技术的新闻闹得很大,因为红盘 Red 系列是主打 NAS 存储,虽然不是“高端”(是与蓝盘、绿盘一样的低转速),但也比蓝盘价格贵出几成,因此被锤的很惨。


在此之前西数是不公布每款产品的内部技术的。今天看了下西数官网,已经标出了每个系列中具体哪些型号使用 SMR、哪些使用 PMR(垂直磁记录;西数称作 CMR,常规磁记录)技术。之前传出的消息是 2~6 TB 的红盘使用 SMR,但我看了下只是一部分型号,而我在某东买的红盘在 CMR 的型号中。所以先别急着退货,可以对照下表检查一下自己的硬盘型号是否在中招的列表中,然后再做决定。


SMR 技术的型号如下:

  • WD Red™ 3.5” (3.5 英寸红盘): WD20EFAX (2TB), WD30EFAX (3TB), WD40EFAX (4TB), WD60EFAX (6TB)
  • WD Blue™ 3.5” (3.5 英寸蓝盘): WD20EZAZ (2TB), WD60EZAZ (6TB)
  • WD Blue™ 2.5” (2.5 英寸蓝盘): WD10SPZX (1TB), WD20SPZX (2TB)
  • WD Black™ 2.5” (2.5 英寸黑盘): WD10SPSX (1TB)

PMR (CMR) 技术的型号如下:

  • WD Red™ (2.5 英寸 / 3.5 英寸红盘): WD10JFCX (1TB), WD10EFRX (1TB), WD20EFRX (2TB), WD30EFRX (3TB), WD40EFRX (4TB), WD60EFRX (6TB), WD80EFAX (8TB), WD100EFAX (10TB), WD101EFAX (10TB), WD120EFAX (12TB), WD140EFAX (14TB)
  • WD Red Pro (3.5 英寸红盘 Pro): WD2002FFSX (2TB), WD4002FFWX (4TB), WD4003FFBX (4TB), WD6002FFWX (6TB), WD6003FFBX (6TB), WD8003FFBX (8TB), WD102KFBX (10TB), WD121KFBX (12TB), WD141KFGX (14TB)
  • WD Black™ 3.5” (3.5 英寸黑盘): WD5003AZEX (500GB), WD1003FZEX (1TB), WD2003FZEX (2TB), WD4005FZBX (4TB), WD6003FZBX (6TB)
  • WD Black™ 2.5” (2.5 英寸黑盘): WD2500LPLX (250GB), WD3200LPLX (320GB), WD5000LPLX (500GB)
  • WD Blue™ 3.5” (3.5 英寸蓝盘): WD5000AZLX (500GB), WD5000AZRZ (500GB), WD10EZRZ (1TB), WD10EZEX (1TB), WD20EZRZ (2TB), WD30EZRZ (3TB), WD40EZRZ (4TB), WD60EZRZ (6TB)
  • WD Blue™ 2.5” (2.5 英寸蓝盘): WD3200LPCX (320GB), WD5000LPVX (500GB), WD5000LPCX (500GB), WD5000LQVX (EA) (500GB)
  • WD Purple (3.5 英寸紫盘): WD10EJRX (1TB), WD20EJRX (2TB), WD30EJRX (3TB), WD40EJRX (4TB), WD60EJRX (6TB), WD80EJRX (8TB), WD81PURZ (8TB), WD82PURZ (8TB), WD100EJRX (10TB), WD101EJRX (10TB), WD121EJRX (12TB), WD140EJRX (14TB)
  • WD Gold (3.5 英寸金盘): WD1005VBYZ (1TB), WD2005VBYZ (2TB), WD4003VRYZ (4TB), WD6003VRYZ (6TB), WD8004VRYZ (8TB), WD102VRYZ (10TB), WD121VRYZ (12TB), WD141VRYZ (14TB)


  1. On WD Red NAS Drives – Western Digital Blog
  2. Western Digital Online


2020 年 4 月 22 日

过去的一周,可以说是多事之秋。作为一个团队,我们认真倾听并了解您对我们的WD Red NAS硬盘的反馈,特别是我们如何沟通使用哪些记录技术,这一点非常重要。我们清楚地听到了您的担忧。以下是我们通过渠道提供的客户内部硬盘的那个列表。





2020 年 4 月 20 日

最近,大家都在讨论我们的一些WD Red硬盘(HDD)中使用的记录技术。我们对任何误解表示遗憾,希望花几分钟时间讨论一下硬盘,并提供一些补充信息。

WD Red 硬盘是使用 NAS 系统的家庭和小型企业的理想选择。它们非常适合使用1到8个硬盘托架进行文件的共享和备份,一年的工作负载率为180TB,是很好的选择。我们已经对这种类型的使用进行了严格的测试,并得到了主要NAS供应商的验证。

我们通常会指定设计好的用例和性能参数,不一定要讲到引擎盖下的东西。其中一项创新技术是Shingled Magnetic Recording(SMR)技术。

SMR是经过测试和验证的技术,它使我们能够跟上个人和企业使用的数据量不断增加的趋势。我们正在不断创新,以推动它的发展。SMR 技术有不同的实现方式–硬盘管理的 SMR (DMSMR)、设备本身的 SMR (如我们的低容量 (2TB – 6TB) WD Red HDD,以及主机管理的 SMR (用于高容量数据中心应用)。每种实现都为不同的用例服务,从个人计算到世界上一些最大的数据中心都有不同的用例。


多年来,WD Red 硬盘已为全球各地的家庭和小型企业NAS系统提供了可靠的动力,并得到了主要NAS制造商的一致认可。在建立了这样的声誉之后,我们明白,有时我们的硬盘可能会被用于系统工作负载远远超过其预期用途。此外,有些人最近分享说,在某些数据密集型的连续读/写使用案例中,WD Red HDD 驱动的 NAS 系统的性能并不符合您的预期。

如果您遇到的性能与您的预期不符,请考虑我们为密集型工作负载设计的产品。这些产品可能包括 WD Red Pro 或 WD Gold 硬盘,或者 Ultrastar 硬盘。我们的客户服务团队随时准备提供帮助,也可以确定哪种产品可能最适合您。

我们知道您将您的数据委托给我们的产品,我们不会掉以轻心。如果您已经购买了 WD Red 硬盘,如果您遇到性能或其他技术问题,请致电我们的客户服务中心。我们将为您提供各种选择。我们将为您提供帮助。

Kubernetes + Flannel: UDP packets dropped for wrong checksum – Workaround

Update on July 22

Stable versions have been released in all supported branches (v1.16.13, v1.17.9 and v1.18.6) that include the fix needed. According to the change log:

Fixes a problem with 63-second or 1-second connection delays with some VXLAN-based network plugins which was first widely noticed in 1.16 (though some users saw it earlier than that, possibly only with specific network plugins). If you were previously using ethtool to disable checksum offload on your primary network interface, you should now be able to stop doing that. [ref]

Update on June 17

After 3 months, the problem has been located and solved from Kubernetes’ end. Long story short, they decided that neither Linux kernel nor Flannel required any change, instead a mark added by kube-proxy caused the kernel to double-NAT the packet, and in turn sent with a wrong checksum. A pull request has been merged to master, and soon to be backported to release branches.

To see the whole story, check out the following links:

Recently I noticed some DNS queries in my Kubernetes cluster time out, causing apps to crash. I looked into the issue, reported to the kernel network team and applied the workaround.


My Kubernetes cluster is built with Flannel overlay network with vxlan backend. The idea is that each node (machine) gets a private IP subnet to further allocate to pods. When a cross-node packet is to be sent, it was sent to the vxlan virtual interface, encapsulated in a UDP (regardless of the original protocol) packet and routed to the other node, where it is received by another vxlan interface and extracted.*

Kubernetes clusters provide Service resource. One of the many types of Service allows you to use a single virtual IP to represent multiple pods, some times across nodes. This is implemented with kube-proxy component, which utilizes the IPVS feature in Linux Kernel.

Now, when I make a DNS query on the host (it should be the same from inside containers, but with more hops) using dig against CoreDNS’ service IP, it always times out. It works fine if I query one of the backend pod’s IP instead.


I used tcpdump to capture the packet, and noticed that the encapsulated UDP packet had a bad UDP checksum.

06:22:23.699846 IP (tos 0x0, ttl 64, id 7598, offset 0, flags [none], proto UDP (17), length 133) > [bad udp cksum 0xd2ae -> 0x245b!] OTV, flags [I] (0x08), overlay 0, instance 1
IP (tos 0x0, ttl 63, id 33703, offset 0, flags [none], proto UDP (17), length 83) > [udp sum ok] 41922+ [1au] A? ar: . OPT UDPsize=4096 (55)

Further test on the receiving end shows that the packet is transferred, but dropped on the target node. That makes it certain that the checksum is what caused the DNS query to time out with no response.

A little more Googling shows that this could be caused by “Checksum offloading“. That means if the kernel wants to send a packet out on a physical ethernet card, it can leave the checksum calculation to the card hardware. In this case, if you capture the packet from kernel, it will show a wrong checksum, since it has yet to be calculated; but, the same packet captured on the receiving end will have a different and correct checksum, calculated by the sender’s network card hardware.


I tried to use ethtool to disable TX (outgoing) checksum offloading on flannel.1 (vxlan virtual interface), and the query works again. So my guess is the kernel driver miscalculated / forgot to calculate the checksum with offloading turned on; when it’s off, it used kernel code to calculate the correct checksum before sending it to the actual outgoing network card.

To temporarily turn off checksum offloading:

sudo ethtool -K flannel.1 tx-checksum-ip-generic off

I used a systemd service to automatically do this after the interface appears. Note that the interface was created by flannel after kubelet is run, so you can’t simply execute it at boot time, e.g. in /etc/rc.local.

You can use the following code to create the service /etc/systemd/system/xiaodu-flannel-tx-off.service, then enable and start it. (The service file can be downloaded using this link.)

sudo tee /etc/systemd/system/xiaodu-flannel-tx-off.service > /dev/null << EOF
Description=Turn off checksum offload on flannel.1


ExecStart=/sbin/ethtool -K flannel.1 tx-checksum-ip-generic off
sudo systemctl enable xiaodu-flannel-tx-off
sudo systemctl start xiaodu-flannel-tx-off

For systemd >= 245, they added TransmitChecksumOffload parameter to *.link unit. You can read the docs and try it out yourself, or just use the service above.

* If you want to learn more about how Flannel and Kubernetes networking works under the hood, I strongly suggest that you read this blog post, which gives an step-by-step demonstration of how a packet is sent from one pod to another.

Solution: After switching to SyntaxHighlighter Evolved, all the codes are scrambled


If you switched from another syntax highlighting plugin to SyntaxHighlighter Evolved, and all your codes are scrambled, try running the following code as a single-file plugin.

function xiaodu_syntaxhighlighter_fix() {
	return 2;
add_filter('syntaxhighlighter_pre_getcodeformat', 'xiaodu_syntaxhighlighter_fix');

The easiest way is to go to Plugins – Plugin Editor, and paste the code at the bottom of any enabled plugin, maybe Hello Dolly.

Long version

After almost three years I finally started working on this blog again.

One of the first things I noticed is that the syntax highlighting plugin I used, Crayon Syntax Highlighter, is dead. Well, to its credit, the server-rendered markups it generated were… fine when I started this blog, but nowadays they just look sickening to me, especially when compared to the neat client rendering solutions.

So I went to the store and downloaded the most popular choice, SyntaxHighlighter Evolved, which uses the JavaScript library SyntaxHighlighter to perform client-side highlighting. After installing it and converting all my old code tags to their markup format, I found the highlighting to be working, but all the C++ and HTML looked screwed up.

Scrambled code
Scrambled code

As you can see, all the “<“, “>” and “&” in the code are now showing up as their HTML entities – “&lt;”, “&rt;” and “&amp;”. That is not cool, so I looked into the problem.

Looking under the hood

First thing we need to know is how the code is stored in the post. When I click on the “Text” tab in the post editor (yep, the old one… I haven’t adapted to the new blocks yet,) I found that the characters are displaying correctly.

Code in post editor
Code in post editor

Then I looked further into MySQL, and the code is stored encoded, which is fine – it can be stored in the database however it fits, as long as the final output is correct… which it isn’t.

Code in MySQL
Code in MySQL

Pinpointing the plugin code

Now that I know that the plugin did an extra encoding, I looked for “htmlspecialchars” in the plugin’s GitHub repository, and found this piece of code:

	// This function determines what version of SyntaxHighlighter was used when the post was written
	// This is because the code was stored differently for different versions of SyntaxHighlighter
	function get_code_format( $post ) {
		if ( false !== $this->codeformat )
			return $this->codeformat;
		if ( empty($post) )
			$post = new stdClass();
		if ( null !== $version = apply_filters( 'syntaxhighlighter_pre_getcodeformat', null, $post ) )
			return $version;
		$version = ( empty($post->ID) || get_post_meta( $post->ID, '_syntaxhighlighter_encoded', true ) || get_post_meta( $post->ID, 'syntaxhighlighter_encoded', true ) ) ? 2 : 1;
		return apply_filters( 'syntaxhighlighter_getcodeformat', $version, $post );
	// Adds a post meta saying that HTML entities are encoded (for backwards compatibility)
	function mark_as_encoded( $post_ID, $post ) {
		if ( false == $this->encoded || 'revision' == $post->post_type )
		delete_post_meta( $post_ID, 'syntaxhighlighter_encoded' ); // Previously used
		add_post_meta( $post_ID, '_syntaxhighlighter_encoded', true, true );

Apparently years ago they changed how codes are stored in the post. Now if you write and save a new post with their plugin installed, they will save the code already encoded, and insert a post meta “_syntaxhighlighter_encoded = True” to mark the post as the “new (encoded) format”.

But if you are like me who used other plugins when initially posting the code and later switched to Evolved, you are in bad luck, as they consider your post by default the “old format,” and will encode the code again in the final output.


The apparent solutions it to make the plugin think that all my posts are in the new format. I could add the same metadata to each of the posts, but luckily there is a easier way: Use the filter “syntaxhighlighter_pre_getcodeformat” (line 8 in the code above) they provided to override the result.

So I used the plugin code at the beginning to hook it. The hook function simply returns 2, which means all my posts, with or without the metadata, will be considered the already-encoded new format, so they will not be doubly encoded.

It’s been years, is that all you have to say?

OK, fair enough. So this blog may look the same, but the tech underneath it is constantly changing.

For example, all my appliances are now hosted on my own bare-metal (as opposed to cloud-vendored like GKE) globally-distributed Kubernetes cluster. Also, I have been hiding behind CloudFlare for years to avoid the haters, but now they are mostly gone (or grown up 🙄,) so I have been thinking of new ways to distribute my content.

All of these new stuff are exciting and worth sharing, and I will write about them soon™.


我手里有一个三星的安卓平板(SM-P601),分辨率是 2560×1600,像素密度 320,经常被某些傻X APP 认为不是移动设备。今天刚发现可以通过一个 wm 命令来临时修改分辨率,于是手贱改了个 640×360,然后悲剧发生了:因为我是在 Terminal Emulator 里 Root 修改的,没有开 ADB,然后屏幕小到看不全,也没法打 wm reset,重启也不管用。借助万能的谷歌,终于坎坷的恢复回去了。下面记录一下各种坑……

第一关:怎么启用 ADB?

在网上搜了一下,果然有蛋疼的老外跟我一样蠢,在没开 ADB 的情况下修改了分辨率,还改不回去了,例如 XDA 的这位仁兄,然后他找出了如何通过 Recovery 强制启用 ADB 的方法:

我用的是 TWRP,首先长按电源键重启,并按住 Home + 音量加进入 Recovery(不同设备组合键也不一样)。然后在 Mount 中挂载 System,再进入 Advanced – Terminal,并使用 vi /system/build.prop 加入以下行:


如果不会使用 vi 或者觉得麻烦,可以用 File Manager 把 /system/build.prop 拷到 /sdcard,然后利用 TWRP 的 MTP 把它复制到电脑上修改,然后再拷贝回去。

重启系统后,电脑上使用 adb devices 可以看到设备了,然而后面显示 unauthorized,无法进入 adb shell……于是进入第二关。(如果你的设备版本较低无需 ADB 授权,或以前授权过这台电脑,可以直接跳到最后。)

第二关:怎么授权 ADB?

还是没办法进行屏幕操作。一般连上 ADB 之后都会有个“是否允许 USB 调试”的提示,点允许就授权好了。于是又放狗一搜,果然也有 StackOverflow 的老外解决了这个问题。

首先找到你电脑上的 公钥,位置可能在 $ANDROID_SDK_HOME 环境变量对应的位置(如果安装过 SDK)、C:\Users\用户名\.android(Windows 默认)或 ~/.android(其它系统默认)。找到之后,改名为 adb_keys,并利用 TWRP 的 MTP 传输到手机的 /sdcard(也就是内部存储)根目录。

然后回到 TWRP 的 Advanced – File Manager,找到 /sdcard/adb_keys 并复制到 /data/misc/adb 目录下。重启手机即可连接 ADB。


在系统中进入 adb shell,在修改分辨率时要保证屏幕是关闭(锁屏)状态,所以最好不要在 Terminal Emulator 命令行中修改。


  • am display-size reset(< Android 4.3)
  • wm size reset(≥ Android 4.3)


  • am display-size 1080x720
  • am display-density 240
  • wm size 1080x720
  • wm density 240