小技巧:联通 PPPoE 拨号失败问题

作为多年的联通宽带用户,发现联通 PPPoE 拨号有一个问题,就是如果在短时间内多次拨号(例如频繁插拔路由器,或电脑多次断开重连),就会出现拨号失败、没有响应的情况(电脑端可能会有 678 错误)。

此时上游线路(光纤)都是没问题的,账号、密码也没有错误,但是一直无法拨号成功。

这种情况下,可以尝试这种方法:暂停拨号一段时间,比如将路由器断电、或电脑关闭拨号界面,等过 5 分钟以上再重新拨号,应该就能成功。

如果尝试之后依然失败,或上游线路本身就有问题(可以在光猫管理界面中看到光信号状态),则应该拨打 10010 报修。

Setting up your own IPv6 Tunnel

I have some servers that don’t come with native IPv6 connectivity, which means that in order to use the next generation protocol, they need to be tunneled by other IPv6-capable nodes over IPv4.

In the past I have exclusively gone for the Tunnel Broker service provided by Hurricane Electric. I loved their service not only because it is free and easy to set up, but also for the reasonably good quality of their tunnel, since HE is a well-known transit provider. But recently, one of my servers which I use as an Internet exit has been suffering when it tries to make connections to IPv6-enabled websites. The symptom is simple – I can ping6 some addresses but not others, and the frequency is getting higher. So, I decided to set up a private tunnel endpoint using one of my own IPv6-enabled servers.

Prerequisites

I want to mimic the Tunnel Broker service as much as possible, because it is known to work. The current service provides tunnel users with the following stuff:

  • “Server IPv4 Address”: The remote IPv4 tunnel endpoint, like 66.220.*.*.
  • “Client IPv6 Address”: An IPv6 address representing the host connecting to the tunnel, like 2001:470:c:*::2.
  • “Server IPv6 Address”: An IPv6 address representing the tunnel server, also used as the IPv6 gateway of the client host, like 2001:470:c:*::1.
  • “Routed IPv6 Prefixes”: /64 or /48 subnets given to the tunnel operator to provide IPv6 connectivity to other internal networks through the tunnel.

I made one of my IPv6-connected servers the designated tunnel server. In order to be used as such, it has the following to provide:

  • A public IPv4 address: I will use this as the “Server IPv4 Address”, which means my client host will connect to this endpoint over IPv4.
  • Three routable IPv6 addresses: The specs actually say I have 10, and I believe they would route the whole /64 to me if I set it up right. But for this particular use case, 3 is enough: *::1 is the address of the tunnel server, *::2 and *::3 as Server and Client IPv6 Address respectively.

Since I don’t have any other subnets to make routable, I don’t need to provide another routable IPv6 prefix.

Connecting tunnel client and server

We first need to make the client and server hosts communicable using their IPv6 addresses. The protocol used by Tunnel Broker, and thus my new tunnel, is Simple Internet Transition (SIT). It is supported by Linux kernel natively and quite easy to set up. In fact, the Tunnel Broker service provides users with sample client configurations depending on their preferred network management tools. Here is an example using iproute2:

modprobe ipv6
ip tunnel add sit-ipv6 mode sit remote [SERVER-IPV4] local [CLIENT-IPV4] ttl 255
ip link set sit-ipv6 up
ip addr add [CLIENT-IPV6]/127 dev sit-ipv6
ip route add ::/0 dev sit-ipv6
ip -f inet6 addr

For my configuration, the client IPv6 address is *::3, and the netmask is set to /127 to include both ends’ addresses. If one wants to persist the configuration, they can use the method provided by their operating systems. Here is the example client configuration using Netplan (used at least by Ubuntu 18.04):

network:
  version: 2
  tunnels:
    sit-ipv6:
      mode: sit
      remote: [SERVER-IPV4]
      local: [CLIENT-IPV4]
      addresses:
        - "[CLIENT-IPV6]/127"
      gateway6: "[SERVER-IPV6]"

The thing about SIT tunnels is that they are symmetrical, so in order to set up the server end, one need to make the following changes:

  • Switch the server and client IPv4 addresses, so that the one after “local” is the IPv4 address of the configured machine.
  • Replace CLIENT-IPV6 with SERVER-IPV6 as the interface’s IPv6 endpoint.
  • Remove the route / gateway definition, since the server already has an external IPv6 gateway.

By now, both the tunnel server and client hosts should be able to reach each other with their brand new IPv6 addresses. This can be verified by running ping6 [SERVER-IPV6] on the client side, and vice versa.

Forwarding Tunneled Traffic

In order for the tunneled host to actually reach the global Internet, the tunnel server has to route IPv6 traffic from and to the host.

Forwarding Outgoing Traffic

Since [SERVER-IPV6] is configured to be the IPv6 gateway on the client host, all its traffic with a remote IPv6 destination address will be sent over the tunnel to the server side. By default, a server will not take the role of routing that traffic – it will only receive traffic destined to itself. To make it also forward traffic to the next hop, we need to enable packet forwarding in the kernel parameters. This can be done by running the following as root:

echo 1 > /proc/sys/net/ipv6/conf/[SERVER-TUNNEL-INTERFACE]/forwarding

This can be persisted across reboots by appending net.ipv6.conf.[SERVER-TUNNEL-INTERFACE].forwarding=1 to /etc/sysctl.conf. Note that if you have firewalls like ip6tables, you may need to configure its forwarding rules, or change the default forwarding policy to ACCEPT.

Accepting Incoming Traffic

When there is traffic coming in for the tunnel server, but has the destination address of the client host, the tunnel server’s gateway will attempt to use “Neighbor Solicitation Message” to verify its reachability. But the client host’s IPv6 address is absent on all interfaces of the server host, so it will not reply said message, causing the incoming traffic to be dropped.

In order for the tunnel server to respond to the solicitation message with a “Neighbor Advertisement Message”, we need to configure a NDP proxy for the server’s external interface. The first step is to enable NDP proxy in the Linux kernel:

echo 1 > /proc/sys/net/ipv6/conf/[SERVER-EXTERNAL-INTERFACE]/proxy_ndp

This parameter can be persisted in the same way as shown in the last section. Then we have to explicitly enable NDP proxy for the client IPv6 address. Using iproute2 this can be done as:

ip -6 neigh add proxy [CLIENT-IPV6] dev [SERVER-EXTERNAL-INTERFACE]

This line means that when the external router wants to reach the client IPv6 address on the interface, the server will respond with its own address. Then, when the traffic destined for the client host arrives, the server will forward it to the tunnel interface, since we configured a /127 subnet above to include IPv6 addresses of both ends. This can be shown by observing the routing table from running ip -6 route on the server.

The command also needs to be persisted, so that client hosts will not lose connectivity after the server reboots. The way of persistence varies by the network management tool used by the server. For ifupdown the command can be written in /etc/network/interfaces; If the server is using Netplan, the location where this command goes should probably be /etc/networkd-dispatcher/routable.d, since Netplan doesn’t come with native hook support.

Summary

I would like to revisit the route an outgoing packet will go through. Let’s say a process on the client host wants to access 2001:4860:4860::8888:

  • According to the routing table on the client end, the traffic should be forwarded to the gateway, SERVER-IPV6.
  • Then it will notice that the SERVER-IPV6 address belongs to the /127 subnet on sit-ipv6 interface.
  • When the packet is forwarded to the sit-ipv6 tunnel interface, it will be encapsulated with an IPv4 header, and sent to the SERVER-IPV4 address. This could be across the IPv4 Internet, or a private connection if there is one.
  • The encapsulated packet will be received by the sit-ipv6 interface on the server’s end, and unpacked to its original IPv6 form.
  • Since the IPv6 destination is an external one, and we have enabled forwarding on the server, it will be routed to the external gateway according to the routing table.

When the remote server replies, the packet goes the exact opposite way back to the client host.

Magically “defeating” different CDN implementations

In this article I will show some different CDN implementations along with cases where each of them fails to bring the best performance. I am not a researcher in this area, so some of the points are based on my personal experiences.

1. Why people use CDN

Usually when people visit a website, their browsers first query the IP address from their ISP’s recursive DNS server, which in turn query the domain’s authoritative DNS. Then they will be connected to whatever IP address returned by the DNS which is usually distant (geographically far and has high latency) from them.

That is why people use CDN to solve this problem. By putting edge caching servers in different areas of the world (or in the target country) close to end-users, they can speed up the load time of their websites and improve user experience.

2. Traditional Geo-DNS setup

The most common and simple solution is to use a “Geo-DNS” service to point the same domain to different edge servers with different IP addresses (this is important). In this case, when people query the IP address of a domain, the authoritative Geo-DNS can point them to the nearest edge server, based on the users’ IP, their ISPs’ recursive DNS servers’ IP, or users’ IP provided by recursive DNS via EDNS Client Subnet (EDNS0) if supported.

This works fine in an ideal network setup, but it can easily fall apart. Here are some cases that I have come across:

(a) Unicast DNS

Sometimes Geo-DNS providers don’t use Anycast, instead provide Unicast IP addresses for different regions. The recursive DNS has no way of telling which one is closer to them, so it queries a random one, which can result in slow DNS resolution at first visit.

Example: DNSPod and CloudXNS (Popular Unicasted Geo-DNS providers in China)

DNSPod servers use ChinaNet, China Unicom and China Mobile unicasted IP addresses.
DNSPod servers use ChinaNet, China Unicom and China Mobile unicasted IP addresses.

(b) Bad GeoIP database

Some Geo-DNS providers don’t update their GeoIP database frequent enough, or just don’t have enough data.

Example:

(1) Amazon AWS CloudFront and Akamai don’t have servers in China for obvious reasons, but Chinese visitors are not consistently directed to nearest (Hong Kong, South Korea, Japan) locations. Sometimes a query from China can get a response of European locations, which results in ~500 ms latency.

Akamai directs ChinaNet users to Frankfurt, Germany when there are obviously better choices.
Akamai directs ChinaNet users to Frankfurt, Germany when there are obviously better choices.

(2) Some Geo-DNS providers in China, most notably Aliyun DNS. When both “Domestic” and “Global” records are set, they may direct Chinese users to “Global” servers.

(c) DNS different from network exit

Sometimes people may use recursive DNS servers in the network different from their actual network exit.

Example:

(1) In my university, we have mixed network exits, one in CERNET (AS4538) and one in TieTong (AS9394). Our recursive DNS has a CERNET address, so most Geo-DNS providers gives CERNET or (if the website doesn’t have CERNET servers) other networks’ addresses, for instance ChinaNet (AS4134). But our network exit is configured to use TieTong by default, so for most websites we are visiting ChinaNet servers with a TieTong network, even if they also have TieTong servers.

A more extreme case is that, in some networks they have different routing policies for TCP and UDP (which is a violation of OSI model), so when you do DNS query in UDP you have network A’s address, and when you actually connect to TCP port 80 you have network B. Magical? But true.

(2) Sometimes recursive DNS providers and/or Geo-DNS providers don’t support EDNS0. As long as either end doesn’t support it, it will not work. For instance, if user of open recursive DNS service “114DNS” (Anycasted across several Chinese networks) has a network that is not present in 114DNS’ Anycast network, and the authoritative Geo-DNS doesn’t support EDNS0, it will return the IP in the same network of 114DNS’ node, but different from the network of the user.

3. TCP Anycast setup

Some modern CDN providers use TCP Anycast technique, which means they provide a single IP address for their edge servers in multiple locations, and visitors are directed to the nearest location, decided by how they broadcast their routing tables to other networks.

Such providers include CloudFlare and MaxCDN, which use a single Anycasted IP for their edge servers across the planet. Verizon EdgeCast use a slightly different method where they provide several Anycasted IPs, each represent a geographical zone (Asia-Pacific, North America, South America, Europe).

A unified Anycasted IP solves many of the problems mentioned above, and it’s becoming harder and harder to defeat them. But here they comes:

(1) Magical routing policy (again)

Current Chinese IPv6 implementation has only one international exit (AS23911), which has two exit points: a default one in Los Angeles by HE.net (AS6939) and a premium one in Hong Kong (HKIX). When I resolve EdgeCast’s IPv6 address, I get one in 2606:2800:147::/48 network, which is Anycasted in Asia. But when I trace route to this address, the packet goes from China to Los Angeles and back to Asia, resulting in ~400 ms latency. Even if people use an Anycasted recursive DNS (like Google’s), since it has servers in Hong Kong, the result is the same. By querying the domain at OpenDNS (which doesn’t have Asian server) I get the IP in 2606:2800:11f::/48 network, which is Anycasted in North America, and the latency is only ~200 ms (same as the network exit’s).

Tracing route from AS23911 to EdgeCast's IPv6 edge servers in different continents.
Tracing route from AS23911 to EdgeCast’s IPv6 edge servers in different continents.

This only happens with EdgeCast’s “Continent-based Anycast” network. CloudFlare is not affected. But it has another kind of problem.

(2) Artificial routing deterioration

CloudFlare has edge servers everywhere, including Hong Kong, Taipei, Japan, South Korea, etc. which are all very close to Chinese users. But the major Chinese ISPs’ international exit routing policy directs CloudFlare traffic to Los Angeles (ChinaNet) and San Jose (China Unicom), where they are directed to the nearest edge servers in <3 hops. They did the same thing for Softlayer’s Hong Kong locations, for some magical reasons: maybe price, maybe [censored] ;). The latency from both ISP to CloudFlare’s US west locations are 200~300 ms, where with TieTong (which use Hong Kong as international exit) the value is <100 ms.

ChinaNet and Unicom users get 200~300 ms latency while China Mobile (TieTong) users get 80.
ChinaNet and Unicom users get 200~300 ms latency while China Mobile (TieTong) users get 80.

This is obviously not CloudFlare’s fault, because they cannot control the routing policy from another AS to themselves (unless they pay the other system to do so). If your ISP is doing this, switch to another ISP; If the whole country is doing this, maybe switch to another country?

Summary

To sum it up, when you and your customers’ networks don’t have any of the quirks above, a simple Anycasted Geo-DNS solution works fine – you don’t even need a commercial CDN service. But the real networks are hard, and so far a global TCP Anycast solution is the best we can do.

配置路由器使用联通 PPPoE IPv6

博主所在的济南联通,很早以前说已经开通了公众 IPv6 网络,但是从 2015 年底才开始陆续有人报告可以获取到 IPv6 地址了。这里面具体怎么回事我就不追究了,今天我也终于成功的在电脑和路由器上配置好了 IPv6 网络。

更新:经过测试,使用 OpenWrt 官方最新的 15.05.1 版本,默认安装无需配置即可获取 IPv6 地址,而且软件库中有更多软件包可用,因此建议直接使用新版,而不要使用 PandoraBox 等版本旧、不开源的分支。小米路由器 Mini其它路由器都可以在官网查询支持情况。

联通在我这里使用的是 PPPoE 拨号上网,在 Windows 系统上无需配置就可以直接获取到 IPv6 地址,所以这篇文章主要说一下 OpenWrt 系统路由器的配置。

1. 第一步当然是准备一个支持 OpenWrt 系统(版本最少是 Attitude Adjustment 12.09.1,建议是 Barrier Breaker 14.07 或更高)的路由器。博主用的是小米路由器 Mini (R1CM),其本身虽然是定制的 OpenWrt 系统,但是界面没法像原版系统一样方便的修改参数,而且没有可用的 opkg 包管理软件(类似于 apt-get),最好还是刷成原始的 OpenWrt 系统。

还是以我手中的小米路由器为例,首先要将系统刷成开发版固件,然后去官网开放 SSH 权限,再接下来就是去下载 PandoraBox(一个国产的 OpenWrt 分支,适配很多国产路由器)并刷机。我使用的刷机命令是:

wget -O /tmp/pandora.bin http://.../xxx.bin  # 此处是你的路由器对应的 PandoraBox ROM 地址
mtd -r write /tmp/pandora.bin OS1  # "OS1" 可能根据路由器型号不同也不一样

2. 刷好机并进入后台后,请先确认左侧“系统”-“软件包”中有“odhcp6c”这个软件。如果没有的话,建议先“刷新列表”,然后在“可用软件包”中安装它。这个软件是用于通过 DHCP 协议自动获取 IPv6 地址的,因此对本教程至关重要。在 12.09.1 系统中可能不提供这个软件,那么建议下载类似的软件包(通常名字中含有 dhcp 和 6)。

3. 接下来,可以转到左侧“网络”-“接口”,默认内置了三个选项:LAN、WAN 和 WAN6。如果你的系统没有 WAN6,可以在接口中新建一个接口,并将“协议”设置为“DHCPv6 client”并点击“切换协议”。

接下来,在 WAN6 接口中按照如下配置:(括号内是 /etc/config/network 中对应的配置项,都在 config ‘interface’ ‘wan6’ 这一段中)

  • 基本设置:
    • Request IPv6-address:Disabled(option reqaddress ‘none’)
    • Request IPv6-prefix of length:自动(option reqprefix ‘auto’)
  • 物理设置:
    • 接口:自定义接口,并输入 @wan(option ifname ‘@wan’)

这样完成了 DHCP 获取 IPv6 地址的配置,可以先“保存”(不必现在应用)。接下来配置一下 PPPoE 上网信息,在左侧找到“接口”-“WAN”,并进行如下配置:(括号同上,属于 config ‘interface’ ‘wan’ 这一段)

  • 基本设置:
    • 协议:选择 PPPoE,但是如果你是其它网络,可以按自己的情况选择“静态地址”、“DHCP 客户端”等。(option proto ‘pppoe’)
    • PAP/CHAP 用户名、密码:输入宽带拨号的用户名和密码即可。(option username ‘xxx’ / option password ‘xxx’)
  • 高级设置:
    • 在 PPP 链路上启用 IPv6 协商:打勾(option ipv6 ‘1’)

这样也完成了上外网配置,还是先“保存”。最后进行一下“接口”-“LAN”中的配置:(属于 config ‘interface’ ‘lan’ 这一段)

  • 上部分“基本设置”:
    • IPv6 assignment length:64(option ip6assign ’64’)
  • 下部分“IPv6 Settings”:
    • Always announce default router:打勾(option ra_default ‘1’)

全部搞定,此时选择“保存&应用”,然后等待大概一分钟(PPPoE 拨号、获取 IPv6 地址的时间),路由器和所连接设备应当都能获得 IPv6 地址了。

路由器配置成功后,可以从以下两处看到效果:

(1) “状态”-“总览”-“网络”:

(2) “网络”-“接口”:

大概解释一下其中的意思:2408:802a::ee3a/64 这个地址,是路由器 WAN 口本身获得的 IPv6 地址,在路由器上使用 ping6 等命令,就是通过这个 IP 向外访问。而2408:802a::0:1/64 这个 IPv6 网段,是路由器从网络中申请到的专用子网段,用于分配给连接到路由器的设备。

配置 Dnsmasq 来避免山东联通的错误域名劫持广告

这篇讲的是如何配置 Dnsmasq 服务器,来避免 ISP 在域名不存在 (NXDOMAIN) 时返回错误的地址。通用于各个存在此问题的电信、联通运营商。

现在 Ubuntu 或 Arch Linux 等发行版,都默认用 NetworkManager (NM) 来管理网络连接。而现在 NM 已经支持与 Dnsmasq 配合进行 DNS 解析的缓存,而且最新版本 (Ubuntu 12.10 Quantal 以上,或 NM 0.9.6+)中已经可以直接修改 Dnsmasq 的配置文件了。

参考资料: 使用Dnsmasq解决联通的dns劫持 (这篇另外也讲了一些 Dnsmasq 其他的用处,比如防止 DNS 污染、屏蔽特定网站等。我很欣赏这篇文章里屏蔽的那几个网站,哈哈!

1. 首先要确定,你要屏蔽哪些 IP 地址?使用 nslookup 后跟一个不存在的域名,多试几次,可以大概知道一个 IP 地址的范围。例如:

$ nslookup fsjdlfjksljflsjakljls.com
... (省略) ...
Name: fsjdlfjksljflsjakljls.com
Address: 123.129.254.15

当然也可以去网上查询,例如山东联通的 IP 地址范围是:123.129.254.11 ~ 123.129.254.19。(这里吐槽一下,最开始用了9台服务器估计是想负载均衡,结果愚蠢的联通不知道该怎么配置,于是最开始解析到的都是 123.129.254.13,现在又都是15了……)

2. 然后要确定,你是不是在用 Dnsmasq?是的话,应该修改哪个配置文件?注意观察上面 nslookup 的结果,如果有这样一行:

Server: 127.0.0.1

就说明你在使用 Dnsmasq 本地服务器;否则如果是一个公网地址,就说明你没有配置使用它。

之后,再确定你是手动安装的 Dnsmasq,还是 NM 自带的?(关于如何配置 NM 使用 Dnsmasq,Ubuntu 是默认如此的,Arch 等用户请查看相关文档。)

$ ps aux | grep dnsmasq
... /usr/bin/dnsmasq ... --pid-file=/var/run/nm-dns-dnsmasq.pid ... --conf-file=/var/run/nm-dns-dnsmasq.conf ... --conf-dir=/etc/NetworkManager/dnsmasq.d

这里就能看出,是一个由 NM 控制的 Dnsmasq,包含 NM 动态生成的配置文件 /var/run/nm-dns-dnsmasq.conf,以及用户可修改的配置文件夹,也就是我们用到的 /etc/NetworkManager/dnsmasq.d 。如果没有看到类似的字样,说明你的 Dnsmasq 是手动安装的,配置文件通常位于 /etc/dnsmasq.conf。

3. 最后,按照 Dnsmasq 的格式,修改配置文件。如果是 NM 控制的,请在 /etc/NetworkManager/dnsmasq.d 下创建 xiaodu.conf,并填入以下内容;如果是手动安装的,请在 /etc/dnsmasq.conf 中添加以下内容。(其中的 IP 地址每行一个,按自己 ISP 的实际情况修改

bogus-nxdomain=123.129.254.11
bogus-nxdomain=123.129.254.12
bogus-nxdomain=123.129.254.13
bogus-nxdomain=123.129.254.14
bogus-nxdomain=123.129.254.15
bogus-nxdomain=123.129.254.16
bogus-nxdomain=123.129.254.17
bogus-nxdomain=123.129.254.18
bogus-nxdomain=123.129.254.19

写入完成后,保存即可。之后,可以重新启动 NM 或 Dnsmasq,使新配置生效。现在再来查询上面的域名:

$ nslookup fsjdlfjksljflsjakljls.com
... (省略) ...
** server can't find fsjdlfjksljflsjakljls.com: NXDOMAIN

这样就实现了我们上面的目的,防止 ISP 对不存在的域名进行劫持。