Magically “defeating” different CDN implementations

In this article I will show some different CDN implementations along with cases where each of them fails to bring the best performance. I am not a researcher in this area, so some of the points are based on my personal experiences.

1. Why people use CDN

Usually when people visit a website, their browsers first query the IP address from their ISP’s recursive DNS server, which in turn query the domain’s authoritative DNS. Then they will be connected to whatever IP address returned by the DNS which is usually distant (geographically far and has high latency) from them.

That is why people use CDN to solve this problem. By putting edge caching servers in different areas of the world (or in the target country) close to end-users, they can speed up the load time of their websites and improve user experience.

2. Traditional Geo-DNS setup

The most common and simple solution is to use a “Geo-DNS” service to point the same domain to different edge servers with different IP addresses (this is important). In this case, when people query the IP address of a domain, the authoritative Geo-DNS can point them to the nearest edge server, based on the users’ IP, their ISPs’ recursive DNS servers’ IP, or users’ IP provided by recursive DNS via EDNS Client Subnet (EDNS0) if supported.

This works fine in an ideal network setup, but it can easily fall apart. Here are some cases that I have come across:

(a) Unicast DNS

Sometimes Geo-DNS providers don’t use Anycast, instead provide Unicast IP addresses for different regions. The recursive DNS has no way of telling which one is closer to them, so it queries a random one, which can result in slow DNS resolution at first visit.

Example: DNSPod and CloudXNS (Popular Unicasted Geo-DNS providers in China)

DNSPod servers use ChinaNet, China Unicom and China Mobile unicasted IP addresses.
DNSPod servers use ChinaNet, China Unicom and China Mobile unicasted IP addresses.

(b) Bad GeoIP database

Some Geo-DNS providers don’t update their GeoIP database frequent enough, or just don’t have enough data.

Example:

(1) Amazon AWS CloudFront and Akamai don’t have servers in China for obvious reasons, but Chinese visitors are not consistently directed to nearest (Hong Kong, South Korea, Japan) locations. Sometimes a query from China can get a response of European locations, which results in ~500 ms latency.

Akamai directs ChinaNet users to Frankfurt, Germany when there are obviously better choices.
Akamai directs ChinaNet users to Frankfurt, Germany when there are obviously better choices.

(2) Some Geo-DNS providers in China, most notably Aliyun DNS. When both “Domestic” and “Global” records are set, they may direct Chinese users to “Global” servers.

(c) DNS different from network exit

Sometimes people may use recursive DNS servers in the network different from their actual network exit.

Example:

(1) In my university, we have mixed network exits, one in CERNET (AS4538) and one in TieTong (AS9394). Our recursive DNS has a CERNET address, so most Geo-DNS providers gives CERNET or (if the website doesn’t have CERNET servers) other networks’ addresses, for instance ChinaNet (AS4134). But our network exit is configured to use TieTong by default, so for most websites we are visiting ChinaNet servers with a TieTong network, even if they also have TieTong servers.

A more extreme case is that, in some networks they have different routing policies for TCP and UDP (which is a violation of OSI model), so when you do DNS query in UDP you have network A’s address, and when you actually connect to TCP port 80 you have network B. Magical? But true.

(2) Sometimes recursive DNS providers and/or Geo-DNS providers don’t support EDNS0. As long as either end doesn’t support it, it will not work. For instance, if user of open recursive DNS service “114DNS” (Anycasted across several Chinese networks) has a network that is not present in 114DNS’ Anycast network, and the authoritative Geo-DNS doesn’t support EDNS0, it will return the IP in the same network of 114DNS’ node, but different from the network of the user.

3. TCP Anycast setup

Some modern CDN providers use TCP Anycast technique, which means they provide a single IP address for their edge servers in multiple locations, and visitors are directed to the nearest location, decided by how they broadcast their routing tables to other networks.

Such providers include CloudFlare and MaxCDN, which use a single Anycasted IP for their edge servers across the planet. Verizon EdgeCast use a slightly different method where they provide several Anycasted IPs, each represent a geographical zone (Asia-Pacific, North America, South America, Europe).

A unified Anycasted IP solves many of the problems mentioned above, and it’s becoming harder and harder to defeat them. But here they comes:

(1) Magical routing policy (again)

Current Chinese IPv6 implementation has only one international exit (AS23911), which has two exit points: a default one in Los Angeles by HE.net (AS6939) and a premium one in Hong Kong (HKIX). When I resolve EdgeCast’s IPv6 address, I get one in 2606:2800:147::/48 network, which is Anycasted in Asia. But when I trace route to this address, the packet goes from China to Los Angeles and back to Asia, resulting in ~400 ms latency. Even if people use an Anycasted recursive DNS (like Google’s), since it has servers in Hong Kong, the result is the same. By querying the domain at OpenDNS (which doesn’t have Asian server) I get the IP in 2606:2800:11f::/48 network, which is Anycasted in North America, and the latency is only ~200 ms (same as the network exit’s).

Tracing route from AS23911 to EdgeCast's IPv6 edge servers in different continents.
Tracing route from AS23911 to EdgeCast’s IPv6 edge servers in different continents.

This only happens with EdgeCast’s “Continent-based Anycast” network. CloudFlare is not affected. But it has another kind of problem.

(2) Artificial routing deterioration

CloudFlare has edge servers everywhere, including Hong Kong, Taipei, Japan, South Korea, etc. which are all very close to Chinese users. But the major Chinese ISPs’ international exit routing policy directs CloudFlare traffic to Los Angeles (ChinaNet) and San Jose (China Unicom), where they are directed to the nearest edge servers in <3 hops. They did the same thing for Softlayer’s Hong Kong locations, for some magical reasons: maybe price, maybe [censored] ;). The latency from both ISP to CloudFlare’s US west locations are 200~300 ms, where with TieTong (which use Hong Kong as international exit) the value is <100 ms.

ChinaNet and Unicom users get 200~300 ms latency while China Mobile (TieTong) users get 80.
ChinaNet and Unicom users get 200~300 ms latency while China Mobile (TieTong) users get 80.

This is obviously not CloudFlare’s fault, because they cannot control the routing policy from another AS to themselves (unless they pay the other system to do so). If your ISP is doing this, switch to another ISP; If the whole country is doing this, maybe switch to another country?

Summary

To sum it up, when you and your customers’ networks don’t have any of the quirks above, a simple Anycasted Geo-DNS solution works fine – you don’t even need a commercial CDN service. But the real networks are hard, and so far a global TCP Anycast solution is the best we can do.

配置路由器使用联通 PPPoE IPv6

博主所在的济南联通,很早以前说已经开通了公众 IPv6 网络,但是从 2015 年底才开始陆续有人报告可以获取到 IPv6 地址了。这里面具体怎么回事我就不追究了,今天我也终于成功的在电脑和路由器上配置好了 IPv6 网络。

更新:经过测试,使用 OpenWrt 官方最新的 15.05.1 版本,默认安装无需配置即可获取 IPv6 地址,而且软件库中有更多软件包可用,因此建议直接使用新版,而不要使用 PandoraBox 等版本旧、不开源的分支。小米路由器 Mini其它路由器都可以在官网查询支持情况。

联通在我这里使用的是 PPPoE 拨号上网,在 Windows 系统上无需配置就可以直接获取到 IPv6 地址,所以这篇文章主要说一下 OpenWrt 系统路由器的配置。

1. 第一步当然是准备一个支持 OpenWrt 系统(版本最少是 Attitude Adjustment 12.09.1,建议是 Barrier Breaker 14.07 或更高)的路由器。博主用的是小米路由器 Mini (R1CM),其本身虽然是定制的 OpenWrt 系统,但是界面没法像原版系统一样方便的修改参数,而且没有可用的 opkg 包管理软件(类似于 apt-get),最好还是刷成原始的 OpenWrt 系统。

还是以我手中的小米路由器为例,首先要将系统刷成开发版固件,然后去官网开放 SSH 权限,再接下来就是去下载 PandoraBox(一个国产的 OpenWrt 分支,适配很多国产路由器)并刷机。我使用的刷机命令是:

2. 刷好机并进入后台后,请先确认左侧“系统”-“软件包”中有“odhcp6c”这个软件。如果没有的话,建议先“刷新列表”,然后在“可用软件包”中安装它。这个软件是用于通过 DHCP 协议自动获取 IPv6 地址的,因此对本教程至关重要。在 12.09.1 系统中可能不提供这个软件,那么建议下载类似的软件包(通常名字中含有 dhcp 和 6)。

3. 接下来,可以转到左侧“网络”-“接口”,默认内置了三个选项:LAN、WAN 和 WAN6。如果你的系统没有 WAN6,可以在接口中新建一个接口,并将“协议”设置为“DHCPv6 client”并点击“切换协议”。

接下来,在 WAN6 接口中按照如下配置:(括号内是 /etc/config/network 中对应的配置项,都在 config ‘interface’ ‘wan6’ 这一段中)

  • 基本设置:
    • Request IPv6-address:Disabled(option reqaddress ‘none’)
    • Request IPv6-prefix of length:自动(option reqprefix ‘auto’)
  • 物理设置:
    • 接口:自定义接口,[email protected](option ifname ‘@wan’)

这样完成了 DHCP 获取 IPv6 地址的配置,可以先“保存”(不必现在应用)。接下来配置一下 PPPoE 上网信息,在左侧找到“接口”-“WAN”,并进行如下配置:(括号同上,属于 config ‘interface’ ‘wan’ 这一段)

  • 基本设置:
    • 协议:选择 PPPoE,但是如果你是其它网络,可以按自己的情况选择“静态地址”、“DHCP 客户端”等。(option proto ‘pppoe’)
    • PAP/CHAP 用户名、密码:输入宽带拨号的用户名和密码即可。(option username ‘xxx’ / option password ‘xxx’)
  • 高级设置:
    • 在 PPP 链路上启用 IPv6 协商:打勾(option ipv6 ‘1’)

这样也完成了上外网配置,还是先“保存”。最后进行一下“接口”-“LAN”中的配置:(属于 config ‘interface’ ‘lan’ 这一段)

  • 上部分“基本设置”:
    • IPv6 assignment length:64(option ip6assign ’64’)
  • 下部分“IPv6 Settings”:
    • Always announce default router:打勾(option ra_default ‘1’)

全部搞定,此时选择“保存&应用”,然后等待大概一分钟(PPPoE 拨号、获取 IPv6 地址的时间),路由器和所连接设备应当都能获得 IPv6 地址了。

路由器配置成功后,可以从以下两处看到效果:

(1) “状态”-“总览”-“网络”:

(2) “网络”-“接口”:

大概解释一下其中的意思:2408:802a::ee3a/64 这个地址,是路由器 WAN 口本身获得的 IPv6 地址,在路由器上使用 ping6 等命令,就是通过这个 IP 向外访问。而2408:802a::0:1/64 这个 IPv6 网段,是路由器从网络中申请到的专用子网段,用于分配给连接到路由器的设备。

配置 Dnsmasq 来避免山东联通的错误域名劫持广告

这篇讲的是如何配置 Dnsmasq 服务器,来避免 ISP 在域名不存在 (NXDOMAIN) 时返回错误的地址。通用于各个存在此问题的电信、联通运营商。

现在 Ubuntu 或 Arch Linux 等发行版,都默认用 NetworkManager (NM) 来管理网络连接。而现在 NM 已经支持与 Dnsmasq 配合进行 DNS 解析的缓存,而且最新版本 (Ubuntu 12.10 Quantal 以上,或 NM 0.9.6+)中已经可以直接修改 Dnsmasq 的配置文件了。

参考资料: 使用Dnsmasq解决联通的dns劫持 (这篇另外也讲了一些 Dnsmasq 其他的用处,比如防止 DNS 污染、屏蔽特定网站等。我很欣赏这篇文章里屏蔽的那几个网站,哈哈!

1. 首先要确定,你要屏蔽哪些 IP 地址?使用 nslookup 后跟一个不存在的域名,多试几次,可以大概知道一个 IP 地址的范围。例如:

$ nslookup fsjdlfjksljflsjakljls.com
... (省略) ...
Name: fsjdlfjksljflsjakljls.com
Address: 123.129.254.15

当然也可以去网上查询,例如山东联通的 IP 地址范围是:123.129.254.11 ~ 123.129.254.19。(这里吐槽一下,最开始用了9台服务器估计是想负载均衡,结果愚蠢的联通不知道该怎么配置,于是最开始解析到的都是 123.129.254.13,现在又都是15了……)

2. 然后要确定,你是不是在用 Dnsmasq?是的话,应该修改哪个配置文件?注意观察上面 nslookup 的结果,如果有这样一行:

Server: 127.0.0.1

就说明你在使用 Dnsmasq 本地服务器;否则如果是一个公网地址,就说明你没有配置使用它。

之后,再确定你是手动安装的 Dnsmasq,还是 NM 自带的?(关于如何配置 NM 使用 Dnsmasq,Ubuntu 是默认如此的,Arch 等用户请查看相关文档。)

$ ps aux | grep dnsmasq
... /usr/bin/dnsmasq ... --pid-file=/var/run/nm-dns-dnsmasq.pid ... --conf-file=/var/run/nm-dns-dnsmasq.conf ... --conf-dir=/etc/NetworkManager/dnsmasq.d

这里就能看出,是一个由 NM 控制的 Dnsmasq,包含 NM 动态生成的配置文件 /var/run/nm-dns-dnsmasq.conf,以及用户可修改的配置文件夹,也就是我们用到的 /etc/NetworkManager/dnsmasq.d 。如果没有看到类似的字样,说明你的 Dnsmasq 是手动安装的,配置文件通常位于 /etc/dnsmasq.conf。

3. 最后,按照 Dnsmasq 的格式,修改配置文件。如果是 NM 控制的,请在 /etc/NetworkManager/dnsmasq.d 下创建 xiaodu.conf,并填入以下内容;如果是手动安装的,请在 /etc/dnsmasq.conf 中添加以下内容。(其中的 IP 地址每行一个,按自己 ISP 的实际情况修改

bogus-nxdomain=123.129.254.11
bogus-nxdomain=123.129.254.12
bogus-nxdomain=123.129.254.13
bogus-nxdomain=123.129.254.14
bogus-nxdomain=123.129.254.15
bogus-nxdomain=123.129.254.16
bogus-nxdomain=123.129.254.17
bogus-nxdomain=123.129.254.18
bogus-nxdomain=123.129.254.19

写入完成后,保存即可。之后,可以重新启动 NM 或 Dnsmasq,使新配置生效。现在再来查询上面的域名:

$ nslookup fsjdlfjksljflsjakljls.com
... (省略) ...
** server can't find fsjdlfjksljflsjakljls.com: NXDOMAIN

这样就实现了我们上面的目的,防止 ISP 对不存在的域名进行劫持。

手工配置 Nginx 反向代理 Apache 的全过程

现在很流行这种 LANMP 的组合,因为 Nginx 具有强力的静态文件、并发处理和反向代理能力,而 Apache 具有高安全性的 PHP 解析模块。于是今天自己手工配置了一遍这两个组合。

注意:本文载于小杜博客,因此要求读者在引用、转载、演绎或以其他形式使用本文内容时,必须遵守CC-by-sa协议(在页面顶部或底部均能看到协议大意及全文),按照协议要求对本文来源进行链接、使用和共享。

环境:主机上安装VirtualBox 4.1.2 For Windows,虚拟机为全新安装的Ubuntu Server 11.04(安装了自带的LAMP服务器,Apache 2.2.17 + PHP 5.3.5 + MySQL 5.1.54),软件包全部使用自带(不手动编译),更新源为网易163源。

虚拟机网络与主机桥接,DHCP获取到IP为192.168.1.5。

1. 安装Nginx,很简单,一行命令。Nginx版本为0.8.54。

# sudo apt-get update && sudo apt-get install nginx

2. 为Apache创建两个网站test1.du9l.com、test2.du9l.com(已经在主机上用Hosts解析到192.168.1.5),使其侦听于127.0.0.1:80,并启用网站。(以下只演示test1的配置,test2只需复制过去修改一下即可)

# sudo nano /etc/apache2/sites-available/test1

<VirtualHost 127.0.0.1:80> # 侦听的端口
ServerAdmin [email protected]
ServerName test1.du9l.com # 绑定的域名
 
DocumentRoot /var/www/test1 # 先创建好这个目录
 
<Directory />
AllowOverride All
</Directory>
 
#ErrorLog ${APACHE_LOG_DIR}/test1.err
#LogLevel warn
#CustomLog ${APACHE_LOG_DIR}/test1.log combined # 这三行是Apache访问和错误日志 请看附录(3)
</VirtualHost>

# sudo a2ensite test1

3. 为Nginx创建同样的网站,侦听于公网IP和端口80(这里是192.168.1.5:80),并启用网站。这里把所有跟反向代理和PHP有关的配置都写入xiaodu.conf,便于重复利用。还是只演示test1,如果还有其他网站,只添加新的文件,无需修改xiaodu.conf。

# sudo nano /etc/nginx/xiaodu.conf

location / {
try_files $uri $uri/ @apache;
}
 
location ~ .php$ { # 这里注意 只对.php结尾的传到代理 如果使用的是过时的.php/xxx传参 请自行修改
proxy_pass http://127.0.0.1;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
}
 
location @apache {
proxy_pass http://127.0.0.1;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
} # 凡是带有.php的,或找不到(试图Rewrite)的,都传给127.0.0.1:80的Apache。
 
location ~ /.ht {
deny all;
} # 禁止访问htaccess或htpasswd文件,Apache本身默认有这个配置了。

# sudo nano /etc/nginx/sites-available/test1

server {
listen   192.168.1.5:80;
#listen   [::]:80 default ipv6only=on; ## listen for ipv6 这里没有开IPv6,需要开的话直接去掉开头的#号,Apache里无需任何修改。
 
root /var/www/test1; # 要跟Apache里的一样
index index.html index.htm index.php; # 默认首页 这里的首页(例如index.html)存在的话不会去后台Rewrite,所以一定要想好,是否需要设置
 
server_name test1.du9l.com; # 绑定域名,与Apache里的对应
 
access_log /var/log/nginx/test1.log; # Nginx的访问日志 详见附录(3)
 
include xiaodu.conf; # 包含上面那个公共配置
}

# sudo ln -s /etc/nginx/sites-available/test1 /etc/nginx/sites-enabled/test1

4. 启动(或重启)两个服务,Apache里默认关闭的mod_rewrite也启用。

# sudo a2enmod rewrite

# sudo service apache2 restart

# sudo service nginx restart

5. 测试相关内容:

(1) 基本的静态文件、绑定域名测试:

(2) PHP后台处理测试(没有测MySQL,也没有必要。注意是Apache2Handler):

(3) Rewrite测试(重写到一个PHP文件的QueryString中,Nginx没有启用Rewrite支持):

扩展阅读:

(1) LNMP一键安装包中的LNMPA架构:包含Nginx、Apache、MySQL和PHP的源码编译脚本,并且有一个能自动创建主机的bash脚本,非常给力。 http://www.lnmp.org/lnmpa.html

(2) Nginx官方文档:HttpProxyModule HttpCoreModule http://wiki.nginx.org/。

参考文献:

(1) 《Windows下Nginx……》(来自“车燕冰的博客”) http://blog.sina.com.cn/s/blog_495697e60100v1n5.html 有删改,会写在下面的附录中。

(2) 《Ubuntu用Nginx……》(来自“新风宇宙”) http://www.cnblogs.com/php5/archive/2011/08/19/2146033.html

附录:

(1) Proxy传递的Host用于区分域名,否则会出现400错误。而参考文献1中有一行proxy_redirect off,这行是错误的,因为如果Apache传递了一个301/302跳转,Nginx必须反向地对Location进行处理,默认值default是正确的。

(2) 在参考文献2中后半段描述的问题,部分应用发现自己运行的Apache在非80端口,则会自行加上端口号。如果只将Apache侦听于内部IP端口(如集群中的192.168.或10.段,或单机时的127.段),而将Nginx侦听于外网IP的80端口(默认是全部IP,那样Apache将无法启动),就能完美的解决这个问题。

(3) 关于Apache的日志注意两点:一是会跟Nginx的日志冗余,尤其是访问日志,基本没有必要开,使用上面Nginx配置的日志即可。二是Apache日志如果非要用的话,请自己修改下日志格式,使用LogFormat或CustomLog命令即可,修改时注意要把访客IP地址设为X-Real-IP的地址。

(4) 这里介绍的只是一个基本架构,具体应用时可能有许多变化。例如可以在Nginx配置中设置将js、css等文件缓存多久,或者Apache是一个集群,需要用upstream来做负载均衡。更多优化方法请自行查看官方文档。

注意:本文载于小杜博客,因此要求读者在引用、转载、演绎或以其他形式使用本文内容时,必须遵守CC-by-sa协议(在页面顶部或底部均能看到协议大意及全文),按照协议要求对本文来源进行链接、使用和共享。

使用 uTorrent 的 ipfilter 过滤指定IP段

写这篇主要是有一个来访记录,查的就是如何使uTorrent屏蔽IPv4流量,其实方法很简单,我这里只是把官方教程解释一下。

uTorrent 官方关于 ipfilter 的教程:http://www.utorrent.com/intl/zh_cn/help/faq/misc

1. 首先打开 uTorrent 所在目录,如果不知道在哪,通常在 %appdata%utorrent (按Win+R后输入这个缩写点确定即可)。

2. 这个目录中有一个 ipfilter.dat 文件,如果没有的话创建一个,然后用记事本打开。

3. 按照以下格式输入IP段,每行一条(//后为解释,不需要输入):

x.x.x.x  //过滤单个IPv4地址
x.x.x.x-y.y.y.y  //过滤IPv4段
[a:b:c:d:e:f:g:h]  //过滤单个IPv6地址
[a:b:c:d:e:f:g:h]-[i:j:k:l:m:n:o:p]  //过滤IPv6段
[a:b:c:d::]/nn  //过滤IPv6子网

例如,过滤全部IPv4则输入: 0.0.0.0-255.255.255.255 。过滤全部 IPv6 则输入: [::]/0
注意:仅用ipfilter并不能很好的过滤 IPv4 或 IPv6,有时候还可能产生流量,建议配合防火墙使用。)

4. 保存这个文件。要使设置立即生效,可以在 uTorrent 中的“用户”页面上右键–“载入IP过滤”。