I released the WordPress plugin “xiaodu-jsdelivr” months ago, which can be used to scan and replace references to static resources with jsDelivr CDN links. Details on how the plugin works can be found in the previous blog post. Yesterday, I released a new version 1.3, which contains a new feature called “Scan API”.
“Scan API” is a hosted service, which can provide plugin users with pre-calculated scan results. Previous versions of the plugin used a more direct approach: Calculate local file hash, fetch CDN file and match their hashes. This obviously works, but it does slowly, because downloading remote files is a time-consuming job, which is the reason that initial scans after installing the plugin are always slow. Usually it took dozens of 30-second scanning sessions to complete an initial scan with the base WordPress, several plugins and themes. When you think about the fact that each website of all users has to go through the same process (there’s no import-export feature yet), it makes even less sense.
This is where the hosted API service becomes helpful. By pre-fetching and storing the hashes of WordPress and official plugins and themes (with all versions), and serving them to a client plugin when needed, the repetitive fetching and calculating can be avoided. That means the scanning process can be greatly accelerated, as long as the resources scanned are present in the API storage.
Current state and future development
The ultimate goal for the service is to provide scanning support for base WordPress versions, plugins and themes. As of now, both the service and plugin only implemented the first part, which is to provide hashes for all published (on GitHub) versions of WordPress. That means the client only uploads WordPress version, not plugin or theme versions; and the service only scans WordPress repository.
The remaining parts will be incrementally added in the future, which I have to think carefully. There are over 100,000 themes and plugins (combined) in the official SVN repositories, and it’s unrealistic to scan and store them all. So for the first future development goal may be to support the most popular of each category.
Also, as of right now the service is completely free. In the near future I don’t have a viable plan to charge users for the service, because with payments come payment gateway integrations and support requests. But I cannot guarantee that it will stay free forever – it may go paid or it may go away.
Technical details
Plugin users can stop reading here, because the following part is about technical details on how the service is built.
The API service is essentially a Python website built with Flask framework, with these main components:
Authentication: This is provided by Auth0 (free plan), chosen for being relatively easy to use and the vast amount of login channels supported. It provides templates for a variety of languages, frameworks and applications, which can be downloaded and modified to fit the basic authentication needs. In my case (Python + Flask), Authlib is used in the template to provide the OAuth 2 functionality.
Web UI: Written with React + React Router. It is not strictly necessary to use React, or even to build a single-page application, but I chose this path to get my skills up to date. For example, create-react-app is quite easy to use for creating a TypeScript-based project, while in the old days one had to configure TypeScript and Babel themselves. Also, React hooks is an interesting recent addition.
API: There are two parts, one is the actual “Scan API” for the client plugin to query for stored data. The other is the backend API for the Web UI to manage API keys. MongoDB is used as permanent store for both scanning and user API key data.
Scanner: Workers that regularly download and calculate remote hashes. A task queue (Celery) is used to manage scanning tasks, and the downloading part is handled by GitPython (later maybe PySVN will also be used).
The whole project is deployed in my private Kubernetes cluster, using Jenkins to build the frontend and backend and push the built images to an in-cluster docker registry. In the process of building the service, I am constantly amazed by how web development has evolved nowadays, with a lot of great tools and libraries available.
I created a plugin called “xiaodu-jsdelivr”, which automatically scan and replace references to WordPress static files, including plugins and themes that can be found in the official repository, with their canonical jsDelivr CDN addresses.
The plugin has been uploaded to the official WordPress plugin category. You can search for “xiaodu-jsdelivr” in the plugin installer, or download the ZIP archive here to install.
How it works
I downloaded and tested some of the existing WordPress plugins with jsDelivr replacing feature, and noticed that most of them employ a passive approach, which is to wait for visitors to request some static files, and then search the file hash against jsDelivr’s lookup API.
My plugin instead uses a more proactive approach. It starts by scanning static files directly from the WordPress installation directory, instead of waiting for visitors’ requests. To do this, I used the official WP-Cron scheduler to perform scans in the background at fixed intervals.
Then it will calculate local file hashes and compare them with known URL patterns (both WordPress itself and its official plugins and themes are supported), and matched URLs will be recorded and later used to replace local file references when they are requested. This is more reliable than just using the lookup API, because it can match against files that have never been visited by anyone, thus cannot be discovered simply by looking up hashes.
Here is a screenshot of what wp-content references can transform into with the plugin enabled:
Note the four different kinds of successful scan results shown in the screenshot above: Base WordPress files (line #32 – #35), an official plugin (#36), an official theme (#38 and #42) and custom files that the fallback hash lookup successfully found (#37).
Why jsDelivr
There are a handful of popular and reliable static file CDNs available. I think other plugin developers and I probably chose jsDelivr for the same reasons:
It has native WordPress support, namely https://cdn.jsdelivr.net/wp/plugins/ and https://cdn.jsdelivr.net/wp/themes/ that point to official plugin and theme SVN repositories. Other static CDNs mostly just load from GitHub and/or NPM.
It is more reliable in Mainland China, thanks to the QUANTIL (ChinaNetCenter / WangSu) nodes it uses for the Chinese visitors. Their overall visitor performance is better than their alternatives. The alternatives usually perform pretty bad in PRC.
Future development
The latest stable version will always be published to the WordPress plugin category. The plugin is GPLv2 (or later) licensed, and the first version (1.0) comes with full functionality and all the features mentioned above.
Managing DNS for a domain name traditionally involves visiting the control panel of your DNS authoritative server providers to create, modify or delete the related records there. But I recently discovered a new project, DNS Control by Stack Overflow, which allows one to manage DNS records by modifying JavaScript configuration files, similar to the ways Kubernetes and Ansible work in.
Why did I switch?
In my experience, the main advantages of DNSControl, or rather the workflow it promotes, are the following:
Support for different authoritative DNS providers: It is no longer needed to visit the control panels of different providers. The configuration is provider-agnostic, and can be applied to different or even multiple DNS providers, which allows administrators to easily migrate between providers or mix and use servers from different providers simultaneously.
Specify the state instead of actions: This is analogous of managing infrastructure using Ansible vs manually. Only the final state is specified in the configuration file, and the software takes care of adding or modifying records and deleting unnecessary ones.
Use script to simplify records description: A basic version of JavaScript can be used in describing the DNS records, which can reduce repetition and ease the complexity of modifications. For example, variables (or constants) and functions can be used to generate similar DNS records in batch.
I will briefly introduce my new workflow for migrating and managing DNS below, in order to show you how it can be done.
Migrating existent zones
The first step of switching to the new workflow, is to export and migrate the existent DNS zones from the current providers into the configuration file.
If you are like me who have dozens of records in the old DNS control panel, and you simply don’t want to copy-paste everything by hand, DNSControl has a “get-zones” sub-command that can be used in this situation. You can read the official documentation about migration, and the steps I used are:
In order to read from the current provider, credentials must be generated and provided in the creds.json file. The methods vary by provider, which can be found in their respective pages. For example, CloudFlare only requires an API token with sufficient permissions to access and modify zone records.
With creds.json filled out and saved to the current directory, the following command can be executed to export current records of a specific zone: dnscontrol get-zones --format=js --out=dnsconfig.js <creds-name> <PROVIDER-IDENTIFIER> your-domain.tld
The software is written in Go, so they provide static binaries in GitHub release page.
creds-name is the key used in creds.json, and PROVIDER-IDENTIFIER can be found in the “Identifier” column in the provider table.
Now dnsconfig.js should contain all your existent records, and you can optimize the script using JavaScript variables and functions. Note that they use a simple JavaScript interpreter, so please only use the simplest features of the language. (You will know what not to use in the testing steps below.)
Updating DNS records
In order to create or update DNS records for a domain, one should first edit dnsconfig.js by modifying the arguments or variables (if created in the previous part) that belongs to the domain in question. Then, in order to make sure that the JavaScript syntax is correct and all the changes are indeed desired, use the preview sub-command to compare the changes to the existent records online. Finally, when everything checks out, use dnscontrol push to apply the changes.
To further automate the workflow, I personally use a Git repository to version-control my dnsconfig.js configuration, and Jenkins to perform the steps above. My creds.json is kept private in Jenkins’ “Credentials” area, and mounted into the pipeline environment during execution. In this way, I can commit and push my DNS configuration to the Git server, and Jenkins will automatically check and apply the changes.
Supported providers
As of the time of writing this article, the following DNS providers are supported by DNSControl:
ActiveDirectory_PS
AXFRDDNS
Azure DNS
BIND
Cloudflare
ClouDNS
deSEC
DigitalOcean
DNSimple
Gandi_v5
Google Cloud DNS
Hurricane Electric DNS
Hetzner DNS Console
HEXONET
INWX
Linode
Microsoft DNS Server (Windows Server)
Name.com
Namecheap Provider
Netcup
NS1
Oracle Cloud
Ovh
PowerDNS
Route 53
SoftLayer DNS
Vultr
In addition, the following registrars are supported, which allow users to modify the domains’ NS records to point to the providers above:
CSC Global
DNSimple
DNS-over-HTTPS
Gandi_v5
HEXONET
Internet.bs
INWX
Name.com
Namecheap Provider
OpenSRS
Ovh
Route 53
And even if your current provider is not covered, you can easily add your own integration and possibly contribute to the upstream.
If you switched from another syntax highlighting plugin to SyntaxHighlighter Evolved, and all your codes are scrambled, try running the following code as a single-file plugin.
function xiaodu_syntaxhighlighter_fix() {
return 2;
}
add_filter('syntaxhighlighter_pre_getcodeformat', 'xiaodu_syntaxhighlighter_fix');
The easiest way is to go to Plugins – Plugin Editor, and paste the code at the bottom of any enabled plugin, maybe Hello Dolly.
Long version
After almost three years I finally started working on this blog again.
One of the first things I noticed is that the syntax highlighting plugin I used, Crayon Syntax Highlighter, is dead. Well, to its credit, the server-rendered markups it generated were… fine when I started this blog, but nowadays they just look sickening to me, especially when compared to the neat client rendering solutions.
So I went to the store and downloaded the most popular choice, SyntaxHighlighter Evolved, which uses the JavaScript library SyntaxHighlighter to perform client-side highlighting. After installing it and converting all my old code tags to their markup format, I found the highlighting to be working, but all the C++ and HTML looked screwed up.
As you can see, all the “<“, “>” and “&” in the code are now showing up as their HTML entities – “<”, “&rt;” and “&”. That is not cool, so I looked into the problem.
Looking under the hood
First thing we need to know is how the code is stored in the post. When I click on the “Text” tab in the post editor (yep, the old one… I haven’t adapted to the new blocks yet,) I found that the characters are displaying correctly.
Then I looked further into MySQL, and the code is stored encoded, which is fine – it can be stored in the database however it fits, as long as the final output is correct… which it isn’t.
Pinpointing the plugin code
Now that I know that the plugin did an extra encoding, I looked for “htmlspecialchars” in the plugin’s GitHub repository, and found this piece of code:
// This function determines what version of SyntaxHighlighter was used when the post was written
// This is because the code was stored differently for different versions of SyntaxHighlighter
function get_code_format( $post ) {
if ( false !== $this->codeformat )
return $this->codeformat;
if ( empty($post) )
$post = new stdClass();
if ( null !== $version = apply_filters( 'syntaxhighlighter_pre_getcodeformat', null, $post ) )
return $version;
$version = ( empty($post->ID) || get_post_meta( $post->ID, '_syntaxhighlighter_encoded', true ) || get_post_meta( $post->ID, 'syntaxhighlighter_encoded', true ) ) ? 2 : 1;
return apply_filters( 'syntaxhighlighter_getcodeformat', $version, $post );
}
// Adds a post meta saying that HTML entities are encoded (for backwards compatibility)
function mark_as_encoded( $post_ID, $post ) {
if ( false == $this->encoded || 'revision' == $post->post_type )
return;
delete_post_meta( $post_ID, 'syntaxhighlighter_encoded' ); // Previously used
add_post_meta( $post_ID, '_syntaxhighlighter_encoded', true, true );
}
Apparently years ago they changed how codes are stored in the post. Now if you write and save a new post with their plugin installed, they will save the code already encoded, and insert a post meta “_syntaxhighlighter_encoded = True” to mark the post as the “new (encoded) format”.
But if you are like me who used other plugins when initially posting the code and later switched to Evolved, you are in bad luck, as they consider your post by default the “old format,” and will encode the code again in the final output.
Solution
The apparent solutions it to make the plugin think that all my posts are in the new format. I could add the same metadata to each of the posts, but luckily there is a easier way: Use the filter “syntaxhighlighter_pre_getcodeformat” (line 8 in the code above) they provided to override the result.
So I used the plugin code at the beginning to hook it. The hook function simply returns 2, which means all my posts, with or without the metadata, will be considered the already-encoded new format, so they will not be doubly encoded.
It’s been years, is that all you have to say?
OK, fair enough. So this blog may look the same, but the tech underneath it is constantly changing.
For example, all my appliances are now hosted on my own bare-metal (as opposed to cloud-vendored like GKE) globally-distributed Kubernetes cluster. Also, I have been hiding behind CloudFlare for years to avoid the haters, but now they are mostly gone (or grown up ?,) so I have been thinking of new ways to distribute my content.
All of these new stuff are exciting and worth sharing, and I will write about them soon™.