How the “Scan API” added in xiaodu-jsdelivr 1.3 works

I released the WordPress plugin “xiaodu-jsdelivr” months ago, which can be used to scan and replace references to static resources with jsDelivr CDN links. Details on how the plugin works can be found in the previous blog post. Yesterday, I released a new version 1.3, which contains a new feature called “Scan API”.

API Manager: https://xiaodu-jsdelivr-api.du9l.com/

What it is and why it is needed

“Scan API” is a hosted service, which can provide plugin users with pre-calculated scan results. Previous versions of the plugin used a more direct approach: Calculate local file hash, fetch CDN file and match their hashes. This obviously works, but it does slowly, because downloading remote files is a time-consuming job, which is the reason that initial scans after installing the plugin are always slow. Usually it took dozens of 30-second scanning sessions to complete an initial scan with the base WordPress, several plugins and themes. When you think about the fact that each website of all users has to go through the same process (there’s no import-export feature yet), it makes even less sense.

This is where the hosted API service becomes helpful. By pre-fetching and storing the hashes of WordPress and official plugins and themes (with all versions), and serving them to a client plugin when needed, the repetitive fetching and calculating can be avoided. That means the scanning process can be greatly accelerated, as long as the resources scanned are present in the API storage.

Flow chart of old and new processes

Current state and future development

The ultimate goal for the service is to provide scanning support for base WordPress versions, plugins and themes. As of now, both the service and plugin only implemented the first part, which is to provide hashes for all published (on GitHub) versions of WordPress. That means the client only uploads WordPress version, not plugin or theme versions; and the service only scans WordPress repository.

The remaining parts will be incrementally added in the future, which I have to think carefully. There are over 100,000 themes and plugins (combined) in the official SVN repositories, and it’s unrealistic to scan and store them all. So for the first future development goal may be to support the most popular of each category.

Also, as of right now the service is completely free. In the near future I don’t have a viable plan to charge users for the service, because with payments come payment gateway integrations and support requests. But I cannot guarantee that it will stay free forever – it may go paid or it may go away.

Technical details

Plugin users can stop reading here, because the following part is about technical details on how the service is built.

The API service is essentially a Python website built with Flask framework, with these main components:

  1. Authentication: This is provided by Auth0 (free plan), chosen for being relatively easy to use and the vast amount of login channels supported. It provides templates for a variety of languages, frameworks and applications, which can be downloaded and modified to fit the basic authentication needs. In my case (Python + Flask), Authlib is used in the template to provide the OAuth 2 functionality.
  2. Web UI: Written with React + React Router. It is not strictly necessary to use React, or even to build a single-page application, but I chose this path to get my skills up to date. For example, create-react-app is quite easy to use for creating a TypeScript-based project, while in the old days one had to configure TypeScript and Babel themselves. Also, React hooks is an interesting recent addition.
  3. API: There are two parts, one is the actual “Scan API” for the client plugin to query for stored data. The other is the backend API for the Web UI to manage API keys. MongoDB is used as permanent store for both scanning and user API key data.
  4. Scanner: Workers that regularly download and calculate remote hashes. A task queue (Celery) is used to manage scanning tasks, and the downloading part is handled by GitPython (later maybe PySVN will also be used).

The whole project is deployed in my private Kubernetes cluster, using Jenkins to build the frontend and backend and push the built images to an in-cluster docker registry. In the process of building the service, I am constantly amazed by how web development has evolved nowadays, with a lot of great tools and libraries available.

3 thoughts on “How the “Scan API” added in xiaodu-jsdelivr 1.3 works

发表评论

邮箱地址不会被公开。 必填项已用*标注