Solution: After switching to SyntaxHighlighter Evolved, all the codes are scrambled

tl;dr

If you switched from another syntax highlighting plugin to SyntaxHighlighter Evolved, and all your codes are scrambled, try running the following code as a single-file plugin.

function xiaodu_syntaxhighlighter_fix() {
	return 2;
}
add_filter('syntaxhighlighter_pre_getcodeformat', 'xiaodu_syntaxhighlighter_fix');

The easiest way is to go to Plugins – Plugin Editor, and paste the code at the bottom of any enabled plugin, maybe Hello Dolly.

Long version

After almost three years I finally started working on this blog again.

One of the first things I noticed is that the syntax highlighting plugin I used, Crayon Syntax Highlighter, is dead. Well, to its credit, the server-rendered markups it generated were… fine when I started this blog, but nowadays they just look sickening to me, especially when compared to the neat client rendering solutions.

So I went to the store and downloaded the most popular choice, SyntaxHighlighter Evolved, which uses the JavaScript library SyntaxHighlighter to perform client-side highlighting. After installing it and converting all my old code tags to their markup format, I found the highlighting to be working, but all the C++ and HTML looked screwed up.

Scrambled code
Scrambled code

As you can see, all the “<“, “>” and “&” in the code are now showing up as their HTML entities – “&lt;”, “&rt;” and “&amp;”. That is not cool, so I looked into the problem.

Looking under the hood

First thing we need to know is how the code is stored in the post. When I click on the “Text” tab in the post editor (yep, the old one… I haven’t adapted to the new blocks yet,) I found that the characters are displaying correctly.

Code in post editor
Code in post editor

Then I looked further into MySQL, and the code is stored encoded, which is fine – it can be stored in the database however it fits, as long as the final output is correct… which it isn’t.

Code in MySQL
Code in MySQL

Pinpointing the plugin code

Now that I know that the plugin did an extra encoding, I looked for “htmlspecialchars” in the plugin’s GitHub repository, and found this piece of code:

	// This function determines what version of SyntaxHighlighter was used when the post was written
	// This is because the code was stored differently for different versions of SyntaxHighlighter
	function get_code_format( $post ) {
		if ( false !== $this->codeformat )
			return $this->codeformat;
		if ( empty($post) )
			$post = new stdClass();
		if ( null !== $version = apply_filters( 'syntaxhighlighter_pre_getcodeformat', null, $post ) )
			return $version;
		$version = ( empty($post->ID) || get_post_meta( $post->ID, '_syntaxhighlighter_encoded', true ) || get_post_meta( $post->ID, 'syntaxhighlighter_encoded', true ) ) ? 2 : 1;
		return apply_filters( 'syntaxhighlighter_getcodeformat', $version, $post );
	}
	// Adds a post meta saying that HTML entities are encoded (for backwards compatibility)
	function mark_as_encoded( $post_ID, $post ) {
		if ( false == $this->encoded || 'revision' == $post->post_type )
			return;
		delete_post_meta( $post_ID, 'syntaxhighlighter_encoded' ); // Previously used
		add_post_meta( $post_ID, '_syntaxhighlighter_encoded', true, true );
	}

Apparently years ago they changed how codes are stored in the post. Now if you write and save a new post with their plugin installed, they will save the code already encoded, and insert a post meta “_syntaxhighlighter_encoded = True” to mark the post as the “new (encoded) format”.

But if you are like me who used other plugins when initially posting the code and later switched to Evolved, you are in bad luck, as they consider your post by default the “old format,” and will encode the code again in the final output.

Solution

The apparent solutions it to make the plugin think that all my posts are in the new format. I could add the same metadata to each of the posts, but luckily there is a easier way: Use the filter “syntaxhighlighter_pre_getcodeformat” (line 8 in the code above) they provided to override the result.

So I used the plugin code at the beginning to hook it. The hook function simply returns 2, which means all my posts, with or without the metadata, will be considered the already-encoded new format, so they will not be doubly encoded.

It’s been years, is that all you have to say?

OK, fair enough. So this blog may look the same, but the tech underneath it is constantly changing.

For example, all my appliances are now hosted on my own bare-metal (as opposed to cloud-vendored like GKE) globally-distributed Kubernetes cluster. Also, I have been hiding behind CloudFlare for years to avoid the haters, but now they are mostly gone (or grown up 🙄,) so I have been thinking of new ways to distribute my content.

All of these new stuff are exciting and worth sharing, and I will write about them soon™.

emlog 更新至 5.1.2 出现的问题 @emlog博客

今天 emlog 博客系统发布了 5.1.2 版本,在更新过程中遇到的问题来总结一下。

1. 从 5.0.1 升级时,执行“up5.0.1to5.1.1.php”升级程序,提示“错误操作:您必须完成升级步骤里的第一步才再进行本操作,详见安装说明”。

解决方法:因为我直接用了 5.1.1 和 5.1.2 的补丁,没有在更新到 5.1.1 之后就升级数据库。将升级程序 (up5.0.1to5.1.1.php) 的 105 行注释掉或删除即可。(请确保是从 5.0.1 升级的,不然要先执行之前的升级程序。)

2. 旧的“代码高亮”插件与新版自带功能冲突,导致“写文章”页面出错,无法载入编辑器。

解决方法:在后台管理页面的“插件”中删除“代码高亮”插件即可。新版 KindEditor 已经带了 prettyprint 代码高亮功能。

3. [W3C Validator 规范错误] (默认主题) 底部 <script>prettyPrint();</script> 缺少 type 属性。

解决方法:修改为 <script type=”text/javascript”> 即可。(位于 content/templates/default/footer.php)

4. [W3C Validator 规范错误] (默认主题) 右侧边栏的 <li> 标签位置错误。

解决方法:在 content/templates/default/module.php 的 79 行后面,插入新行内容 </li> 。应该是作者的小失误吧。

以上所有修改过的文件,可以点击这里下载

由于 emlog 当前没有真正的开源(版本库是托管于 bitbucket 的私有库),<更正>emlog 官微告诉我现在已经开源了,源码在GitHub,不过貌似没有之前版本的提交记录了</更正>,我在本地维护了一个修改版的 emlog 程序,包括一些默认模板和代码的修改,修改后的页面都符合 W3C 标准(可以测试本博客首页和大部分文章,有些文章内容中有错误除外)。如果有兴趣的话,可以联系我索取该版本或直接 pull 我的版本库。

emlog 插件 — 自定义代码 (du9l_emlog_code)

这是我之前做的一个小插件,用于在 emlog 的钩子处插入 HTML 代码,可以用来放置自己的 JavaScript、统计和广告代码。

插件下载:更新官网插件地址(推荐)点击打开,更新百度网盘(推荐)点击打开,本地下载—点击这里下载 0.3版本,发布于2012年11月3日。

注意:1. 下载服务器有防盗链,请链接百度网盘或官网地址。
2. 插件没有提交 emlog 官方列表。请谨慎使用。
3. 下载使用表明您遵守 GNU GPLv3 授权协议。注:官网版本要求 GPLv2 协议,因此两个协议均可以接受。
4. 可以用于 PHP 5.2 以上环境,未使用 5.3 特性。

后台截图:

支持以下钩子处的代码插入(钩子描述来自 emlog 官方博客):

  • index_head:前台头部扩展:可以用于增加前台css样式、加载js等
  • index_footer:首页底部扩展点
  • index_loglist_top:日志列表顶部扩展点,如显示公告等
  • log_related:阅读日志页面扩展点,用于增加日志相关内容(暂不支持参数)
  • navbar:用于扩展导航条,例如相册插件会利用这个挂载点生成一个相册的导航链接
  • comment_reply:回复评论扩展点
  • rss_display:Rss输出扩展
  • diff_side:侧边栏控制扩展点

使用方法:在 emlog 后台插件中上传,成功后在左侧的 “自定义代码” 功能中,点击相应条目编辑并保存即可。支持(不过滤)HTML 且不支持(不会运行)PHP 代码。

使emlog反向输入验证码,回避垃圾评论

最近博客上有好多垃圾评论,我懒得装反垃圾插件,就改了下验证码验证,要求反向输入验证码,机器人或者老外就搞不定啦。

1. 修改验证码保存的Session值,将 /include/lib/checkcode.php 中的15行左右,将:

$_SESSION['code'] = strtoupper($randCode);

改为:

$_SESSION['code'] = strrev(strtoupper($randCode));

这样在判断时,就会以为反过来的字符串(ABCD变成DCBA这样的反过来)是正确的,原本的验证码是错误的。

2. 添加输入验证码的提示,在模板中的 module.php (/content/templates/[模板名]/module.php)文件,找到 blog_comments_post 这个函数,修改其中的内容即可。

官方也有不少反垃圾评论插件,可以直接使用那些插件来过滤大部分国外垃圾评论,打开审核功能和验证码也是很管事的方法。

将Drupal博客迁移到emlog平台

今天终于完成了这个蛋疼的任务,因为实在受不了Drupal的复杂和缓慢了。顺便为了测试一下新浪SAE的效果,就干脆搬到了这个平台上。

下面说一下转换数据、跳转地址的方法。首先,Drupal版本为6.22(就是说版本5和7我都不保证能行),emlog是4.1.0的最新版。要求有一个安装好的emlog,以及Drupal数据库的完整备份,还需要一个支持phpMyAdmin的MySQL数据库做数据中转(我是在本地架设的)。

1. 将Drupal数据库导出,并导入到本地数据库中。假设Drupal的表前缀为drblog_。

2. 安装好emlog,进入后台的“数据”,选择备份到本地。然后将备份也导入本地同一个数据库。假设emlog的表前缀为emlog_。

3. 在本地数据库中执行下面的SQL语句:(我SQL学的很渣,不要吐槽我- -)

INSERT INTO `emlog_sort`(`sid`, `sortname`) 
SELECT `drblog_term_data`.`tid`, `drblog_term_data`.`name` FROM `drblog_term_data`
WHERE `drblog_term_data`.`vid` = (SELECT `drblog_vocabulary`.`vid` FROM `drblog_vocabulary` LIMIT 0,1);
INSERT INTO `emlog_blog` (`gid`, `title`, `date`, `content`, `excerpt`, `author`, `sortid`, `type`, `views`, `hide`)
SELECT `drblog_node`.`nid`, `drblog_node_revisions`.`title`, `drblog_node`.`created`, `drblog_node_revisions`.`body`, `drblog_node_revisions`.`teaser`, 1, `drblog_term_node`.`tid`, 'blog', `drblog_node_counter`.`totalcount`, IF(`drblog_node`.`status`=1, 'n', 'y')
FROM `drblog_node`, `drblog_node_revisions`, `drblog_term_node`, `drblog_node_counter`
WHERE `drblog_node_revisions`.`nid` = `drblog_node`.`nid` AND `drblog_term_node`.`nid` = `drblog_node`.`nid` AND `drblog_node_counter`.`nid` = `drblog_node`.`nid`;
UPDATE `emlog_blog` SET `emlog_blog`.`comnum` =
(SELECT COUNT(*) FROM `drblog_comments` WHERE `drblog_comments`.`nid` = `emlog_blog`.`gid`);
INSERT INTO `emlog_comment`(`cid`, `gid`, `pid`, `date`, `poster`, `comment`, `ip`)
SELECT `cid`, `nid`, `pid`, `timestamp`, `subject`, `comment`, `hostname` FROM `drblog_comments`;

注意:这里根据我的环境编写的,即只有一个用户ID=1,只有一个Taxomony(分类)ID=1,将日志最初发表时间作为日志时间,将评论标题作为emlog中的评论者名称。

4. 将执行完毕的库导出为SQL备份,并按照emlog备份的格式添加头尾、修改好之后,用emlog数据还原即可。

UPDATE: 更新一下用.htaccess将旧地址替换为新地址的代码,这里假设新博客地址就是本博客(t.du9l.com),替换的内容包括分页、文章、分类和作者页面。

  # Moving to SAE
  RewriteCond %{QUERY_STRING} page=([0-9]+)
  RewriteRule ^.*$ http://t.du9l.com/page/%1 [L,R=301]
  RewriteRule ^node/([0-9]+)$ http://t.du9l.com/post/$1 [L,R=301]
  RewriteRule ^blog/([0-9]+)$ http://t.du9l.com/author/$1 [L,R=301]
  RewriteRule ^taxonomy/term/([0-9]+)$ http://t.du9l.com/sort/$1 [L,R=301]
  RewriteRule ^.*$ http://t.du9l.com/ [L,R=301]
  # Moving to SAE