Debugging random lagging on Android

My Samsung tablet with Android (4.4.2, rooted) has been experiencing random lagging for a while. This affects me most when I’m playing games on it. Earlier today I finally decided to find the root of the problem.

Finding the cause

The ultimate solution is to know which application is causing the lag and disable it. I first went to Settings > General > Developer Options (If you don’t have it, go to Settings > General > About phone/device and tap “Build number” 7+ times to enable it), and ticked “Show CPU usage”. Then you would see an overlay on the top-right corner of your device showing the top (like the Linux command) processes, similar to the image below.

Then I used my device to play games, and wait for the random lagging to happen. When the device start to response slowly, I watched the CPU usage overlay and see “gzip” topping the list. If you are not familiar with Linux, gzip is a compression tool similar to RAR or 7-Zip.

Wrapping (first attempt)

I was expecting some Android application to cause the lagging, instead it’s a command gzip that can be called by anyone. The problem became finding out who called it. Since my device is rooted, I decided to replace the original gzip binary with a wrapper that can log who called it before actually gzipping. Here is the wrapping bash script I used (first version, read on for second version):

#!/system/bin/sh

OPERATION=/system/bin/gzip.real
LOGFILE=/sdcard/gziplog.txt

OPTIONS="$@"

echo "`date` + $EUID `id` + $OPERATION "$@"" >> $LOGFILE

exec $OPERATION "$@"

The script does the following: (1) Log the time, user ID (the $EUID variable turns out to be useless, but `id` worked) and command line parameters to $LOGFILE (a plain text file in the internal storage); then (2) hand over the control to real gzip so that it doesn’t break the gzip functionality.

Then I used Root Explorer to rename /system/bin/gzip to /system/bin/gzip.real, and put my script to /system/bin/gzip (remember to modify the permission to be the same as gzip.real, usually 0755).

After using the device for a while and several lagging happened, I read the logging file and got these lines:

Fri Aug 28 13:53:22 CST 2015 +  uid=2000(shell) gid=2000(shell) groups=1007(log),1009(mount),1015(sdcard_rw),1028(sdcard_r),3003(inet),3006(net_bw_stats) context=u:r:dumpstate:s0 + /system/bin/gzip.real -6
Fri Aug 28 13:54:32 CST 2015 +  uid=2000(shell) gid=2000(shell) groups=1007(log),1009(mount),1015(sdcard_rw),1028(sdcard_r),3003(inet),3006(net_bw_stats) context=u:r:dumpstate:s0 + /system/bin/gzip.real -6
Fri Aug 28 13:55:43 CST 2015 +  uid=2000(shell) gid=2000(shell) groups=1007(log),1009(mount),1015(sdcard_rw),1028(sdcard_r),3003(inet),3006(net_bw_stats) context=u:r:dumpstate:s0 + /system/bin/gzip.real -6
Fri Aug 28 13:56:53 CST 2015 +  uid=2000(shell) gid=2000(shell) groups=1007(log),1009(mount),1015(sdcard_rw),1028(sdcard_r),3003(inet),3006(net_bw_stats) context=u:r:dumpstate:s0 + /system/bin/gzip.real -6

Observe that `id` produced “uid=2000(shell) gid=2000(shell) ...” and the command called is “gzip -6“. This means gzip was called by the shell user, not a specific application, and the data to be compressed was piped in to gzip and then piped out, since there are no input or output file names. (Spoiler: Actually this has shown “context=u:r:dumpstate:s0” which indicated that it was called by “dumpstate“, but I didn’t realize for the first time.)

Wrapping (second attempt)

Then I decided to take the surveillance up a level, and used the following wrapper in place of the first version:

#!/system/bin/sh

OPERATION=/system/bin/gzip.real
LOGFILE=/sdcard/gziplog.txt
TEE="/system/bin/busybox0 tee"
PSTREE="/system/bin/busybox0 pstree"

OPTIONS="$@"

echo "[$$] `date` + $EUID `id` + $OPERATION "$@"" >> $LOGFILE
$PSTREE -p >> $LOGFILE

#exec $OPERATION "$@"
$TEE -a $LOGFILE | $OPERATION "$@"

The difference between it and the previous one is that, it uses Busybox’s “pstree” command to output the process tree of that moment, and “tee” command to record the data piped in and pipe it out again. In order for this to work, I downloaded busybox-armv7l binary (my device has ARMv7 architecture) from the official website and put it to /system/bin/busybox0 with the same permission as gzip.real. You can also use Busybox installer apps and modify the busybox path accordingly.

After a while, I got the following log this time:

[8691] Fri Aug 28 14:30:46 CST 2015 +  uid=2000(shell) gid=2000(shell) groups=1007(log),1009(mount),1015(sdcard_rw),1028(sdcard_r),3003(inet),3006(net_bw_stats) context=u:r:dumpstate:s0 + /system/bin/gzip.real -6
init(1)-+-Binder_2(2299)---{Binder_1}(2360)
        |-DaemonServer(5419)---app_process(8668)
        |-DaemonServer(6687)
<...snip...>
        |-debuggerd(2276)---dumpstate(8690)-+-gzip(8691)---busybox0(8695)
        |                                   `-top(8696)
<...snip...>
========================================================
== dumpstate: 2015-08-28 14:30:46
========================================================

Build: KOT49H.P601ZCUCNH1
Build fingerprint: 'samsung/lt033gzc/lt033g:4.4.2/KOT49H/P601ZCUCNH1:user/release-keys'
Bootloader: P601ZCUCNH1
Radio: unknown
<...snip...>

Observe that [8691] is the gzip wrapper’s PID, and it’s called by “dumpstate“, which is then called by “debuggerd“. The wrapper also logged the data to be compressed, which is 5 MB long and takes 0.14s to be compressed on my PC instead of a poor mobile device.

Final solution

A quick Google told me this is for developers to debug the system state, and since I’m not a developer in this area, I decided to disable it once for all. This is done by using Root Explorer to revoke the execute permissions from the two binaries /system/bin/{debuggerd,dumpstate}.

Now the lagging is gone.

2 thoughts on “Debugging random lagging on Android

  1. 我的理解是debuggerd也不会无缘无故自己跑吧,从我看的资料上说debuggerd是检测应用程序崩溃的,那么肯定是后台有个什么应用奔溃了才触发了debuggerd吧,应该把那个罪魁祸首揪出来才对。

发表回复

您的电子邮箱地址不会被公开。 必填项已用*标注