

How does WeChat monitor image files?
来源: Bob Mok


WeChat uses two basic methods to implement its censorship on its Chinese Mainland users – the keyword list and a forbidden web site list for filtering and banning communications.


特鲁多的母亲和兄弟通过演讲从WE Charity拿了30万的报酬
7月10日周五中午12点疫情快报:6月份安省首次新增工作机会 全国失业率下降 省府倾向9月份让学生返校上课 加国未准备好应对第二波疫情
疫情期间在家工作 小心久坐有害健康



It is very labour intensive, time-consuming and hence expensive for to analyze an image in real time to determine if it should be censored. WeChat would therefore subject every image to check against previously catalogued “Ban items” through an image's hash index – a form of simplified coding of the file in layman's term. Since this can be done quickly in real time, it is most efficient to find matches when compared to previous identified items.


When an image does not have a match, it then undergo “Content surveillance”. This process will see if it is visually similar to any of the blacklisted images. As well, text inside the image is extracted and analyzed to determine if any of the text is blacklisted. If these findings are positive, then the hash code is added to the “Ban items lists” and allow for future real-time censorship. Previous it was found that content surveillance was never performed in real time and that the first time that a sensitive image file is transmitted it was not censored.


This explains why some people tried to beat the system recently and was successful for a while until the system caught up to them. Apparently someone put simple Morse code messages onto a paper and then scanned it into a picture and transmitted that around as part of their messages. It worked for a while and then the system learned of this through its artificial intelligence capability and put a stop to this.


In controlled testing, computer experts did not detect censorship in communications among non-China-registered accounts. They have proof that such accounts are nevertheless subject to content surveillance. This was confirmed when politically sensitive content which was sent exclusively between non-China-registered accounts was identified as politically sensitive and subsequently censored when transmitted between China-registered accounts. These traced contents were not previously sent to, or transmitted between China-registered accounts.

测试人员设法了解微信是否使用散列索引对发往或来自中国用户的敏感文件进行监控和审查。他们将敏感文件发送到在国内的用户,发现诸如UTF8编码的纯文本(* .txt),Microsoft Word(* .docx)和可移植文档格式(* .pdf)之类的文档如包含某些敏感的关键字,例如“法轮功”和“法轮大法”就会被微信封杀。换句话说,国内用户根本就不知道有人给他们发这些文件。

The testers further explored whether sensitive documents sent to, or from, China-registered accounts were been monitored and censored using a hash index. By sending sensitive documents to a China-registered account, we could observe which files were censored. We found that documents such as UTF8-encoded plain text (*.txt), Microsoft Word (*.docx), and Portable Document Format (*.pdf) documents which contained certain sensitive keyword combinations such as “Falun Gong + Falun Dafa”were censored. In other words, these files never reached their destinations and displayed on Chinese-registered accounts.


By sending out the files in two different batches, testers confirmed that documents underwent file hash surveillance and that such files were not censored in real time until they had undergone non-real time content surveillance and their file hash had been added to the hash index.


Through these experiments, experts seek to understand the methodology used by WeChat to conduct its censorship so users can devise a better means to combat this practice. I will further explore this in the coming weeks.




请先 点击登录注册 后发表评论
You must be logged in to join the discussion

©2013 - 2024 chinesenewsgroup.com Chinese News Group Ltd. 大中资讯网. All rights reserved. 
Distribution, transmission or republication of any material from chinesenewsgroup.com is strictly prohibited without the prior written permission of Chinese News Group Ltd.