Author Date

2023-03-17

Degree Name

BS

Department

Electrical and Computer Engineering

College

Ira A. Fulton College of Engineering and Technology

Defense Date

2023-03-10

Publication Date

2023-03-17

First Faculty Advisor

Philip B. Lundrigan

First Faculty Reader

Steve Richardson

Honors Coordinator

Karl F. Warnick

Keywords

censorship, Chinese, homophone, keyword filtering

Abstract

As the scope of Chinese language censorship expands, individuals will seek to bypass such censorship efforts. One of the most prevalent techniques in such censorship is automated keyword filtering. This research focuses on building a command-line tool that can bypass automated keyword filters for both traditional and simplified Chinese characters using a two-part approach. The first part involves detecting sensitive words in user-inputted text by using phrase matching techniques to identify character strings that have been censored in the past. The second part centers around generating possible obfuscated homonym alternatives. The tool relies on a compiled list of banned and potentially banned phrases from previous research to determine what is deemed “sensitive.” Alternate characters to generate the obfuscated text are drawn from a a standardized list of the most commonly used Chinese characters. Further research is needed to automate the updating the list of sensitive phrases and to detect phrases that are similar, but not identical, to those that have been censored in the past.

Share

COinS