本次我们选择的安卓游戏对象叫“单词英雄”,大家可以先下载这个游戏。
游戏的界面是这样的:
通过选择单词的意思进行攻击,选对了就正常攻击,选错了就象征性的攻击一下。玩了一段时间之后琢磨可以做成自动的,通过PIL识别图片里的单词和选项,然后翻译英文成中文意思,根据中文模糊匹配选择对应的选项。
查找了N多资料以后开始动手,程序用到以下这些东西:
PIL:Python Imaging Library 大名鼎鼎的图片处理模块
pytesser:Python下用来驱动tesseract-ocr来进行识别的模块
Tesseract-OCR:图像识别引擎,用来把图像识别成文字,可以识别英文和中文,以及其它语言
autopy:Python下用来模拟操作鼠标和键盘的模块。
安装步骤(win7环境):
(1)安装PIL,下载地址:http://www.pythonware.com/products/pil/,安装Python Imaging Library 1.1.7 for Python 2.7。
(2)安装pytesser,下载地址:http://code.google.com/p/pytesser/,下载解压后直接放在
C:/Python27/Lib/site-packages下,在文件夹下建立pytesser.pth文件,内容为C:/Python27/Lib/site-packages/pytesser_v0.0.1
(3)安装Tesseract OCR engine,下载:https://github.com/tesseract-ocr/tesseract/wiki/Downloads,下载Windows installer of tesseract-ocr 3.02.02 (including English language data)的安装文件,进行安装。
(4)安装语言包,在https://github.com/tesseract-ocr/tessdata下载chi_sim.traineddata简体中文语言包,放到安装的Tesseract OCR目标下的tessdata文件夹内,用来识别简体中文。
(5)修改C:/Python27/Lib/site-packages/pytesser_v0.0.1下的pytesser.py的函数,将原来的image_to_string函数增加语音选择参数language,language='chi_sim'就可以用来识别中文,默认为eng英文。
改好后的pytesser.py:
"""OCR in Python using the Tesseract engine from Googlehttp://code.google.com/p/pytesser/by Michael J.T. O'KellyV 0.0.1, 3/10/07"""import Imageimport subprocessimport utilimport errorstesseract_exe_name = 'tesseract' # Name of executable to be called at command linescratch_image_name = "temp.bmp" # This file must be .bmp or other Tesseract-compatible formatscratch_text_name_root = "temp" # Leave out the .txt extensioncleanup_scratch_flag = True # Temporary files cleaned up after OCR operationdef call_tesseract(input_filename, output_filename, language): """Calls external tesseract.exe on input file (restrictions on types), outputting output_filename+'txt'""" args = [tesseract_exe_name, input_filename, output_filename, "-l", language] proc = subprocess.Popen(args) retcode = proc.wait() if retcode!=0: errors.check_for_errors()def image_to_string(im, cleanup = cleanup_scratch_flag, language = "eng"): """Converts im to file, applies tesseract, and fetches resulting text. If cleanup=True, delete scratch files after operation.""" try: util.image_to_scratch(im, scratch_image_name) call_tesseract(scratch_image_name, scratch_text_name_root,language) text = util.retrieve_text(scratch_text_name_root) finally: if cleanup: util.perform_cleanup(scratch_image_name, scratch_text_name_root) return textdef image_file_to_string(filename, cleanup = cleanup_scratch_flag, graceful_errors=True, language = "eng"): """Applies tesseract to filename; or, if image is incompatible and graceful_errors=True, converts to compatible format and then applies tesseract. Fetches resulting text. If cleanup=True, delete scratch files after operation.""" try: try: call_tesseract(filename, scratch_text_name_root, language) text = util.retrieve_text(scratch_text_name_root) except errors.Tesser_General_Exception: if graceful_errors: im = Image.open(filename) text = image_to_string(im, cleanup) else: raise finally: if cleanup: util.perform_cleanup(scratch_image_name, scratch_text_name_root) return textif __name__=='__main__': im = Image.open('phototest.tif') text = image_to_string(im) print text try: text = image_file_to_string('fnord.tif', graceful_errors=False) except errors.Tesser_General_Exception, value: print "fnord.tif is incompatible filetype. Try graceful_errors=True" print value text = image_file_to_string('fnord.tif', graceful_errors=True) print "fnord.tif contents:", text text = image_file_to_string('fonts_test.png', graceful_errors=True) print text
新闻热点
疑难解答