由于工作需要,登录网站需要用到验证码。最初是研究过验证码识别的,但是总是不能获取到我需要的那个验证码。直到这周五,才想起这事来,昨天顺利的解决了。
下面正题:
python版本:3.4.3
所需要的代码库:PIL,selenium,tesseract
先上代码:
#coding:utf-8import subprocessfrom PIL import Imagefrom PIL import ImageOpsfrom selenium import webdriverimport time,os,sysdef cleanImage(imagePath): image = Image.open(imagePath) #打开图片 image = image.point(lambda x: 0 if x<143 else 255) #处理图片上的每个像素点,使图片上每个点“非黑即白” borderImage = ImageOps.expand(image,border=20,fill='white') borderImage.save(imagePath)def getAuthCode(driver, url="http://localhost/"): captchaUrl = url + "common/random" driver.get(captchaUrl) time.sleep(0.5) driver.save_screenshot("captcha.jpg") #截屏,并保存图片 #urlretrieve(captchaUrl, "captcha.jpg") time.sleep(0.5) cleanImage("captcha.jpg") p = subprocess.Popen(["tesseract", "captcha.jpg", "captcha"], stdout=/ subprocess.PIPE,stderr=subprocess.PIPE) p.wait() f = open("captcha.txt", "r") #Clean any whitespace characters captchaResponse = f.read().replace(" ", "").replace("/n", "") print("Captcha solution attempt: " + captchaResponse) if len(captchaResponse) == 4: return captchaResponse else: return Falsedef withoutCookieLogin(url="http://org.cfu666.com/"): driver = webdriver.Chrome() driver.maximize_window() driver.get(url) while True: authCode = getAuthCode(driver, url) if authCode: driver.back() driver.find_element_by_xpath("//input[@id='orgCode' and @name='orgCode']").clear() driver.find_element_by_xpath("//input[@id='orgCode' and @name='orgCode']").send_keys("orgCode") driver.find_element_by_xpath("//input[@id='account' and @name='username']").clear() driver.find_element_by_xpath("//input[@id='account' and @name='username']").send_keys("username") driver.find_element_by_xpath("//input[@type='password' and @name='password']").clear() driver.find_element_by_xpath("//input[@type='password' and @name='password']").send_keys("password") driver.find_element_by_xpath("//input[@type='text' and @name='authCode']").send_keys(authCode) driver.find_element_by_xpath("//button[@type='submit']").click() try: time.sleep(3) driver.find_element_by_xpath("//*[@id='side-menu']/li[2]/ul/li/a").click() return driver except: print("authCode Error:", authCode) driver.refresh() return driverdriver = withoutCookieLogin("http://localhost/")driver.get("http://localhost/enterprise/add/")
怎么获取我们需要的验证码
在这获取验证码的道路上,我掉了太多的坑,看过太多的文章,很多都是教你验证码的识别方法,但是没有说明,怎么获取你当前需要的验证码图片。
新闻热点
疑难解答