发布时间:2019-09-06 09:16:18编辑:auto阅读(3818)
转载请注明http://blog.csdn.net/boksic 如有疑问欢迎留言
PYTHON作为网络操作很灵活的脚本语言,理应实现起来很容易,尝试了下具体的实现,url为目标地址
import urllib2
url = 'http://blog.csdn.net'
file = urllib2.urlopen(url)
content = file.read()
看来网站已经禁止了单纯测试的访问import urllib2
def test():
url = 'http://blog.csdn.net'
req=urllib2.Request(url)
req.add_header("User-Agent","Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; Trident/4.0)")
file = urllib2.urlopen(req)
content = file.read()
加上HTTP头,被识别为正常浏览器后可以正常访问了用timeit模块来测试其效率
t=timeit.Timer("test()","from __main__ import test")
print t.timeit(10)/10
content = file.read()这句其实在刷访问量时有请求就够了。把这句注释掉后平均每次0.26秒.
为了加快效率,尝试采用多线程的方法来请求页面,完整代码为
import urllib2
import timeit
import thread
import time
i = 0
mylock = thread.allocate_lock()
def test(no,r):
global i
url = 'http://blog.csdn.net'
for j in range(1,r):
req=urllib2.Request(url)
req.add_header("User-Agent","Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; Trident/4.0)")
file = urllib2.urlopen(req)
print file.getcode();
mylock.acquire()
i+=1
mylock.release()
print i;
thread.exit_thread()
def fast():
thread.start_new_thread(test,(1,50))
thread.start_new_thread(test,(2,50))
fast()
time.sleep(15)
上一篇: 为什么使用Python, Python应
下一篇: python中使用traceback来追
47863
46424
37314
34760
29332
25990
24945
19967
19564
18050
5806°
6433°
5947°
5975°
7080°
5926°
5963°
6456°
6418°
7798°