利用python的socket发送http(s)请求方法


当前第2页 返回上一页

基于以上,写出如下代码:

1

2

3

4

5

6

7

8

9

10

11

12

13

14

s=socket.socket(socket.AF_INET,socket.SOCK_STREAM)

s.connect(('www.baidu.com',80))

s.send('''GET / HTTP/1.1

Host: zh.lianjia.com

Connection: keep-alive

Cache-Control: max-age=0

Upgrade-Insecure-Requests: 1

User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.139 Safari/537.36

Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8

Referer: https://www.baidu.com/link?url=4J5Kx--GLdLFESJhkfRePU8Ac_0agnTcOtB-b3kfnX8VNdZ_6TPqOyJGKVXkTczg&ck=6140.3.83.296.315.287.208.155&shh=www.baidu.com&sht=94886267_hao_pg&wd=&eqid=af98b98700060b77000000065aef0524

Accept-Encoding: gzip, deflate, br

Accept-Language: zh-CN,zh;q=0.9,en-CA;q=0.8,en;q=0.7

Cookie: lianjia_uuid=ce61c41c-25b0-46d6-a0a0-d57a75ee8706; UM_distinctid=1631f588055f9-0286722badd3ec-b34356b-1fa400-1631f58805657f; _ga=GA1.2.43397143.1525239286; _smt_uid=5ae94e02.558be516; _jzqx=1.1525248800.1525335927.1.jzqsr=zh%2Elianjia%2Ecom|jzqct=/ershoufang/xiangzhouqu/.-; _jzqc=1; _jzqckmp=1; _gid=GA1.2.1028411676.1525594529; select_city=440400; all-lj=c60bf575348a3bc08fb27ee73be8c666; _qzjc=1; CNZZDATA1254525948=963210960-1525238218-https%253A%252F%252Fwww.lianjia.com%252F%7C1525608956; CNZZDATA1255633284=1054798284-1525238580-https%253A%252F%252Fwww.lianjia.com%252F%7C1525608969; lianjia_ssid=c046ddb3-3e66-4809-998a-52ade335fdfc; _qzja=1.1070225156.1525239298260.1525603274282.1525613866775.1525609113492.1525613866775.0.0.0.92.9; _qzjto=29.3.0; _jzqa=1.3750161754444366000.1525239284.1525603274.1525613867.9; _jzqy=1.1525239284.1525613867.3.jzqsr=baidu.jzqsr=baidu; Hm_lvt_9152f8221cb6243a53c83b956842be8a=1525607433,1525607626,1525609113,1525613867; Hm_lpvt_9152f8221cb6243a53c83b956842be8a=1525613867; _qzjb=1.1525613866775.1.0.0.0; _jzqb=1.1.10.1525613867.1; CNZZDATA1255604082=964175865-1525237915-https%253A%252F%252Fwww.lianjia.com%252F%7C1525612833

''')

结果总是报400(Bad Request) ,这个地方卡了很久,最后解决办法是一条一条发送,每一条后面加\r\n。

1

2

3

4

5

6

7

8

9

10

11

sock = socket.socket()

sock.connect(('zh.lianjia.com', 80))

sock.send('GET /ershoufang/ HTTP/1.1\r\n'.encode())

sock.send('Host: zh.lianjia.com\r\n'.encode())

sock.send('Connection: keep-alive\r\n'.encode())

sock.send('Cache-Control: no-cache\r\n'.encode())

sock.send('Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8\r\n'.encode())

sock.send('Upgrade-Insecure-Requests: 1\r\n'.encode())

sock.send('User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.139 Safari/537.36\r\n'.encode())

sock.send('Accept-Encoding: gzip, deflate, br\r\n'.encode())

sock.send('Cookie: lianjia_uuid=ce61c41c-25b0-46d6-a0a0-d57a75ee8706; UM_distinctid=1631f588055f9-0286722badd3ec-b34356b-1fa400-1631f58805657f; _ga=GA1.2.43397143.1525239286; _smt_uid=5ae94e02.558be516; _jzqx=1.1525248800.1525335927.1.jzqsr=zh%2Elianjia%2Ecom|jzqct=/ershoufang/xiangzhouqu/.-; _jzqc=1; _jzqy=1.1525239284.1525594526.2.jzqsr=baidu.jzqsr=baidu|jzqct=%E9%93%BE%E5%AE%B6; _jzqckmp=1; _gid=GA1.2.1028411676.1525594529; Hm_lvt_9152f8221cb6243a53c83b956842be8a=1525594526,1525594536,1525594804,1525595210; select_city=440400; all-lj=c60bf575348a3bc08fb27ee73be8c666; _qzjc=1; lianjia_ssid=99306d63-8ee5-a53c-a740-2d3021f3db2f; CNZZDATA1255604082=964175865-1525237915-https%253A%252F%252Fwww.lianjia.com%252F%7C1525602095; _jzqa=1.3750161754444366000.1525239284.1525594526.1525603274.8; CNZZDATA1254525948=963210960-1525238218-https%253A%252F%252Fwww.lianjia.com%252F%7C1525603556; CNZZDATA1255633284=1054798284-1525238580-https%253A%252F%252Fwww.lianjia.com%252F%7C1525603557; Hm_lpvt_9152f8221cb6243a53c83b956842be8a=1525606057; _jzqb=1.9.10.1525603274.1; _qzja=1.1070225156.1525239298260.1525597069547.1525603274282.1525605398368.1525606071025.0.0.0.86.8; _qzjb=1.1525603274282.9.0.0.0; _qzjto=23.2.0\r\n\r\n'.encode())

结果总是重定向,状态码301!找了好久都不知道什么原因,而且直接在浏览器网址栏输入网址,用fiddler抓包也没有抓到状态为301的包。最后使用fiddler的composer输入http://zh.lianjia.com/ershoufang 就抓到了301和200,其中200的地址就是https://zh.lianjia.com/ershoufang,如下图所示。

这下知道原因了,是http和https的区别。(其实301状态码的时候应答部分的Location可以观察到,但是一个s太不显眼了所以我没有注意到,导致卡了很久)

接下来只要知道如何发送https请求就好了。下面是代码,主要是更改建立socket和连接的部分。注意端口号为443。参考文章在这里

1

2

sock = ssl.wrap_socket(socket.socket())

sock.connect(('zh.lianjia.com', 443))

感觉自己很多地方了解的不够深入,暂时学校也没有讲到应用层。到时候再研究研究,如有错漏欢迎指出。

相关推荐:

利用python将pdf输出为txt的实例

利用python执行shell脚本 并动态传参 及subprocess基本使用

以上就是利用python的socket发送http(s)请求方法的详细内容,更多文章请关注木庄网络博客!!

返回前面的内容

相关阅读 >>

Python使用协程与并发有什么用?

Python去掉空白行的多种实现代码

Python如何打印99乘法表

Python中枚举类型的详解(代码示例)

有关Python的md5加密用法详解

Python + selenium自动化环境搭建的完整步骤

Python怎么自学要那本书

Python中的any函数是什么?如何使用any函数?

了解 flask 项目结构

Python怎么判断是否为字符串

更多相关阅读请进入《Python》频道 >>




打赏

取消

感谢您的支持,我会继续努力的!

扫码支持
扫码打赏,您说多少就多少

打开支付宝扫一扫,即可进行扫码打赏哦

分享从这里开始,精彩与您同在

评论

管理员已关闭评论功能...