Python三期爬虫作业

【Python爬虫】-第四周作业-打印简书首页

2017-08-02  本文已影响58人  infinite昊昊

未安装requests包的安装包,谷歌浏览器安装

爬虫知识学习什么是url,header请求头,网页源代码,简单了解html标签

requests包的使用get方法 返回网页源代码

打印输出简书首页的源代码

import requests #导入
response = requests.get("http://www.jianshu.com/") #获取简书首页地址
print(response.status_code) #打印状态码
print(response.text) #打印响应的文本

输出:
"C:\Program Files\Python36\python.exe" "E:/python project/jianshushouye.py"
200
<!DOCTYPE html>




<html>

<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=Edge">
<meta name="viewport" content="width=device-width, initial-scale=1.0,user-scalable=no">


<meta http-equiv="Cache-Control" content="no-siteapp" />
<meta http-equiv="Cache-Control" content="no-transform" />
<meta name="applicable-device" content="pc,mobile">
<meta name="MobileOptimized" content="width"/>
<meta name="HandheldFriendly" content="true"/>
<meta name="mobile-agent" content="format=html5;url=http://localhost/">

<meta name="description" content="简书是一个优质的创作社区,在这里,你可以任性地创作,一篇短文、一张照片、一首诗、一幅画……我们相信,每个人都是生活中的艺术家,有着无穷的创造力。">
<meta name="keywords" content="简书,简书官网,图文编辑软件,简书下载,图文创作,创作软件,原创社区,小说,散文,写作,阅读">

<meta name="360-site-verification" content="604a14b53c6b871206001285921e81d8" />
<meta property="wb:webmaster" content="294ec9de89e7fadb" />
<meta property="qc:admins" content="104102651453316562112116375" />
<meta property="qc:admins" content="11635613706305617" />
<meta property="qc:admins" content="1163561616621163056375" />
<meta name="google-site-verification" content="cV4-qkUJZR6gmFeajx_UyPe47GW9vY6cnCrYtCHYNh4" />
<meta name="google-site-verification" content="HF7lfF8YEGs1qtCE-kPml8Z469e2RHhGajy6JPVy5XI" />
<meta http-equiv="mobile-agent" content="format=html5; url=http://localhost/">


<meta name="apple-mobile-web-app-title" content="简书">
<title>简书 - 创作你的创作</title>
<meta name="csrf-param" content="authenticity_token" />
<meta name="csrf-token" content="lFwn5L5XKzBfwZB5EvHN+pcdozh26wz6f6F2FBhL7uV2rsNyJjodvAcVUbQLboTnJleTHkZp3Tee44svUBONsQ==" />

<link rel="stylesheet" media="all" href="//cdn2.jianshu.io/assets/web-85629238feda813871f6.css" />

<link rel="stylesheet" media="all" href="//cdn2.jianshu.io/assets/web/pages/home/index/entry-85629238feda813871f6.css" />

<link href="//cdn2.jianshu.io/assets/favicons/favicon-783beb88ed621ceab614de960376ac0c.ico" rel="icon">
<link rel="apple-touch-icon-precomposed" href="//cdn2.jianshu.io/assets/apple-touch-icons/57-47624b2e2161e8eb144462c85db0a5ff.png" sizes="57x57" />
<link rel="apple-touch-icon-precomposed" href="//cdn2.jianshu.io/assets/apple-touch-icons/72-c00cde7cf98fc49e50cbb3ee1dcd5804.png" sizes="72x72" />
<link rel="apple-touch-icon-precomposed" href="//cdn2.jianshu.io/assets/apple-touch-icons/76-e8af0bdeaf1ba31e303b1fde8b5e66c4.png" sizes="76x76" />
<link rel="apple-touch-icon-precomposed" href="//cdn2.jianshu.io/assets/apple-touch-icons/114-f4c78569bbf1977e8382a5fd90c9237a.png" sizes="114x114" />
<link rel="apple-touch-icon-precomposed" href="//cdn2.jianshu.io/assets/apple-touch-icons/120-cf10c3711dba269522743729efe66bbc.png" sizes="120x120" />
<link rel="apple-touch-icon-precomposed" href="//cdn2.jianshu.io/assets/apple-touch-icons/152-7bd60457b5f3ecbf1343f0e6241be4f8.png" sizes="152x152" />
</head>

<body lang="zh-CN" class="reader-black-font">

<nav class="navbar navbar-default navbar-fixed-top" role="navigation">
<div class="width-limit">

<a class="logo" href="/">

</a>
<div class="content">
<div class="author">
<a class="avatar" target="_blank" href="/u/ad73e614982f">



</a>
<div class="content">
<div class="author">
<a class="avatar" target="_blank" href="/u/05b00dea008f">



</a>
<div class="content">
<div class="author">
<a class="avatar" target="_blank" href="/u/1441f4ae075d">



</a>
<div class="content">
<div class="author">
<a class="avatar" target="_blank" href="/u/e2073c34b346">



</a>
<div class="content">
<div class="author">
<a class="avatar" target="_blank" href="/u/ef4f2422125f">



</a>
<div class="content">
<div class="author">
<a class="avatar" target="_blank" href="/u/cefb44685547">



</a>
<div class="content">
<div class="author">
<a class="avatar" target="_blank" href="/u/023be4bb211b">



</a>
<div class="content">
<div class="author">
<a class="avatar" target="_blank" href="/u/addcee2f9c78">



</a>
<div class="content">
<div class="author">
<a class="avatar" target="_blank" href="/u/46ba19c138d8">



</a>
<div class="content">
<div class="author">
<a class="avatar" target="_blank" href="/u/62478ec15b74">



</a>
<div class="content">
<div class="author">
<a class="avatar" target="_blank" href="/u/94b24595f47c">



</a>
<div class="content">
<div class="author">
<a class="avatar" target="_blank" href="/u/81040fa04d07">



</a>
<div class="content">
<div class="author">
<a class="avatar" target="_blank" href="/u/deeea9e09cbc">



</a>
<div class="content">
<div class="author">
<a class="avatar" target="_blank" href="/u/cb1105e29520">



</a>
<div class="content">
<div class="author">
<a class="avatar" target="_blank" href="/u/16caeb2ea4ab">



</a>
<div class="content">
<div class="author">
<a class="avatar" target="_blank" href="/u/55b597320c4e">



</a>
<div class="content">
<div class="author">
<a class="avatar" target="_blank" href="/u/0166492c089b">



</a>
<div class="content">
<div class="author">
<a class="avatar" target="_blank" href="/u/8e43e13b1c79">



</a>
<div class="content">
<div class="author">
<a class="avatar" target="_blank" href="/u/fec5a2fdd236">



</a>
<div class="content">
<div class="author">
<a class="avatar" target="_blank" href="/u/07edceabfca0">



<div class="note-title">黛玉早报170802 —— 《亲爱的先生,你要爱你的寂寞》</div>
</a> <a target="_blank" class="note" href="/p/1d8d1045d94c?utm_medium=index-jianshu-daily-note&utm_source=desktop">

<div class="note-title">黛玉早报170801 —— 《你还在向朝九晚五的工作要安全感吗?》</div>
</a> </div>

<div data-vcomp="recommended-author-list"></div>
</div>
</div>
</div>
<div data-vcomp="side-tool"></div>
<footer class="container">
<div class="row">
<div class="col-xs-17 main">

<a target="_blank" href="http://www.jianshu.com/c/jppzD2">关于简书</a><em> · </em><a target="_blank" href="http://www.jianshu.com/contact">联系我们</a><em> · </em><a target="_blank" href="http://www.jianshu.com/c/bfeec2e13990">加入我们</a><em> · </em><a target="_blank" href="http://www.jianshu.com/p/ab3a3856cdaf">作者成书计划</a><em> · </em><a target="_blank" href="http://www.jianshu.com/press">品牌与徽标</a><em> · </em><a target="_blank" href="http://www.jianshu.com/faqs">帮助中心</a><em> · </em><a target="_blank" href="http://www.jianshu.com/p/cabc8fa39830">合作伙伴</a> <div class="icp">
©2012-2017 上海佰集信息科技有限公司 / Tel:021-61995350 / 简书 / 沪ICP备11018329号-5 / <a target="_blank" href="http://www.beian.gov.cn/portal/registerSystemInfo?recordcode=31010602000064">沪公网安备31010602000064号</a>
<a target="_blank" href="http://www.beian.gov.cn/portal/registerSystemInfo?recordcode=31010602000064">


</a> </div>
</div>
</div>
</footer>
<script type="application/json" data-name="page-data">{"user_signed_in":false,"locale":"zh-CN","os":"other","read_mode":"day","read_font":"font2"}</script>
<script src="//cdn2.jianshu.io/assets/babel-polyfill-e1333b32713277e2cd1f.js"></script>
<script src="//cdn2.jianshu.io/assets/web-base-85629238feda813871f6.js"></script>
<script src="//cdn2.jianshu.io/assets/web-e483d55d248edc3d2c6f.js"></script>
<script src="//cdn2.jianshu.io/assets/web/pages/home/index/entry-ce0e91352ce2a0da92bc.js"></script>
<script>
(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
})(window,document,'script','https://www.google-analytics.com/analytics.js','ga');

ga('create', 'UA-35169517-1', 'auto');
ga('send', 'pageview');
</script>

<script>
var _hmt = _hmt || [];
(function() {
var hm = document.createElement("script");
hm.src = "//hm.baidu.com/hm.js?0c0e9d9b1e7d617b3e6842e85b9fb068";
var s = document.getElementsByTagName("script")[0];
s.parentNode.insertBefore(hm, s);
})();
</script>

<script>
(function(){
var bp = document.createElement('script');
var curProtocol = window.location.protocol.split(':')[0];
if (curProtocol === 'https') {
bp.src = 'https://zz.bdstatic.com/linksubmit/push.js';
}
else {
bp.src = 'http://push.zhanzhang.baidu.com/push.js';
}
var s = document.getElementsByTagName("script")[0];
s.parentNode.insertBefore(bp, s);
})();
</script>

</body>
</html>

上一篇下一篇

猜你喜欢

热点阅读