python爬虫之路--准备环境
安装Python3
1.MAC下安装python3
MAC下推荐使用homebrew来安装Python3
,什么是homebrew
呢?
它是macOS 缺失的软件包管理器。如何安装homebrew
,只需要在终端输入:
/usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
如果想进一步了解homebrew
,点击 homebrew官网
安装完毕homebrew之后,开始安装python3
1.1搜索包:
brew searsh python3
结果如下,说明存在python3的包,可以安装。如图 1-1
图1-1
1.2安装包:
brew install python3
控制台输入安装过程如下:
==> Installing dependencies for python: gdbm, openssl, readline, sqlite, xz
==> Installing python dependency: gdbm
==> Downloading https://homebrew.bintray.com/bottles/gdbm-1.14.1_1.high_sierra.bottle.tar.gz
######################################################################## 100.0%
==> Pouring gdbm-1.14.1_1.high_sierra.bottle.tar.gz
🍺 /usr/local/Cellar/gdbm/1.14.1_1: 20 files, 555.7KB
==> Installing python dependency: openssl
==> Downloading https://homebrew.bintray.com/bottles/openssl-1.0.2o_1.high_sierra.bottle.tar.gz
######################################################################## 100.0%
==> Pouring openssl-1.0.2o_1.high_sierra.bottle.tar.gz
==> Caveats
A CA file has been bootstrapped using certificates from the SystemRoots
keychain. To add additional certificates (e.g. the certificates added in
the System keychain), place .pem files in
/usr/local/etc/openssl/certs
and run
/usr/local/opt/openssl/bin/c_rehash
This formula is keg-only, which means it was not symlinked into /usr/local,
because Apple has deprecated use of OpenSSL in favor of its own TLS and crypto libraries.
If you need to have this software first in your PATH run:
echo 'export PATH="/usr/local/opt/openssl/bin:$PATH"' >> ~/.bash_profile
For compilers to find this software you may need to set:
LDFLAGS: -L/usr/local/opt/openssl/lib
CPPFLAGS: -I/usr/local/opt/openssl/include
==> Summary
🍺 /usr/local/Cellar/openssl/1.0.2o_1: 1,791 files, 12.3MB
==> Installing python dependency: readline
==> Downloading https://homebrew.bintray.com/bottles/readline-7.0.3_1.high_sierra.bottle.tar.gz
######################################################################## 100.0%
==> Pouring readline-7.0.3_1.high_sierra.bottle.tar.gz
==> Caveats
This formula is keg-only, which means it was not symlinked into /usr/local,
because macOS provides the BSD libedit library, which shadows libreadline.
In order to prevent conflicts when programs look for libreadline we are
defaulting this GNU Readline installation to keg-only.
For compilers to find this software you may need to set:
LDFLAGS: -L/usr/local/opt/readline/lib
CPPFLAGS: -I/usr/local/opt/readline/include
==> Summary
🍺 /usr/local/Cellar/readline/7.0.3_1: 46 files, 1.5MB
==> Installing python dependency: sqlite
==> Downloading https://homebrew.bintray.com/bottles/sqlite-3.23.1.high_sierra.bottle.tar.gz
######################################################################## 100.0%
==> Pouring sqlite-3.23.1.high_sierra.bottle.tar.gz
==> Caveats
This formula is keg-only, which means it was not symlinked into /usr/local,
because macOS provides an older sqlite3.
If you need to have this software first in your PATH run:
echo 'export PATH="/usr/local/opt/sqlite/bin:$PATH"' >> ~/.bash_profile
For compilers to find this software you may need to set:
LDFLAGS: -L/usr/local/opt/sqlite/lib
CPPFLAGS: -I/usr/local/opt/sqlite/include
==> Summary
🍺 /usr/local/Cellar/sqlite/3.23.1: 11 files, 3MB
==> Installing python dependency: xz
==> Downloading https://homebrew.bintray.com/bottles/xz-5.2.4.high_sierra.bottle.tar.gz
######################################################################## 100.0%
==> Pouring xz-5.2.4.high_sierra.bottle.tar.gz
🍺 /usr/local/Cellar/xz/5.2.4: 92 files, 1MB
==> Installing python
==> Downloading https://homebrew.bintray.com/bottles/python-3.6.5.high_sierra.bottle.1.tar.gz
######################################################################## 100.0%
==> Pouring python-3.6.5.high_sierra.bottle.1.tar.gz
==> /usr/local/Cellar/python/3.6.5/bin/python3 -s setup.py --no-user-cfg install --force --verbose --install-scripts=/usr/local/C
==> /usr/local/Cellar/python/3.6.5/bin/python3 -s setup.py --no-user-cfg install --force --verbose --install-scripts=/usr/local/C
==> /usr/local/Cellar/python/3.6.5/bin/python3 -s setup.py --no-user-cfg install --force --verbose --install-scripts=/usr/local/C
==> Caveats
Python has been installed as
/usr/local/bin/python3
Unversioned symlinks python
, python-config
, pip
etc. pointing to
python3
, python3-config
, pip3
etc., respectively, have been installed into
/usr/local/opt/python/libexec/bin
If you need Homebrew's Python 2.7 run
brew install python@2
Pip, setuptools, and wheel have been installed. To update them run
pip3 install --upgrade pip setuptools wheel
You can install Python packages with
pip3 install <package>
They will install into the site-package directory
/usr/local/lib/python3.6/site-packages
See: https://docs.brew.sh/Homebrew-and-Python
==> Summary
🍺 /usr/local/Cellar/python/3.6.5: 4,736 files, 99.2MB
1.3 检测安装是否成功。
打开终端,在命令行界面输入分别python3 和pip -V 查看,如图 1-2
图1-2
安装请求库
爬虫可以大致分为三个步骤:抓取页面,分析页面和存储数据。
在抓取页面的时候,我们需要模拟浏览器向服务器发送请求,这时需要用到一些python的库来完成这些请求。常见的有:requests, Selenium和aiohttp等
安装requests
requests是第三方库,python默认不自带这个库,所以需要我们手动安装这个模块。相关参考资料1.Github | 2.PyPI | 3.官网文档 | 4.中文文档
使用pip安装requets,执行:
pip3 install requests
验证安装,打开终端,在命令行中输入:python3,进入命令行模式。
>>> import requests
如果什么错误提示有没有,则证明安装成功的安装了requests。
安装Selenium
Selenuim是一个自动化测试工具
,利用它我们可以干什么呢?可以驱动浏览器执行特定的动作,例如点击
,下拉
等操作。对于一些Javascript渲染的页面来说,这种方式很有效。相关参考资料1.官方网站 | 2.Github | 3.PyPI | 4.中文文档
使用pip安装selenium,执行
pip3 install selenium
此时报错了!!!错误信息
Could not install packages due to an EnvironmentError: [Errno 13] Permission denied: '/usr/local/selenium'
Consider using the `--user` option or check the permissions.
先把这个问题留在这里。
验证安装,打开终端,在命令行中输入:python3,进入命令行模式。
>>> import selenium
没有报错,安装成功!