Python Web Scraping ———08.01.201
Python requests, brew, postgresql
Just write down what I've learned about web data scraping so that I won't forget everything and start all over next time I need to use the technique.
Today let's introduce "requests"
import requests
For details of requests.request/get/post/put/patch/delete. requests is very powerful in terms of how much it takes in and returns. You can specify headers, timeout, proxies all at one time. urllib is really messy since if you are using python 2.X, it could really be an annoy trying tp figure out urllib and urllib2, and their features in six.moves.urllib.
After submitting the request, it is going to return a requests.Response object. You can invok either raw, text, content method to see the inside. Furthermore, you can call response.json() and turn it into json object.
Powerful tool.
Getting familiar with "brew" and "postgresql"
I need to establish a database for storing all the data I have scrapped and after research and talking to people, I decided to go with "postgresql."
So as a result I ended up with "brew" easy to check if "brew" is been installed:
apple$ which brew
It turns out that "brew" has been installed and then I checked for update:
apple$ brew update
apple$ brew doctor
To install "brew" and make sure that "brew" is installed in your computer:
apple$ brew install postgresql
apple$ brew search postgresql
After intallation, you can now start and stop postgresql:
apple$ brew services start postgresql
apple$ brew services list #This is to check what is been running
apple$ brew services stop postgresql
For other command lines:
apple$ brew outdated
apple$ brew upgrade
apple$ brew cleanup
#Easy to read thru these command lines what they do
This website could be useful as reference:
http://blog.csdn.net/evane1890/article/details/38759073
Getting familiar with working with "postgresql" over at terminal
Make sure that you have started postgres:
apple$ brew services start postgres
OR
apple$ pg_ctl -D /usr/local/var/postgres start && brew services start postgresql
Super useful website:
https://www.codementor.io/devops/tutorial/getting-started-postgresql-server-mac-osx
In terms of create user and create database
psql postgres #access the database cluster postgres
postgres=# \du #check on roles and each role's access
CREATEROLE usernameWITHLOGIN PASSWORD'quoted password'[OPTIONS] #create roles with login password
postgres=# ALTER ROLE janton CREATEDB; #grant access to the user
postgres=# \q
psql postgres -U janton #reconnects with user name
postgres=> CREATE DATABASE teld;
postgres=> GRANT ALL PRIVILEGES ON DATABASE teld TO janton;
postgres=> \list
postgres=> \connect teld;
postgres=> \dt #check on the tables