python 基础练习第一课笔记

2017-02-04 本文已影响127人 Sugeei

第1题

max 取一组数的最大值

>>> max([5,3,2,1])
5
>>> max(8,4,5)
8

列表排序 list.sort

>>> a= [3, 6, 1.1, 4]
>>> a.sort()
>>> a
[1.1, 3, 4, 6]
>>> a[-1]
6

sort作用在原序列上。
sort返回值为空。

列表排序 sorted()

It is not possible to sort a dict, only to get a representation of a dict that is sorted. Dicts are inherently orderless, but other types, such as lists and tuples, are not. So you need a sorted representation, which will be a list—probably a list of tuples.
For instance,

import operator
x = {1: 2, 3: 4, 4: 3, 2: 1, 0: 0}
# sort on values
sorted_x = sorted(x.items(), key=operator.itemgetter(1))

result

[(0, 0), (2, 1), (1, 2), (4, 3), (3, 4)]

sorted_x will be a list of tuples sorted by the second element in each tuple. dict(sorted_x) == x.
And for those wishing to sort on keys instead of values:

import operator
x = {1: 2, 3: 4, 4: 3, 2: 1, 0: 0}
# sort on keys
sorted_x = sorted(x.items(), key=operator.itemgetter(0))

result

[(0, 0), (1, 2), (2, 1), (3, 4), (4, 3)]

补充例子
how-to-sort-python-dictionary-by-keys http://www.saltycrane.com/blog/2007/09/how-to-sort-python-dictionary-by-keys/

# sort a dict by value
for key, value in sorted(mydict.iteritems(), key=lambda (k,v): (v,k)):
    print "%s: %s" % (key, value)

operator.itemgetter 的用法

>>> import operator
>>> operator.itemgetter(1)('ABCDEFG')
'B'
>>> operator.itemgetter(1,3,5)('ABCDEFG')
('B', 'D', 'F')

using itemgetter() to retrieve specific fields from a dict:

>>> from operator import itemgetter as itemgetter
>>> inventory = [('apple', 3), ('banana', 2), ('pear', 5), ('orange', 1)]
>>> getcount = itemgetter(1)
>>> map(getcount, inventory)
[3, 2, 5, 1]
>>> sorted(inventory, key=getcount)
[('orange', 1), ('banana', 2), ('apple', 3), ('pear', 5)]

第2题

range 函数

>>> range(5)
[0, 1, 2, 3, 4]
>>> range(1,5)
[1, 2, 3, 4]

注意右边界的值

第3题

列表切片, 取最后n个数据： [-n: ]

>>> a = [1,2,3,4,5]
>>> a[-3:]
[3, 4, 5]
>>> a[-1:]
[5]

每隔n个取值：［::n]

#python3.6 
>>> a = range(1,10)
>>> list(a)
[1, 2, 3, 4, 5, 6, 7, 8, 9]

每隔2个数取一个值

>>> list(a[::2])
[1, 3, 5, 7, 9]

每隔3个数取一个值

>>> list(a[::3])
[1, 4, 7]

第4题

math.log 计算对数

>>> import math
>>> math.log10(100)
2.0
>>> math.log(4,2)
2.0

自定义底

>>> math.log(2,1.2)
3.8017840169239308
>>> math.log(2,1.02)
35.0027887811465
>>> math.log(2,1.03)
23.449772250437736

第5题

找出10000之内的所有完数

for i in range(1,10000):
    n=0
    list5=[]
    for j in range(1,i):
        if i%j==0:
            n+=j 
            list5.append(j)
    if i==n:
        print(i,list5)

1也是完数，但由于range(1,1)返回空列表 [ ] , 无法完成计算
修改如下：

for i in range(1,10000):
    n=1
    list5=[1]
    for j in range(2, i/2+1):
        if i%j==0:
            n+=j
            list5.append(j)
    if i==n:
        print(i,list5)

i = 1 时，不进入for循环，直接进入if分支判断
range(2, i/2+1) 使得内层for循环只需要判断2到i/2+1之间的数是否为其因子

运行结果

# 
1=1 
6=1 +2 +3 
28=1 +2 +4 +7 +14 
496=1 +2 +4 +8 +16 +31 +62 +124 +248 
8128=1 +2 +4 +8 +16 +32 +64 +127 +254 +508 +1016 +2032 +4064

第6题

re.findall 的用法

>>>import re
>>> article = r'''you are a good student it is my honor to present it, habit,  young, You're my'''
>>> re.findall('\w+', article)
['you', 'are', 'a', 'good', 'student', 'it', 'is', 'my', 'honor', 'to', 'present', 'it', 'habit', 'young', 'You', 're', 'my']
>>> print(len(re.findall('\w+', article)))
17

re.findall 返回一个包含所有匹配结果的列表
匹配单词 '\w+'
len()返回列表长度

>>> article = r'''you are a good student it is my honor to present it, habit,  young, You're my 1 12 999'''
>>> print(len(re.findall('[0-9]+', article)))
3
>>> print(len(re.findall('[0-9]', article)))
6
>>> print(len(re.findall('\d', article)))
6
>>> print(len(re.findall('\d+', article)))
3

正则匹配中'+' 的作用
匹配数字的模式'\d'

正则表达式中括号的作用

>>> s="gaxxIxxefahxxlovexxhoghexxpythonxxghaweoif"  
>>> r=re.compile('xx.*?xx')  
>>> r.findall(s)
['xxIxx', 'xxlovexx', 'xxpythonxx']

>>> r=re.compile('xx(.*?)xx')  
>>> r.findall(s)
['I', 'love', 'python']

正则匹配中的贪心匹配与非贪心匹配

#贪心匹配， 尽可能多地去匹配
>>> s="gaxxIxxefahxxlovexxhoghexxpythonxxghaweoif"  
>>> r=re.compile('xx.*xx')  
>>> r.findall(s)
['xxIxxefahxxlovexxhoghexxpythonxx']

#非贪心匹配
>>> r=re.compile('xx.*?xx')  
>>> r.findall(s)
['xxIxx', 'xxlovexx', 'xxpythonxx']
>>> re.findall('\w+?@\w+?', 'abc@ef@mn')
['abc@e', 'f@m']

第七题

append() 与extend() 的区别

>>> a = [1,2,3]
>>> b = [2,3,4]
>>> a.append(b)
>>> a
[1, 2, 3, [2, 3, 4]]

>>> a = [1,2,3]
>>> b = [2,3,4]
>>> a.extend(b)
>>> a
[1, 2, 3, 2, 3, 4]

append 与 extend的返回值均为None
append 与 extend 均作用在原序列上

第8题

类的定义，初始化

考虑以下两种方式定义的类的变量有什么不同

class EnglishLearning(object):
    word_list = {}

class EnglishLearning(object):
    __init__(self):
        self.word_list = {}

运行下面代码

# !/usr/bin/python
# -*- coding: utf-8 -*-

class counter(object):
    count = 5
    def __init__(self):
        # __calss__ 用于访问类的公共变量, 此处访问的是变量 count
        self.__class__.count += 1
        # self.count = 0
        self.getcount()

    def getcount(self):
        print(self.count)


if __name__ == '__main__':
    print(counter.count) # 初始值为5
    a = counter() # count 自加1得6, 由于是公共变量, 所有counter类的实例共享同个一count变量, 对python来说是引用同一个对象
    print(counter.count)  # 也为6
    b = counter() # count 再自加1得7
    print(counter.count)  # 也为7  #[Note1]
    b.count += 3  #[Note2]
    b.getcount() # b 指向新建的number对象， 值为10
    a.getcount() # a 的引用对象不变， 仍为7
    c = counter() # 新建实例c， 生成新的number对象8， 实例c与类counter都指向它 # [Note3]
    print(counter.count)  # 也为8
    a.getcount() # 8
    b.getcount() # 10

count 的值被所有此类的实例共享
count 的值可以在类的外部被修改，不是安全的类的定义方式。

Note1: 类count ，实例a与实例b中的count变量引用的是同一个number对象
Note2: b实例中count引用修改，指向新生成的number对象
Note3: 新建实例c, 类count ,实例a与实例c中的count变量引用同一个number对象

reference

第9题

eval 的用法

>>> type(eval('12'))
<type 'int'>
>>> type(eval('12.2'))
<type 'float'>
>>> type(eval('a'))
<type 'str'>
>>> type(eval('a1'))
Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "<string>", line 1, in <module>
NameError: name 'a1' is not defined

用if int == type(eval(input)) 判断输入的值是否是整数。可以正确识别整数与浮点数。用try...except 处理输入字符时的异常。

代码示例

# !/usr/bin/python
# -*- coding: utf-8 -*-

def check_input(prompt):
    while True:
        rawinput = raw_input(prompt + ": ")
        try:
            if int == type(eval(rawinput)):
                return rawinput
        except:
            print('输入内容必须为整数！请重新输入！')
        else:
            print('输入内容必须为整数！请重新输入！')

def handleInput():
    num1 = check_input('请输入第一个整数')
    num2 = check_input('请输入第二个整数')
    print('第一个数:' + num1)
    print('第二个数:' + num2)
    print('两个数字之和为:'+ str(int(num1)+int(num2)))

a = 11
handleInput()

调用handleInput()之前创建一个变量'a'并赋值， debug运行，在提示“请输入第一个整数”时输入'a'，提示“请输入第二个整数”时输入一个整数，跟踪代码
把赋值语句'a = 11'删除，再次 debug运行, 仍然在提示“请输入第一个整数”时输入'a'，提示“请输入第二个整数”时输入一个整数，跟踪代码
比较两次代码走向有什么不同，and why?

try/except/else/finally

The statements in the else block are executed if execution falls off the bottom of the try - if there was no exception. Honestly, I've never found a need.

However, Handling Exceptions notes:

The use of the else clause is better than adding additional code to the try clause because it avoids accidentally catching an exception that wasn’t raised by the code being protected by the try ... except statement.

So, if you have a method that could, for example, throw an IOError, and you want to catch exceptions it raises, but there's something else you want to do if the first operation succeeds, and you don't want to catch an IOError from that operation, you might write something like this:

    try:
        operation_that_can_throw_ioerror()
    except IOError:
        handle_the_exception_somehow()
    else:
         # we don't want to catch the IOError if it's raised
        another_operation_that_can_throw_ioerror()
    finally:
        something_we_always_need_to_do()

If you just put another_operation_that_can_throw_ioerror() after operation_that_can_throw_ioerror, the except would catch the second call's errors. And if you put it after the whole try block, it'll always be run, and not until after the finally. The else lets you make sure

the second operation's only run if there's no exception,
it's run before the finally block, and
any IOErrors it raises aren't caught here

isinstance 的用法

>>> isinstance('12', int)
False
>>> isinstance('12', str)
True
>>> isinstance(12, str)
False
>>> isinstance(12, int)
True
>>> isinstance(12.2, int)
False
>>> isinstance(12.2, float)
True

isdigit 的用法

>>> '12'.isdigit()
Out[174]: True
>>> 'a'.isdigit()
Out[175]: False
>>> '1.2'.isdigit()
Out[176]: False
>>> '-1'.isdigit()
Out[177]: False

另一种写法

# !/usr/bin/python
# -*- coding: utf-8 -*-

def check_input(prompt):
    while True:
        rawinput = raw_input(prompt + ": ")
        # 将所有可能的输入分为两类， 一类为负数，以‘-’号开头，另一类为其它，包括非负数，字符
        if re.match('-', rawinput):
            input = rawinput[1:]   # 获取输入的负数的绝对值， 去掉符号字符
        else:
            input = rawinput
        if input.isdigit():
            return rawinput
        else:
            print('输入内容必须为整数！请重新输入！')

def handleInput():
    num1 = check_input('请输入第一个整数')
    num2 = check_input('请输入第二个整数')
    print('第一个数:' + num1)
    print('第二个数:' + num2)
    print('两个数字之和为:'+ str(int(num1)+int(num2)))

handleInput()

第10题

文件操作推荐用法：

with open(filename) as fp:
      fp.ready()
      fp.write()

当文件遇到异常时也可以正常close。

# !/usr/bin/python
# -*- coding: utf-8 -*-
# read a file, replace some words and write back

import re

with open('r.txt', 'r') as f:
    c = f.read()
    o = re.sub('y', '**',c)
    print(o)

with open('a.txt','w') as f:
    f.write(o)

try...except的用法

try:
    open('x.txt','r') # x.txt do not exist 
except Exception, FileNotFoundError:
    print(FileNotFoundError)

如果不确认exception，就直接写

except Exception,e:
    print e

open a file, 'r', 'w', 'r+', 'w+

open the file for reading. 'r' is the default, so it can be omitted.

open(path, 'r')
open(path)

for both reading and writing, but won't be truncated.

open(path, 'r+')

opens the file for writing (truncates the file to 0 bytes)

open(path, 'w')

w+ opens the file for writing (truncates the file to 0 bytes) but also lets you read from it

open(path, 'w+')

第11题

字符替换目标

匹配目标单词，过滤掉包含目标单词的其它词，比如要匹配you或者it时,不会匹配上young以及edit.
替换结果中单词前后的符号会被保留，比如you're 会替换为 **'re, 单引号会保留。
句首与句尾的目标字符串可以匹配到。

优化方案一 —— \b

>>> article = r'''you are a good student it is my honor to present it, habit,  young, You're my'''
>>> str=re.sub(r'(\bI\b)|(\bhe\b)|(\bshe\b)|(\bit\b)|(\bwe\b)|(\bthey\b)|(\bme\b)|(\byou\b)| \
    (\bhim\b)|(\bher\b)|(\bus\b)|(\bthem\b)', "**", article, flags=re.IGNORECASE) 
>>> str
"** are a good student ** is my honor to present **, habit,  young, **'re my"

re.sub的用法
re.IGNORECASE的作用
\b的作用

优化方案二 —— (?<!...) & (?!...)

比较version1与version2两种方法：

# version 1
>>> article = r'''you are a good student it is my honor to present it, habit,  young, You're my'''
>>> re.sub('(\Wyou\W)|(\Wit\W)', '**', article, flags=re.IGNORECASE)
'you are a good student**is my honor to present** habit,  young,**re my'

version 1 的问题在于目标单词前后的符号也被替换掉了。

# version 2
>>> re.sub('(?<!\w)you(?!\w)|(?<!\w)it(?!\w)', '**', "You're it is t young tyou it's item you",flags=re.IGNORECASE)
Out[172]: "**'re ** is t young tyou **'s item **"

(?<!...) Matches if the current position in the string is not preceded by a match for .... This is called a negative lookbehind assertion. Similar to positive lookbehind assertions, the contained pattern must only match strings of some fixed length. Patterns which start with negative lookbehind assertions may match at the beginning of the string being searched.
(?!...) Matches if ... doesn’t match next. This is a negative lookahead assertion. For example, Isaac
(?!Asimov)willmatch’Isaac ’onlyifit’snotfollowedby’Asimov’.
\b vs \W
匹配不上， why?

>>> import re
>>> a = 'you you ayoun you'
>>> re.findall('\byou\b', a) 
[]

\W可以匹配成功

>>> re.findall('\Wyou\W', a)
[' you ']

模式加上‘r'后可以匹配成功

>>> re.findall(r'\byou\b', a)
['you', 'you', 'you']

这样也可以。成功匹配句首及句尾

>>> a = '''you're you ayoun you'''
>>> re.findall(r'\byou\b', a)
['you', 'you', 'you']

再用\W试试。\W没能匹配句首及句尾

>>> re.findall('\Wyou\W', a)
[' you ']

参考http://www.aichengxu.com/python/8521648.htm

优化方案三 —— 利用 group()

>>> re.sub('\W(you|it)\W', lambda m:m.group(0).replace(m.group(1), '**'), article,flags=re.IGNORECASE)
"you are a good student ** is my honor to present **, habit,  young, **'re my"

基本语法

>>> a = re.search('\W(you)\W', r''' you are a good student， it is my honor to present it, habit,  young, You're my''',
              flags=re.IGNORECASE)
>>> a.group(0)
Out[145]: ' you '
>>> a.group(1)
Out[146]: 'you'
>>> a.group(0).replace(a.group(1),'**')
Out[147]: ' ** '

lambda m:m.group(0).replace(m.group(1), '**')的含义
此方法无法匹配句首及句尾的"you"。上面例子中，句首与句尾的'you'未能匹配成功。可以考虑在读入文章之后头尾加上非字符符号比如空格或者引号再做匹配。

>>> article = r'''you are a good student it is my honor to present it, habit,  young, You're my'''
>>> article = ' ' + article + ' '
>>> article
" you are a good student it is my honor to present it, habit,  young, You're my "
>>> re.sub('\W(you|it)\W', lambda m:m.group(0).replace(m.group(1), '**'), article,flags=re.IGNORECASE)
" ** are a good student ** is my honor to present **, habit,  young, **'re my "

第12题

遍历文件夹的方法 os.walk VS os.listdir

# !/usr/bin/python
# -*- coding: utf-8 -*-

import os

# os.listdir
a = os.listdir(os.getcwd())
print(a)
print('-' * 20)
# output looks like ['a.txt', 'js', 'js2', 'tess.py']

# os.walk
for dir, dirname, file in os.walk(os.getcwd()):
    print(dir)
    print('=' * 20)
    print(dirname)
    print('*' * 20)
    for fi in file:
        print(os.path.join(dir,fi))
    print('+' * 20)

日期与字符串之间的转换 —— time 模块的用法


>>> timestr = '2017-01-11'
>>> timeformat = time.strptime(timestr, '%Y-%m-%d')
>>> timeformat
time.struct_time(tm_year=2017, tm_mon=1, tm_mday=11, tm_hour=0, tm_min=0, tm_sec=0, tm_wday=2, tm_yday=11, tm_isdst=-1)
>>> weekday = timeformat.tm_wday 
>>> weekday
2

#注意返回值与周几的对应关系
# 周一 ~ 周日 返回值对应 0 ~ 6 
>>> time.strptime('2017-01-14', '%Y-%m-%d')
time.struct_time(tm_year=2017, tm_mon=1, tm_mday=14, tm_hour=0, tm_min=0, tm_sec=0, tm_wday=5, tm_yday=14, tm_isdst=-1)
>>> time.strptime('2017-01-15', '%Y-%m-%d')
time.struct_time(tm_year=2017, tm_mon=1, tm_mday=15, tm_hour=0, tm_min=0, tm_sec=0, tm_wday=6, tm_yday=15, tm_isdst=-1)
>>> time.strptime('2017-01-16', '%Y-%m-%d')
time.struct_time(tm_year=2017, tm_mon=1, tm_mday=16, tm_hour=0, tm_min=0, tm_sec=0, tm_wday=0, tm_yday=16, tm_isdst=-1)
>>> time.strptime('2017-01-17', '%Y-%m-%d')
time.struct_time(tm_year=2017, tm_mon=1, tm_mday=17, tm_hour=0, tm_min=0, tm_sec=0, tm_wday=1, tm_yday=17, tm_isdst=-1)

日期与字符串之间的转换 —— datetime 模块的用法

# 周一 ~ 周日 返回值对应 0 ~ 6 
>>> import datetime
>>> datetime.datetime.strptime('2017-01-17', '%Y-%m-%d')
datetime.datetime(2017, 1, 17, 0, 0)
>>> datetime.datetime.strptime('2017-01-17', '%Y-%m-%d').weekday()
1
>>> datetime.datetime.strptime('2017-01-16', '%Y-%m-%d').weekday()
0
>>> datetime.datetime.strptime('2017-01-15', '%Y-%m-%d').weekday()
6

补充资料

对象的引用

id()返回与内存地址密切相关的一个值，可以把它理解为对象在内存中的地址编号。
编号相同表示是同一个对象。
下面的例子可以看到b=10这条语句执行时，python并没有新开辟内存空间，而是增加了一个引用，这个引用将变量b指向地址编号为30844492的对象，其值为10。

Python 2.7.3 (default, Apr 10 2012, 23:31:26) [MSC v.1500 32 bit (Intel)] on win32
>>> a = 10
>>> id(a)
30844492
>>> b = 10
>>> id(b)
30844492

参考https://my.oschina.net/leejun2005/blog/145911 （推荐）

可变（可修改）对象与不可变（不可修改）对象

python中，万物皆对象。
数值，字符串为不可变对象
列表，字典为可变对象
参考 http://www.tuicool.com/articles/3ym6zqI

# 以下为字符串不可修改的例子
>>> a = 'abc'
>>> b = a.replace('a','A')
>>> b
'Abc'
>>> a
'abc'

1.任意定义三个数(有整型和浮点型)，通过比较判断，输出其最大者。
2.改写上道题，写一个函数，输出三个输入值的最大值
3.1用list comprehension生成1-20000之间所有能被3整除不能被5整除的数
3.2 练习切片，取出上述列表中前10个数，最后5个数，下标为偶数的数，并把列表逆序
4.定义一个函数，完成一个小任务：对于给定的银行定期利率(输入)，计算多少年后可以连本带息翻番
5.一个数如果恰好等于它的因子之和，这个数就称为“完数”。例如，6的因子为1、2、3，而6=1+2+3，因此6是完数。编程，找出10000之内的所有完数，并输出该完数及对应的因子。
6.摘录网页HTML源码，粘贴到input.txt，统计其中英文字母、空格、数字和其他字符的个数并输出。
7.在网上摘录一段英文文本(尽量长一些)，粘贴到input.txt，统计其中每个单词的词频(出现的次数)，并按照词频的顺序写入out.txt文件，每一行的内容为“单词:频次”
8. 王同学希望用电脑记录他每天掌握的英文单词。请设计程序和相应的数据结构，使小王能记录新学的英文单词和其中文翻译，并能很方便地根据英文来查找中文。实现一个类，能完成功能：1.背单词(添加单词) 2.查单词(根据单词进行翻译) 3.修改单词含义 4.统计历届四六级试卷，查找高频词，按照高频词排序
9.写一个函数接收两个整数，并输出相加结果。但如果输入的不是整数（如字母、浮点数等），程序就会终止执行并输出异常信息(异常处理)。请对程序进行修改，要求输入非整数时，给出“输入内容必须为整数！”的提示，并提示用户重新输入，直至输入正确。
10.请输入一个文件路径名或文件名，查看该文件是否存在，如存在，打开文件并在屏幕上输出该文件内容；如不存在，显示“输入的文件未找到！”并要求重新输入；如文件存在但在读文件过程中发生异常，则显示“文件无法正常读出！”并要求重新输入。（提示：请使用异常处理。“文件未找到”对应的异常名为：FileNotFoundError，其他异常直接用except匹配）
11.找一本英文小说(转为txt格式)，把其中所有的代词(I, you, he, she, it, we, you, they, me, you, him, her, it, us, you, them)都替换成**
12.找出文件夹下所有文件名格式为output_YYYY.MM.DD.txt(output_2016.10.21.txt)的文件 。读取文件名中的日期时间信息，并找出这一天是周几。将文件改名为output_YYYY-MM-DD-W.txt (YYYY:四位的年，MM：两位的月份，DD：两位的日，W：一位的周几，并假设周一为一周第一天)

关于编码你应该知道的

The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!）