NLP学习笔记之基础技能

2019-01-02 本文已影响0人半笔闪

一、字符串操作

1、去空格及特殊符号


s = ' hello, world!'

print(s.strip())                    #hello, world!

print(s.lstrip(' hello, '))        #world!

print(s.rstrip('!'))                # hello, world

2、连接字符串


s1 = 'hello'

s2 = 'world'

s = s1 + s2

print(s)                          #helloworld

3、查找字符串


s1 = 'hello'

s2  = 'e'

print(s1.index(s2))          #1

4、比较字符串


###如使用python2，可直接使用cmp()函数

import operator

s1 = 'hello'

s2 = 'hell'

#相当于a == b

print(operator.eq(s1,s2))       #False

#相当于a < b

print(operator.lt(s1,s2))         #False

#相当于a <= b

print(operator.le(s1,s2))         #False

#相当于a > b

print(operator.gt(s1,s2))         #True

#相当于a >= b

print(operator.ge(s1,s2))         #True

#相当于a != b

print(operator.ne(s1,s2))         #True

5、字符串中的大小写转换


s1 = 'Hello'

print(s1.upper())                      #HELLO

print(s1.lower())                      #hello

6、翻转字符串


s1 = 'hello'

print(s1[::-1])                        #olleh

7、查找字符串


s1 = 'hello'

s2 = 'el'

print(s1.find(s2))                 #1

8、分割字符串


s1 = 'I, want, to, say, hello, world'

s2 = ','

print(s1.split(s2))                           #['I', ' want', ' to', ' say', ' hello', ' world']

9、计算字符串中出现频次最多的字母


import re

from collections import Counter

def get_max_frequency_char(text):

    text = text.lower()

    result = re.findall('[a-zA-Z]',text)

    count = Counter(result)

    m = max(count.values())

    return sorted([x for (x, y) in count.items() if y == m])[0]

二、正则表达式

NLP学习笔记之基础技能

猜你喜欢

热点阅读