python Unicodedata用法

2019-05-29 本文已影响0人吃鱼喵了个鱼

1.unicodedata.lookup()：通过索引中的名称查找相应的字符

unicodedata.lookup()

2.unicodedata.name()：通过字符查找名称，与unicodedata.lookup()相反

unicodedata.name()

3.unicodedata.decimal()：返回表示数字字符的数值

unicodedata.decimal()

4.unicodedata.digit():把一个合法的数字字符串转换为数字值

unicodedata.digit()

5.unicodedata.numeric():把一个表示数字的字符串转换为浮点数返回，与unicodedata.digit()不同的是：它可以任意表示数值的字符都可以，不仅仅限于0到9的字符

unicodedata.numeric()

6.unicodedata.category():把一个字符返回它在UNICODE里分类的类型

unicodedata.category()

UNICODE具体类型如下：

Code Description

[Cc] Other, Control

[Cf] Other, Format

[Cn] Other, Not Assigned (no characters in the file have this property)

[Co] Other, Private Use

[Cs] Other, Surrogate

[LC] Letter, Cased

[Ll] Letter, Lowercase

[Lm] Letter, Modifier

[Lo] Letter, Other

[Lt] Letter, Titlecase

[Lu] Letter, Uppercase

[Mc] Mark, Spacing Combining

[Me] Mark, Enclosing

[Mn] Mark, Nonspacing

[Nd] Number, Decimal Digit

[Nl] Number, Letter

[No] Number, Other

[Pc] Punctuation, Connector

[Pd] Punctuation, Dash

[Pe] Punctuation, Close

[Pf] Punctuation, Final quote (may behave like Ps or Pe depending on usage)

[Pi] Punctuation, Initial quote (may behave like Ps or Pe depending on usage)

[Po] Punctuation, Other

[Ps] Punctuation, Open

[Sc] Symbol, Currency

[Sk] Symbol, Modifier

[Sm] Symbol, Math

[So] Symbol, Other

[Zl] Separator, Line

[Zp] Separator, Paragraph

[Zs] Separator, Space

上述代码的结果依次如下：

调试结果