[读书笔记r4ds]14. Strings

2019-10-11  本文已影响0人  茶思饭

在线读书:
R for data science
github地址: https://github.com/hadley/r4ds


II Data Wrangle

Data Wrangle 分为3个步骤:import, tidy transformation. 
image.png

这一章讲字符串的操作,用到的R包主要是Stringr.

library(tidyverse)
library(stringr)

14.2 String basic

double_quote <- "\""  # or ' " '
single_quote <- '\''   or " ' "

或者 也可以采用与外面不同的引号形式来避免错误, 在" " 中 使用 ' ',在' '中 使用 " "

x <- c("\"", "\\")
x
#> [1] "\"" "\\"
writeLines(x)
#> "
#> \
a <- "abc\\efg\r12456"   #"\r" 表示 回车 ,"\\" 表示 \ .
a
# "abc\\efg\r12456"
 writeLines(a)          ## 前面的字符被后面的替换掉了,多余的留了下来。
# 12456fg             
a <- "abc\\efg\b12456" #  "\b" 表示退格,删除了前面一个字符。
writeLines(a)
# abc\ef12456
a <- "abc\\efg\a12456" # "\a" 表示警告,插入了一个 表示警告的�符号
writeLines(a)
# abc\efg�12456
a <- "abc\\efg\f12456"   #"\f" 表示换页,页面被清空,只留下之后的“12456”。
# 12456 
a <- "abc\\efg\v123456"  # 
writeLines(a)
# abc\efg�123456
a <- "abc\\efg\12456" # " \124" 被认为是字符代码,插入了一个字符。
writeLines(a)
#abc\efgT56
str_c("x", "y")
#> [1] "xy"
str_c("x", "y", sep = ", ")
#> [1] "x, y"
str_c("prefix-", c("a", "b", "c"), "-suffix")
#> [1] "prefix-a-suffix" "prefix-b-suffix" "prefix-c-suffix"
name <- "Hadley"
time_of_day <- "morning"
birthday <- FALSE

str_c(
  "Good ", time_of_day, " ", name,
  if (birthday) " and HAPPY BIRTHDAY",
  "."
)
#> [1] "Good morning Hadley."
x <- c("abc", NA)
str_c("|-", x, "-|")
#> [1] "|-abc-|" NA
str_c("|-", str_replace_na(x), "-|")
#> [1] "|-abc-|" "|-NA-|"
str_c(c("x", "y", "z"), collapse = ", ")
#> [1] "x, y, z"
x <- c("Apple", "Banana", "Pear")
str_sub(x, 1, 3)
#> [1] "App" "Ban" "Pea"
# negative numbers count backwards from end
str_sub(x, -3, -1)
#> [1] "ple" "ana" "ear"

-- str_sub() 不会报错,会给出尽可能正确的回应。

str_sub("a", 1, 5)
#> [1] "a"

-- str_sub() 的结果也可用 赋值符号进行修改。

str_sub(x, 1, 1) <- str_to_lower(str_sub(x, 1, 1))
x
#> [1] "apple"  "banana" "pear"

-- str_to_lower() 转换为小写字母
-- str_to_upper() 转换为大写字母
-- str_to_title() 转换为标题形式,每个单词首字母大写。
-- str_to_sentence() 转换为句子形式,Only 每句的首字母大写。

x <- c("apple", "eggplant", "banana")
str_sort(x, locale = "en")  # English
#> [1] "apple"    "banana"   "eggplant"
str_sort(x, locale = "haw") # Hawaiian
#> [1] "apple"    "eggplant" "banana"

-- en 英语;
-- zh 中文;
-- fr 法语;
-- ja 日语;
-- de 德语;
-- es 西班牙语;
......

练习:

Write a function that turns (e.g.) a vector c("a", "b", "c") into the string a, b, and c. Think carefully about what it should do if given a vector of length 0, 1, or 2.

str_commasep <- function(x, delim = ",") {
  n <- length(x)
  x <-str_replace_na(x)
  if (n == 0) {
    ""
  } else if (n == 1) {
    x
  } else if (n == 2) {
    # no comma before and when n == 2
    str_c(x[[1]], "and", x[[2]], sep = " ")
  } else {
    # commas after all n - 1 elements
    not_last <- str_c(x[seq_len(n - 1)], delim)
    # prepend "and" to the last element
    last <- str_c("and", x[[n]], sep = " ")
    # combine parts with spaces
    str_c(c(not_last, last), collapse = " ")
  }
}
str_commasep("")
#> [1] ""
str_commasep("a")
#> [1] "a"
str_commasep(c("a", "b"))
#> [1] "a and b"
str_commasep(c("a", "b", "c"))
#> [1] "a, b, and c"
str_commasep(c("a", "b", "c", "d"))
#> [1] "a, b, c, and d"
## 作者:Richard_Zhou
##  链接:https://www.jianshu.com/p/4790b00dc238

14.3 Matching patterns with regular expressions

正则表达式的模式匹配

x <- "a\\b"
writeLines(x)
#> a\b

str_view(x, "\\\\")

14.3.2.1 Exercises

1.How would you match the literal string "$^$"?

str_view("$^$","\\$\\^\\$")
  1. Given the corpus of common words in stringr::words, create regular expressions that find all words that:
    Start with “y”.
    End with “x”
    Are exactly three letters long. (Don’t cheat by using str_length()!)
    Have seven letters or more.
str_view(stringr::words,"^y",match=T)
str_view(stringr::words, "x$",match=T)
str_view(stringr::words,"^...$",match=T)
str_view(stringr::words, "^.......",match=T)

其他模糊匹配方式:

\d: 匹配任意数字
\s: 匹配任意空白 (e.g. space, tab, newline).
[abc]: 匹配 a, b, or c.
[^abc]: 匹配任意字符,除了a, b, or c.

[] 可以匹配 $ . | ? * + ( ) [ {字符,而不用“\”,但有些字符在[] 也有特殊意义,因此,必须手动输入\来跳过] \ ^ and -.

14.3.3.1 Exercises

1.Create regular expressions to find all words that:
Start with a vowel(元音).
That only contain consonants(辅音). (Hint: thinking about matching “not”-vowels.)
End with ed, but not with eed.
End with ing or ise.
Empirically verify the rule “i before e except after c”.
Is “q” always followed by a “u”?

str_view(stringr::words, "^[aeiou],match=T)
str_view(stringr::words, "^[^aeiou],match=T)
str_view(stringr::words, "[^e]ed$",match=T)
str_view(stringr::words, "ing|ise$",match=T)
str_view(stringr::words, "ing|ise$",match=T)
str_view(stringr::words, "[^c]ie|cei",match=T)
str_view(stringr::words, "q[^u]",match=T)  ### 没有匹配,及所有的"q"都有“u”跟着。

2.Write a regular expression that matches a word if it’s probably written in British English, not American English.

str_view(stringr::words, "re$",match=T)# 以–re结尾的单词:英式以-re结尾;美式以-er结尾。
str_view(stringr::words, "our$",match=T)#以-our结尾的单词:英式以-our结尾;美式通常以-or结尾。
str_view(stringr::words, "ise$",match=T)#以-ize或-ise结尾的单词:英式英语中,以-ize或-ise拼写的动词都是可以的;而在美式英语中,总是拼做-ize。
str_view(stringr::words, "yse$",match=T)#以-yse结尾的单词:英式英语中,这类动词写作-yse;美式英语中总是写作-yze。
str_view(stringr::words, "ll[ed|ing]$",match=T)#以元音+字母l结尾的单词:英式拼写中,动词以元音+字母l结尾时,如果需要再添加元音,会双写l;美式拼写中,无需双写。
str_view(stringr::words, "[ae|oe]",match=T)#双元音的拼写:英式英语中,双元音ae或oe都是两个字母;美式英语中,它们都写做一个字母e。
str_view(stringr::words, "ence$",match=T)#以–ence结尾的名词:英式英语中以–ence结尾的名词,在美式英语中写做-ense。
str_view(stringr::words, "ogue$",match=T)#以–ogue结尾的名词:英式拼写为–ogue;美式拼写为-og或-ogue均可。

Create a regular expression that will match telephone numbers as commonly written in your country.
"(0[0-9]{2,3})-" #固话
"1([1-9]{2})([0-9]{8})"## 手机

14.3.4 Repetition 重复

str_view(x, "C{2,3}") ###默认匹配最长的字符串
str_view(x, 'C{2,3}?') ### 匹配最短的字符串

14.3.4.1 Exercises

  1. Describe the equivalents of ?, +, * in {m,n} form.
    ?={0,1}
    +={1,}
    *={0,}

  2. Describe in words what these regular expressions match: (read carefully to see if I’m using a regular expression or a string that defines a regular expression.)

    1. ^.*$ ## .*匹配任意字符
    2. "\\{.+\\}" ## {.+}
    3. \d{4}-\d{2}-\d{2} ## 任意数字重复4次,-,任意数字重复2次,-,任意数字重复2次
    4. "\\\\{4}" ## \{4}, 表示“\”4次
  3. Create regular expressions to find all words that:

    1. Start with three consonants.
      str_view(stringr::words, "^[^aoeiu]{3}",match=T)
    2. Have three or more vowels in a row.
      str_view(stringr::words, "[aoeiu]{3,}",match=T)
    3. Have two or more vowel-consonant pairs in a row.
      str_view(stringr::words, "([aoeiu][^aoeiu]){2,}",match=T)
  4. Solve the beginner regexp crosswords athttps://regexcrossword.com/challenges/beginner.

14.3.5 Grouping and backreferences

正则表达式的反向引用
反向引用非常方便,因为它允许重复一个模式(pattern),无需再重写一遍。我们可以使用#(#是组号)来引用前面已定义的组(用括号括起来的内容)。比如一个文本以abc开始,接着为xyz,紧跟着abc,对应的正则表达式可以为“abcxyzabc”,也可以使用反向引用重写正则表达式,"(abc)xyz\\1"\1表示第一组(abc)。\2表示第二组,\3表示第三组,以此类推。

14.3.5.1 Exercises

  1. Describe, in words, what these expressions will match:
    (.)\1\1 ## 3个相同字符aaa
    "(.)(.)\\2\\1" ## 2个字符的回文结构abba
    (..)\1 ## 任意2个字符的重复结构abab
    "(.).\\1.\\1" ## 类似abaa的结构
    "(.)(.)(.).*\\3\\2\\1" ## 3个连续字符及其回文结构,中间可以间隔任意字符abcxxcba
  2. Construct regular expressions to match words that:
    Start and end with the same character.
    str_view(stringr::words, "^(.).*\\1$",match=T)
    Contain a repeated pair of letters (e.g. “church” contains “ch” repeated twice.)
    str_view(stringr::words, "(..).*\\1",match=T)
    Contain one letter repeated in at least three places (e.g. “eleven” contains three “e”s.)
    str_view(stringr::words, "(.).*\\1.*\\1",match=T)

14.4 Tools

image.png

正则表达式的匹配模式用pattern来表示,他把正则表达式在字符串的功能分为四个方面,分别是

查找:Detect pattern,确定这个模式有没有
定位:Locate pattern, 返回模式起止位置
取回:Extract pattern, 返回模式匹配到的条目
替换:Replace pattern,替换匹配的模式,返回替换后的结果

14.4.1 Detect matches

str_detect() 为了确定字符串向量是否匹配模式,返回向量等长的逻辑值。

x <- c("apple", "banana", "pear")
str_detect(x, "e")
#> [1]  TRUE FALSE  TRUE
str_count(x, "a")
#> [1] 1 3 1
str_count("abababa", "aba")
#> [1] 2
str_view_all("abababa", "aba")

14.4.2 Extract matches

1.str_subset() 可以实现匹配项取子集

str_subset(x,"e")
[1] "apple" "pear" 
  1. str_extract()##返回的是匹配到的模式
    str_extract_all()##以list形式返回的匹配到的模式

使用simplify=TRUE参数,以matrix形式返回匹配的模式

str_extract(x,"e")
[1] "e" NA  "e"
##以list形式返回的匹配到的模式
str_extract_all(x,"a")
[[1]]
[1] "a"

[[2]]
[1] "a" "a" "a"

[[3]]
[1] "a"
##以matrix形式返回匹配到的模式
str_extract_all(x,"a",simplify = T)
     [,1] [,2] [,3]
[1,] "a"  ""   ""  
[2,] "a"  "a"  "a" 
[3,] "a"  ""   ""  

14.4.3 Grouped matches

  1. str_match返回的是数据框
    第一列是str_extract匹配到的模式 ,后面依次是括号中的内容,模式中有多少个(),就返回多少列。本例中有2对小括号。
    如果数据是 tibble格式,使用tidyr::extract()函数也很方便,工作方式类似str_match()只是需要命名匹配项。
str_match(x,"([aoeiu]).*([aoeiu])")
     [,1]    [,2] [,3]
[1,] "apple" "a"  "e" 
[2,] "anana" "a"  "a" 
[3,] "ea"    "e"  "a" 
tibble(x=x) %>%
 tidyr::extract(x,c("vowel1","vowel2"),"([aoeiu]).*([aoeiu])",
remove=FALSE)  ## 保留原数据

14.4.4 Replacing matches

  1. str_replace() and str_replace_all() 可以将匹配字符串替换为其他字符串
str_replace(x, "[aeiou]", "-")
#> [1] "-pple"  "p-ar"   "b-nana"
str_replace_all(x, "[aeiou]", "-")
#> [1] "-ppl-"  "p--r"   "b-n-n-"
x <- c("1 house", "2 cars", "3 people")
str_replace_all(x, c("1" = "one", "2" = "two", "3" = "three"))
#> [1] "one house"    "two cars"     "three people"
sentences %>% 
  str_replace("([^ ]+) ([^ ]+) ([^ ]+)", "\\1 \\3 \\2") %>% ###第2个单词与第3个单词互换位置。
  head(5) 
#> [1] "The canoe birch slid on the smooth planks." 
#> [2] "Glue sheet the to the dark blue background."
#> [3] "It's to easy tell the depth of a well."     
#> [4] "These a days chicken leg is a rare dish."   
#> [5] "Rice often is served in round bowls."

14.4.1.1 Exercises

1.For each of the following challenges, try solving it by using both a single regular expression, and a combination of multiple str_detect() calls.

Find all words that start or end with x.

words[str_detect(words,"^x|x$")]

Find all words that start with a vowel and end with a consonant.

words[str_detect(words,"^[aoeiu].*[^aoeiu]$")]

Are there any words that contain at least one of each different vowel?

words[str_count(words,"[aoeiu]")>2]
  1. What word has the highest number of vowels? What word has the highest proportion of vowels? (Hint: what is the denominator?)
words[str_count(words,"[aoeiu]") == max(str_count(words,"[aoeiu]"))]
words[str_count(words,"[aoeiu]")/str_length(words) == max(str_count(words,"[aoeiu]")/str_length(words))]

14.4.2.1 Exercises

  1. In the previous example, you might have noticed that the regular expression matched “flickered”, which is not a colour. Modify the regex to fix the problem.
colours <- c("red", "orange", "yellow", "green", "blue", "purple")
colour_match <- str_c(colours, collapse = " | ")
colour_match
#> [1] "red|orange|yellow|green|blue|purple"
has_colour <- str_subset(sentences, colour_match)
matches <- str_extract(has_colour, colour_match)
more <- sentences[str_count(sentences, colour_match) > 1]
str_view_all(more, colour_match)
image.png

修改正则表达式:

colors <- c( " red", "orange", "yellow", "green", "blue", "purple")
> (colour_match <- str_c(colors,collapse = "|"))
[1] " red|orange|yellow|green|blue|purple"
> str_view_all(more,colour_match)
image.png
  1. From the Harvard sentences data, extract:
    The first word from each sentence.
str_extract(sentences,"^[^ ]+ ")

All words ending in ing.

str_extract(sentences,"[^ ]+ing ")

All plurals.

str_extract(sentences,"([a-z]+)(((s|x|sh|ch)es)|ies|[aoeiu]ys|ves|[^aeiu']s)[ .]")

14.4.3.1 Exercises

  1. Find all words that come after a “number” like “one”, “two”, “three” etc. Pull out both the number and the word.
numb <- c(" one","two","three","four","five","six","seven"," eight","nine"," ten ")
number <- str_c(numb,collapse = " | ") %>% paste0("(",.,") ","([^ ]+)")
str_subset(sentences,number)%>%str_match(number)

Find all contractions. Separate out the pieces before and after the apostrophe.

str_subset(sentences,"'") %>% str_match("([^ ]+)'([^ ]+)")

14.4.4.1 Exercises

  1. Replace all forward slashes in a string with backslashes.
x <- c("ab\\c","abbc\\edf")
 x
#[1] "ab\\c"     "abbc\\edf"
str_replace_all(x,"\\\\\\\\","\\/\\/")
#"ab//c"     "abbc//edf"
  1. Implement a simple version of str_to_lower() using replace_all().
paste0('"',LETTERS,'"',"=",'"',letters,'"') %>% str_c(collapse = ",") %>% writeLines()
#"A"="a","B"="b","C"="c","D"="d","E"="e","F"="f","G"="g","H"="h","I"="i","J"="j","K"="k","L"="l","M"="m","N"="n","O"="o","P"="p","Q"="q","R"="r","S"="s","T"="t","U"="u","V"="v","W"="w","X"="x","Y"="y","Z"="z"
str_replace_all(sentences,c("A"="a","B"="b","C"="c","D"="d","E"="e","F"="f","G"="g","H"="h","I"="i","J"="j","K"="k","L"="l","M"="m","N"="n","O"="o","P"="p","Q"="q","R"="r","S"="s","T"="t","U"="u","V"="v","W"="w","X"="x","Y"="y","Z"="z"))%>% head(5)
[1] "the birch canoe slid on the smooth planks."  "glue the sheet to the dark blue background."
[3] "it's easy to tell the depth of a well."      "these days a chicken leg is a rare dish."   
[5] "rice is often served in round bowls."       
c("A"="a","B"="b","C"="c","D"="d","E"="e","F"="f","G"="g","H"="h","I"="i","J"="j","K"="k","L"="l","M"="m","N"="n","O"="o","P"="p","Q"="q","R"="r","S"="s","T"="t","U"="u","V"="v","W"="w","X"="x","Y"="y","Z"="z")
##  A   B   C   D   E   F   G   H   I   J   K   L   M   N   O   P   Q   R   S   T   U   V   W   X   Y   Z 
## "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s" "t" "u" "v" "w" "x" "y" "z" 
names(letters) <- LETTERS
letters
##   A   B   C   D   E   F   G   H   I   J   K   L   M   N   O   P   Q   R   S   T   U   V   W   X   Y   Z 
## "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s" "t" "u" "v" "w" "x" "y" "z" 
str_replace_all(sentences,letters) %>% head(5)
#[1] "the birch canoe slid on the smooth planks."               
#[2] "glue the sheet to the dark blue background."              
#[3] "it's easy to tell the depth of a well."                   
#[4] "these days a chicken leg is a rare dish."                 
#[5] "rice is often served in round bowls."  
  1. Switch the first and last letters in words. Which of those strings are still words?
str_replace_all(words,"(^.)(.*)(.$)","\\3\\2\\1")
str_replace_all(words,"(^.)(.*)(.$)","\\3\\2\\1") %>% str_subset(paste0("^",str_c(words,collapse = "$|^"),"$"))
 [1] "a"          "america"    "area"       "dad"        "dead"       "lead"       "read"       "depend"     "god"       
[10] "educate"    "else"       "encourage"  "engine"     "europe"     "evidence"   "example"    "excuse"     "exercise"  
[19] "expense"    "experience" "eye"        "dog"        "health"     "high"       "knock"      "deal"       "level"     
[28] "local"      "nation"     "on"         "non"        "no"         "rather"     "dear"       "refer"      "remember"  
[37] "serious"    "stairs"     "test"       "tonight"    "transport"  "treat"      "trust"      "window"     "yesterday" 

14.4.5 Splitting 分列

words <- c("These are   some words.")
> str_count(words, boundary("word"))
[1] 4
> str_split(words, " ")[[1]]
[1] "These"  "are"    ""       ""       "some"   "words."
> str_split(words, boundary("word"))[[1]]
[1] "These" "are"   "some"  "words"
fruits <- c(
  "apples and oranges and pears and bananas",
  "pineapples and mangos and guavas"
)
> str_split_fixed(fruits, " and ", 3)
     [,1]         [,2]      [,3]               
[1,] "apples"     "oranges" "pears and bananas"
[2,] "pineapples" "mangos"  "guavas"           
> str_split_fixed(fruits, " and ", 4)
     [,1]         [,2]      [,3]     [,4]     
[1,] "apples"     "oranges" "pears"  "bananas"
[2,] "pineapples" "mangos"  "guavas" ""       

14.4.5.1 Exercises

1.Split up a string like "apples, pears, and bananas" into individual components.

c<-"apples, pears, and bananas"
str_split(c,", |and ")

2.Why is it better to split up by boundary("word") than " "?
boundary("word") 可以忽略空格、逗号等的影响。

  1. What does splitting with an empty string ("") do? Experiment, and then read the documentation.
    结果以每个字符分列:
str_split(words,"")[[1]]
 [1] "T" "h" "e" "s" "e" " " "a" "r" "e" " " " " " " "s" "o" "m" "e" " " "w" "o" "r" "d" "s" "."

14.4.6 Find matches

str_locate() ·and ·str_locate_all()· 给出匹配模式的起始和终止位置,可以用str_locate()查找匹配位置str_sub()` 进行提取或修改.

14.5 Other types of pattern 其他模式

bananas <- c("banana", "Banana", "BANANA")
> str_extract(bananas,"banana")
[1] "banana" NA       NA      
> str_extract(bananas,regex("banana",ignore_case = TRUE))
[1] "banana" "Banana" "BANANA"
x <- "Line 1\nLine 2\nLine 3"
str_extract_all(x, "^Line")[[1]]
#> [1] "Line"
str_extract_all(x, regex("^Line", multiline = TRUE))[[1]]
#> [1] "Line" "Line" "Line"
 str_split(words,regex("\\ ",comments = T))[[1]]## 以空格进行分列
# [1] "These"  "are"    ""       ""       "some"   "words."
str_split(words,regex(" ",comments = T))[[1]]## pattern中的空格被忽略,以每个字符进行分列
# [1] ""  "T" "h" "e" "s" "e" " " "a" "r" "e" " " " " " " "s" "o" "m" "e" " " "w" "o" "r" "d" "s" "." ""
str_split(words,regex("[. ]",comments = T))[[1]] ## 只能以点进行分列。
# [1] "These are   some words" ""                      
str_split(words,regex("[.\\ ]",comments = T))[[1]]##  以· 或者空格 进行分列
# [1] "These" "are"   ""      ""      "some"  "words" "" 

此外,还有3种函数可以替代regex() :

 str_extract_all(words, boundary("word"))[[1]]
[1] "These" "are"   "some"  "words"

14.5.1 Exercises

  1. How would you find all strings containing \ with regex() vs. with fixed()?
x <- "Line 1\\\\Line 2\nLine 3"
writeLines(x)
#Line 1\\Line 2
#Line 3
str_view_all(x, regex("\\\\"))
str_view_all(x,fixed("\\"))
  1. What are the five most common words in sentences?
str_split(sentences,boundary("word")) %>%  ## 分割单词,
  unlist %>%  str_to_lower%>%   ## 去list,全部转为小写
  table() %>% sort(decreasing = T) %>%  ##使用table统计 ,sort排序
  head(5)#head显示前5个。
# .
# the   a  of  to and 
# 751 202 132 123 118 

14.6 Other uses of regular expressions

14.7 stringi

stringr 是在stringi包的基础上产生的,stringr 包含了最基本的字符串处理函数46个,stringi包具有234个函数,功能更加强大。如果有更复杂的字符串处理,可以使用stringi包,这两个包的函数非常相似,只需要替换str_stri_即可,

14.7.1 Exercises

  1. Find the stringi functions that:
    Count the number of words.
    stri_count(sentences,regex = " ") %>% head()
    Find duplicated strings.
    stri_extract(words,regex="(.)\\1",simplify=T) ###寻找连续字符
    stri_duplicated(c("a", "b", "a", NA, "a", NA)) ### 判断是否有重复字符串
    Generate random text.

  2. How do you control the language that stri_sort() uses for sorting?

 stri_sort(sample(LETTERS))
 [1] "A" "B" "C" "D" "E" "F" "G" "H" "I" "J" "K" "L" "M" "N" "O" "P" "Q" "R" "S" "T" "U" "V" "W" "X" "Y" "Z"
stri_sort(sample(LETTERS),decreasing = TRUE) 
 [1] "Z" "Y" "X" "W" "V" "U" "T" "S" "R" "Q" "P" "O" "N" "M" "L" "K" "J" "I" "H" "G" "F" "E" "D" "C" "B" "A"

-指定locale="xx"参数,按地区语言特性排列。

stri_sort(c("hladny", "chladny"), locale="pl_PL")
[1] "chladny" "hladny" 
> stri_sort(c("hladny", "chladny"), locale="sk_SK")
[1] "hladny"  "chladny"

-指定numeric=TRUE参数,按数字大小排列

stri_sort(c(1, 100, 2, 101, 11, 10))
[1] "1"   "10"  "100" "101" "11"  "2"  
> stri_sort(c(1, 100, 2, 101, 11, 10), numeric=TRUE)
[1] "1"   "2"   "10"  "11"  "100" "101"
上一篇 下一篇

猜你喜欢

热点阅读