提取Word中特定字符中的文本到Excel
2019-06-16 本文已影响0人
叶知行
image.png
image.png
要求:提取Word中每个《》之间的文本。(代码在Excel中)
代码1:查找替换
Sub 提取特定字符中间的内容() '查找替换
Dim wdapp As Object, wdoc As Object '声明wdapp和wdoc变量为对象类型
Set wdapp = CreateObject("Word.Application") '将新建word程序对象赋给变量wdapp
Dim Findchar As String '要查找的字符
Findchar = "《*》"
Set wdoc = wdapp.Documents.Open(ThisWorkbook.Path & "\范例.docx") '打开word文档
With wdoc.Content.Find '此处针对全文档
.MatchWildcards = True '使用通配符
Do While .Execute(FindText:=Findchar) = True '将内容返回到Excel
k = k + 1
Cells(k, 1) = .Parent '此代码没有处理符号
Cells(k, 2) = Replace(Replace(.Parent, "《", ""), "》", "") '此代码去除符号
Loop
End With
wdoc.Close False '关闭word文档,不保存更改。
wdapp.Quit '关闭word程序
Set wdapp = Nothing '释放内存
Set wdoc = Nothing '释放内存
End Sub
代码2:Split方法
Sub 提取特定字符中间的内容1() 'split
Dim wdapp As Object, wdoc As Object '声明wdapp和wdoc变量为对象类型
Set wdapp = CreateObject("Word.Application") '将新建word程序对象赋给变量wdapp
Dim schar As String '字符串
Set wdoc = wdapp.Documents.Open(ThisWorkbook.Path & "\范例.docx") '打开word文档
schar = wdoc.Content '将文档内容赋值给字符串
wdoc.Close False '关闭word文档,不保存更改。
wdapp.Quit '关闭word程序
Set wdapp = Nothing '释放内存
Set wdoc = Nothing '释放内存
Dim s
s = Split(schar, "《")
For Each s1 In s
If InStr(s1, "》") > 0 Then
k = k + 1
Cells(k, 1) = Split(s1, "》")(0)
End If
Next
End Sub
代码3:正则表达式,三个表达式都能够实现结果,注意submatchse就行。
Sub 提取特定字符中间的内容2() '正则
Dim wdapp As Object, wdoc As Object '声明wdapp和wdoc变量为对象类型
Set wdapp = CreateObject("Word.Application") '将新建word程序对象赋给变量wdapp
Dim schar As String '字符串
Set wdoc = wdapp.Documents.Open(ThisWorkbook.Path & "\范例.docx") '打开word文档
schar = wdoc.Content '将文档内容赋值给字符串
wdoc.Close False '关闭word文档,不保存更改。
wdapp.Quit '关闭word程序
Set wdapp = Nothing '释放内存
Set wdoc = Nothing '释放内存
With CreateObject("vbscript.regexp")
.Global = True
' .Pattern = "《([^》]+)》"
' .Pattern = "《(.*?)》"
.Pattern = "[^《]+(?=》)"
Set matc = .Execute(schar)
For Each mat In matc
k = k + 1
' Cells(k, 1) = mat.submatchse(0)
Cells(k, 1) = mat.Value
Next
End With
End Sub
结果:
image.png