写作

markdown 转 docx 及 pdf 转 docx

2018-08-04  本文已影响194人  水之心

pdf 转换为 docx

中文转换时的乱码

通过 -V 参数指定中文字体

-V mainfont="SimSun"

其他格式的文件转化为 docx

  1. Word docx:

    pandoc -s MANUAL.txt -o example29.docx
    
  2. LaTeX math to docx:

    pandoc -s math.tex -o example30.docx
    
  3. Markdown to docx:

    pandoc -s m.md -o m.docx
    
  4. Docx with a reference docx:

    pandoc --reference-doc twocolumns.docx -o UsersGuide.docx MANUAL.txt
    

解决中文乱码

pandoc -V mainfont="SimSun" --reference-doc twocolumns.docx -o UsersGuide.docx MANUAL.txt

这里是以 twocolumns.docx 为模板将 MANUAL.txt 写入到 UsersGuide.docx,使得UsersGuide.docxtwocolumns.docx 具有相同的格式。
更多内容参考:Pandoc Demos

其他

Docx to markdown, including math:

pandoc -s example30.docx -t markdown -o example35.md

EPUB to plain text:

pandoc MANUAL.epub -t plain -o example36.text

If no input-files are specified, input is read from stdin. Output goes to stdout by default. For output to a file, use the -o option:

pandoc -o output.html input.txt

By default, pandoc produces a document fragment. To produce a standalone document (e.g. a valid HTML file including <head> and <body>), use the -s or -- standalone flag:

pandoc -s -o output.html input.txt

Character encoding

Pandoc uses the UTF-8 character encoding for both input and output. If your local character encoding is not UTF-8, you should pipe input and output through iconv:

iconv -t utf-8 input.txt | pandoc | iconv -f utf-8

Note that in some output formats (such as HTML, LaTeX, ConTeXt, RTF, OPML, DocBook, and Texinfo), information about the character encoding is included in the document header, which will only be included if you use the -s/--standalone option.

总结:

参考资料:

上一篇 下一篇

猜你喜欢

热点阅读