怎样把 MediaWiki 的 wikitext 转换为 GitHub 风格的 markdown

From 清冽之泉
Jump to navigation Jump to search

想把本站内容发布到微信公众号,复制 wikitext 过去手动编辑,肯定是麻烦的。只有寻求 wikitext 到微信公众号的自动化方法。

  1. wikitext 直达微信公众平台,此路人少,难以走通
  2. markdown 直达微信公众平台,此路人多,很易走通

问题于是变成,怎样让 wikitext 较为完美地转换为 markdown。试过 pandoc,不太靠谱。在 ChatGPT 的帮助下,自己写了一个简明的 wikitext 转 markdown 工具,需要在 Emacs 中使用。

以上转换规则,果断放弃了表格转换,因为表格太费事了,复杂度逆天,不如直接放截图。果断放弃了图片地址转换,因为辛苦找图床,不如直接单独另传至微信公众号后台图床。把有序列表 # 转换为 *,是因为 # 在 markdown 里被当成一级标题,肯定不符合我们的意图;而把它转换为 *,是因为偷懒,挨个加数字序号可不是我有耐心去干的,干脆有序变无序。至于无序列表 *、预设格式 pre,wikitext 与 markdown 里两边相同,所以不需要理它们。

;; wikitext -> gfm converter
;;
;; 设计原则:
;; 1 所有规则按顺序执行
;; 2 每条规则对应一段代码
;; 3 不做抽象,优先可读性
;;
;; 规则:
;; 1 <code> 和 </code> -> ```
;; 2 <syntaxhighlight> -> ```
;; 3 # -> *
;; 4 ====== -> ######
;; 5 ===== -> #####
;; 6 ==== -> ####
;; 7 === -> ###
;; 8 == -> ##
;; 9 <del> -> ~~
;; 10 ---- -> ---

(defun wikitext-convert-all (&optional start end)
  "Convert simple wikitext to GFM markdown."

  (interactive
   (if (use-region-p)
       (list (region-beginning) (region-end))
     (list (point-min) (point-max))))

  (save-excursion
    (save-restriction
      (narrow-to-region start end)

      ;; --------------------------------------------------
      ;; 1. inline <code>
      ;; --------------------------------------------------

      (goto-char (point-min))
      (while (re-search-forward "<code>\\([^<\n]*\\)</code>" nil t)
        (replace-match (concat "`" (match-string 1) "`") t t))

      (goto-char (point-min))
      (while (re-search-forward "<code>" nil t)
        (replace-match "`"))

      (goto-char (point-min))
      (while (re-search-forward "</code>" nil t)
        (replace-match "`"))

      ;; --------------------------------------------------
      ;; 2. syntaxhighlight
      ;; --------------------------------------------------

      (goto-char (point-min))
      (while (re-search-forward "<syntaxhighlight\\s-+lang=[\"']\\([^\"']+\\)[\"'][^>]*>" nil t)
        (replace-match (concat "```" (match-string 1)) t t))

      (goto-char (point-min))
      (while (re-search-forward "</syntaxhighlight>" nil t)
        (replace-match "```"))

      ;; --------------------------------------------------
      ;; 3. 列表 # → *
      ;; --------------------------------------------------

      (let ((in-code-block nil))
        (goto-char (point-min))
        (while (not (eobp))

          (let ((line (thing-at-point 'line t)))

            (when (string-match "^```" line)
              (setq in-code-block (not in-code-block)))

            (unless in-code-block
              (beginning-of-line)
              (while (search-forward "#" (line-end-position) t)
                (replace-match "*"))))

          (forward-line 1)))

      ;; --------------------------------------------------
      ;; 4. headings
      ;; --------------------------------------------------

      (dolist (n '(6 5 4 3 2))
        (goto-char (point-min))
        (let ((pat (format "^=\\{%d\\}[ \t]*\\(.*?\\S-\\)[ \t]*=\\{%d\\}[ \t]*$" n n))
              (rep (concat (make-string n ?#) " \\1")))
          (while (re-search-forward pat nil t)
            (replace-match rep))))

      ;; --------------------------------------------------
      ;; 5. other
      ;; --------------------------------------------------

      (goto-char (point-min))
      (while (re-search-forward "<del>" nil t)
        (replace-match "~~"))

      (goto-char (point-min))
      (while (re-search-forward "</del>" nil t)
        (replace-match "~~"))

      (goto-char (point-min))
      (while (re-search-forward "^----[ \t]*$" nil t)
        (replace-match "---"))

      (widen))))

转好了 markdown,再把源稿放入 Doocs 即可,默认的样式挺美观了。不接受默认样式,则需要花大时间手动编辑外观,那要等我有空时再说了。

理想流程如下:

MediaWiki
   ↓
Elisp converter
   ↓
Markdown
   ↓
Local md editor
   ↓
公众号