An elisp function to convert legal headers to wikitext format

From 清冽之泉
Revision as of 16:22, 13 March 2025 by Mwroot (talk | contribs)
Jump to navigation Jump to search

本 Emacs 函数,可以很方便地处理国家法律数据库中下载下来的法律文本,使其标题以 wikitext 形式呈现。具体排版效果可参见本站好文转载部分分享的法条。

(defun convert-legal-headers-to-wikitext ()
  "将当前 buffer 中的法律文书标题转换为 mediawiki wikitext 层级格式。
对以“第…编”、“第…章”、“第…节”开头的行进行转换:
- ‘编’ -> 二级标题 (== 标题 ==)
- ‘章’ -> 三级标题 (=== 标题 ===)
- ‘节’ -> 四级标题 (==== 标题 ====)
处理过程中先统一替换各种混乱空格为单个空格,并在最后将标题中的半角空格转换为全角空格。"
  (interactive)
  (save-excursion
    (goto-char (point-min))
    (while (not (eobp))
      (let ((line (buffer-substring-no-properties
                   (line-beginning-position) (line-end-position))))
        (cond
         ;; 处理“编”
         ((string-match "^\\s-*\\(第[^[:space:]]*编\\)\\s-*\\(.*\\)$" line)
          (let* ((part1 (match-string 1 line))
                 (part2 (match-string 2 line))
                 (title (concat part1 " " part2)))
            ;; 将全角和连续空格统一替换为一个半角空格,并去除首尾空白
            (setq title (replace-regexp-in-string "[  ]+" " " title))
            (setq title (string-trim title))
            ;; 替换半角空格为全角空格
            (setq title (replace-regexp-in-string " " " " title))
            (delete-region (line-beginning-position) (line-end-position))
            (insert (format "== %s ==" title))))
         ;; 处理“章”
         ((string-match "^\\s-*\\(第[^[:space:]]*章\\)\\s-*\\(.*\\)$" line)
          (let* ((part1 (match-string 1 line))
                 (part2 (match-string 2 line))
                 (title (concat part1 " " part2)))
            (setq title (replace-regexp-in-string "[  ]+" " " title))
            (setq title (string-trim title))
            (setq title (replace-regexp-in-string " " " " title))
            (delete-region (line-beginning-position) (line-end-position))
            (insert (format "=== %s ===" title))))
         ;; 处理“节”
         ((string-match "^\\s-*\\(第[^[:space:]]*节\\)\\s-*\\(.*\\)$" line)
          (let* ((part1 (match-string 1 line))
                 (part2 (match-string 2 line))
                 (title (concat part1 " " part2)))
            (setq title (replace-regexp-in-string "[  ]+" " " title))
            (setq title (string-trim title))
            (setq title (replace-regexp-in-string " " " " title))
            (delete-region (line-beginning-position) (line-end-position))
            (insert (format "==== %s ====" title)))))
        (forward-line 1)))))