An elisp function to convert legal headers to wikitext format
本 Emacs 函数,可以很方便地处理国家法律数据库中下载下来的法律文本,使其标题以 wikitext 形式呈现。具体排版效果可参见本站[好文转载]部分分享的法条。
(defun convert-legal-headers-to-wikitext ()
"将当前 buffer 中的法律文书标题转换为 mediawiki wikitext 层级格式。
对以“第…编”、“第…章”、“第…节”开头的行进行转换:
- ‘编’ -> 二级标题 (== 标题 ==)
- ‘章’ -> 三级标题 (=== 标题 ===)
- ‘节’ -> 四级标题 (==== 标题 ====)
处理过程中先统一替换各种混乱空格为单个空格,并在最后将标题中的半角空格转换为全角空格。"
(interactive)
(save-excursion
(goto-char (point-min))
(while (not (eobp))
(let ((line (buffer-substring-no-properties
(line-beginning-position) (line-end-position))))
(cond
;; 处理“编”
((string-match "^\\s-*\\(第[^[:space:]]*编\\)\\s-*\\(.*\\)$" line)
(let* ((part1 (match-string 1 line))
(part2 (match-string 2 line))
(title (concat part1 " " part2)))
;; 将全角和连续空格统一替换为一个半角空格,并去除首尾空白
(setq title (replace-regexp-in-string "[ ]+" " " title))
(setq title (string-trim title))
;; 替换半角空格为全角空格
(setq title (replace-regexp-in-string " " " " title))
(delete-region (line-beginning-position) (line-end-position))
(insert (format "== %s ==" title))))
;; 处理“章”
((string-match "^\\s-*\\(第[^[:space:]]*章\\)\\s-*\\(.*\\)$" line)
(let* ((part1 (match-string 1 line))
(part2 (match-string 2 line))
(title (concat part1 " " part2)))
(setq title (replace-regexp-in-string "[ ]+" " " title))
(setq title (string-trim title))
(setq title (replace-regexp-in-string " " " " title))
(delete-region (line-beginning-position) (line-end-position))
(insert (format "=== %s ===" title))))
;; 处理“节”
((string-match "^\\s-*\\(第[^[:space:]]*节\\)\\s-*\\(.*\\)$" line)
(let* ((part1 (match-string 1 line))
(part2 (match-string 2 line))
(title (concat part1 " " part2)))
(setq title (replace-regexp-in-string "[ ]+" " " title))
(setq title (string-trim title))
(setq title (replace-regexp-in-string " " " " title))
(delete-region (line-beginning-position) (line-end-position))
(insert (format "==== %s ====" title)))))
(forward-line 1)))))