How to install cirrussearch extension for mediawiki
MediaWiki 原生的站内检索能力略有欠缺,但安装拓展 CirrusSearch 之后,站内检索能力则近乎完美。CirrusSearch 的检索能力主要来自 Elasticsearch 这款外部软件。据笔者实践,2G 小内存的机器就不要折腾了,运行几秒就耗干净内存了。但是对于内存稍强的机器,体验非常棒。据笔者多次实践,MediaWiki 安装 CirrusSearch 后,要正常运转,像本站这种小站,CirrusSearch 内存消耗稳定在 1.3G 左右,全站内存消耗稳定在 2.6G 左右。
安装依赖
以下软件要看准与 Mediawiki 版本兼容的相应版本。一般其实就是 Linux 发行版 stable 源中的相应版本。
- 外部安装并开启 elasticsearch 的服务
- apt 安装 php
- apt 安装 curl
- apt 安装 openjdk
- apt 安装 composer
详述一下 elasticsearch 的安装及运行:
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.10.2-amd64.deb
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.10.2-amd64.deb.sha512
shasum -a 512 -c elasticsearch-7.10.2-amd64.deb.sha512
sudo dpkg -i elasticsearch-7.10.2-amd64.deb
sudo /bin/systemctl daemon-reload
sudo /bin/systemctl enable elasticsearch.service
sudo systemctl start elasticsearch.service
sudo systemctl status elasticsearch.service
检测 elasticsearch 是否成功开启:
# 检测代码
curl -X GET "localhost:9200/?pretty"
# 正常结果参考样子
{
"name" : "Cp9sag6",
"cluster_name" : "elasticsearch",
"cluster_uuid" : "AT78_T_DTp-1qgIasfxtQqA",
"version" : {
"number" : "7.10.2",
"build_flavor" : "default",
"build_type" : "tar",
"build_hash" : "f2733455d",
"build_date" : "2016-03-30T09:51:41.449Z",
"build_snapshot" : false,
"lucene_version" : "8.5.0",
"minimum_wire_compatibility_version" : "1.3.3",
"minimum_index_compatibility_version" : "1.3.3"
},
"tagline" : "You Know, for Search"
}
安装拓展
所谓拓展,即 Mediawiki 的 Extension,均在 Mediawiki 官网 下载即可。
此处拓展不建议使用 git 方式安装,因为 master 分支,与 mediawiki 稳定版分支,大概率不兼容。所以采用从拓展页直接下载 mediawiki 稳定版分支对应的拓展版本,很可能官方解决了兼容问题,就非常方便且不易出错。
说人话就是,从网页端下载,再 tar 解压。通过这种方式安装对应兼容版本。
解压命令:tar -xzf Elastica-REL1_42-78f2f84.tar.gz -C /var/www/mediawiki/extensions
安装 Elastica
- 下载 Elastica 拓展,并解压至 extensions/ 文件夹
- 可选,在 Elastica 文件夹内,执行
sudo composer install --no-dev --no-plugins --no-scripts
。用 sudo 不完美但目前我没找到更完美方法 - 在 LocalSettings.php 中添加
wfLoadExtension( 'Elastica' );
如果是通过网页端直接下载的插件,大概率不用 2 中的 composer 步骤,因为兼容问题官方 mediawiki 稳定版解决好了的。
如果是通过 git 下载的插件,大概率要用 2 中的 composer 步骤,因为拓展版本、master 版本、mediawiki 稳定版三处很可能不兼容,会导致生成索引时找不到相关 api。
安装 CirrusSearch
- 下载 CirrusSearch 拓展,并解压至 extensions/ 文件夹
- 可选,在 CirrusSearch 文件夹内,执行
sudo composer install --no-dev --no-plugins --no-scripts
。用 sudo 不完美但目前我没找到更完美方法 - 在 LocalSettings.php 中添加
wfLoadExtension( 'CirrusSearch' );
如果是通过网页端直接下载的插件,大概率不用 2 中的 composer 步骤,因为兼容问题官方 mediawiki 稳定版解决好了的。
如果是通过 git 下载的插件,大概率要用 2 中的 composer 步骤,因为拓展版本、master 版本、mediawiki 稳定版三处很可能不兼容,会导致生成索引时找不到相关 api。
composer 样例
如果你还是非得用 git 方式安装以上两个拓展,那提供一个 composer 成功运行的两个样例。
# 从 git 安装后 composer 的样例
Do not run Composer as root/super user! See https://getcomposer.org/root for details
Continue as root/super user [yes]?
No composer.lock file present. Updating dependencies to latest instead of installing from lock file. See https://getcomposer.org/install for more information.
Loading composer repositories with package information
Updating dependencies
Lock file operations: 45 installs, 0 updates, 0 removals
- Locking symfony/string (v7.2.0)
- Locking tysonandre/var_representation_polyfill (0.1.3)
- Locking webmozart/assert (1.11.0)
Writing lock file
Installing dependencies from lock file
Package operations: 9 installs, 0 updates, 0 removals
- Downloading nyholm/dsn (2.0.1)
- Downloading elasticsearch/elasticsearch (v7.17.1)
- Downloading ruflin/elastica (7.3.1)
- Installing react/promise (v3.2.0): Extracting archive
- Installing ezimuel/guzzlestreams (3.1.0): Extracting archive
9 package suggestions were added by new dependencies, use `composer suggest` to see details.
Generating autoload files
4 packages you are using are looking for funding.
Use the `composer fund` command to find out more!
# 从网页端安装后 composer 的样例
Do not run Composer as root/super user! See https://getcomposer.org/root for details
Continue as root/super user [yes]?
Installing dependencies from lock file
Verifying lock file contents can be installed on current platform.
Nothing to install, update or remove
Generating autoload files
4 packages you are using are looking for funding.
Use the `composer fund` command to find out more!
容量比较
本节无实际教程意义,仅做个比较。
安装方式 | 状态 | 拓展名 | |
---|---|---|---|
Elastica | CirrusSearch | ||
网页方式安装 | 解压前 | 568KB | 13MB |
解压后 | 7.3MB | 92MB | |
git 方式安装 | composer 前 | 1.6MB | 78MB |
composer 后 | 8.3MB | 118MB |
生成索引
这里其实就是配置 CirrusSearch。
改下配置
首先,确保 Elasticsearch 按上文所说已安装并开启开机自动启用。确保你的 LocalSettings.php 里有这三行,其实就是加了第三行:
wfLoadExtension( 'Elastica' );
wfLoadExtension( 'CirrusSearch' );
$wgDisableSearchUpdate = true;
生成索引
其次,生成 Elasticsearch 索引:
sudo php $MW_INSTALL_PATH/extensions/CirrusSearch/maintenance/UpdateSearchIndexConfig.php
# 成功状态如下
Updating cluster ...
indexing namespaces...
Indexing namespaces...done
content index...
Fetching Elasticsearch version...7.10.2...ok
Scanning available plugins...none
Validating mappings...
Validating mapping...ok
Validating aliases...
Validating some_your-wiki_content alias...ok
Validating some_your-wiki alias...ok
Updating tracking indexes...done
general index...
Fetching Elasticsearch version...7.10.2...ok
Scanning available plugins...none
Validating aliases...
Validating some_your-wiki_general alias...ok
Validating some_your-wiki alias...ok
Updating tracking indexes...done
再改配置
其次,从 LocalSettings.php 移除刚才添加的这行:
$wgDisableSearchUpdate = true
调整索引
再次,调整索引:
sudo php $MW_INSTALL_PATH/extensions/CirrusSearch/maintenance/ForceSearchIndex.php --skipLinks --indexOnSkip
sudo php $MW_INSTALL_PATH/extensions/CirrusSearch/maintenance/ForceSearchIndex.php --skipParse
# 第一条命令成功状态如下
[ some_your-wiki] Indexed 10 pages ending at 13 at 146/second
[ some_your-wiki] Indexed 10 pages ending at 25 at 235/second
[ some_your-wiki] Indexed 10 pages ending at 331 at 437/second
[ some_your-wiki] Indexed 10 pages ending at 341 at 440/second
[ some_your-wiki] Indexed 4 pages ending at 347 at 439/second
Indexed a total of 324 pages at 439/second
# 第二条命令成功状态如下
[ some_your-wiki] Indexed 10 pages ending at 25 at 235/second
[ some_your-wiki] Indexed 10 pages ending at 39 at 284/second
[ some_your-wiki] Indexed 10 pages ending at 331 at 437/second
[ some_your-wiki] Indexed 10 pages ending at 341 at 440/second
[ some_your-wiki] Indexed 4 pages ending at 347 at 439/second
Indexed a total of 324 pages at 439/second
再改配置
最后,在 LocalSettings.php 中添加:
$wgSearchType = 'CirrusSearch';
这样,你就成功实现了 Mediawiki 网站的站内搜索,例如搜索“清冽之泉”,则网站内所有包含“清冽之泉”四字的页面和标题,都会即刻出现。