AI mode documentation (#2620)

This commit is contained in:
Soxoj
2026-05-05 22:21:00 +02:00
committed by GitHub
parent 52c8917e2c
commit 79e93ab715
6 changed files with 173 additions and 2 deletions
+20
View File
@@ -69,6 +69,7 @@ See also: [Quick start](https://maigret.readthedocs.io/en/latest/quick-start.htm
- Fetches an [auto-updated site database](https://maigret.readthedocs.io/en/latest/settings.html#database-auto-update) from GitHub each run (once per 24 hours), and falls back to the built-in database if offline. - Fetches an [auto-updated site database](https://maigret.readthedocs.io/en/latest/settings.html#database-auto-update) from GitHub each run (once per 24 hours), and falls back to the built-in database if offline.
- Works with Tor and I2P websites; able to check domains. - Works with Tor and I2P websites; able to check domains.
- Ships with a [web interface](#web-interface) for browsing results as a graph and downloading reports in every format from a single page. - Ships with a [web interface](#web-interface) for browsing results as a graph and downloading reports in every format from a single page.
- Optional [AI analysis mode](#ai-analysis) (`--ai`) that turns raw findings into a short investigation summary using an OpenAI-compatible API.
For the complete feature list, see the [features documentation](https://maigret.readthedocs.io/en/latest/features.html). For the complete feature list, see the [features documentation](https://maigret.readthedocs.io/en/latest/features.html).
@@ -195,6 +196,9 @@ maigret user --tags us
# search for three usernames on all available sites # search for three usernames on all available sites
maigret user1 user2 user3 -a maigret user1 user2 user3 -a
# AI-assisted investigation summary (needs OPENAI_API_KEY)
maigret user --ai
``` ```
Run `maigret --help` for all options. Docs: [CLI options](https://maigret.readthedocs.io/en/latest/command-line-options.html), [more examples](https://maigret.readthedocs.io/en/latest/usage-examples.html). Running into 403s or timeouts? See [TROUBLESHOOTING.md](TROUBLESHOOTING.md). Run `maigret --help` for all options. Docs: [CLI options](https://maigret.readthedocs.io/en/latest/command-line-options.html), [more examples](https://maigret.readthedocs.io/en/latest/usage-examples.html). Running into 403s or timeouts? See [TROUBLESHOOTING.md](TROUBLESHOOTING.md).
@@ -230,6 +234,22 @@ See the full [library usage guide](https://maigret.readthedocs.io/en/latest/libr
- `--parse URL` — parse a profile page, extract IDs/usernames, and use them to kick off a recursive search. - `--parse URL` — parse a profile page, extract IDs/usernames, and use them to kick off a recursive search.
- `--permute` — generate likely username variants from two or more inputs (e.g. `john doe``johndoe`, `j.doe`, …) and search for all of them. - `--permute` — generate likely username variants from two or more inputs (e.g. `john doe``johndoe`, `j.doe`, …) and search for all of them.
- `--self-check [--auto-disable]` — verify `usernameClaimed` / `usernameUnclaimed` pairs against live sites for maintainers auditing the database. - `--self-check [--auto-disable]` — verify `usernameClaimed` / `usernameUnclaimed` pairs against live sites for maintainers auditing the database.
- `--ai` / `--ai-model` — run the [AI analysis](#ai-analysis) over the search results and stream a short investigation summary to the terminal.
<a id="ai-analysis"></a>
### AI analysis
`--ai` collects the search results, builds an internal Markdown report, and sends it to an OpenAI-compatible chat completion endpoint to produce a short, neutral investigation summary (likely real name, location, occupation, interests, languages, confidence, follow-up leads). Per-site progress is suppressed and the model's output is streamed to stdout.
```bash
export OPENAI_API_KEY=sk-...
maigret user --ai
# pick a different model
maigret user --ai --ai-model gpt-4o-mini
```
The key can also be set as `openai_api_key` in `settings.json`. The endpoint defaults to `https://api.openai.com/v1`, but `openai_api_base_url` in `settings.json` can point to any OpenAI-compatible API (Azure OpenAI, OpenRouter, a local server, …). See the [settings docs](https://maigret.readthedocs.io/en/latest/settings.html) for the full list of options.
### Tor / I2P / proxies ### Tor / I2P / proxies
+20
View File
@@ -70,6 +70,7 @@ maigret YOUR_USERNAME
- 每次运行时(每 24 小时一次)从 GitHub 拉取一份[自动更新的站点数据库](https://maigret.readthedocs.io/en/latest/settings.html#database-auto-update);离线时会回退到内置数据库。 - 每次运行时(每 24 小时一次)从 GitHub 拉取一份[自动更新的站点数据库](https://maigret.readthedocs.io/en/latest/settings.html#database-auto-update);离线时会回退到内置数据库。
- 可访问 Tor 与 I2P 站点;支持检查域名。 - 可访问 Tor 与 I2P 站点;支持检查域名。
- 自带一个 [Web 界面](#web-interface),可在同一页面将结果以图谱方式浏览,并下载各种格式的报告。 - 自带一个 [Web 界面](#web-interface),可在同一页面将结果以图谱方式浏览,并下载各种格式的报告。
- 可选的 [AI 分析模式](#ai-analysis)(`--ai`),通过 OpenAI 兼容 API 将原始搜索结果整理成一份简短的调查摘要。
完整特性列表请见[特性文档](https://maigret.readthedocs.io/en/latest/features.html)。 完整特性列表请见[特性文档](https://maigret.readthedocs.io/en/latest/features.html)。
@@ -199,6 +200,9 @@ maigret user --tags us
# 同时在所有站点上搜索三个用户名 # 同时在所有站点上搜索三个用户名
maigret user1 user2 user3 -a maigret user1 user2 user3 -a
# AI 辅助调查摘要(需要 OPENAI_API_KEY)
maigret user --ai
``` ```
完整选项请运行 `maigret --help`。文档:[命令行选项](https://maigret.readthedocs.io/en/latest/command-line-options.html)、[更多示例](https://maigret.readthedocs.io/en/latest/usage-examples.html)。遇到 403 或超时?参见 [TROUBLESHOOTING.md](TROUBLESHOOTING.md)。 完整选项请运行 `maigret --help`。文档:[命令行选项](https://maigret.readthedocs.io/en/latest/command-line-options.html)、[更多示例](https://maigret.readthedocs.io/en/latest/usage-examples.html)。遇到 403 或超时?参见 [TROUBLESHOOTING.md](TROUBLESHOOTING.md)。
@@ -234,6 +238,22 @@ maigret --web 5000
- `--parse URL` —— 解析一个个人主页,从中提取 ID/用户名,并以此为起点发起递归搜索。 - `--parse URL` —— 解析一个个人主页,从中提取 ID/用户名,并以此为起点发起递归搜索。
- `--permute` —— 基于两个或更多输入生成可能的用户名变体(例如 `john doe``johndoe``j.doe` …)并对其逐一搜索。 - `--permute` —— 基于两个或更多输入生成可能的用户名变体(例如 `john doe``johndoe``j.doe` …)并对其逐一搜索。
- `--self-check [--auto-disable]` —— 维护者用于核对数据库的工具:针对线上站点验证 `usernameClaimed` / `usernameUnclaimed` 配对是否仍然有效。 - `--self-check [--auto-disable]` —— 维护者用于核对数据库的工具:针对线上站点验证 `usernameClaimed` / `usernameUnclaimed` 配对是否仍然有效。
- `--ai` / `--ai-model` —— 启用 [AI 分析](#ai-analysis),将搜索结果交给 OpenAI 兼容 API,并把简短的调查摘要流式输出到终端。
<a id="ai-analysis"></a>
### AI 分析
`--ai` 会先收集搜索结果、在内存中构建 Markdown 报告,再将其发送到一个 OpenAI 兼容的 chat completion 接口,生成一份简短、克制的调查摘要(最可能的真实姓名、所在地、职业、兴趣、语言、置信度以及后续线索)。开启该模式后,逐站点的进度输出会被静默,模型的输出会以流式方式打印到 stdout。
```bash
export OPENAI_API_KEY=sk-...
maigret user --ai
# 切换到其它模型
maigret user --ai --ai-model gpt-4o-mini
```
API key 也可以写入 `settings.json``openai_api_key` 字段。接口地址默认为 `https://api.openai.com/v1`,通过在 `settings.json` 中设置 `openai_api_base_url`,可以指向任何 OpenAI 兼容的服务(Azure OpenAI、OpenRouter、本地推理服务等)。完整选项见[配置文档](https://maigret.readthedocs.io/en/latest/settings.html)。
### Tor / I2P / 代理 ### Tor / I2P / 代理
+56
View File
@@ -161,6 +161,14 @@ ndjson (one report per username). E.g. ``--json ndjson``
``-M``, ``--md`` - Generate a Markdown report (general report on all ``-M``, ``--md`` - Generate a Markdown report (general report on all
usernames). See :ref:`markdown-report` below. usernames). See :ref:`markdown-report` below.
``--ai`` - Run an AI-powered analysis of the search results using an
OpenAI-compatible chat completion API. The internal Markdown report is
sent to the model, which returns a short investigation summary that is
streamed to the terminal. See :ref:`ai-analysis` below.
``--ai-model`` - Model name to use with ``--ai``. Defaults to
``openai_model`` from settings (``gpt-4o`` out of the box).
``-fo``, ``--folderoutput`` - Results will be saved to this folder, ``-fo``, ``--folderoutput`` - Results will be saved to this folder,
``results`` by default. Will be created if doesnt exist. ``results`` by default. Will be created if doesnt exist.
@@ -242,3 +250,51 @@ The Markdown format is optimized for LLM context windows. You can feed the repor
The structured Markdown with per-site sections makes it easy for AI tools to extract relationships, cross-reference identities, and identify patterns across accounts. The structured Markdown with per-site sections makes it easy for AI tools to extract relationships, cross-reference identities, and identify patterns across accounts.
For a built-in alternative that calls the model for you and prints the
summary directly, see :ref:`ai-analysis` below.
.. _ai-analysis:
AI analysis (built-in)
----------------------
The ``--ai`` flag turns the search results into a short investigation
summary by sending the internal Markdown report to an OpenAI-compatible
chat completion API and streaming the model's reply to the terminal.
.. code-block:: console
export OPENAI_API_KEY=sk-...
maigret username --ai
# use a smaller / cheaper model
maigret username --ai --ai-model gpt-4o-mini
While ``--ai`` is active, per-site progress lines and the short text
report at the end are suppressed so the streamed summary is the main
output. The Markdown report itself is built in memory and is **not**
written to disk by ``--ai`` alone — combine with ``--md`` if you also
want the file on disk.
The summary follows a fixed format with sections for the most likely
real name, location, occupation, interests, languages, main website,
username variants, number of platforms, active years, a confidence
rating, and a short list of follow-up leads. The model is instructed
to rely only on what is supported by the report and to avoid mixing
clearly unrelated profiles into the main identity.
**Configuration.** The API key is resolved from
``settings.openai_api_key`` first, then from the ``OPENAI_API_KEY``
environment variable. The endpoint defaults to
``https://api.openai.com/v1`` and can be redirected to any
OpenAI-compatible service (Azure OpenAI, OpenRouter, a local server,
…) by setting ``openai_api_base_url`` in ``settings.json``. See
:ref:`settings` for the full list of options.
.. note::
``--ai`` makes a network request to the configured chat completion
endpoint and sends the full Markdown report (which contains the
gathered profile data). Use it only with providers and accounts
you trust with that data.
+27
View File
@@ -147,6 +147,33 @@ Also, there is a short text report in the CLI output after the end of a searchin
.. warning:: .. warning::
XMind 8 mindmaps are incompatible with XMind 2022! XMind 8 mindmaps are incompatible with XMind 2022!
AI analysis
-----------
Maigret can produce a short, human-readable investigation summary on top
of the raw search results using the ``--ai`` flag. It builds the
internal Markdown report, sends it to an OpenAI-compatible chat
completion endpoint, and streams the model's reply directly to the
terminal.
.. code-block:: console
export OPENAI_API_KEY=sk-...
maigret username --ai
The summary uses a fixed format with the most likely real name,
location, occupation, interests, languages, main website, username
variants, number of platforms, active years, a confidence rating, and a
short list of follow-up leads. While ``--ai`` is active, per-site
progress and the short text report are suppressed so the streamed
summary is the main output.
The endpoint, model, and API key are configured via ``settings.json``
(``openai_api_key``, ``openai_model``, ``openai_api_base_url``) or the
``OPENAI_API_KEY`` environment variable. Any OpenAI-compatible API can
be used (Azure OpenAI, OpenRouter, a local server, …). See
:ref:`ai-analysis` and :ref:`settings` for details.
Tags Tags
---- ----
+48
View File
@@ -101,3 +101,51 @@ This is recommended for **Docker containers**, **CI pipelines**, and **air-gappe
- URL of the metadata file (for custom mirrors) - URL of the metadata file (for custom mirrors)
**Using a custom database** with ``--db`` always skips auto-update — you are explicitly choosing your data source. **Using a custom database** with ``--db`` always skips auto-update — you are explicitly choosing your data source.
.. _ai-analysis-settings:
AI analysis
-----------
The ``--ai`` flag (see :ref:`ai-analysis`) talks to an OpenAI-compatible
chat completion API. Three settings control how that request is made:
.. list-table::
:header-rows: 1
:widths: 35 25 40
* - Setting
- Default
- Description
* - ``openai_api_key``
- ``""`` (empty)
- API key. If empty, Maigret falls back to the ``OPENAI_API_KEY``
environment variable.
* - ``openai_model``
- ``gpt-4o``
- Default model name. Overridable per-run with ``--ai-model``.
* - ``openai_api_base_url``
- ``https://api.openai.com/v1``
- Base URL of the chat completion API. Point this at any
OpenAI-compatible service (Azure OpenAI, OpenRouter, a local
server, …) to use it instead of OpenAI directly.
Example ``~/.maigret/settings.json`` snippet using a non-OpenAI
endpoint:
.. code-block:: json
{
"openai_api_key": "sk-...",
"openai_model": "gpt-4o-mini",
"openai_api_base_url": "https://openrouter.ai/api/v1"
}
The key resolution order is ``settings.openai_api_key````OPENAI_API_KEY``
environment variable; the first non-empty value wins.
.. note::
``--ai`` sends the full internal Markdown report (which contains the
gathered profile data) to the configured endpoint. Only use providers
and accounts you trust with that data.
+2 -2
View File
@@ -1,7 +1,7 @@
{ {
"version": 1, "version": 1,
"updated_at": "2026-05-05T17:17:59Z", "updated_at": "2026-05-05T20:17:24Z",
"sites_count": 3155, "sites_count": 3154,
"min_maigret_version": "0.6.0", "min_maigret_version": "0.6.0",
"data_sha256": "acf9d9fef8412bf05fa09d50c1ae363e5c8394597b1aaa3f98a9a1c4e31ca356", "data_sha256": "acf9d9fef8412bf05fa09d50c1ae363e5c8394597b1aaa3f98a9a1c4e31ca356",
"data_url": "https://raw.githubusercontent.com/soxoj/maigret/main/maigret/resources/data.json" "data_url": "https://raw.githubusercontent.com/soxoj/maigret/main/maigret/resources/data.json"