From 79e93ab715e4660429cf1f556b7697c2ec846ab4 Mon Sep 17 00:00:00 2001
From: Soxoj <31013580+soxoj@users.noreply.github.com>
Date: Tue, 5 May 2026 22:21:00 +0200
Subject: [PATCH] AI mode documentation (#2620)
---
README.md | 20 ++++++++++
README.zh-CN.md | 20 ++++++++++
docs/source/command-line-options.rst | 56 ++++++++++++++++++++++++++++
docs/source/features.rst | 27 ++++++++++++++
docs/source/settings.rst | 48 ++++++++++++++++++++++++
maigret/resources/db_meta.json | 4 +-
6 files changed, 173 insertions(+), 2 deletions(-)
diff --git a/README.md b/README.md
index fff26d2..aea33c0 100644
--- a/README.md
+++ b/README.md
@@ -69,6 +69,7 @@ See also: [Quick start](https://maigret.readthedocs.io/en/latest/quick-start.htm
- Fetches an [auto-updated site database](https://maigret.readthedocs.io/en/latest/settings.html#database-auto-update) from GitHub each run (once per 24 hours), and falls back to the built-in database if offline.
- Works with Tor and I2P websites; able to check domains.
- Ships with a [web interface](#web-interface) for browsing results as a graph and downloading reports in every format from a single page.
+- Optional [AI analysis mode](#ai-analysis) (`--ai`) that turns raw findings into a short investigation summary using an OpenAI-compatible API.
For the complete feature list, see the [features documentation](https://maigret.readthedocs.io/en/latest/features.html).
@@ -195,6 +196,9 @@ maigret user --tags us
# search for three usernames on all available sites
maigret user1 user2 user3 -a
+
+# AI-assisted investigation summary (needs OPENAI_API_KEY)
+maigret user --ai
```
Run `maigret --help` for all options. Docs: [CLI options](https://maigret.readthedocs.io/en/latest/command-line-options.html), [more examples](https://maigret.readthedocs.io/en/latest/usage-examples.html). Running into 403s or timeouts? See [TROUBLESHOOTING.md](TROUBLESHOOTING.md).
@@ -230,6 +234,22 @@ See the full [library usage guide](https://maigret.readthedocs.io/en/latest/libr
- `--parse URL` — parse a profile page, extract IDs/usernames, and use them to kick off a recursive search.
- `--permute` — generate likely username variants from two or more inputs (e.g. `john doe` → `johndoe`, `j.doe`, …) and search for all of them.
- `--self-check [--auto-disable]` — verify `usernameClaimed` / `usernameUnclaimed` pairs against live sites for maintainers auditing the database.
+- `--ai` / `--ai-model` — run the [AI analysis](#ai-analysis) over the search results and stream a short investigation summary to the terminal.
+
+
+### AI analysis
+
+`--ai` collects the search results, builds an internal Markdown report, and sends it to an OpenAI-compatible chat completion endpoint to produce a short, neutral investigation summary (likely real name, location, occupation, interests, languages, confidence, follow-up leads). Per-site progress is suppressed and the model's output is streamed to stdout.
+
+```bash
+export OPENAI_API_KEY=sk-...
+maigret user --ai
+
+# pick a different model
+maigret user --ai --ai-model gpt-4o-mini
+```
+
+The key can also be set as `openai_api_key` in `settings.json`. The endpoint defaults to `https://api.openai.com/v1`, but `openai_api_base_url` in `settings.json` can point to any OpenAI-compatible API (Azure OpenAI, OpenRouter, a local server, …). See the [settings docs](https://maigret.readthedocs.io/en/latest/settings.html) for the full list of options.
### Tor / I2P / proxies
diff --git a/README.zh-CN.md b/README.zh-CN.md
index fc1e0d2..f2367f0 100644
--- a/README.zh-CN.md
+++ b/README.zh-CN.md
@@ -70,6 +70,7 @@ maigret YOUR_USERNAME
- 每次运行时(每 24 小时一次)从 GitHub 拉取一份[自动更新的站点数据库](https://maigret.readthedocs.io/en/latest/settings.html#database-auto-update);离线时会回退到内置数据库。
- 可访问 Tor 与 I2P 站点;支持检查域名。
- 自带一个 [Web 界面](#web-interface),可在同一页面将结果以图谱方式浏览,并下载各种格式的报告。
+- 可选的 [AI 分析模式](#ai-analysis)(`--ai`),通过 OpenAI 兼容 API 将原始搜索结果整理成一份简短的调查摘要。
完整特性列表请见[特性文档](https://maigret.readthedocs.io/en/latest/features.html)。
@@ -199,6 +200,9 @@ maigret user --tags us
# 同时在所有站点上搜索三个用户名
maigret user1 user2 user3 -a
+
+# AI 辅助调查摘要(需要 OPENAI_API_KEY)
+maigret user --ai
```
完整选项请运行 `maigret --help`。文档:[命令行选项](https://maigret.readthedocs.io/en/latest/command-line-options.html)、[更多示例](https://maigret.readthedocs.io/en/latest/usage-examples.html)。遇到 403 或超时?参见 [TROUBLESHOOTING.md](TROUBLESHOOTING.md)。
@@ -234,6 +238,22 @@ maigret --web 5000
- `--parse URL` —— 解析一个个人主页,从中提取 ID/用户名,并以此为起点发起递归搜索。
- `--permute` —— 基于两个或更多输入生成可能的用户名变体(例如 `john doe` → `johndoe`、`j.doe` …)并对其逐一搜索。
- `--self-check [--auto-disable]` —— 维护者用于核对数据库的工具:针对线上站点验证 `usernameClaimed` / `usernameUnclaimed` 配对是否仍然有效。
+- `--ai` / `--ai-model` —— 启用 [AI 分析](#ai-analysis),将搜索结果交给 OpenAI 兼容 API,并把简短的调查摘要流式输出到终端。
+
+
+### AI 分析
+
+`--ai` 会先收集搜索结果、在内存中构建 Markdown 报告,再将其发送到一个 OpenAI 兼容的 chat completion 接口,生成一份简短、克制的调查摘要(最可能的真实姓名、所在地、职业、兴趣、语言、置信度以及后续线索)。开启该模式后,逐站点的进度输出会被静默,模型的输出会以流式方式打印到 stdout。
+
+```bash
+export OPENAI_API_KEY=sk-...
+maigret user --ai
+
+# 切换到其它模型
+maigret user --ai --ai-model gpt-4o-mini
+```
+
+API key 也可以写入 `settings.json` 的 `openai_api_key` 字段。接口地址默认为 `https://api.openai.com/v1`,通过在 `settings.json` 中设置 `openai_api_base_url`,可以指向任何 OpenAI 兼容的服务(Azure OpenAI、OpenRouter、本地推理服务等)。完整选项见[配置文档](https://maigret.readthedocs.io/en/latest/settings.html)。
### Tor / I2P / 代理
diff --git a/docs/source/command-line-options.rst b/docs/source/command-line-options.rst
index b111d91..ef8a9b3 100644
--- a/docs/source/command-line-options.rst
+++ b/docs/source/command-line-options.rst
@@ -161,6 +161,14 @@ ndjson (one report per username). E.g. ``--json ndjson``
``-M``, ``--md`` - Generate a Markdown report (general report on all
usernames). See :ref:`markdown-report` below.
+``--ai`` - Run an AI-powered analysis of the search results using an
+OpenAI-compatible chat completion API. The internal Markdown report is
+sent to the model, which returns a short investigation summary that is
+streamed to the terminal. See :ref:`ai-analysis` below.
+
+``--ai-model`` - Model name to use with ``--ai``. Defaults to
+``openai_model`` from settings (``gpt-4o`` out of the box).
+
``-fo``, ``--folderoutput`` - Results will be saved to this folder,
``results`` by default. Will be created if doesn’t exist.
@@ -242,3 +250,51 @@ The Markdown format is optimized for LLM context windows. You can feed the repor
The structured Markdown with per-site sections makes it easy for AI tools to extract relationships, cross-reference identities, and identify patterns across accounts.
+For a built-in alternative that calls the model for you and prints the
+summary directly, see :ref:`ai-analysis` below.
+
+.. _ai-analysis:
+
+AI analysis (built-in)
+----------------------
+
+The ``--ai`` flag turns the search results into a short investigation
+summary by sending the internal Markdown report to an OpenAI-compatible
+chat completion API and streaming the model's reply to the terminal.
+
+.. code-block:: console
+
+ export OPENAI_API_KEY=sk-...
+ maigret username --ai
+
+ # use a smaller / cheaper model
+ maigret username --ai --ai-model gpt-4o-mini
+
+While ``--ai`` is active, per-site progress lines and the short text
+report at the end are suppressed so the streamed summary is the main
+output. The Markdown report itself is built in memory and is **not**
+written to disk by ``--ai`` alone — combine with ``--md`` if you also
+want the file on disk.
+
+The summary follows a fixed format with sections for the most likely
+real name, location, occupation, interests, languages, main website,
+username variants, number of platforms, active years, a confidence
+rating, and a short list of follow-up leads. The model is instructed
+to rely only on what is supported by the report and to avoid mixing
+clearly unrelated profiles into the main identity.
+
+**Configuration.** The API key is resolved from
+``settings.openai_api_key`` first, then from the ``OPENAI_API_KEY``
+environment variable. The endpoint defaults to
+``https://api.openai.com/v1`` and can be redirected to any
+OpenAI-compatible service (Azure OpenAI, OpenRouter, a local server,
+…) by setting ``openai_api_base_url`` in ``settings.json``. See
+:ref:`settings` for the full list of options.
+
+.. note::
+
+ ``--ai`` makes a network request to the configured chat completion
+ endpoint and sends the full Markdown report (which contains the
+ gathered profile data). Use it only with providers and accounts
+ you trust with that data.
+
diff --git a/docs/source/features.rst b/docs/source/features.rst
index 00e3c45..2fa7387 100644
--- a/docs/source/features.rst
+++ b/docs/source/features.rst
@@ -147,6 +147,33 @@ Also, there is a short text report in the CLI output after the end of a searchin
.. warning::
XMind 8 mindmaps are incompatible with XMind 2022!
+AI analysis
+-----------
+
+Maigret can produce a short, human-readable investigation summary on top
+of the raw search results using the ``--ai`` flag. It builds the
+internal Markdown report, sends it to an OpenAI-compatible chat
+completion endpoint, and streams the model's reply directly to the
+terminal.
+
+.. code-block:: console
+
+ export OPENAI_API_KEY=sk-...
+ maigret username --ai
+
+The summary uses a fixed format with the most likely real name,
+location, occupation, interests, languages, main website, username
+variants, number of platforms, active years, a confidence rating, and a
+short list of follow-up leads. While ``--ai`` is active, per-site
+progress and the short text report are suppressed so the streamed
+summary is the main output.
+
+The endpoint, model, and API key are configured via ``settings.json``
+(``openai_api_key``, ``openai_model``, ``openai_api_base_url``) or the
+``OPENAI_API_KEY`` environment variable. Any OpenAI-compatible API can
+be used (Azure OpenAI, OpenRouter, a local server, …). See
+:ref:`ai-analysis` and :ref:`settings` for details.
+
Tags
----
diff --git a/docs/source/settings.rst b/docs/source/settings.rst
index dc36000..09449f0 100644
--- a/docs/source/settings.rst
+++ b/docs/source/settings.rst
@@ -101,3 +101,51 @@ This is recommended for **Docker containers**, **CI pipelines**, and **air-gappe
- URL of the metadata file (for custom mirrors)
**Using a custom database** with ``--db`` always skips auto-update — you are explicitly choosing your data source.
+
+.. _ai-analysis-settings:
+
+AI analysis
+-----------
+
+The ``--ai`` flag (see :ref:`ai-analysis`) talks to an OpenAI-compatible
+chat completion API. Three settings control how that request is made:
+
+.. list-table::
+ :header-rows: 1
+ :widths: 35 25 40
+
+ * - Setting
+ - Default
+ - Description
+ * - ``openai_api_key``
+ - ``""`` (empty)
+ - API key. If empty, Maigret falls back to the ``OPENAI_API_KEY``
+ environment variable.
+ * - ``openai_model``
+ - ``gpt-4o``
+ - Default model name. Overridable per-run with ``--ai-model``.
+ * - ``openai_api_base_url``
+ - ``https://api.openai.com/v1``
+ - Base URL of the chat completion API. Point this at any
+ OpenAI-compatible service (Azure OpenAI, OpenRouter, a local
+ server, …) to use it instead of OpenAI directly.
+
+Example ``~/.maigret/settings.json`` snippet using a non-OpenAI
+endpoint:
+
+.. code-block:: json
+
+ {
+ "openai_api_key": "sk-...",
+ "openai_model": "gpt-4o-mini",
+ "openai_api_base_url": "https://openrouter.ai/api/v1"
+ }
+
+The key resolution order is ``settings.openai_api_key`` → ``OPENAI_API_KEY``
+environment variable; the first non-empty value wins.
+
+.. note::
+
+ ``--ai`` sends the full internal Markdown report (which contains the
+ gathered profile data) to the configured endpoint. Only use providers
+ and accounts you trust with that data.
diff --git a/maigret/resources/db_meta.json b/maigret/resources/db_meta.json
index d563902..e372d27 100644
--- a/maigret/resources/db_meta.json
+++ b/maigret/resources/db_meta.json
@@ -1,7 +1,7 @@
{
"version": 1,
- "updated_at": "2026-05-05T17:17:59Z",
- "sites_count": 3155,
+ "updated_at": "2026-05-05T20:17:24Z",
+ "sites_count": 3154,
"min_maigret_version": "0.6.0",
"data_sha256": "acf9d9fef8412bf05fa09d50c1ae363e5c8394597b1aaa3f98a9a1c4e31ca356",
"data_url": "https://raw.githubusercontent.com/soxoj/maigret/main/maigret/resources/data.json"