mirror of
https://github.com/soxoj/maigret.git
synced 2026-05-07 06:24:35 +00:00
AI mode documentation (#2620)
This commit is contained in:
@@ -69,6 +69,7 @@ See also: [Quick start](https://maigret.readthedocs.io/en/latest/quick-start.htm
|
||||
- Fetches an [auto-updated site database](https://maigret.readthedocs.io/en/latest/settings.html#database-auto-update) from GitHub each run (once per 24 hours), and falls back to the built-in database if offline.
|
||||
- Works with Tor and I2P websites; able to check domains.
|
||||
- Ships with a [web interface](#web-interface) for browsing results as a graph and downloading reports in every format from a single page.
|
||||
- Optional [AI analysis mode](#ai-analysis) (`--ai`) that turns raw findings into a short investigation summary using an OpenAI-compatible API.
|
||||
|
||||
For the complete feature list, see the [features documentation](https://maigret.readthedocs.io/en/latest/features.html).
|
||||
|
||||
@@ -195,6 +196,9 @@ maigret user --tags us
|
||||
|
||||
# search for three usernames on all available sites
|
||||
maigret user1 user2 user3 -a
|
||||
|
||||
# AI-assisted investigation summary (needs OPENAI_API_KEY)
|
||||
maigret user --ai
|
||||
```
|
||||
|
||||
Run `maigret --help` for all options. Docs: [CLI options](https://maigret.readthedocs.io/en/latest/command-line-options.html), [more examples](https://maigret.readthedocs.io/en/latest/usage-examples.html). Running into 403s or timeouts? See [TROUBLESHOOTING.md](TROUBLESHOOTING.md).
|
||||
@@ -230,6 +234,22 @@ See the full [library usage guide](https://maigret.readthedocs.io/en/latest/libr
|
||||
- `--parse URL` — parse a profile page, extract IDs/usernames, and use them to kick off a recursive search.
|
||||
- `--permute` — generate likely username variants from two or more inputs (e.g. `john doe` → `johndoe`, `j.doe`, …) and search for all of them.
|
||||
- `--self-check [--auto-disable]` — verify `usernameClaimed` / `usernameUnclaimed` pairs against live sites for maintainers auditing the database.
|
||||
- `--ai` / `--ai-model` — run the [AI analysis](#ai-analysis) over the search results and stream a short investigation summary to the terminal.
|
||||
|
||||
<a id="ai-analysis"></a>
|
||||
### AI analysis
|
||||
|
||||
`--ai` collects the search results, builds an internal Markdown report, and sends it to an OpenAI-compatible chat completion endpoint to produce a short, neutral investigation summary (likely real name, location, occupation, interests, languages, confidence, follow-up leads). Per-site progress is suppressed and the model's output is streamed to stdout.
|
||||
|
||||
```bash
|
||||
export OPENAI_API_KEY=sk-...
|
||||
maigret user --ai
|
||||
|
||||
# pick a different model
|
||||
maigret user --ai --ai-model gpt-4o-mini
|
||||
```
|
||||
|
||||
The key can also be set as `openai_api_key` in `settings.json`. The endpoint defaults to `https://api.openai.com/v1`, but `openai_api_base_url` in `settings.json` can point to any OpenAI-compatible API (Azure OpenAI, OpenRouter, a local server, …). See the [settings docs](https://maigret.readthedocs.io/en/latest/settings.html) for the full list of options.
|
||||
|
||||
### Tor / I2P / proxies
|
||||
|
||||
|
||||
@@ -70,6 +70,7 @@ maigret YOUR_USERNAME
|
||||
- 每次运行时(每 24 小时一次)从 GitHub 拉取一份[自动更新的站点数据库](https://maigret.readthedocs.io/en/latest/settings.html#database-auto-update);离线时会回退到内置数据库。
|
||||
- 可访问 Tor 与 I2P 站点;支持检查域名。
|
||||
- 自带一个 [Web 界面](#web-interface),可在同一页面将结果以图谱方式浏览,并下载各种格式的报告。
|
||||
- 可选的 [AI 分析模式](#ai-analysis)(`--ai`),通过 OpenAI 兼容 API 将原始搜索结果整理成一份简短的调查摘要。
|
||||
|
||||
完整特性列表请见[特性文档](https://maigret.readthedocs.io/en/latest/features.html)。
|
||||
|
||||
@@ -199,6 +200,9 @@ maigret user --tags us
|
||||
|
||||
# 同时在所有站点上搜索三个用户名
|
||||
maigret user1 user2 user3 -a
|
||||
|
||||
# AI 辅助调查摘要(需要 OPENAI_API_KEY)
|
||||
maigret user --ai
|
||||
```
|
||||
|
||||
完整选项请运行 `maigret --help`。文档:[命令行选项](https://maigret.readthedocs.io/en/latest/command-line-options.html)、[更多示例](https://maigret.readthedocs.io/en/latest/usage-examples.html)。遇到 403 或超时?参见 [TROUBLESHOOTING.md](TROUBLESHOOTING.md)。
|
||||
@@ -234,6 +238,22 @@ maigret --web 5000
|
||||
- `--parse URL` —— 解析一个个人主页,从中提取 ID/用户名,并以此为起点发起递归搜索。
|
||||
- `--permute` —— 基于两个或更多输入生成可能的用户名变体(例如 `john doe` → `johndoe`、`j.doe` …)并对其逐一搜索。
|
||||
- `--self-check [--auto-disable]` —— 维护者用于核对数据库的工具:针对线上站点验证 `usernameClaimed` / `usernameUnclaimed` 配对是否仍然有效。
|
||||
- `--ai` / `--ai-model` —— 启用 [AI 分析](#ai-analysis),将搜索结果交给 OpenAI 兼容 API,并把简短的调查摘要流式输出到终端。
|
||||
|
||||
<a id="ai-analysis"></a>
|
||||
### AI 分析
|
||||
|
||||
`--ai` 会先收集搜索结果、在内存中构建 Markdown 报告,再将其发送到一个 OpenAI 兼容的 chat completion 接口,生成一份简短、克制的调查摘要(最可能的真实姓名、所在地、职业、兴趣、语言、置信度以及后续线索)。开启该模式后,逐站点的进度输出会被静默,模型的输出会以流式方式打印到 stdout。
|
||||
|
||||
```bash
|
||||
export OPENAI_API_KEY=sk-...
|
||||
maigret user --ai
|
||||
|
||||
# 切换到其它模型
|
||||
maigret user --ai --ai-model gpt-4o-mini
|
||||
```
|
||||
|
||||
API key 也可以写入 `settings.json` 的 `openai_api_key` 字段。接口地址默认为 `https://api.openai.com/v1`,通过在 `settings.json` 中设置 `openai_api_base_url`,可以指向任何 OpenAI 兼容的服务(Azure OpenAI、OpenRouter、本地推理服务等)。完整选项见[配置文档](https://maigret.readthedocs.io/en/latest/settings.html)。
|
||||
|
||||
### Tor / I2P / 代理
|
||||
|
||||
|
||||
@@ -161,6 +161,14 @@ ndjson (one report per username). E.g. ``--json ndjson``
|
||||
``-M``, ``--md`` - Generate a Markdown report (general report on all
|
||||
usernames). See :ref:`markdown-report` below.
|
||||
|
||||
``--ai`` - Run an AI-powered analysis of the search results using an
|
||||
OpenAI-compatible chat completion API. The internal Markdown report is
|
||||
sent to the model, which returns a short investigation summary that is
|
||||
streamed to the terminal. See :ref:`ai-analysis` below.
|
||||
|
||||
``--ai-model`` - Model name to use with ``--ai``. Defaults to
|
||||
``openai_model`` from settings (``gpt-4o`` out of the box).
|
||||
|
||||
``-fo``, ``--folderoutput`` - Results will be saved to this folder,
|
||||
``results`` by default. Will be created if doesn’t exist.
|
||||
|
||||
@@ -242,3 +250,51 @@ The Markdown format is optimized for LLM context windows. You can feed the repor
|
||||
|
||||
The structured Markdown with per-site sections makes it easy for AI tools to extract relationships, cross-reference identities, and identify patterns across accounts.
|
||||
|
||||
For a built-in alternative that calls the model for you and prints the
|
||||
summary directly, see :ref:`ai-analysis` below.
|
||||
|
||||
.. _ai-analysis:
|
||||
|
||||
AI analysis (built-in)
|
||||
----------------------
|
||||
|
||||
The ``--ai`` flag turns the search results into a short investigation
|
||||
summary by sending the internal Markdown report to an OpenAI-compatible
|
||||
chat completion API and streaming the model's reply to the terminal.
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
export OPENAI_API_KEY=sk-...
|
||||
maigret username --ai
|
||||
|
||||
# use a smaller / cheaper model
|
||||
maigret username --ai --ai-model gpt-4o-mini
|
||||
|
||||
While ``--ai`` is active, per-site progress lines and the short text
|
||||
report at the end are suppressed so the streamed summary is the main
|
||||
output. The Markdown report itself is built in memory and is **not**
|
||||
written to disk by ``--ai`` alone — combine with ``--md`` if you also
|
||||
want the file on disk.
|
||||
|
||||
The summary follows a fixed format with sections for the most likely
|
||||
real name, location, occupation, interests, languages, main website,
|
||||
username variants, number of platforms, active years, a confidence
|
||||
rating, and a short list of follow-up leads. The model is instructed
|
||||
to rely only on what is supported by the report and to avoid mixing
|
||||
clearly unrelated profiles into the main identity.
|
||||
|
||||
**Configuration.** The API key is resolved from
|
||||
``settings.openai_api_key`` first, then from the ``OPENAI_API_KEY``
|
||||
environment variable. The endpoint defaults to
|
||||
``https://api.openai.com/v1`` and can be redirected to any
|
||||
OpenAI-compatible service (Azure OpenAI, OpenRouter, a local server,
|
||||
…) by setting ``openai_api_base_url`` in ``settings.json``. See
|
||||
:ref:`settings` for the full list of options.
|
||||
|
||||
.. note::
|
||||
|
||||
``--ai`` makes a network request to the configured chat completion
|
||||
endpoint and sends the full Markdown report (which contains the
|
||||
gathered profile data). Use it only with providers and accounts
|
||||
you trust with that data.
|
||||
|
||||
|
||||
@@ -147,6 +147,33 @@ Also, there is a short text report in the CLI output after the end of a searchin
|
||||
.. warning::
|
||||
XMind 8 mindmaps are incompatible with XMind 2022!
|
||||
|
||||
AI analysis
|
||||
-----------
|
||||
|
||||
Maigret can produce a short, human-readable investigation summary on top
|
||||
of the raw search results using the ``--ai`` flag. It builds the
|
||||
internal Markdown report, sends it to an OpenAI-compatible chat
|
||||
completion endpoint, and streams the model's reply directly to the
|
||||
terminal.
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
export OPENAI_API_KEY=sk-...
|
||||
maigret username --ai
|
||||
|
||||
The summary uses a fixed format with the most likely real name,
|
||||
location, occupation, interests, languages, main website, username
|
||||
variants, number of platforms, active years, a confidence rating, and a
|
||||
short list of follow-up leads. While ``--ai`` is active, per-site
|
||||
progress and the short text report are suppressed so the streamed
|
||||
summary is the main output.
|
||||
|
||||
The endpoint, model, and API key are configured via ``settings.json``
|
||||
(``openai_api_key``, ``openai_model``, ``openai_api_base_url``) or the
|
||||
``OPENAI_API_KEY`` environment variable. Any OpenAI-compatible API can
|
||||
be used (Azure OpenAI, OpenRouter, a local server, …). See
|
||||
:ref:`ai-analysis` and :ref:`settings` for details.
|
||||
|
||||
Tags
|
||||
----
|
||||
|
||||
|
||||
@@ -101,3 +101,51 @@ This is recommended for **Docker containers**, **CI pipelines**, and **air-gappe
|
||||
- URL of the metadata file (for custom mirrors)
|
||||
|
||||
**Using a custom database** with ``--db`` always skips auto-update — you are explicitly choosing your data source.
|
||||
|
||||
.. _ai-analysis-settings:
|
||||
|
||||
AI analysis
|
||||
-----------
|
||||
|
||||
The ``--ai`` flag (see :ref:`ai-analysis`) talks to an OpenAI-compatible
|
||||
chat completion API. Three settings control how that request is made:
|
||||
|
||||
.. list-table::
|
||||
:header-rows: 1
|
||||
:widths: 35 25 40
|
||||
|
||||
* - Setting
|
||||
- Default
|
||||
- Description
|
||||
* - ``openai_api_key``
|
||||
- ``""`` (empty)
|
||||
- API key. If empty, Maigret falls back to the ``OPENAI_API_KEY``
|
||||
environment variable.
|
||||
* - ``openai_model``
|
||||
- ``gpt-4o``
|
||||
- Default model name. Overridable per-run with ``--ai-model``.
|
||||
* - ``openai_api_base_url``
|
||||
- ``https://api.openai.com/v1``
|
||||
- Base URL of the chat completion API. Point this at any
|
||||
OpenAI-compatible service (Azure OpenAI, OpenRouter, a local
|
||||
server, …) to use it instead of OpenAI directly.
|
||||
|
||||
Example ``~/.maigret/settings.json`` snippet using a non-OpenAI
|
||||
endpoint:
|
||||
|
||||
.. code-block:: json
|
||||
|
||||
{
|
||||
"openai_api_key": "sk-...",
|
||||
"openai_model": "gpt-4o-mini",
|
||||
"openai_api_base_url": "https://openrouter.ai/api/v1"
|
||||
}
|
||||
|
||||
The key resolution order is ``settings.openai_api_key`` → ``OPENAI_API_KEY``
|
||||
environment variable; the first non-empty value wins.
|
||||
|
||||
.. note::
|
||||
|
||||
``--ai`` sends the full internal Markdown report (which contains the
|
||||
gathered profile data) to the configured endpoint. Only use providers
|
||||
and accounts you trust with that data.
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
{
|
||||
"version": 1,
|
||||
"updated_at": "2026-05-05T17:17:59Z",
|
||||
"sites_count": 3155,
|
||||
"updated_at": "2026-05-05T20:17:24Z",
|
||||
"sites_count": 3154,
|
||||
"min_maigret_version": "0.6.0",
|
||||
"data_sha256": "acf9d9fef8412bf05fa09d50c1ae363e5c8394597b1aaa3f98a9a1c4e31ca356",
|
||||
"data_url": "https://raw.githubusercontent.com/soxoj/maigret/main/maigret/resources/data.json"
|
||||
|
||||
Reference in New Issue
Block a user