Update of Readme and documentation (#2514)

* Big readme update

* Readme and documentation update

* Readme structure update

* Small fixes

* Changelog update
This commit is contained in:
Soxoj
2026-04-17 17:42:36 +02:00
committed by GitHub
parent f74f82ee13
commit 37ce4fe728
11 changed files with 589 additions and 97 deletions
+1 -1
View File
@@ -96,7 +96,7 @@ You should make your git commits from your maigret git repo folder, or else the
If you already know which site has a false-positive and want to fix it specifically, go to the next step.
Otherwise, simply run a search with a random username (e.g. `laiuhi3h4gi3u4hgt`) and check the results.
Alternatively, you can use `the Telegram bot <https://t.me/osint_maigret_bot>`_.
Alternatively, you can use `the Telegram bot <https://t.me/maigret_search_bot>`_.
2. Open the account link in your browser and check:
+2
View File
@@ -29,6 +29,7 @@ You may be interested in:
- :doc:`Usage examples <usage-examples>`
- :doc:`Command line options <command-line-options>`
- :doc:`Features list <features>`
- :doc:`Library usage <library-usage>`
.. toctree::
:hidden:
@@ -39,6 +40,7 @@ You may be interested in:
usage-examples
command-line-options
features
library-usage
philosophy
supported-identifier-types
tags
+2 -3
View File
@@ -4,7 +4,7 @@ Installation
============
Maigret can be installed using pip, Docker, or simply can be launched from the cloned repo.
Also, it is available online via `official Telegram bot <https://t.me/osint_maigret_bot>`_,
Also, it is available online via `official Telegram bot <https://t.me/maigret_search_bot>`_,
source code of a bot is `available on GitHub <https://github.com/soxoj/maigret-tg-bot>`_.
Windows Standalone EXE-binaries
@@ -45,8 +45,7 @@ Press one of the buttons below and follow the instructions to launch it in your
Local installation from PyPi
----------------------------
Please note that the sites database in the PyPI package may be outdated.
If you encounter frequent false positive results, we recommend installing the latest development version from GitHub instead.
Maigret ships with a bundled site database. After installation from PyPI (or any other method), it can **automatically fetch a newer compatible database from GitHub** when you run it—see :ref:`database-auto-update` in :doc:`settings`.
.. note::
Python 3.10 or higher and pip is required, **Python 3.11 is recommended.**
+139
View File
@@ -0,0 +1,139 @@
.. _library-usage:
Library usage
=============
Maigret's CLI is a thin wrapper around an async Python API. You can embed Maigret in your own tools, pipelines, and OSINT workflows — no need to shell out.
This page covers the common patterns. For the full argument list of the underlying function, see ``maigret.checking.maigret`` in the source.
Installation
------------
.. code-block:: bash
pip install maigret
Minimal example
---------------
A working end-to-end search against the top 500 sites:
.. code-block:: python
import asyncio
import logging
from maigret import search as maigret_search
from maigret.sites import MaigretDatabase
# Load the bundled site database
db = MaigretDatabase().load_from_path(
"maigret/resources/data.json"
)
# Pick which sites to scan (same filtering the CLI uses)
sites = db.ranked_sites_dict(top=500)
results = asyncio.run(
maigret_search(
username="soxoj",
site_dict=sites,
logger=logging.getLogger("maigret"),
timeout=30,
is_parsing_enabled=True,
)
)
for site_name, result in results.items():
if result["status"].is_found():
print(site_name, result["url_user"])
Key points:
- ``maigret_search`` is an ``async`` function — wrap it with ``asyncio.run(...)`` or ``await`` it from inside your own event loop.
- ``is_parsing_enabled=True`` turns on ``socid_extractor`` so ``result["ids_data"]`` is populated with profile fields (bio, linked accounts, uids, etc.).
- Each entry in the returned dict has a ``"status"`` object with ``is_found()``, plus ``url_user``, ``http_status``, ``rank``, ``ids_data``, and more.
Filtering sites
---------------
``ranked_sites_dict`` accepts the same filters as the CLI:
.. code-block:: python
# All sites tagged as coding, top 200 by rank
sites = db.ranked_sites_dict(top=200, tags=["coding"])
# Exclude NSFW and dating sites
sites = db.ranked_sites_dict(excluded_tags=["nsfw", "dating"])
# Only specific sites by name
sites = db.ranked_sites_dict(names=["GitHub", "Reddit", "VK"])
# Include disabled sites (useful for maintenance / self-check)
sites = db.ranked_sites_dict(disabled=True)
Running inside an existing event loop
-------------------------------------
If your application already runs an asyncio loop (FastAPI, aiohttp server, a Discord bot, etc.), ``await`` ``maigret_search`` directly instead of calling ``asyncio.run``:
.. code-block:: python
async def check_username(username: str) -> dict:
results = await maigret_search(
username=username,
site_dict=sites,
logger=logger,
timeout=30,
)
return {
name: r["url_user"]
for name, r in results.items()
if r["status"].is_found()
}
Routing through a proxy
-----------------------
The same proxy / Tor / I2P flags the CLI exposes are plain keyword arguments:
.. code-block:: python
results = await maigret_search(
username="soxoj",
site_dict=sites,
logger=logger,
proxy="socks5://127.0.0.1:1080",
tor_proxy="socks5://127.0.0.1:9050", # used for .onion sites
i2p_proxy="http://127.0.0.1:4444", # used for .i2p sites
timeout=30,
)
Full function signature
-----------------------
.. code-block:: python
async def maigret(
username: str,
site_dict: Dict[str, MaigretSite],
logger,
query_notify=None,
proxy=None,
tor_proxy=None,
i2p_proxy=None,
timeout=30,
is_parsing_enabled=False,
id_type="username",
debug=False,
forced=False,
max_connections=100,
no_progressbar=False,
cookies=None,
retries=0,
check_domains=False,
) -> QueryResultWrapper
See :doc:`command-line-options` for a description of each option — the semantics match the CLI flags one-to-one.
+24
View File
@@ -3,6 +3,10 @@
Philosophy
==========
*The Commissioner Jules Maigret is a fictional French police detective, created by Georges Simenon.
His investigation method is based on understanding the personality of different people and their
interactions.*
TL;DR: Username => Dossier
Maigret is designed to gather all the available information about person by his username.
@@ -15,3 +19,23 @@ All this information forms some dossier, but it also useful for other tools and
Each collected piece of data has a label of a certain format (for example, ``follower_count`` for the number
of subscribers or ``created_at`` for account creation time) so that it can be parsed and analyzed by various
systems and stored in databases.
Origins
-------
Maigret started from studying what OSINT investigators actually use in practice — and from
the realization that many popular tools do not deliver real investigative value. The original
research behind this observation is summarized in the article
`What's wrong with namecheckers <https://soxoj.medium.com/whats-wrong-with-namecheckers-981e5cba600e>`_.
For a broader landscape of username-checking tools, see the curated
`OSINT namecheckers list <https://github.com/soxoj/osint-namecheckers-list>`_.
Two ideas grew out of that research:
- `socid-extractor <https://github.com/soxoj/socid-extractor>`_ — a library focused on pulling
structured identity data (user IDs, full names, linked accounts, bios, timestamps, etc.) out of
account pages and public API responses, so that finding an account is not the end of the pipeline.
- **Maigret** itself — which started as a fork of
`Sherlock <https://github.com/sherlock-project/sherlock>`_ but has long since outgrown the
original project in coverage, extraction depth, and check reliability. Today Maigret is used
as a component by major OSINT vendors in their commercial products.