mirror of
https://github.com/soxoj/maigret.git
synced 2026-05-15 19:05:43 +00:00
Compare commits
1 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| ca54db6fb7 |
@@ -2,7 +2,7 @@ name: Build docker image and push to DockerHub
|
||||
|
||||
on:
|
||||
push:
|
||||
branches: [ main, dev ]
|
||||
branches: [ main ]
|
||||
|
||||
jobs:
|
||||
docker:
|
||||
@@ -10,62 +10,24 @@ jobs:
|
||||
steps:
|
||||
-
|
||||
name: Set up QEMU
|
||||
uses: docker/setup-qemu-action@v3
|
||||
uses: docker/setup-qemu-action@v1
|
||||
-
|
||||
name: Set up Docker Buildx
|
||||
uses: docker/setup-buildx-action@v3
|
||||
uses: docker/setup-buildx-action@v1
|
||||
-
|
||||
name: Login to DockerHub
|
||||
uses: docker/login-action@v3
|
||||
uses: docker/login-action@v1
|
||||
with:
|
||||
username: ${{ secrets.DOCKER_HUB_USERNAME }}
|
||||
password: ${{ secrets.DOCKER_HUB_ACCESS_TOKEN }}
|
||||
-
|
||||
name: Extract metadata (CLI)
|
||||
id: meta_cli
|
||||
uses: docker/metadata-action@v5
|
||||
with:
|
||||
images: ${{ secrets.DOCKER_HUB_USERNAME }}/maigret
|
||||
tags: |
|
||||
type=raw,value=latest,enable={{is_default_branch}}
|
||||
type=ref,event=branch
|
||||
type=sha,prefix=
|
||||
-
|
||||
name: Extract metadata (Web UI)
|
||||
id: meta_web
|
||||
uses: docker/metadata-action@v5
|
||||
with:
|
||||
images: ${{ secrets.DOCKER_HUB_USERNAME }}/maigret
|
||||
tags: |
|
||||
type=raw,value=web,enable={{is_default_branch}}
|
||||
type=ref,event=branch,suffix=-web
|
||||
type=sha,prefix=web-
|
||||
-
|
||||
name: Build and push (CLI, default)
|
||||
id: docker_build_cli
|
||||
uses: docker/build-push-action@v6
|
||||
name: Build and push
|
||||
id: docker_build
|
||||
uses: docker/build-push-action@v2
|
||||
with:
|
||||
push: true
|
||||
target: cli
|
||||
tags: ${{ steps.meta_cli.outputs.tags }}
|
||||
labels: ${{ steps.meta_cli.outputs.labels }}
|
||||
tags: ${{ secrets.DOCKER_HUB_USERNAME }}/maigret:latest
|
||||
platforms: linux/amd64,linux/arm64
|
||||
cache-from: type=gha
|
||||
cache-to: type=gha,mode=max
|
||||
-
|
||||
name: Build and push (Web UI)
|
||||
id: docker_build_web
|
||||
uses: docker/build-push-action@v6
|
||||
with:
|
||||
push: true
|
||||
target: web
|
||||
tags: ${{ steps.meta_web.outputs.tags }}
|
||||
labels: ${{ steps.meta_web.outputs.labels }}
|
||||
platforms: linux/amd64,linux/arm64
|
||||
cache-from: type=gha
|
||||
cache-to: type=gha,mode=max
|
||||
-
|
||||
name: Image digests
|
||||
run: |
|
||||
echo "cli: ${{ steps.docker_build_cli.outputs.digest }}"
|
||||
echo "web: ${{ steps.docker_build_web.outputs.digest }}"
|
||||
name: Image digest
|
||||
run: echo ${{ steps.docker_build.outputs.digest }}
|
||||
|
||||
+1
-10
@@ -1,4 +1,4 @@
|
||||
FROM python:3.11-slim AS base
|
||||
FROM python:3.11-slim
|
||||
LABEL maintainer="Soxoj <soxoj@protonmail.com>"
|
||||
WORKDIR /app
|
||||
RUN pip install --no-cache-dir --upgrade pip
|
||||
@@ -15,13 +15,4 @@ COPY . .
|
||||
RUN YARL_NO_EXTENSIONS=1 python3 -m pip install --no-cache-dir .
|
||||
# For production use, set FLASK_HOST to a specific IP address for security
|
||||
ENV FLASK_HOST=0.0.0.0
|
||||
|
||||
# Web UI variant: auto-launches the web interface on $PORT
|
||||
FROM base AS web
|
||||
ENV PORT=5000
|
||||
EXPOSE 5000
|
||||
ENTRYPOINT ["sh", "-c", "exec maigret --web \"$PORT\""]
|
||||
|
||||
# Default variant (last stage = `docker build .` target): CLI, backwards-compatible
|
||||
FROM base AS cli
|
||||
ENTRYPOINT ["maigret"]
|
||||
|
||||
@@ -109,7 +109,7 @@ Download a standalone EXE from [Releases](https://github.com/soxoj/maigret/relea
|
||||
|
||||
Run Maigret in the browser via cloud shells or Jupyter notebooks:
|
||||
|
||||
<a href="https://console.cloud.google.com/cloudshell/open?git_repo=https://github.com/soxoj/maigret&tutorial=cloudshell-tutorial.md"><img src="https://user-images.githubusercontent.com/27065646/92304704-8d146d80-ef80-11ea-8c29-0deaabb1c702.png" alt="Open in Cloud Shell" height="50"></a>
|
||||
[](https://console.cloud.google.com/cloudshell/open?git_repo=https://github.com/soxoj/maigret&tutorial=README.md)
|
||||
<a href="https://repl.it/github/soxoj/maigret"><img src="https://replit.com/badge/github/soxoj/maigret" alt="Run on Replit" height="50"></a>
|
||||
|
||||
<a href="https://colab.research.google.com/gist/soxoj/879b51bc3b2f8b695abb054090645000/maigret-collab.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab" height="45"></a>
|
||||
@@ -140,27 +140,15 @@ maigret username
|
||||
|
||||
### Docker
|
||||
|
||||
Two image variants are published:
|
||||
|
||||
- `soxoj/maigret:latest` — CLI mode (default)
|
||||
- `soxoj/maigret:web` — auto-launches the [web interface](#web-interface)
|
||||
|
||||
```bash
|
||||
# official image (CLI)
|
||||
# official image
|
||||
docker pull soxoj/maigret
|
||||
|
||||
# CLI usage
|
||||
# usage
|
||||
docker run -v /mydir:/app/reports soxoj/maigret:latest username --html
|
||||
|
||||
# Web UI (open http://localhost:5000)
|
||||
docker run -p 5000:5000 soxoj/maigret:web
|
||||
|
||||
# Web UI on a custom port
|
||||
docker run -e PORT=8080 -p 8080:8080 soxoj/maigret:web
|
||||
|
||||
# manual build
|
||||
docker build -t maigret . # CLI image (default target)
|
||||
docker build --target web -t maigret-web . # Web UI image
|
||||
docker build -t maigret .
|
||||
```
|
||||
|
||||
### Troubleshooting
|
||||
|
||||
@@ -1,69 +0,0 @@
|
||||
# Maigret
|
||||
|
||||
<div align="center">
|
||||
<img src="https://raw.githubusercontent.com/soxoj/maigret/main/static/maigret.png" height="220" alt="Maigret logo"/>
|
||||
</div>
|
||||
|
||||
**Maigret** collects a dossier on a person **by username only**, checking for accounts on a huge number of sites and gathering all the available information from web pages. No API keys required.
|
||||
|
||||
## Installation
|
||||
|
||||
Google Cloud Shell does not ship with all the system libraries Maigret needs (`libcairo2-dev`, `pkg-config`). The helper script below installs them and then builds Maigret from the cloned source.
|
||||
|
||||
Copy the command and run it in the Cloud Shell terminal:
|
||||
|
||||
```bash
|
||||
./utils/cloudshell_install.sh
|
||||
```
|
||||
|
||||
When the script finishes, verify the install:
|
||||
|
||||
```bash
|
||||
maigret --version
|
||||
```
|
||||
|
||||
## Usage examples
|
||||
|
||||
Run a basic search for a username. By default Maigret checks the **500 highest-ranked sites by traffic** — pass `-a` to scan the full 3,000+ database.
|
||||
|
||||
```bash
|
||||
maigret soxoj
|
||||
```
|
||||
|
||||
Search several usernames at once:
|
||||
|
||||
```bash
|
||||
maigret user1 user2 user3
|
||||
```
|
||||
|
||||
Narrow the run to sites related to cryptocurrency via the `crypto` tag (you can also use country tags):
|
||||
|
||||
```bash
|
||||
maigret vitalik.eth --tags crypto
|
||||
```
|
||||
|
||||
Generate reports in HTML, PDF, and XMind 8 formats:
|
||||
|
||||
```bash
|
||||
maigret soxoj --html
|
||||
maigret soxoj --pdf
|
||||
maigret soxoj --xmind
|
||||
```
|
||||
|
||||
Download a generated report from Cloud Shell to your local machine:
|
||||
|
||||
```bash
|
||||
cloudshell download reports/report_soxoj.pdf
|
||||
```
|
||||
|
||||
Tune reliability on flaky networks — raise the timeout and retry failed checks:
|
||||
|
||||
```bash
|
||||
maigret soxoj --timeout 60 --retries 2
|
||||
```
|
||||
|
||||
For the full list of options see `maigret --help` or the [CLI documentation](https://maigret.readthedocs.io/en/latest/command-line-options.html).
|
||||
|
||||
## Further reading
|
||||
|
||||
Full project documentation: [maigret.readthedocs.io](https://maigret.readthedocs.io/)
|
||||
@@ -84,6 +84,9 @@ ids. Useful for repeated scanning with found known irrelevant usernames.
|
||||
``--db`` - Load Maigret database from a JSON file or an online, valid,
|
||||
JSON file. See :ref:`custom-database` below.
|
||||
|
||||
``--extra-db`` - Load an **additional** sites database on top of
|
||||
``--db`` (overlay). Repeatable. See :ref:`extra-database` below.
|
||||
|
||||
``--no-autoupdate`` - Disable the automatic database update check that
|
||||
runs at startup. The currently cached (or bundled) database is used
|
||||
as-is.
|
||||
@@ -139,6 +142,47 @@ disabled and all sites scanned, looks like::
|
||||
--db LLM/maigret_private_db.json \
|
||||
--no-autoupdate -a
|
||||
|
||||
.. _extra-database:
|
||||
|
||||
Overlaying additional databases (``--extra-db``)
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
``--extra-db FILE`` loads an additional sites database **on top of**
|
||||
``--db``, rather than replacing it. The flag is repeatable, so multiple
|
||||
extras can be layered in one invocation::
|
||||
|
||||
python3 -m maigret username \
|
||||
--extra-db private_sites.json \
|
||||
--extra-db team_sites.json -a
|
||||
|
||||
Each extra accepts the same three forms as ``--db`` (HTTP(S) URL,
|
||||
absolute or cwd-relative local path, or module-relative path).
|
||||
|
||||
**Merge semantics.** Sites, engines and tags are merged into the main
|
||||
database. On duplicate names, **last wins**: a site or engine defined
|
||||
later (either in a subsequent ``--extra-db`` or in an ``--extra-db``
|
||||
that re-defines a name from ``--db``) overrides the earlier definition.
|
||||
Tag lists are deduplicated while preserving first-seen order.
|
||||
|
||||
**Auto-update.** Extras are never auto-updated — they are read exactly
|
||||
as provided, regardless of ``--no-autoupdate`` / ``--force-update``.
|
||||
|
||||
**Save behaviour.** While any ``--extra-db`` is active, Maigret **skips
|
||||
every database save** — including the implicit end-of-run save, the
|
||||
``--self-check --auto-disable`` save, and the ``--submit`` save. This
|
||||
prevents silently writing merged (main + extras) content back into the
|
||||
main ``--db`` file. If you need to persist edits, run Maigret again
|
||||
without ``--extra-db``. You will see a warning at startup::
|
||||
|
||||
[!] Database modifications will NOT be persisted while --extra-db is active.
|
||||
|
||||
**Missing or unreadable extra.** Maigret exits with a non-zero status —
|
||||
extras are opt-in, so a silent skip would hide configuration errors.
|
||||
|
||||
**Not supported with** ``--web``. The web UI reloads its own database
|
||||
from the main ``--db`` path, so extras would be invisible. Passing both
|
||||
exits with an error.
|
||||
|
||||
Reports
|
||||
-------
|
||||
|
||||
|
||||
@@ -142,28 +142,18 @@ There are few options for sites data.json helpful in various cases:
|
||||
``protection`` (site protection tracking)
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
The ``protection`` field records what kind of anti-bot protection a site uses. Maigret reads this field and automatically applies the appropriate bypass mechanism where one exists.
|
||||
|
||||
Two categories of tag:
|
||||
|
||||
- **Load-bearing.** Maigret changes its HTTP client or headers based on the tag. Currently only ``tls_fingerprint`` (switches to ``curl_cffi`` with Chrome-class TLS).
|
||||
- **Documentation-only.** Maigret does **not** change behavior based on the tag; it records *why* the site is hard so a future solver can target the right set of sites without re-auditing.
|
||||
|
||||
Within the documentation-only tags, there is a further split that dictates whether the site is ``disabled: true``:
|
||||
|
||||
- ``ip_reputation`` is the **only** doc-tag that **keeps the site enabled**. It means "works for most users, fails from datacenter/cloud IPs." Disabling would silently hide a working site from anyone with a clean IP. The fix is **external** to Maigret (residential IP or ``--proxy``).
|
||||
- ``cf_js_challenge``, ``cf_firewall``, ``aws_waf_js_challenge``, ``ddos_guard_challenge``, ``custom_bot_protection``, ``js_challenge`` all pair with ``disabled: true``. They mean "does not work for anyone right now"; the tag identifies the provider so that when a bypass ships, every site with that tag can be re-enabled in one pass.
|
||||
The ``protection`` field records what kind of anti-bot protection a site uses. Maigret reads this field and automatically applies the appropriate bypass mechanism.
|
||||
|
||||
Supported values:
|
||||
|
||||
- ``tls_fingerprint`` *(load-bearing; site stays enabled)* — the site fingerprints the TLS handshake (JA3/JA4) and blocks non-browser clients. Maigret automatically uses ``curl_cffi`` with Chrome browser emulation to bypass this. Requires the ``curl_cffi`` package (included as a dependency). Examples: Instagram, NPM, Codepen, Kickstarter, Letterboxd.
|
||||
- ``ip_reputation`` *(documentation-only; site stays enabled)* — the site blocks requests from datacenter/cloud IPs regardless of headers or TLS. Cannot be bypassed automatically; run Maigret from a regular internet connection (not a datacenter) or use a proxy (``--proxy``). The site is **not** marked ``disabled`` because it continues to work for users on residential IPs. Examples: Reddit, Patreon, Figma, OnlyFans.
|
||||
- ``cf_js_challenge`` *(documentation-only; pair with ``disabled: true``)* — Cloudflare Managed Challenge / Turnstile JS challenge. Symptom: HTTP 403 with ``cf-mitigated: challenge`` header; body contains ``challenges.cloudflare.com``, ``_cf_chl_opt``, ``window._cf_chl``, or "Just a moment". Not bypassable via ``curl_cffi`` TLS impersonation (verified across Chrome 123/124/131, Safari 17/18, Firefox 133/135, Edge 101 — all return the same 403 challenge page); a real browser executing the challenge JS is required to obtain the clearance cookie. Sites stay ``disabled: true`` until a CF-challenge solver is integrated. Examples: DMOJ, Elakiri, Fanlore, Bdoutdoors, TheStudentRoom, forum.hr.
|
||||
- ``cf_firewall`` *(documentation-only; pair with ``disabled: true``)* — Cloudflare firewall rule / bot score block (WAF action=block, **not** action=challenge). Symptom: HTTP 403 served by Cloudflare (``server: cloudflare``, ``cf-ray`` header) **without** JS-challenge markers — body typically shows "Access denied", "Attention Required", or just a bare 1015/1016/1020 error page. Unlike ``ip_reputation``, residential IPs are **not** sufficient to bypass — Cloudflare decides based on a composite of bot score, TLS fingerprint, UA, ASN, and custom site-owner rules, so ``curl_cffi`` Chrome impersonation from a residential line still returns 403. Sites stay ``disabled: true`` until a per-site bypass (cookies, real browser, or residential+clean session) is found. Examples: Fark, Fodors, Huntingnet, Hunttalk.
|
||||
- ``aws_waf_js_challenge`` *(documentation-only; pair with ``disabled: true``)* — the site is protected by AWS WAF with a JavaScript challenge. Symptom: HTTP 202 with empty body and ``x-amzn-waf-action: challenge`` header (a token-granting challenge that requires executing the CAPTCHA/challenge JS bundle). Neither ``curl_cffi`` TLS impersonation nor User-Agent changes bypass this — a real browser or the official AWS WAF challenge-solver SDK is required. Sites stay ``disabled: true`` until a solver is integrated. Example: Dreamwidth.
|
||||
- ``ddos_guard_challenge`` *(documentation-only; pair with ``disabled: true``)* — DDoS-Guard (ddos-guard.net) anti-bot page. Symptom: HTTP 403 with ``server: ddos-guard`` header; body contains "DDoS-Guard". DDoS-Guard fingerprints different UAs per source IP, so a single User-Agent override does not work across environments; a JS-capable bypass or DDoS-Guard-aware solver is required. Sites stay ``disabled: true`` until a solver is integrated. Example: ForumHouse.
|
||||
- ``js_challenge`` *(documentation-only; pair with ``disabled: true``)* — **fallback** for JavaScript-challenge systems whose provider cannot be identified (custom in-house challenge pages that are not Cloudflare, AWS WAF, or any other recognized vendor). Prefer a provider-specific tag whenever the provider can be pinned down from response headers or body signatures.
|
||||
- ``custom_bot_protection`` *(documentation-only; pair with ``disabled: true``)* — **fallback** for non-JS-challenge bot protection served by a custom/in-house system (not Cloudflare, not AWS WAF, not DDoS-Guard). Typical symptom: HTTP 403 from the site's own origin server (``server: nginx``, AWS ELB, etc.) with a branded block page, returned regardless of TLS fingerprint or residential IP. Not generically bypassable; investigate per site (cookies, session, proxy geography). Examples: Hackerearth ("HackerEarth Guardian"), FreelanceJob (nginx-level block).
|
||||
- ``tls_fingerprint`` — the site fingerprints the TLS handshake (JA3/JA4) and blocks non-browser clients. Maigret automatically uses ``curl_cffi`` with Chrome browser emulation to bypass this. Requires the ``curl_cffi`` package (included as a dependency). Examples: Instagram, NPM, Codepen, Kickstarter, Letterboxd.
|
||||
- ``ip_reputation`` — the site blocks requests from datacenter/cloud IPs regardless of headers or TLS. Cannot be bypassed automatically; run Maigret from a regular internet connection (not a datacenter) or use a proxy (``--proxy``). Examples: Reddit, Patreon, Figma.
|
||||
- ``cf_js_challenge`` — Cloudflare Managed Challenge / Turnstile JS challenge. Symptom: HTTP 403 with ``cf-mitigated: challenge`` header; body contains ``challenges.cloudflare.com``, ``_cf_chl_opt``, ``window._cf_chl``, or "Just a moment". Not bypassable via ``curl_cffi`` TLS impersonation (verified across Chrome 123/124/131, Safari 17/18, Firefox 133/135, Edge 101 — all return the same 403 challenge page); a real browser executing the challenge JS is required to obtain the clearance cookie. Documentation-only flag; sites stay ``disabled: true`` until a CF-challenge solver is integrated. Examples: DMOJ, Elakiri, Fanlore, Bdoutdoors, TheStudentRoom, forum.hr.
|
||||
- ``cf_firewall`` — Cloudflare firewall rule / bot score block (WAF action=block, **not** action=challenge). Symptom: HTTP 403 served by Cloudflare (``server: cloudflare``, ``cf-ray`` header) **without** JS-challenge markers — body typically shows "Access denied", "Attention Required", or just a bare 1015/1016/1020 error page. Unlike ``ip_reputation``, residential IPs are **not** sufficient to bypass — Cloudflare decides based on a composite of bot score, TLS fingerprint, UA, ASN, and custom site-owner rules, so ``curl_cffi`` Chrome impersonation from a residential line still returns 403. Documentation-only flag; sites stay ``disabled: true`` until a per-site bypass (cookies, real browser, or residential+clean session) is found. Examples: Fark, Fodors, Huntingnet, Hunttalk.
|
||||
- ``aws_waf_js_challenge`` — the site is protected by AWS WAF with a JavaScript challenge. Symptom: HTTP 202 with empty body and ``x-amzn-waf-action: challenge`` header (a token-granting challenge that requires executing the CAPTCHA/challenge JS bundle). Neither ``curl_cffi`` TLS impersonation nor User-Agent changes bypass this — a real browser or the official AWS WAF challenge-solver SDK is required. Currently marked for documentation only; sites using this protection stay ``disabled: true`` until a solver is integrated. Example: Dreamwidth.
|
||||
- ``ddos_guard_challenge`` — DDoS-Guard (ddos-guard.net) anti-bot page. Symptom: HTTP 403 with ``server: ddos-guard`` header; body contains "DDoS-Guard". DDoS-Guard fingerprints different UAs per source IP, so a single User-Agent override does not work across environments; a JS-capable bypass or DDoS-Guard-aware solver is required. Documentation-only flag; sites stay ``disabled: true`` until a solver is integrated. Example: ForumHouse.
|
||||
- ``js_challenge`` — **fallback** for JavaScript-challenge systems whose provider cannot be identified (custom in-house challenge pages that are not Cloudflare, AWS WAF, or any other recognized vendor). Prefer a provider-specific tag whenever the provider can be pinned down from response headers or body signatures.
|
||||
- ``custom_bot_protection`` — **fallback** for non-JS-challenge bot protection served by a custom/in-house system (not Cloudflare, not AWS WAF, not DDoS-Guard). Typical symptom: HTTP 403 from the site's own origin server (``server: nginx``, AWS ELB, etc.) with a branded block page, returned regardless of TLS fingerprint or residential IP. Not generically bypassable; investigate per site (cookies, session, proxy geography). Examples: Hackerearth ("HackerEarth Guardian"), FreelanceJob (nginx-level block).
|
||||
|
||||
**Rule: prefer provider-specific protection tags.** When a site is blocked by an identifiable anti-bot vendor, always record the vendor in the tag (``cf_js_challenge``, ``cf_firewall``, ``aws_waf_js_challenge``, ``ddos_guard_challenge``, and future additions such as ``sucuri_challenge``, ``incapsula_challenge``). The generic ``js_challenge`` and ``custom_bot_protection`` tags are reserved for custom/unknown systems. Rationale: bypass solvers are inherently provider-specific (a Cloudflare Turnstile solver does not help with AWS WAF); recording the provider in advance lets us fan out fixes the moment a per-provider solver is added, without re-auditing every disabled site. The same principle applies to other protection categories when the provider is identifiable.
|
||||
|
||||
|
||||
-158
@@ -1,158 +0,0 @@
|
||||
"""Maigret AI Analysis Module
|
||||
|
||||
Provides AI-powered analysis of search results using OpenAI-compatible APIs.
|
||||
"""
|
||||
|
||||
import asyncio
|
||||
import json
|
||||
import os
|
||||
import sys
|
||||
import threading
|
||||
|
||||
import aiohttp
|
||||
|
||||
|
||||
def load_ai_prompt() -> str:
|
||||
"""Load the AI system prompt from the resources directory."""
|
||||
maigret_path = os.path.dirname(os.path.realpath(__file__))
|
||||
prompt_path = os.path.join(maigret_path, "resources", "ai_prompt.txt")
|
||||
with open(prompt_path, "r", encoding="utf-8") as f:
|
||||
return f.read()
|
||||
|
||||
|
||||
def resolve_api_key(settings) -> str | None:
|
||||
"""Resolve OpenAI API key from settings or environment variable.
|
||||
|
||||
Priority: settings.openai_api_key > OPENAI_API_KEY env var.
|
||||
"""
|
||||
key = getattr(settings, "openai_api_key", None)
|
||||
if key:
|
||||
return key
|
||||
return os.environ.get("OPENAI_API_KEY")
|
||||
|
||||
|
||||
class _Spinner:
|
||||
"""Simple animated spinner for terminal output."""
|
||||
|
||||
FRAMES = ["⠋", "⠙", "⠹", "⠸", "⠼", "⠴", "⠦", "⠧", "⠇", "⠏"]
|
||||
|
||||
def __init__(self, text=""):
|
||||
self.text = text
|
||||
self._stop = threading.Event()
|
||||
self._thread = None
|
||||
|
||||
def start(self):
|
||||
self._thread = threading.Thread(target=self._spin, daemon=True)
|
||||
self._thread.start()
|
||||
|
||||
def _spin(self):
|
||||
i = 0
|
||||
while not self._stop.is_set():
|
||||
frame = self.FRAMES[i % len(self.FRAMES)]
|
||||
sys.stderr.write(f"\r{frame} {self.text}")
|
||||
sys.stderr.flush()
|
||||
i += 1
|
||||
self._stop.wait(0.08)
|
||||
|
||||
def stop(self):
|
||||
self._stop.set()
|
||||
if self._thread:
|
||||
self._thread.join()
|
||||
sys.stderr.write("\r\033[2K")
|
||||
sys.stderr.flush()
|
||||
|
||||
|
||||
async def print_streaming(text: str, delay: float = 0.04):
|
||||
"""Print text word by word with a delay, simulating streaming LLM output."""
|
||||
words = text.split(" ")
|
||||
for i, word in enumerate(words):
|
||||
if i > 0:
|
||||
sys.stdout.write(" ")
|
||||
sys.stdout.write(word)
|
||||
sys.stdout.flush()
|
||||
await asyncio.sleep(delay)
|
||||
sys.stdout.write("\n")
|
||||
sys.stdout.flush()
|
||||
|
||||
|
||||
async def get_ai_analysis(
|
||||
api_key: str,
|
||||
markdown_report: str,
|
||||
model: str = "gpt-4o",
|
||||
api_base_url: str = "https://api.openai.com/v1",
|
||||
) -> str:
|
||||
"""Send the markdown report to an OpenAI-compatible API and return the analysis.
|
||||
|
||||
Uses streaming to display tokens as they arrive.
|
||||
Raises on HTTP errors with descriptive messages.
|
||||
"""
|
||||
system_prompt = load_ai_prompt()
|
||||
|
||||
url = f"{api_base_url.rstrip('/')}/chat/completions"
|
||||
headers = {
|
||||
"Authorization": f"Bearer {api_key}",
|
||||
"Content-Type": "application/json",
|
||||
}
|
||||
payload = {
|
||||
"model": model,
|
||||
"stream": True,
|
||||
"messages": [
|
||||
{"role": "system", "content": system_prompt},
|
||||
{"role": "user", "content": markdown_report},
|
||||
],
|
||||
}
|
||||
|
||||
spinner = _Spinner("Analysing the data with AI...")
|
||||
spinner.start()
|
||||
first_token = True
|
||||
full_response = []
|
||||
|
||||
try:
|
||||
async with aiohttp.ClientSession() as session:
|
||||
async with session.post(url, json=payload, headers=headers) as resp:
|
||||
if resp.status == 401:
|
||||
raise RuntimeError("Invalid OpenAI API key (HTTP 401)")
|
||||
if resp.status == 429:
|
||||
raise RuntimeError("OpenAI API rate limit exceeded (HTTP 429)")
|
||||
if resp.status != 200:
|
||||
body = await resp.text()
|
||||
raise RuntimeError(
|
||||
f"OpenAI API error (HTTP {resp.status}): {body[:500]}"
|
||||
)
|
||||
|
||||
async for line in resp.content:
|
||||
decoded = line.decode("utf-8").strip()
|
||||
if not decoded or not decoded.startswith("data: "):
|
||||
continue
|
||||
|
||||
data_str = decoded[len("data: "):]
|
||||
if data_str == "[DONE]":
|
||||
break
|
||||
|
||||
try:
|
||||
chunk = json.loads(data_str)
|
||||
except json.JSONDecodeError:
|
||||
continue
|
||||
|
||||
delta = chunk.get("choices", [{}])[0].get("delta", {})
|
||||
content = delta.get("content", "")
|
||||
if not content:
|
||||
continue
|
||||
|
||||
if first_token:
|
||||
spinner.stop()
|
||||
print()
|
||||
first_token = False
|
||||
|
||||
sys.stdout.write(content)
|
||||
sys.stdout.flush()
|
||||
except Exception:
|
||||
spinner.stop()
|
||||
raise
|
||||
|
||||
if first_token:
|
||||
# No tokens received — stop spinner anyway
|
||||
spinner.stop()
|
||||
|
||||
print()
|
||||
return "".join(full_response)
|
||||
+2
-14
@@ -247,15 +247,9 @@ class CurlCffiChecker(CheckerBase):
|
||||
async def check(self) -> Tuple[Optional[str], int, Optional[CheckError]]:
|
||||
try:
|
||||
async with CurlCffiAsyncSession() as session:
|
||||
# Strip the User-Agent so curl_cffi can use the impersonated browser's
|
||||
# matching UA. Mixing a random UA with a Chrome TLS fingerprint trips
|
||||
# composite bot scoring (e.g. Cloudflare returns a JS challenge for
|
||||
# "Chrome 91 UA + Chrome 131 TLS"). Keep any site-specific custom headers.
|
||||
headers = {k: v for k, v in (self.headers or {}).items()
|
||||
if k.lower() not in ('user-agent', 'connection')}
|
||||
kwargs = {
|
||||
'url': self.url,
|
||||
'headers': headers or None,
|
||||
'headers': self.headers,
|
||||
'allow_redirects': self.allow_redirects,
|
||||
'timeout': self.timeout if self.timeout else 10,
|
||||
'impersonate': self.browser_emulate,
|
||||
@@ -351,11 +345,7 @@ def process_site_result(
|
||||
username = results_info["username"]
|
||||
is_parsing_enabled = results_info["parsing_enabled"]
|
||||
url = results_info.get("url_user")
|
||||
url_probe = results_info.get("url_probe") or url
|
||||
if url_probe != url:
|
||||
logger.info(f"{url_probe} (display: {url})")
|
||||
else:
|
||||
logger.info(url)
|
||||
logger.info(url)
|
||||
|
||||
status = results_info.get("status")
|
||||
if status is not None:
|
||||
@@ -613,8 +603,6 @@ def make_site_result(
|
||||
for k, v in site.get_params.items():
|
||||
url_probe += f"&{k}={v}"
|
||||
|
||||
results_site["url_probe"] = url_probe
|
||||
|
||||
if site.request_method:
|
||||
request_method = site.request_method.lower()
|
||||
elif site.check_type == "status_code" and site.request_head_only:
|
||||
|
||||
+77
-84
@@ -202,6 +202,17 @@ def setup_arguments_parser(settings: Settings):
|
||||
default=settings.sites_db_path,
|
||||
help="Load Maigret database from a JSON file or HTTP web resource.",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--extra-db",
|
||||
metavar="EXTRA_DB_FILE",
|
||||
dest="extra_db_files",
|
||||
action="append",
|
||||
default=[],
|
||||
help="Load an additional sites database on top of --db. Repeatable. "
|
||||
"Accepts a local path (absolute or cwd-relative) or HTTP(S) URL. "
|
||||
"Never auto-updated. Changes from --self-check / --submit are NOT "
|
||||
"persisted when any --extra-db is loaded.",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--no-autoupdate",
|
||||
action="store_true",
|
||||
@@ -494,21 +505,6 @@ def setup_arguments_parser(settings: Settings):
|
||||
" (one report per username).",
|
||||
)
|
||||
|
||||
report_group.add_argument(
|
||||
"--ai",
|
||||
action="store_true",
|
||||
dest="ai",
|
||||
default=False,
|
||||
help="Generate an AI-powered analysis of the search results using OpenAI API. "
|
||||
"Requires OPENAI_API_KEY env var or openai_api_key in settings.",
|
||||
)
|
||||
report_group.add_argument(
|
||||
"--ai-model",
|
||||
dest="ai_model",
|
||||
default=settings.openai_model,
|
||||
help="OpenAI model to use for AI analysis (default: gpt-4o).",
|
||||
)
|
||||
|
||||
parser.add_argument(
|
||||
"--reports-sorting",
|
||||
default=settings.report_sorting,
|
||||
@@ -611,7 +607,6 @@ async def main():
|
||||
print_found_only=not args.print_not_found,
|
||||
skip_check_errors=not args.print_check_errors,
|
||||
color=not args.no_color,
|
||||
silent=args.ai,
|
||||
)
|
||||
|
||||
# Create object with all information about sites we are aware of.
|
||||
@@ -630,6 +625,46 @@ async def main():
|
||||
)
|
||||
else:
|
||||
raise
|
||||
|
||||
for extra_arg in args.extra_db_files:
|
||||
try:
|
||||
extra_path = resolve_db_path(
|
||||
db_file_arg=extra_arg,
|
||||
no_autoupdate=True,
|
||||
meta_url=settings.db_update_meta_url,
|
||||
check_interval_hours=settings.autoupdate_check_interval_hours,
|
||||
color=not args.no_color,
|
||||
)
|
||||
except FileNotFoundError as e:
|
||||
logger.error(f"--extra-db: {e}")
|
||||
sys.exit(2)
|
||||
|
||||
before = len(db.sites)
|
||||
try:
|
||||
db.load_extra_from_path(extra_path)
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to load extra database from {extra_path}: {e}")
|
||||
sys.exit(2)
|
||||
query_notify.success(
|
||||
f'Loaded extra database: {extra_path} '
|
||||
f'(+{len(db.sites) - before} new, {len(db.sites)} total sites)'
|
||||
)
|
||||
|
||||
if args.extra_db_files:
|
||||
query_notify.warning(
|
||||
'Database modifications will NOT be persisted while --extra-db is active.'
|
||||
)
|
||||
|
||||
def save_db_if_safe(reason: str) -> bool:
|
||||
if args.extra_db_files:
|
||||
logger.warning(
|
||||
f"Skipping database save ({reason}): --extra-db is active; "
|
||||
"modifications are in-memory only."
|
||||
)
|
||||
return False
|
||||
db.save_to_file(db_file)
|
||||
return True
|
||||
|
||||
get_top_sites_for_id = lambda x: db.ranked_sites_dict(
|
||||
top=args.top_sites,
|
||||
tags=args.tags,
|
||||
@@ -645,7 +680,7 @@ async def main():
|
||||
submitter = Submitter(db=db, logger=logger, settings=settings, args=args)
|
||||
is_submitted = await submitter.dialog(args.new_site_to_submit, args.cookie_file)
|
||||
if is_submitted:
|
||||
db.save_to_file(db_file)
|
||||
save_db_if_safe("post-submit")
|
||||
await submitter.close()
|
||||
|
||||
# Database self-checking
|
||||
@@ -679,8 +714,8 @@ async def main():
|
||||
'y',
|
||||
'',
|
||||
):
|
||||
db.save_to_file(db_file)
|
||||
print('Database was successfully updated.')
|
||||
if save_db_if_safe("post-self-check"):
|
||||
print('Database was successfully updated.')
|
||||
else:
|
||||
print('Updates will be applied only for current search session.')
|
||||
|
||||
@@ -703,6 +738,14 @@ async def main():
|
||||
|
||||
# Web interface
|
||||
if args.web is not None:
|
||||
if args.extra_db_files:
|
||||
logger.error(
|
||||
'--web is not compatible with --extra-db: the web UI reloads '
|
||||
'the database from --db only, so extras would be silently '
|
||||
'ignored. Remove --extra-db or use the CLI mode.'
|
||||
)
|
||||
sys.exit(2)
|
||||
|
||||
from maigret.web.app import app
|
||||
|
||||
app.config["MAIGRET_DB_FILE"] = db_file
|
||||
@@ -727,33 +770,17 @@ async def main():
|
||||
+ get_dict_ascii_tree(usernames, prepend="\t")
|
||||
)
|
||||
|
||||
if args.ai:
|
||||
from .ai import resolve_api_key
|
||||
|
||||
if not resolve_api_key(settings):
|
||||
query_notify.warning(
|
||||
'AI analysis requires an OpenAI API key. '
|
||||
'Set OPENAI_API_KEY environment variable or add '
|
||||
'openai_api_key to settings.json.'
|
||||
)
|
||||
sys.exit(1)
|
||||
|
||||
if not site_data:
|
||||
query_notify.warning('No sites to check, exiting!')
|
||||
sys.exit(2)
|
||||
|
||||
if args.ai:
|
||||
query_notify.warning(
|
||||
f'Starting a search on top {len(site_data)} sites from the Maigret database...'
|
||||
)
|
||||
if not args.all_sites:
|
||||
query_notify.warning(
|
||||
f'Starting AI-assisted search on top {len(site_data)} sites from the Maigret database...'
|
||||
'You can run search by full list of sites with flag `-a`', '!'
|
||||
)
|
||||
else:
|
||||
query_notify.warning(
|
||||
f'Starting a search on top {len(site_data)} sites from the Maigret database...'
|
||||
)
|
||||
if not args.all_sites:
|
||||
query_notify.warning(
|
||||
'You can run search by full list of sites with flag `-a`', '!'
|
||||
)
|
||||
|
||||
already_checked = set()
|
||||
general_results = []
|
||||
@@ -806,12 +833,11 @@ async def main():
|
||||
check_domains=args.with_domains,
|
||||
)
|
||||
|
||||
if not args.ai:
|
||||
errs = errors.notify_about_errors(
|
||||
results, query_notify, show_statistics=args.verbose
|
||||
)
|
||||
for e in errs:
|
||||
query_notify.warning(*e)
|
||||
errs = errors.notify_about_errors(
|
||||
results, query_notify, show_statistics=args.verbose
|
||||
)
|
||||
for e in errs:
|
||||
query_notify.warning(*e)
|
||||
|
||||
if args.reports_sorting == "data":
|
||||
results = sort_report_by_data_points(results)
|
||||
@@ -900,46 +926,13 @@ async def main():
|
||||
save_graph_report(filename, general_results, db)
|
||||
query_notify.warning(f'Graph report on all usernames saved in {filename}')
|
||||
|
||||
if not args.ai:
|
||||
text_report = get_plaintext_report(report_context)
|
||||
if text_report:
|
||||
query_notify.info('Short text report:')
|
||||
print(text_report)
|
||||
|
||||
if args.ai:
|
||||
from .ai import get_ai_analysis, resolve_api_key
|
||||
from .report import generate_markdown_report
|
||||
|
||||
api_key = resolve_api_key(settings)
|
||||
|
||||
run_flags = []
|
||||
if args.tags:
|
||||
run_flags.append(f"--tags {args.tags}")
|
||||
if args.site_list:
|
||||
run_flags.append(f"--site {','.join(args.site_list)}")
|
||||
if args.all_sites:
|
||||
run_flags.append("--all-sites")
|
||||
run_info = {
|
||||
"sites_count": sum(len(d) for _, _, d in general_results),
|
||||
"flags": " ".join(run_flags) if run_flags else None,
|
||||
}
|
||||
|
||||
md_report = generate_markdown_report(report_context, run_info=run_info)
|
||||
|
||||
try:
|
||||
await get_ai_analysis(
|
||||
api_key=api_key,
|
||||
markdown_report=md_report,
|
||||
model=args.ai_model,
|
||||
api_base_url=getattr(
|
||||
settings, 'openai_api_base_url', 'https://api.openai.com/v1'
|
||||
),
|
||||
)
|
||||
except Exception as e:
|
||||
query_notify.warning(f'AI analysis failed: {e}')
|
||||
text_report = get_plaintext_report(report_context)
|
||||
if text_report:
|
||||
query_notify.info('Short text report:')
|
||||
print(text_report)
|
||||
|
||||
# update database
|
||||
db.save_to_file(db_file)
|
||||
save_db_if_safe("end-of-run")
|
||||
|
||||
|
||||
def run():
|
||||
|
||||
@@ -123,7 +123,6 @@ class QueryNotifyPrint(QueryNotify):
|
||||
print_found_only=False,
|
||||
skip_check_errors=False,
|
||||
color=True,
|
||||
silent=False,
|
||||
):
|
||||
"""Create Query Notify Print Object.
|
||||
|
||||
@@ -150,7 +149,6 @@ class QueryNotifyPrint(QueryNotify):
|
||||
self.print_found_only = print_found_only
|
||||
self.skip_check_errors = skip_check_errors
|
||||
self.color = color
|
||||
self.silent = silent
|
||||
|
||||
return
|
||||
|
||||
@@ -189,9 +187,6 @@ class QueryNotifyPrint(QueryNotify):
|
||||
Nothing.
|
||||
"""
|
||||
|
||||
if self.silent:
|
||||
return
|
||||
|
||||
title = f"Checking {id_type}"
|
||||
if self.color:
|
||||
print(
|
||||
@@ -241,9 +236,6 @@ class QueryNotifyPrint(QueryNotify):
|
||||
Return Value:
|
||||
Nothing.
|
||||
"""
|
||||
if self.silent:
|
||||
return
|
||||
|
||||
notify = None
|
||||
self.result = result
|
||||
|
||||
|
||||
+7
-16
@@ -30,18 +30,14 @@ UTILS
|
||||
|
||||
|
||||
def filter_supposed_data(data):
|
||||
# interesting fields
|
||||
allowed_fields = ["fullname", "gender", "location", "age"]
|
||||
|
||||
def _first(v):
|
||||
if isinstance(v, (list, tuple)):
|
||||
return v[0] if v else ""
|
||||
return v
|
||||
|
||||
return {
|
||||
CaseConverter.snake_to_title(k): _first(v)
|
||||
filtered_supposed_data = {
|
||||
CaseConverter.snake_to_title(k): v[0]
|
||||
for k, v in data.items()
|
||||
if k in allowed_fields
|
||||
}
|
||||
return filtered_supposed_data
|
||||
|
||||
|
||||
def sort_report_by_data_points(results):
|
||||
@@ -245,7 +241,7 @@ def save_graph_report(filename: str, username_results: list, db: MaigretDatabase
|
||||
# Generate interactive visualization
|
||||
from pyvis.network import Network # type: ignore[import-untyped]
|
||||
|
||||
nt = Network(notebook=True, height="100vh", width="100%")
|
||||
nt = Network(notebook=True, height="750px", width="100%")
|
||||
nt.from_nx(G)
|
||||
nt.show(filename)
|
||||
|
||||
@@ -271,7 +267,7 @@ def _md_format_value(value) -> str:
|
||||
return s
|
||||
|
||||
|
||||
def generate_markdown_report(context: dict, run_info: dict = None) -> str:
|
||||
def save_markdown_report(filename: str, context: dict, run_info: dict = None):
|
||||
username = context.get("username", "unknown")
|
||||
generated_at = context.get("generated_at", "")
|
||||
brief = context.get("brief", "")
|
||||
@@ -395,13 +391,8 @@ def generate_markdown_report(context: dict, run_info: dict = None) -> str:
|
||||
"CCPA, and similar).\n"
|
||||
)
|
||||
|
||||
return "\n".join(lines)
|
||||
|
||||
|
||||
def save_markdown_report(filename: str, context: dict, run_info: dict = None):
|
||||
content = generate_markdown_report(context, run_info)
|
||||
with open(filename, "w", encoding="utf-8") as f:
|
||||
f.write(content)
|
||||
f.write("\n".join(lines))
|
||||
|
||||
|
||||
"""
|
||||
|
||||
@@ -1,62 +0,0 @@
|
||||
You are an OSINT analyst that converts raw username-investigation reports into a short, clean human-readable summary.
|
||||
|
||||
Your task:
|
||||
Read the attached account-discovery report and produce a concise report in exactly this style:
|
||||
|
||||
# Investigation Summary
|
||||
|
||||
Name: <most likely real full name>
|
||||
Location: <most likely current location>
|
||||
Occupation: <short combined description based only on strong signals>
|
||||
Interests: <3–6 broad interests inferred from platform types, bios, and activity>
|
||||
Languages: <languages supported by strong evidence only>
|
||||
Website: <main personal website if clearly present>
|
||||
Username: <main username> (variant: <variant usernames if any>)
|
||||
Platforms: <number> profiles, active from <first year> to <last year>
|
||||
Confidence: <High / Medium / Low> — <one short explanation why>
|
||||
|
||||
# Other leads
|
||||
|
||||
- <lead 1>
|
||||
- <lead 2>
|
||||
- <lead 3 if needed>
|
||||
|
||||
Rules:
|
||||
1. Use only information supported by the report.
|
||||
2. Resolve identity using consistency of username, full name, bio, links, company, and location.
|
||||
3. Prefer strong repeated signals over one-off weak signals.
|
||||
4. If one profile clearly conflicts with the rest, mention it in "Other leads" as a likely false positive instead of mixing it into the main identity.
|
||||
5. Keep the tone analytical and neutral.
|
||||
6. Do not mention every platform individually.
|
||||
7. Do not include raw URLs except for the main website.
|
||||
8. Do not mention NSFW/adult platforms in the main summary unless they are the only source for a critical lead; if such a profile looks inconsistent, mention it only as a likely false positive.
|
||||
9. "Occupation" should be a compact merged description, for example: "Chief Product Officer (CPO) at ..., entrepreneur, OSINT community founder".
|
||||
10. "Interests" should be broad categories, not noisy tags. Convert raw platform/tag evidence into natural categories like OSINT, software development, blogging, gaming, streaming, etc.
|
||||
11. "Languages" should only include languages clearly supported by bios, texts, country tags, or profile content.
|
||||
12. For "Platforms", count the profiles reported as found by the report summary, not manually deduplicated.
|
||||
13. For active years, use the earliest and latest reliable dates from the consistent identity cluster. Ignore obvious outlier dates if they belong to likely false positives or weak profiles.
|
||||
14. For confidence:
|
||||
- High = strong consistency across username, name, bio, links, location, and/or company
|
||||
- Medium = partial consistency with some gaps
|
||||
- Low = mostly username-only matches
|
||||
15. If some field is not reliably known, omit speculation and use the best cautious wording possible.
|
||||
16. For "Name", output only the most likely real personal name in clean canonical form.
|
||||
- Remove nicknames, handles, aliases, or bracketed parts such as "(Soxoj)".
|
||||
- Example: "Dmitriy (Soxoj) Danilov" -> "Dmitriy Danilov".
|
||||
17. For "Website", output only the plain domain or URL as text, not a markdown hyperlink.
|
||||
18. In "Other leads", do not label conflicting profiles as "false positive", "likely unrelated", or "potentially a false positive".
|
||||
- Instead, use neutral intelligence wording such as:
|
||||
"Accounts were found that are most likely unrelated to the main identity, but may indicate possible cross-border activity and should be verified."
|
||||
19. When describing anomalies in "Other leads", prefer cautious investigative phrasing:
|
||||
- "may be unrelated"
|
||||
- "requires verification"
|
||||
- "could indicate separate activity"
|
||||
- "should be checked manually"
|
||||
20. Do not include nicknames or aliases inside the Name field unless they are clearly part of the legal or real-world name.
|
||||
|
||||
Output requirements:
|
||||
- Return only the final formatted text.
|
||||
- Keep it short.
|
||||
- No preamble, no explanations.
|
||||
|
||||
Now analyze the following report
|
||||
+73
-393
@@ -40,7 +40,7 @@
|
||||
],
|
||||
"alexaRank": 3,
|
||||
"urlMain": "https://www.youtube.com/",
|
||||
"url": "https://www.youtube.com/@{username}/about",
|
||||
"url": "https://www.youtube.com/@{username}",
|
||||
"usernameClaimed": "test",
|
||||
"usernameUnclaimed": "noonewouldeverusethis777"
|
||||
},
|
||||
@@ -63,7 +63,7 @@
|
||||
],
|
||||
"alexaRank": 3,
|
||||
"urlMain": "https://www.youtube.com/",
|
||||
"url": "https://www.youtube.com/@{username}/about",
|
||||
"url": "https://www.youtube.com/@{username}",
|
||||
"usernameClaimed": "test",
|
||||
"usernameUnclaimed": "noonewouldeverusethis777"
|
||||
},
|
||||
@@ -100,7 +100,7 @@
|
||||
"sec-ch-ua": "Google Chrome\";v=\"87\", \" Not;A Brand\";v=\"99\", \"Chromium\";v=\"87\"",
|
||||
"authorization": "Bearer AAAAAAAAAAAAAAAAAAAAANRILgAAAAAAnNwIzUejRCOuH5E6I8xnZz4puTs%3D1Zv7ttfk8LF81IUq16cHjhLTvJu4FA33AGWWjCpTnA",
|
||||
"user-agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36",
|
||||
"x-guest-token": "2048070238281826593"
|
||||
"x-guest-token": "2045154491230572773"
|
||||
},
|
||||
"errors": {
|
||||
"Bad guest token": "x-guest-token update required"
|
||||
@@ -296,7 +296,7 @@
|
||||
"method": "vimeo"
|
||||
},
|
||||
"headers": {
|
||||
"Authorization": "jwt eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJleHAiOjE3NzcxMzM4ODAsInVzZXJfaWQiOm51bGwsImFwcF9pZCI6NTg0NzksInNjb3BlcyI6InB1YmxpYyIsInRlYW1fdXNlcl9pZCI6bnVsbCwianRpIjoiZjFiMGJjNWUtMjIyOC00Y2I1LWFlNmItODk0YjZhNGNmODJhIn0.YCPXekRrHnJy8iH1gX4iVuNURiw6sU_FlmsfHnM2oug"
|
||||
"Authorization": "jwt eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJleHAiOjE3NzY0Mzg3MjAsInVzZXJfaWQiOm51bGwsImFwcF9pZCI6NTg0NzksInNjb3BlcyI6InB1YmxpYyIsInRlYW1fdXNlcl9pZCI6bnVsbCwianRpIjoiNjY0OWY3ZWItMThjZS00ODU1LWIzNmEtNWY3MzRkOGZhNjAyIn0.l1SRcr5UqvxqYLveW7MTECKSfkgsbh1y9QZqZmBX1EI"
|
||||
},
|
||||
"urlProbe": "https://api.vimeo.com/users/{username}?fields=name%2Cgender%2Cbio%2Curi%2Clink%2Cbackground_video%2Clocation_details%2Cpictures%2Cverified%2Cmetadata.public_videos.total%2Cavailable_for_hire%2Ccan_work_remotely%2Cmetadata.connections.videos.total%2Cmetadata.connections.albums.total%2Cmetadata.connections.followers.total%2Cmetadata.connections.following.total%2Cmetadata.public_videos.total%2Cmetadata.connections.vimeo_experts.is_enrolled%2Ctotal_collection_count%2Ccreated_time%2Cprofile_preferences%2Cmembership%2Cclients%2Cskills%2Cproject_types%2Crates%2Ccategories%2Cis_expert%2Cprofile_discovery%2Cwebsites%2Ccontact_emails&fetch_user_profile=1",
|
||||
"checkType": "status_code",
|
||||
@@ -491,10 +491,6 @@
|
||||
"usernameUnclaimed": "noonewouldeverusethis7"
|
||||
},
|
||||
"Reddit": {
|
||||
"disabled": true,
|
||||
"protection": [
|
||||
"custom_bot_protection"
|
||||
],
|
||||
"tags": [
|
||||
"discussion",
|
||||
"news",
|
||||
@@ -515,7 +511,10 @@
|
||||
"url": "https://www.reddit.com/user/{username}",
|
||||
"urlProbe": "https://api.reddit.com/user/{username}/about",
|
||||
"usernameClaimed": "blue",
|
||||
"usernameUnclaimed": "noonewouldeverusethis7"
|
||||
"usernameUnclaimed": "noonewouldeverusethis7",
|
||||
"protection": [
|
||||
"tls_fingerprint"
|
||||
]
|
||||
},
|
||||
"Tumblr": {
|
||||
"tags": [
|
||||
@@ -1335,18 +1334,10 @@
|
||||
"usernameClaimed": "Blue",
|
||||
"usernameUnclaimed": "noonewouldeverusethis7",
|
||||
"alexaRank": 242,
|
||||
"presenseStrs": [
|
||||
"class=\"gs_a\""
|
||||
],
|
||||
"absenceStrs": [
|
||||
"did not match any articles",
|
||||
"not match"
|
||||
],
|
||||
"errors": {
|
||||
"Our systems have detected unusual traffic": "Google rate-limit / captcha",
|
||||
"/sorry/index": "Google rate-limit / captcha",
|
||||
"unusual traffic from your computer network": "Google rate-limit / captcha"
|
||||
},
|
||||
"tags": [
|
||||
"education",
|
||||
"research"
|
||||
@@ -1622,10 +1613,6 @@
|
||||
"usernameUnclaimed": "noonewouldeverusethis7"
|
||||
},
|
||||
"Quora": {
|
||||
"protection": [
|
||||
"cf_js_challenge",
|
||||
"tls_fingerprint"
|
||||
],
|
||||
"tags": [
|
||||
"education"
|
||||
],
|
||||
@@ -1792,10 +1779,6 @@
|
||||
"usernameUnclaimed": "noonewouldeverusethis7"
|
||||
},
|
||||
"Patreon": {
|
||||
"disabled": true,
|
||||
"protection": [
|
||||
"cf_js_challenge"
|
||||
],
|
||||
"tags": [
|
||||
"finance"
|
||||
],
|
||||
@@ -2061,10 +2044,6 @@
|
||||
"usernameUnclaimed": "noonewouldeverusethis7"
|
||||
},
|
||||
"Shutterstock": {
|
||||
"disabled": true,
|
||||
"protection": [
|
||||
"custom_bot_protection"
|
||||
],
|
||||
"tags": [
|
||||
"music",
|
||||
"photo",
|
||||
@@ -2828,10 +2807,6 @@
|
||||
"usernameUnclaimed": "noonewouldeverusethis7"
|
||||
},
|
||||
"PyPi": {
|
||||
"disabled": true,
|
||||
"protection": [
|
||||
"custom_bot_protection"
|
||||
],
|
||||
"tags": [
|
||||
"coding"
|
||||
],
|
||||
@@ -2843,7 +2818,10 @@
|
||||
"urlMain": "https://pypi.org/",
|
||||
"url": "https://pypi.org/user/{username}",
|
||||
"usernameClaimed": "adam",
|
||||
"usernameUnclaimed": "noonewouldeverusethis7"
|
||||
"usernameUnclaimed": "noonewouldeverusethis7",
|
||||
"protection": [
|
||||
"tls_fingerprint"
|
||||
]
|
||||
},
|
||||
"Pastebin": {
|
||||
"tags": [
|
||||
@@ -3512,7 +3490,8 @@
|
||||
"usernameUnclaimed": "noonewouldeverusethis7",
|
||||
"alexaRank": 1426,
|
||||
"absenceStrs": [
|
||||
"<title>false | GeeksforGeeks Profile"
|
||||
"not found",
|
||||
"404"
|
||||
],
|
||||
"tags": [
|
||||
"coding",
|
||||
@@ -3653,10 +3632,6 @@
|
||||
"disabled": true
|
||||
},
|
||||
"Redbubble": {
|
||||
"disabled": true,
|
||||
"protection": [
|
||||
"cf_js_challenge"
|
||||
],
|
||||
"tags": [
|
||||
"shopping"
|
||||
],
|
||||
@@ -3665,7 +3640,10 @@
|
||||
"urlMain": "https://www.redbubble.com/",
|
||||
"url": "https://www.redbubble.com/people/{username}",
|
||||
"usernameClaimed": "blue",
|
||||
"usernameUnclaimed": "noonewouldeverusethis77777"
|
||||
"usernameUnclaimed": "noonewouldeverusethis77777",
|
||||
"protection": [
|
||||
"tls_fingerprint"
|
||||
]
|
||||
},
|
||||
"codeberg.org": {
|
||||
"tags": [
|
||||
@@ -5467,13 +5445,7 @@
|
||||
"tags": [
|
||||
"gaming"
|
||||
],
|
||||
"checkType": "message",
|
||||
"presenseStrs": [
|
||||
"class=\"profile-container\""
|
||||
],
|
||||
"absenceStrs": [
|
||||
"request-error"
|
||||
],
|
||||
"checkType": "status_code",
|
||||
"alexaRank": 2067,
|
||||
"urlMain": "https://www.roblox.com/",
|
||||
"url": "https://www.roblox.com/user.aspx?username={username}",
|
||||
@@ -5641,9 +5613,6 @@
|
||||
"alexaRank": 2472
|
||||
},
|
||||
"OnlyFans": {
|
||||
"protection": [
|
||||
"ip_reputation"
|
||||
],
|
||||
"tags": [
|
||||
"porn"
|
||||
],
|
||||
@@ -5653,8 +5622,8 @@
|
||||
"user-agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36",
|
||||
"user-id": "0",
|
||||
"x-bc": "0a106d301866494c873ae3a05bc3c97cee59a749",
|
||||
"time": "1777132991121",
|
||||
"sign": "57203:3723aa7d500e76eabca29df74e4e97c483f14204:66d:69cfa6d8",
|
||||
"time": "1776790550214",
|
||||
"sign": "57203:31541b62efa9f19fafc79ca8002b1d0f12335c1d:6d2:69cfa6d8",
|
||||
"referer": "https://onlyfans.com/",
|
||||
"cookie": "__cf_bm=YovfzPN0T_wg6F60L5eZKPOQvlGESws3UDGgEkmPb9A-1776790253-1.0.1.1-KRZgptNe5P9epBZSdITa12VfTEDlDdLckPY3I2FDAacvCPxOj0PqeK86J5mcC7UQ_TM8_O24bAh21ElYINovqk2386EoJYyLmknHJ5UsFts"
|
||||
},
|
||||
@@ -5974,10 +5943,6 @@
|
||||
"usernameUnclaimed": "noonewouldeverusethis7"
|
||||
},
|
||||
"Muckrack": {
|
||||
"disabled": true,
|
||||
"protection": [
|
||||
"cf_js_challenge"
|
||||
],
|
||||
"absenceStrs": [
|
||||
"(404) Page Not Found"
|
||||
],
|
||||
@@ -6092,9 +6057,6 @@
|
||||
"tags": [
|
||||
"freelance"
|
||||
],
|
||||
"protection": [
|
||||
"tls_fingerprint"
|
||||
],
|
||||
"checkType": "message",
|
||||
"absenceStrs": [
|
||||
"\"users\":{}"
|
||||
@@ -6719,10 +6681,6 @@
|
||||
"usernameUnclaimed": "noonewouldeverusethis777"
|
||||
},
|
||||
"MyFitnessPal": {
|
||||
"disabled": true,
|
||||
"protection": [
|
||||
"custom_bot_protection"
|
||||
],
|
||||
"tags": [
|
||||
"sport"
|
||||
],
|
||||
@@ -6922,9 +6880,6 @@
|
||||
"tags": [
|
||||
"music"
|
||||
],
|
||||
"protection": [
|
||||
"tls_fingerprint"
|
||||
],
|
||||
"checkType": "message",
|
||||
"presenseStrs": [
|
||||
"Points:"
|
||||
@@ -7040,10 +6995,6 @@
|
||||
]
|
||||
},
|
||||
"LibraryThing": {
|
||||
"protection": [
|
||||
"cf_js_challenge",
|
||||
"tls_fingerprint"
|
||||
],
|
||||
"tags": [
|
||||
"books"
|
||||
],
|
||||
@@ -7217,10 +7168,6 @@
|
||||
]
|
||||
},
|
||||
"Speedrun.com": {
|
||||
"protection": [
|
||||
"cf_js_challenge",
|
||||
"tls_fingerprint"
|
||||
],
|
||||
"tags": [
|
||||
"gaming"
|
||||
],
|
||||
@@ -7429,10 +7376,6 @@
|
||||
"tags": [
|
||||
"gaming"
|
||||
],
|
||||
"protection": [
|
||||
"tls_fingerprint"
|
||||
],
|
||||
"ignore403": true,
|
||||
"checkType": "status_code",
|
||||
"alexaRank": 5699,
|
||||
"urlMain": "https://www.moddb.com/",
|
||||
@@ -7775,10 +7718,6 @@
|
||||
"usernameUnclaimed": "noonewouldeverusethis77777"
|
||||
},
|
||||
"Morguefile": {
|
||||
"disabled": true,
|
||||
"protection": [
|
||||
"cf_js_challenge"
|
||||
],
|
||||
"absenceStrs": [
|
||||
"free photographs for commercial use"
|
||||
],
|
||||
@@ -7983,14 +7922,11 @@
|
||||
"alexaRank": 6720
|
||||
},
|
||||
"Kick": {
|
||||
"protection": [
|
||||
"tls_fingerprint"
|
||||
],
|
||||
"url": "https://kick.com/{username}",
|
||||
"urlMain": "https://kick.com/",
|
||||
"urlProbe": "https://kick.com/api/v2/channels/{username}",
|
||||
"checkType": "status_code",
|
||||
"usernameClaimed": "xqc",
|
||||
"usernameClaimed": "blue",
|
||||
"usernameUnclaimed": "noonewouldeverusethis7",
|
||||
"alexaRank": 6474,
|
||||
"tags": [
|
||||
@@ -8393,9 +8329,6 @@
|
||||
"Muse Score": {
|
||||
"url": "https://musescore.com/{username}",
|
||||
"urlMain": "https://musescore.com/",
|
||||
"protection": [
|
||||
"tls_fingerprint"
|
||||
],
|
||||
"checkType": "status_code",
|
||||
"usernameClaimed": "arrangeme",
|
||||
"usernameUnclaimed": "noonewouldeverusethis7",
|
||||
@@ -8435,10 +8368,6 @@
|
||||
"usernameUnclaimed": "noonewouldeverusethis7"
|
||||
},
|
||||
"PlanetMinecraft": {
|
||||
"protection": [
|
||||
"cf_js_challenge",
|
||||
"tls_fingerprint"
|
||||
],
|
||||
"tags": [
|
||||
"gaming"
|
||||
],
|
||||
@@ -9425,10 +9354,6 @@
|
||||
"alexaRank": 8514
|
||||
},
|
||||
"Rate Your Music": {
|
||||
"disabled": true,
|
||||
"protection": [
|
||||
"cf_js_challenge"
|
||||
],
|
||||
"tags": [
|
||||
"music"
|
||||
],
|
||||
@@ -9965,10 +9890,6 @@
|
||||
"usernameUnclaimed": "noonewouldeverusethis7"
|
||||
},
|
||||
"JeuxVideo": {
|
||||
"protection": [
|
||||
"cf_js_challenge",
|
||||
"tls_fingerprint"
|
||||
],
|
||||
"tags": [
|
||||
"fr",
|
||||
"gaming"
|
||||
@@ -10062,8 +9983,7 @@
|
||||
},
|
||||
"Anime-planet": {
|
||||
"protection": [
|
||||
"tls_fingerprint",
|
||||
"ip_reputation"
|
||||
"tls_fingerprint"
|
||||
],
|
||||
"tags": [
|
||||
"anime"
|
||||
@@ -10555,6 +10475,17 @@
|
||||
"usernameClaimed": "blue",
|
||||
"usernameUnclaimed": "noonewouldeverusethis7"
|
||||
},
|
||||
"Fotolog": {
|
||||
"tags": [
|
||||
"photo"
|
||||
],
|
||||
"engine": "engine404get",
|
||||
"urlMain": "http://fotolog.com",
|
||||
"url": "http://fotolog.com/{username}",
|
||||
"usernameUnclaimed": "noonewouldeverusethis7",
|
||||
"usernameClaimed": "red",
|
||||
"alexaRank": 11693
|
||||
},
|
||||
"PushSquare": {
|
||||
"tags": [
|
||||
"gaming",
|
||||
@@ -10684,10 +10615,6 @@
|
||||
"usernameUnclaimed": "noonewouldeverusethis7"
|
||||
},
|
||||
"Lomography": {
|
||||
"disabled": true,
|
||||
"protection": [
|
||||
"cf_js_challenge"
|
||||
],
|
||||
"absenceStrs": [
|
||||
"<title>404 · Lomography</title>"
|
||||
],
|
||||
@@ -10947,10 +10874,6 @@
|
||||
"usernameUnclaimed": "noonewouldeverusethis7"
|
||||
},
|
||||
"Liberapay": {
|
||||
"disabled": true,
|
||||
"protection": [
|
||||
"cf_js_challenge"
|
||||
],
|
||||
"tags": [
|
||||
"finance"
|
||||
],
|
||||
@@ -11111,7 +11034,6 @@
|
||||
"usernameUnclaimed": "noonewouldeverusethis7"
|
||||
},
|
||||
"Joomlart": {
|
||||
"disabled": true,
|
||||
"tags": [
|
||||
"coding"
|
||||
],
|
||||
@@ -11224,14 +11146,12 @@
|
||||
"alexaRank": 14969,
|
||||
"urlMain": "https://www.vivino.com/",
|
||||
"url": "https://www.vivino.com/users/{username}",
|
||||
"urlProbe": "https://api.vivino.com/users/{username}",
|
||||
"usernameClaimed": "adam",
|
||||
"usernameUnclaimed": "noonewouldeverusethis7"
|
||||
},
|
||||
"Flyertalk": {
|
||||
"protection": [
|
||||
"tls_fingerprint",
|
||||
"ip_reputation"
|
||||
"tls_fingerprint"
|
||||
],
|
||||
"tags": [
|
||||
"travel"
|
||||
@@ -11425,17 +11345,7 @@
|
||||
"tags": [
|
||||
"us"
|
||||
],
|
||||
"checkType": "message",
|
||||
"presenseStrs": [
|
||||
"'s profile - Garden.org</title>"
|
||||
],
|
||||
"absenceStrs": [
|
||||
"<title>Member List - Garden.org</title>"
|
||||
],
|
||||
"errors": {
|
||||
"Just a moment": "Cloudflare challenge",
|
||||
"challenges.cloudflare.com": "Cloudflare challenge"
|
||||
},
|
||||
"checkType": "response_url",
|
||||
"alexaRank": 17338,
|
||||
"urlMain": "https://garden.org",
|
||||
"url": "https://garden.org/users/profile/{username}/",
|
||||
@@ -11888,10 +11798,6 @@
|
||||
"alexaRank": 20421
|
||||
},
|
||||
"Smule": {
|
||||
"protection": [
|
||||
"cf_js_challenge",
|
||||
"tls_fingerprint"
|
||||
],
|
||||
"tags": [
|
||||
"music"
|
||||
],
|
||||
@@ -13217,12 +13123,8 @@
|
||||
"url": "https://hive.blog/@{username}",
|
||||
"urlMain": "https://hive.blog/",
|
||||
"checkType": "message",
|
||||
"presenseStrs": [
|
||||
"class=\"UserProfile\""
|
||||
],
|
||||
"absenceStrs": [
|
||||
"<title>User Not Found - Hive</title>",
|
||||
"class=\"NotFound"
|
||||
"<title>User Not Found - Hive</title>"
|
||||
],
|
||||
"usernameClaimed": "mango-juice",
|
||||
"usernameUnclaimed": "noonewouldeverusethis7",
|
||||
@@ -13383,10 +13285,6 @@
|
||||
"usernameUnclaimed": "noonewouldeverusethis7"
|
||||
},
|
||||
"Smogon": {
|
||||
"disabled": true,
|
||||
"protection": [
|
||||
"custom_bot_protection"
|
||||
],
|
||||
"tags": [
|
||||
"gaming"
|
||||
],
|
||||
@@ -13438,9 +13336,6 @@
|
||||
"usernameUnclaimed": "noonewouldeverusethis7"
|
||||
},
|
||||
"PromptBase": {
|
||||
"protection": [
|
||||
"ip_reputation"
|
||||
],
|
||||
"absenceStrs": [
|
||||
"NotFound"
|
||||
],
|
||||
@@ -14067,13 +13962,10 @@
|
||||
"gb",
|
||||
"hk"
|
||||
],
|
||||
"protection": [
|
||||
"tls_fingerprint"
|
||||
],
|
||||
"checkType": "status_code",
|
||||
"alexaRank": 49143,
|
||||
"urlMain": "https://www.mybuilder.com",
|
||||
"url": "https://www.mybuilder.com/profile/{username}",
|
||||
"url": "https://www.mybuilder.com/profile/view/{username}",
|
||||
"usernameClaimed": "adam",
|
||||
"usernameUnclaimed": "noonewouldeverusethis7"
|
||||
},
|
||||
@@ -14824,14 +14716,7 @@
|
||||
"tags": [
|
||||
"gaming"
|
||||
],
|
||||
"checkType": "message",
|
||||
"presenseStrs": [
|
||||
"class=\"profile-avatar\""
|
||||
],
|
||||
"errors": {
|
||||
"Just a moment": "Cloudflare challenge",
|
||||
"challenges.cloudflare.com": "Cloudflare challenge"
|
||||
},
|
||||
"checkType": "response_url",
|
||||
"alexaRank": 65342,
|
||||
"urlMain": "https://www.thesimsresource.com/",
|
||||
"url": "https://www.thesimsresource.com/members/{username}/",
|
||||
@@ -15404,7 +15289,6 @@
|
||||
"usernameUnclaimed": "noonewouldeverusethis7"
|
||||
},
|
||||
"Knowem": {
|
||||
"disabled": true,
|
||||
"tags": [
|
||||
"business"
|
||||
],
|
||||
@@ -15674,7 +15558,6 @@
|
||||
"usernameUnclaimed": "noonewouldeverusethis7"
|
||||
},
|
||||
"Polywork": {
|
||||
"disabled": true,
|
||||
"checkType": "message",
|
||||
"absenceStrs": [
|
||||
">404</h3>",
|
||||
@@ -15817,13 +15700,9 @@
|
||||
"usernameUnclaimed": "noonewouldeverusethis7"
|
||||
},
|
||||
"Designspiration": {
|
||||
"protection": [
|
||||
"cf_js_challenge",
|
||||
"tls_fingerprint"
|
||||
],
|
||||
"checkType": "status_code",
|
||||
"urlMain": "https://designspiration.com/",
|
||||
"url": "https://designspiration.com/{username}/",
|
||||
"urlMain": "https://www.designspiration.net/",
|
||||
"url": "https://www.designspiration.net/{username}/",
|
||||
"usernameClaimed": "blue",
|
||||
"usernameUnclaimed": "noonewouldeverusethis7",
|
||||
"alexaRank": 89022,
|
||||
@@ -16036,10 +15915,6 @@
|
||||
"usernameClaimed": "admin"
|
||||
},
|
||||
"Movieforums": {
|
||||
"disabled": true,
|
||||
"protection": [
|
||||
"cf_js_challenge"
|
||||
],
|
||||
"tags": [
|
||||
"forum",
|
||||
"la"
|
||||
@@ -16838,8 +16713,8 @@
|
||||
"ru"
|
||||
],
|
||||
"checkType": "message",
|
||||
"presenseStrs": [
|
||||
"class=\"userprofile\""
|
||||
"absenceStrs": [
|
||||
"Пользователь с таким именем не найден"
|
||||
],
|
||||
"alexaRank": 160156,
|
||||
"urlMain": "https://www.rusfootball.info/",
|
||||
@@ -17765,7 +17640,6 @@
|
||||
"usernameUnclaimed": "noonewouldeverusethis7"
|
||||
},
|
||||
"the-mainboard.com": {
|
||||
"disabled": true,
|
||||
"tags": [
|
||||
"forum",
|
||||
"us"
|
||||
@@ -17989,7 +17863,6 @@
|
||||
"usernameUnclaimed": "noonewouldeverusethis7"
|
||||
},
|
||||
"Onlyfinder": {
|
||||
"disabled": true,
|
||||
"absenceStrs": [
|
||||
"\"rows\":[]"
|
||||
],
|
||||
@@ -18156,6 +18029,18 @@
|
||||
],
|
||||
"alexaRank": 379171
|
||||
},
|
||||
"Pitomec": {
|
||||
"tags": [
|
||||
"ru",
|
||||
"ua"
|
||||
],
|
||||
"checkType": "status_code",
|
||||
"alexaRank": 228310,
|
||||
"urlMain": "https://www.pitomec.ru",
|
||||
"url": "https://www.pitomec.ru/{username}",
|
||||
"usernameClaimed": "adam",
|
||||
"usernameUnclaimed": "noonewouldeverusethis7"
|
||||
},
|
||||
"Loveplanet": {
|
||||
"disabled": true,
|
||||
"tags": [
|
||||
@@ -18660,9 +18545,6 @@
|
||||
"us"
|
||||
],
|
||||
"checkType": "message",
|
||||
"presenseStrs": [
|
||||
"class=\"data_head\""
|
||||
],
|
||||
"absenceStrs": [
|
||||
"The user you requested does not exist, no matter how much you wish this might be the case."
|
||||
],
|
||||
@@ -18835,6 +18717,24 @@
|
||||
"ua"
|
||||
]
|
||||
},
|
||||
"SQL.ru": {
|
||||
"tags": [
|
||||
"ru"
|
||||
],
|
||||
"checkType": "message",
|
||||
"presenseStrs": [
|
||||
"По вашему запросу найдено"
|
||||
],
|
||||
"absenceStrs": [
|
||||
"Извините",
|
||||
" но по вашему запросу ничего не найдено"
|
||||
],
|
||||
"url": "https://www.sql.ru/forum/actualsearch.aspx?a={username}&ma=0",
|
||||
"urlMain": "https://www.sql.ru",
|
||||
"usernameClaimed": "buser",
|
||||
"usernameUnclaimed": "noonewouldeverusethis7",
|
||||
"alexaRank": 285351
|
||||
},
|
||||
"Pepper PL": {
|
||||
"url": "https://www.pepper.pl/profile/{username}",
|
||||
"urlMain": "https://www.pepper.pl/",
|
||||
@@ -18928,10 +18828,6 @@
|
||||
},
|
||||
"Math10": {
|
||||
"urlSubpath": "/forum",
|
||||
"disabled": true,
|
||||
"protection": [
|
||||
"cf_js_challenge"
|
||||
],
|
||||
"tags": [
|
||||
"forum",
|
||||
"ru"
|
||||
@@ -19146,7 +19042,6 @@
|
||||
"usernameUnclaimed": "noonewouldeverusethis7"
|
||||
},
|
||||
"mcfc-fan.ru": {
|
||||
"disabled": true,
|
||||
"engine": "uCoz",
|
||||
"urlMain": "http://mcfc-fan.ru",
|
||||
"usernameUnclaimed": "noonewouldeverusethis7",
|
||||
@@ -28621,7 +28516,6 @@
|
||||
]
|
||||
},
|
||||
"TikTok Online Viewer": {
|
||||
"disabled": true,
|
||||
"errors": {
|
||||
"Website unavailable": "Site error",
|
||||
"is currently offline": "Site error"
|
||||
@@ -35458,220 +35352,6 @@
|
||||
"url": "https://op.gg/lol/summoners/search?q={username}®ion=th",
|
||||
"usernameClaimed": "adam",
|
||||
"usernameUnclaimed": "noonewouldeverusethis7"
|
||||
},
|
||||
"hiveon.com forum": {
|
||||
"tags": [
|
||||
"coding",
|
||||
"ru"
|
||||
],
|
||||
"urlMain": "https://hiveon.com/forum",
|
||||
"engine": "Discourse",
|
||||
"usernameClaimed": "Rony",
|
||||
"usernameUnclaimed": "noonewouldeverusethis7"
|
||||
},
|
||||
"forum.manticoresearch.com": {
|
||||
"tags": [
|
||||
"coding"
|
||||
],
|
||||
"urlMain": "https://forum.manticoresearch.com",
|
||||
"engine": "Discourse",
|
||||
"usernameClaimed": "gloria",
|
||||
"usernameUnclaimed": "noonewouldeverusethis7"
|
||||
},
|
||||
"forum.jscourse.com": {
|
||||
"tags": [
|
||||
"coding",
|
||||
"ru"
|
||||
],
|
||||
"urlMain": "https://forum.jscourse.com",
|
||||
"engine": "Discourse",
|
||||
"usernameClaimed": "kharkovhipster",
|
||||
"usernameUnclaimed": "noonewouldeverusethis7"
|
||||
},
|
||||
"forums.grandstream.com": {
|
||||
"tags": [
|
||||
"coding"
|
||||
],
|
||||
"checkType": "status_code",
|
||||
"urlMain": "https://forums.grandstream.com",
|
||||
"url": "https://forums.grandstream.com/u/{username}/summary",
|
||||
"urlProbe": "https://forums.grandstream.com/u/{username}.json",
|
||||
"usernameClaimed": "EricPitz",
|
||||
"usernameUnclaimed": "noonewouldeverusethis7"
|
||||
},
|
||||
"support.wirenboard.com": {
|
||||
"tags": [
|
||||
"coding",
|
||||
"ru"
|
||||
],
|
||||
"urlMain": "https://support.wirenboard.com",
|
||||
"engine": "Discourse",
|
||||
"usernameClaimed": "enginPetr",
|
||||
"usernameUnclaimed": "noonewouldeverusethis7"
|
||||
},
|
||||
"forum.cs-cart.ru": {
|
||||
"tags": [
|
||||
"coding",
|
||||
"ru"
|
||||
],
|
||||
"urlMain": "https://forum.cs-cart.ru",
|
||||
"engine": "Discourse",
|
||||
"usernameClaimed": "a.shishkin",
|
||||
"usernameUnclaimed": "noonewouldeverusethis7"
|
||||
},
|
||||
"instantcms.ru": {
|
||||
"tags": [
|
||||
"coding",
|
||||
"ru"
|
||||
],
|
||||
"checkType": "status_code",
|
||||
"urlMain": "https://instantcms.ru",
|
||||
"url": "https://instantcms.ru/users/{username}",
|
||||
"usernameClaimed": "fuze",
|
||||
"usernameUnclaimed": "noonewouldeverusethis7"
|
||||
},
|
||||
"wewin.ru": {
|
||||
"tags": [
|
||||
"ru"
|
||||
],
|
||||
"checkType": "status_code",
|
||||
"urlMain": "https://wewin.ru",
|
||||
"url": "https://wewin.ru/user/{username}",
|
||||
"usernameClaimed": "dimok",
|
||||
"usernameUnclaimed": "noonewouldeverusethis7"
|
||||
},
|
||||
"myslo.ru": {
|
||||
"tags": [
|
||||
"news",
|
||||
"ru"
|
||||
],
|
||||
"checkType": "status_code",
|
||||
"urlMain": "https://myslo.ru",
|
||||
"url": "https://myslo.ru/user/profile/{username}",
|
||||
"usernameClaimed": "admin",
|
||||
"usernameUnclaimed": "noonewouldeverusethis7"
|
||||
},
|
||||
"add-groups.com": {
|
||||
"tags": [
|
||||
"messaging",
|
||||
"ru"
|
||||
],
|
||||
"checkType": "status_code",
|
||||
"urlMain": "https://add-groups.com",
|
||||
"url": "https://add-groups.com/user/{username}",
|
||||
"usernameClaimed": "admin",
|
||||
"usernameUnclaimed": "noonewouldeverusethis7"
|
||||
},
|
||||
"Profi.ru": {
|
||||
"tags": [
|
||||
"freelance",
|
||||
"ru"
|
||||
],
|
||||
"checkType": "status_code",
|
||||
"urlMain": "https://profi.ru",
|
||||
"url": "https://profi.ru/profile/{username}/",
|
||||
"usernameClaimed": "petrov",
|
||||
"usernameUnclaimed": "noonewouldeverusethis7"
|
||||
},
|
||||
"mover.uz": {
|
||||
"tags": [
|
||||
"video"
|
||||
],
|
||||
"checkType": "status_code",
|
||||
"urlMain": "https://mover.uz",
|
||||
"url": "https://mover.uz/channel/{username}",
|
||||
"usernameClaimed": "AlterEgo",
|
||||
"usernameUnclaimed": "noonewouldeverusethis7"
|
||||
},
|
||||
"BitPapa": {
|
||||
"tags": [
|
||||
"crypto",
|
||||
"ru"
|
||||
],
|
||||
"checkType": "message",
|
||||
"presenseStrs": [
|
||||
"avgEscrowReleaseTime"
|
||||
],
|
||||
"urlMain": "https://bitpapa.com",
|
||||
"url": "https://bitpapa.com/ru/user/{username}",
|
||||
"usernameClaimed": "admin",
|
||||
"usernameUnclaimed": "noonewouldeverusethis7"
|
||||
},
|
||||
"minfin.com.ua": {
|
||||
"tags": [
|
||||
"finance",
|
||||
"ua"
|
||||
],
|
||||
"checkType": "message",
|
||||
"presenseStrs": [
|
||||
"Был на сайте"
|
||||
],
|
||||
"absenceStrs": [
|
||||
"\"isRequestFailed\":true"
|
||||
],
|
||||
"urlMain": "https://minfin.com.ua",
|
||||
"url": "https://minfin.com.ua/users/{username}/",
|
||||
"usernameClaimed": "Maksim",
|
||||
"usernameUnclaimed": "noonewouldeverusethis7"
|
||||
},
|
||||
"Pexels": {
|
||||
"tags": [
|
||||
"photo"
|
||||
],
|
||||
"checkType": "status_code",
|
||||
"protection": [
|
||||
"tls_fingerprint"
|
||||
],
|
||||
"urlMain": "https://www.pexels.com",
|
||||
"url": "https://www.pexels.com/ru-ru/@{username}",
|
||||
"usernameClaimed": "jess",
|
||||
"usernameUnclaimed": "noonewouldeverusethis7"
|
||||
},
|
||||
"BestGore": {
|
||||
"tags": [
|
||||
"video"
|
||||
],
|
||||
"checkType": "status_code",
|
||||
"protection": [
|
||||
"tls_fingerprint"
|
||||
],
|
||||
"urlMain": "https://bestgore.fun",
|
||||
"url": "https://bestgore.fun/c/{username}/videos",
|
||||
"usernameClaimed": "user",
|
||||
"usernameUnclaimed": "noonewouldeverusethis7"
|
||||
},
|
||||
"AirNFTs": {
|
||||
"tags": [
|
||||
"crypto",
|
||||
"nft"
|
||||
],
|
||||
"checkType": "message",
|
||||
"protection": [
|
||||
"tls_fingerprint"
|
||||
],
|
||||
"presenseStrs": [
|
||||
"accountCreatedAt"
|
||||
],
|
||||
"urlMain": "https://app.airnfts.com",
|
||||
"url": "https://app.airnfts.com/creators/{username}",
|
||||
"usernameClaimed": "demo",
|
||||
"usernameUnclaimed": "noonewouldeverusethis7"
|
||||
},
|
||||
"GreasyFork": {
|
||||
"tags": [
|
||||
"coding"
|
||||
],
|
||||
"checkType": "message",
|
||||
"presenseStrs": [
|
||||
"class=\"user-list\""
|
||||
],
|
||||
"absenceStrs": [
|
||||
"<p>No users!</p>"
|
||||
],
|
||||
"urlMain": "https://greasyfork.org",
|
||||
"url": "https://greasyfork.org/en/users?q={username}",
|
||||
"usernameClaimed": "jcunews",
|
||||
"usernameUnclaimed": "noonewouldeverusethis7"
|
||||
}
|
||||
},
|
||||
"engines": {
|
||||
|
||||
@@ -1,8 +1,8 @@
|
||||
{
|
||||
"version": 1,
|
||||
"updated_at": "2026-04-29T14:56:55Z",
|
||||
"sites_count": 3157,
|
||||
"updated_at": "2026-04-23T15:02:48Z",
|
||||
"sites_count": 3142,
|
||||
"min_maigret_version": "0.6.0",
|
||||
"data_sha256": "5dac8f1c045ea650d5872cf9dfd7f224410eaadba0f2b7eb60514cc51ba0097a",
|
||||
"data_sha256": "1e1ed6da2aa9db0f34171f61a044c20bbd1ed53a0430dec4a9ce8f8543655d1a",
|
||||
"data_url": "https://raw.githubusercontent.com/soxoj/maigret/main/maigret/resources/data.json"
|
||||
}
|
||||
@@ -55,9 +55,6 @@
|
||||
"pdf_report": false,
|
||||
"html_report": false,
|
||||
"md_report": false,
|
||||
"openai_api_key": "",
|
||||
"openai_model": "gpt-4o",
|
||||
"openai_api_base_url": "https://api.openai.com/v1",
|
||||
"web_interface_port": 5000,
|
||||
"no_autoupdate": false,
|
||||
"db_update_meta_url": "https://raw.githubusercontent.com/soxoj/maigret/main/maigret/resources/db_meta.json",
|
||||
|
||||
@@ -516,6 +516,15 @@ class MaigretDatabase:
|
||||
else:
|
||||
return self.load_from_file(path)
|
||||
|
||||
def load_extra_from_path(self, path: str) -> "MaigretDatabase":
|
||||
"""Merge an additional DB on top of self. Last-wins on duplicate
|
||||
site/engine names; tags deduped preserving first-seen order."""
|
||||
self.load_from_path(path)
|
||||
self._sites = list({s.name: s for s in self._sites}.values())
|
||||
self._engines = list({e.name: e for e in self._engines}.values())
|
||||
self._tags = list(dict.fromkeys(self._tags))
|
||||
return self
|
||||
|
||||
def load_from_http(self, url: str) -> "MaigretDatabase":
|
||||
is_url_valid = url.startswith("http://") or url.startswith("https://")
|
||||
|
||||
|
||||
Generated
+13
-13
@@ -232,14 +232,14 @@ graphemeu = "0.7.2"
|
||||
|
||||
[[package]]
|
||||
name = "arabic-reshaper"
|
||||
version = "3.0.1"
|
||||
version = "3.0.0"
|
||||
description = "Reconstruct Arabic sentences to be used in applications that do not support Arabic"
|
||||
optional = false
|
||||
python-versions = ">=3.10"
|
||||
python-versions = "*"
|
||||
groups = ["main"]
|
||||
files = [
|
||||
{file = "arabic_reshaper-3.0.1-py3-none-any.whl", hash = "sha256:41c5adc2420f85758eada7e880251c4b6a2adbd83377bd27e5d4eba71f648bc7"},
|
||||
{file = "arabic_reshaper-3.0.1.tar.gz", hash = "sha256:a0d9b2a9fa29b5f2c1d705f407adf6ca4242405b9cac0e5cc09e6c4f3f8fb68c"},
|
||||
{file = "arabic_reshaper-3.0.0-py3-none-any.whl", hash = "sha256:3f71d5034bb694204a239a6f1ebcf323ac3c5b059de02259235e2016a1a5e2dc"},
|
||||
{file = "arabic_reshaper-3.0.0.tar.gz", hash = "sha256:ffcd13ba5ec007db71c072f5b23f420da92ac7f268512065d49e790e62237099"},
|
||||
]
|
||||
|
||||
[package.extras]
|
||||
@@ -418,14 +418,14 @@ files = [
|
||||
|
||||
[[package]]
|
||||
name = "certifi"
|
||||
version = "2026.4.22"
|
||||
version = "2026.2.25"
|
||||
description = "Python package for providing Mozilla's CA Bundle."
|
||||
optional = false
|
||||
python-versions = ">=3.7"
|
||||
groups = ["main"]
|
||||
files = [
|
||||
{file = "certifi-2026.4.22-py3-none-any.whl", hash = "sha256:3cb2210c8f88ba2318d29b0388d1023c8492ff72ecdde4ebdaddbb13a31b1c4a"},
|
||||
{file = "certifi-2026.4.22.tar.gz", hash = "sha256:8d455352a37b71bf76a79caa83a3d6c25afee4a385d632127b6afb3963f1c580"},
|
||||
{file = "certifi-2026.2.25-py3-none-any.whl", hash = "sha256:027692e4402ad994f1c42e52a4997a9763c646b73e4096e4d5d6db8af1d6f0fa"},
|
||||
{file = "certifi-2026.2.25.tar.gz", hash = "sha256:e887ab5cee78ea814d3472169153c2d12cd43b14bd03329a39a9c6e2e80bfba7"},
|
||||
]
|
||||
|
||||
[[package]]
|
||||
@@ -1261,14 +1261,14 @@ lxml = ["lxml ; platform_python_implementation == \"CPython\""]
|
||||
|
||||
[[package]]
|
||||
name = "idna"
|
||||
version = "3.13"
|
||||
version = "3.12"
|
||||
description = "Internationalized Domain Names in Applications (IDNA)"
|
||||
optional = false
|
||||
python-versions = ">=3.8"
|
||||
groups = ["main"]
|
||||
files = [
|
||||
{file = "idna-3.13-py3-none-any.whl", hash = "sha256:892ea0cde124a99ce773decba204c5552b69c3c67ffd5f232eb7696135bc8bb3"},
|
||||
{file = "idna-3.13.tar.gz", hash = "sha256:585ea8fe5d69b9181ec1afba340451fba6ba764af97026f92a91d4eef164a242"},
|
||||
{file = "idna-3.12-py3-none-any.whl", hash = "sha256:60ffaa1858fac94c9c124728c24fcde8160f3fb4a7f79aa8cdd33a9d1af60a67"},
|
||||
{file = "idna-3.12.tar.gz", hash = "sha256:724e9952cc9e2bd7550ea784adb098d837ab5267ef67a1ab9cf7846bdbdd8254"},
|
||||
]
|
||||
|
||||
[package.extras]
|
||||
@@ -3168,14 +3168,14 @@ png = ["pypng"]
|
||||
|
||||
[[package]]
|
||||
name = "reportlab"
|
||||
version = "4.5.0"
|
||||
version = "4.4.10"
|
||||
description = "The Reportlab Toolkit"
|
||||
optional = false
|
||||
python-versions = "<4,>=3.9"
|
||||
groups = ["main", "dev"]
|
||||
files = [
|
||||
{file = "reportlab-4.5.0-py3-none-any.whl", hash = "sha256:b8cc8996947d84e805368b47b2376070966f091d029351a0d8a1f238984c2c7f"},
|
||||
{file = "reportlab-4.5.0.tar.gz", hash = "sha256:e595932789ab7a107ba253e83f7815622708a9fd49920d0d6a909880eb66ac75"},
|
||||
{file = "reportlab-4.4.10-py3-none-any.whl", hash = "sha256:5abc815746ae2bc44e7ff25db96814f921349ca814c992c7eac3c26029bf7c24"},
|
||||
{file = "reportlab-4.4.10.tar.gz", hash = "sha256:5cbbb34ac3546039d0086deb2938cdec06b12da3cdb836e813258eb33cd28487"},
|
||||
]
|
||||
|
||||
[package.dependencies]
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
maigret @ https://github.com/soxoj/maigret/archive/refs/heads/main.zip
|
||||
pefile==2023.2.7 # do not bump while pyinstaller is 6.11.1, there is a conflict
|
||||
psutil==7.2.2
|
||||
pyinstaller==6.20.0
|
||||
pyinstaller==6.19.0
|
||||
pywin32-ctypes==0.2.3
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
|
||||
## List of supported sites (search methods): total 3157
|
||||
## List of supported sites (search methods): total 3142
|
||||
|
||||
Rank data fetched from Majestic Million by domains.
|
||||
|
||||
@@ -22,7 +22,7 @@ Rank data fetched from Majestic Million by domains.
|
||||
1.  [WordPress (https://wordpress.com)](https://wordpress.com)*: top 50, blog*
|
||||
1.  [Google Plus (archived) (https://plus.google.com)](https://plus.google.com)*: top 50, social*
|
||||
1.  [Telegram (https://t.me/)](https://t.me/)*: top 50, messaging*
|
||||
1.  [Reddit (https://www.reddit.com/)](https://www.reddit.com/)*: top 50, discussion, news, social*, search is disabled
|
||||
1.  [Reddit (https://www.reddit.com/)](https://www.reddit.com/)*: top 50, discussion, news, social*
|
||||
1.  [Tumblr (https://www.tumblr.com)](https://www.tumblr.com)*: top 100, blog, social*
|
||||
1.  [Spotify (https://open.spotify.com/)](https://open.spotify.com/)*: top 100, music*
|
||||
1.  [Archive.org (https://archive.org)](https://archive.org)*: top 100, archive*, search is disabled
|
||||
@@ -101,7 +101,7 @@ Rank data fetched from Majestic Million by domains.
|
||||
1.  [OP.GG LoL Vietnam (https://www.op.gg/)](https://www.op.gg/)*: top 500, gaming, vn*
|
||||
1.  [OP.GG LoL Thailand (https://www.op.gg/)](https://www.op.gg/)*: top 500, gaming, th*
|
||||
1.  [Xing (https://www.xing.com/)](https://www.xing.com/)*: top 500, de, eu*
|
||||
1.  [Patreon (https://www.patreon.com/)](https://www.patreon.com/)*: top 500, finance*, search is disabled
|
||||
1.  [Patreon (https://www.patreon.com/)](https://www.patreon.com/)*: top 500, finance*
|
||||
1.  [DeviantART (https://deviantart.com)](https://deviantart.com)*: top 500, art, photo*
|
||||
1.  [Gofundme (https://www.gofundme.com)](https://www.gofundme.com)*: top 500, finance*
|
||||
1.  [Zhihu (https://www.zhihu.com/)](https://www.zhihu.com/)*: top 500, cn*, search is disabled
|
||||
@@ -117,7 +117,7 @@ Rank data fetched from Majestic Million by domains.
|
||||
1.  [OK (https://ok.ru/)](https://ok.ru/)*: top 1K, ru, social*
|
||||
1.  [Photobucket (https://photobucket.com/)](https://photobucket.com/)*: top 1K, photo*, search is disabled
|
||||
1.  [Udemy (https://www.udemy.com)](https://www.udemy.com)*: top 1K, education*, search is disabled
|
||||
1.  [Shutterstock (https://www.shutterstock.com)](https://www.shutterstock.com)*: top 1K, music, photo, stock*, search is disabled
|
||||
1.  [Shutterstock (https://www.shutterstock.com)](https://www.shutterstock.com)*: top 1K, music, photo, stock*
|
||||
1.  [MixCloud (https://www.mixcloud.com/)](https://www.mixcloud.com/)*: top 1K, music*
|
||||
1.  [NPM (https://www.npmjs.com/)](https://www.npmjs.com/)*: top 1K, coding*
|
||||
1.  [NPM-Package (https://www.npmjs.com/)](https://www.npmjs.com/)*: top 1K, coding*
|
||||
@@ -139,7 +139,7 @@ Rank data fetched from Majestic Million by domains.
|
||||
1.  [Gumroad (https://www.gumroad.com/)](https://www.gumroad.com/)*: top 1K, shopping*
|
||||
1.  [Upwork (https://upwork.com)](https://upwork.com)*: top 1K, freelance*
|
||||
1.  [Yumpu (https://www.yumpu.com)](https://www.yumpu.com)*: top 1K, stock*, search is disabled
|
||||
1.  [PyPi (https://pypi.org/)](https://pypi.org/)*: top 1K, coding*, search is disabled
|
||||
1.  [PyPi (https://pypi.org/)](https://pypi.org/)*: top 1K, coding*
|
||||
1.  [Douban (https://www.douban.com)](https://www.douban.com)*: top 1K, cn*
|
||||
1.  [LonelyPlanet (https://www.lonelyplanet.com)](https://www.lonelyplanet.com)*: top 1K, travel*, search is disabled
|
||||
1.  [Figma (https://www.figma.com/)](https://www.figma.com/)*: top 1K, design*
|
||||
@@ -183,7 +183,7 @@ Rank data fetched from Majestic Million by domains.
|
||||
1.  [AllTrails (https://www.alltrails.com/)](https://www.alltrails.com/)*: top 5K, sport, travel*, search is disabled
|
||||
1.  [Habr (https://habr.com/)](https://habr.com/)*: top 5K, blog, discussion, ru*
|
||||
1.  [AllRecipes (https://www.allrecipes.com/)](https://www.allrecipes.com/)*: top 5K, hobby*
|
||||
1.  [Redbubble (https://www.redbubble.com/)](https://www.redbubble.com/)*: top 5K, shopping*, search is disabled
|
||||
1.  [Redbubble (https://www.redbubble.com/)](https://www.redbubble.com/)*: top 5K, shopping*
|
||||
1.  [Diigo (https://www.diigo.com/)](https://www.diigo.com/)*: top 5K, bookmarks*
|
||||
1.  [Windy (https://windy.com/)](https://windy.com/)*: top 5K, maps*
|
||||
1.  [Codecanyon (https://codecanyon.net)](https://codecanyon.net)*: top 5K, coding, shopping*
|
||||
@@ -229,7 +229,7 @@ Rank data fetched from Majestic Million by domains.
|
||||
1.  [beacons.ai (https://beacons.ai)](https://beacons.ai)*: top 5K, links*
|
||||
1.  [Artsy (https://www.artsy.net)](https://www.artsy.net)*: top 5K, art*, search is disabled
|
||||
1.  [IFTTT (https://www.ifttt.com/)](https://www.ifttt.com/)*: top 5K, tech*
|
||||
1.  [Muckrack (https://muckrack.com)](https://muckrack.com)*: top 5K, news*, search is disabled
|
||||
1.  [Muckrack (https://muckrack.com)](https://muckrack.com)*: top 5K, news*
|
||||
1.  [Crunchyroll (https://www.crunchyroll.com/)](https://www.crunchyroll.com/)*: top 5K, forum, movies*, search is disabled
|
||||
1.  [Odysee (https://odysee.com/)](https://odysee.com/)*: top 5K, video*
|
||||
1.  [Replit (https://replit.com/)](https://replit.com/)*: top 5K, coding*
|
||||
@@ -258,7 +258,7 @@ Rank data fetched from Majestic Million by domains.
|
||||
1.  [Ultimate-Guitar (https://ultimate-guitar.com/)](https://ultimate-guitar.com/)*: top 5K, music*
|
||||
1.  [ChaturBate (https://chaturbate.com)](https://chaturbate.com)*: top 5K, porn, webcam*
|
||||
1.  [HackerOne (https://hackerone.com/)](https://hackerone.com/)*: top 5K, coding, hacking*
|
||||
1.  [MyFitnessPal (https://www.myfitnesspal.com/)](https://www.myfitnesspal.com/)*: top 5K, sport*, search is disabled
|
||||
1.  [MyFitnessPal (https://www.myfitnesspal.com/)](https://www.myfitnesspal.com/)*: top 5K, sport*
|
||||
1.  [Plurk (https://gab.com/)](https://gab.com/)*: top 5K, social, tw, us*
|
||||
1.  [Contently (https://contently.com/)](https://contently.com/)*: top 5K, freelance*
|
||||
1.  [MyMiniFactory (https://www.myminifactory.com/)](https://www.myminifactory.com/)*: top 5K, 3d, shopping*
|
||||
@@ -310,7 +310,7 @@ Rank data fetched from Majestic Million by domains.
|
||||
1.  [MercadoLivre (https://www.mercadolivre.com.br)](https://www.mercadolivre.com.br)*: top 10K, br*
|
||||
1.  [Tinder (https://tinder.com/)](https://tinder.com/)*: top 10K, dating*
|
||||
1.  [Anobii (https://www.anobii.com)](https://www.anobii.com)*: top 10K, books*
|
||||
1.  [Morguefile (https://morguefile.com)](https://morguefile.com)*: top 10K, photo*, search is disabled
|
||||
1.  [Morguefile (https://morguefile.com)](https://morguefile.com)*: top 10K, photo*
|
||||
1.  [Velog (https://velog.io/)](https://velog.io/)*: top 10K, blog, coding, kr*
|
||||
1.  [Kick (https://kick.com/)](https://kick.com/)*: top 10K, streaming*
|
||||
1.  [domestika.org (https://www.domestika.org)](https://www.domestika.org)*: top 10K, education*
|
||||
@@ -360,7 +360,7 @@ Rank data fetched from Majestic Million by domains.
|
||||
1.  [Paltalk (https://www.paltalk.com)](https://www.paltalk.com)*: top 10K, messaging*
|
||||
1.  [NICommunityForum (https://www.native-instruments.com/forum/)](https://www.native-instruments.com/forum/)*: top 10K, forum*
|
||||
1.  [Ccm (https://ccm.net)](https://ccm.net)*: top 10K, fr*
|
||||
1.  [Rate Your Music (https://rateyourmusic.com/)](https://rateyourmusic.com/)*: top 10K, music*, search is disabled
|
||||
1.  [Rate Your Music (https://rateyourmusic.com/)](https://rateyourmusic.com/)*: top 10K, music*
|
||||
1.  [VideoHive (https://videohive.net)](https://videohive.net)*: top 10K, video*
|
||||
1.  [authorSTREAM (http://www.authorstream.com/)](http://www.authorstream.com/)*: top 10K, documents, in, sharing*, search is disabled
|
||||
1.  [Airliners (https://www.airliners.net/)](https://www.airliners.net/)*: top 10K, hobby, photo*, search is disabled
|
||||
@@ -407,6 +407,7 @@ Rank data fetched from Majestic Million by domains.
|
||||
1.  [hi5 (http://www.hi5.com)](http://www.hi5.com)*: top 100K, social*, search is disabled
|
||||
1.  [Diary.ru (https://diary.ru)](https://diary.ru)*: top 100K, blog, ru*, search is disabled
|
||||
1.  [MirTesen (https://mirtesen.ru)](https://mirtesen.ru)*: top 100K, news, ru*, search is disabled
|
||||
1.  [Fotolog (http://fotolog.com)](http://fotolog.com)*: top 100K, photo*
|
||||
1.  [Aufeminin (https://www.aufeminin.com)](https://www.aufeminin.com)*: top 100K, fr*
|
||||
1.  [Coderwall (https://coderwall.com/)](https://coderwall.com/)*: top 100K, coding*
|
||||
1.  [PCPartPicker (https://pcpartpicker.com)](https://pcpartpicker.com)*: top 100K, shopping, tech*, search is disabled
|
||||
@@ -416,13 +417,13 @@ Rank data fetched from Majestic Million by domains.
|
||||
1.  [TheStudentRoom (https://www.thestudentroom.co.uk)](https://www.thestudentroom.co.uk)*: top 100K, forum, gb*, search is disabled
|
||||
1.  [Codementor (https://www.codementor.io/)](https://www.codementor.io/)*: top 100K, coding*
|
||||
1.  [N4g (https://n4g.com/)](https://n4g.com/)*: top 100K, gaming, news*
|
||||
1.  [Lomography (https://www.lomography.com)](https://www.lomography.com)*: top 100K, photo*, search is disabled
|
||||
1.  [Lomography (https://www.lomography.com)](https://www.lomography.com)*: top 100K, photo*
|
||||
1.  [pixelfed.social (https://pixelfed.social/)](https://pixelfed.social/)*: top 100K, art, photo*
|
||||
1.  [Hackerearth (https://www.hackerearth.com)](https://www.hackerearth.com)*: top 100K, freelance*, search is disabled
|
||||
1.  [Weedmaps (https://weedmaps.com)](https://weedmaps.com)*: top 100K, us*
|
||||
1.  [Redtube (https://www.redtube.com/)](https://www.redtube.com/)*: top 100K, porn*
|
||||
1.  [Neoseeker (https://www.neoseeker.com)](https://www.neoseeker.com)*: top 100K, forum, gaming*
|
||||
1.  [Liberapay (https://liberapay.com)](https://liberapay.com)*: top 100K, finance*, search is disabled
|
||||
1.  [Liberapay (https://liberapay.com)](https://liberapay.com)*: top 100K, finance*
|
||||
1.  [Sythe (https://www.sythe.org)](https://www.sythe.org)*: top 100K, forum*
|
||||
1.  [FilmWeb (https://www.filmweb.pl/user/adam)](https://www.filmweb.pl/user/adam)*: top 100K, movies, pl*
|
||||
1.  [Listal (https://listal.com/)](https://listal.com/)*: top 100K, movies, music*
|
||||
@@ -437,7 +438,7 @@ Rank data fetched from Majestic Million by domains.
|
||||
1.  [mastodon.social (https://chaos.social/)](https://chaos.social/)*: top 100K, social*
|
||||
1.  [notabug.org (https://notabug.org/)](https://notabug.org/)*: top 100K, coding*
|
||||
1.  [Livemaster (https://www.livemaster.ru)](https://www.livemaster.ru)*: top 100K, ru*, search is disabled
|
||||
1.  [Joomlart (https://www.joomlart.com)](https://www.joomlart.com)*: top 100K, coding*, search is disabled
|
||||
1.  [Joomlart (https://www.joomlart.com)](https://www.joomlart.com)*: top 100K, coding*
|
||||
1.  [Trinixy (https://trinixy.ru)](https://trinixy.ru)*: top 100K, news, ru*
|
||||
1.  [TripIt (https://tripit.com)](https://tripit.com)*: top 100K, travel*, search is disabled
|
||||
1.  [Mydramalist (https://mydramalist.com)](https://mydramalist.com)*: top 100K, kr, movies*
|
||||
@@ -577,7 +578,7 @@ Rank data fetched from Majestic Million by domains.
|
||||
1.  [BabyBlog.ru (https://www.babyblog.ru/)](https://www.babyblog.ru/)*: top 100K, ru*
|
||||
1.  [7Cups (https://www.7cups.com/)](https://www.7cups.com/)*: top 100K, medicine*, search is disabled
|
||||
1.  [CTFtime (https://ctftime.org/)](https://ctftime.org/)*: top 100K, hacking*
|
||||
1.  [Smogon (https://www.smogon.com)](https://www.smogon.com)*: top 100K, gaming*, search is disabled
|
||||
1.  [Smogon (https://www.smogon.com)](https://www.smogon.com)*: top 100K, gaming*
|
||||
1.  [LOR (https://linux.org.ru/)](https://linux.org.ru/)*: top 100K, ru*
|
||||
1.  [Mouthshut (https://www.mouthshut.com/)](https://www.mouthshut.com/)*: top 100K, in*
|
||||
1.  [Eva (https://eva.ru/)](https://eva.ru/)*: top 100K, ru*, search is disabled
|
||||
@@ -678,7 +679,7 @@ Rank data fetched from Majestic Million by domains.
|
||||
1.  [Partyflock (https://partyflock.nl)](https://partyflock.nl)*: top 100K, nl*
|
||||
1.  [Trisquel (https://trisquel.info)](https://trisquel.info)*: top 100K, eu*
|
||||
1.  [Pokemon Showdown (https://pokemonshowdown.com)](https://pokemonshowdown.com)*: top 100K, gaming*
|
||||
1.  [Knowem (https://knowem.com/)](https://knowem.com/)*: top 100K, business*, search is disabled
|
||||
1.  [Knowem (https://knowem.com/)](https://knowem.com/)*: top 100K, business*
|
||||
1.  [MoiKrug (https://moikrug.ru/)](https://moikrug.ru/)*: top 100K, career*
|
||||
1.  [Medikforum (https://www.medikforum.ru)](https://www.medikforum.ru)*: top 100K, de, forum, nl, ru, ua*, search is disabled
|
||||
1.  [mynickname.com (https://mynickname.com)](https://mynickname.com)*: top 100K, social*
|
||||
@@ -693,7 +694,7 @@ Rank data fetched from Majestic Million by domains.
|
||||
1.  [Govloop (https://www.govloop.com)](https://www.govloop.com)*: top 100K, education*
|
||||
1.  [Wakatime (https://wakatime.com)](https://wakatime.com)*: top 100K, ng, ve*
|
||||
1.  [Cqham (http://www.cqham.ru)](http://www.cqham.ru)*: top 100K, ru, tech*
|
||||
1.  [Designspiration (https://designspiration.com/)](https://designspiration.com/)*: top 100K, design*
|
||||
1.  [Designspiration (https://www.designspiration.net/)](https://www.designspiration.net/)*: top 100K, design*
|
||||
1.  [Politforums (https://www.politforums.net/)](https://www.politforums.net/)*: top 100K, forum, ru*
|
||||
1.  [NameMC (https://namemc.com/)](https://namemc.com/)*: top 100K, gaming*
|
||||
1.  [EuroFootball (https://www.euro-football.ru)](https://www.euro-football.ru)*: top 100K, ru*
|
||||
@@ -701,7 +702,7 @@ Rank data fetched from Majestic Million by domains.
|
||||
1.  [Truelancer (https://www.truelancer.com)](https://www.truelancer.com)*: top 100K, in*
|
||||
1.  [Icheckmovies (https://www.icheckmovies.com/)](https://www.icheckmovies.com/)*: top 100K, movies*
|
||||
1.  [Likee (https://likee.video)](https://likee.video)*: top 100K, video*
|
||||
1.  [Polywork (https://www.polywork.com)](https://www.polywork.com)*: top 100K, career*, search is disabled
|
||||
1.  [Polywork (https://www.polywork.com)](https://www.polywork.com)*: top 100K, career*
|
||||
1.  [ForumHouse (https://www.forumhouse.ru/)](https://www.forumhouse.ru/)*: top 100K, forum, ru*, search is disabled
|
||||
1.  [AnimeSuperHero (https://animesuperhero.com)](https://animesuperhero.com)*: top 100K, forum*, search is disabled
|
||||
1.  [SportsTracker (https://www.sports-tracker.com/)](https://www.sports-tracker.com/)*: top 100K, pt, ru*
|
||||
@@ -710,7 +711,7 @@ Rank data fetched from Majestic Million by domains.
|
||||
1.  [fotostrana.ru (https://fotostrana.ru)](https://fotostrana.ru)*: top 100K, ru*
|
||||
1.  [bigfooty.com (https://www.bigfooty.com/forum/)](https://www.bigfooty.com/forum/)*: top 100K, au, forum*
|
||||
1.  [Tl (https://tl.net)](https://tl.net)*: top 10M, de, dk*
|
||||
1.  [Movieforums (https://www.movieforums.com)](https://www.movieforums.com)*: top 10M, forum, la*, search is disabled
|
||||
1.  [Movieforums (https://www.movieforums.com)](https://www.movieforums.com)*: top 10M, forum, la*
|
||||
1.  [Crevado (https://crevado.com/)](https://crevado.com/)*: top 10M, design*
|
||||
1.  [Monkeytype (https://monkeytype.com/)](https://monkeytype.com/)*: top 10M, gaming*
|
||||
1.  [Mylot (https://www.mylot.com/)](https://www.mylot.com/)*: top 10M, pl*
|
||||
@@ -804,7 +805,7 @@ Rank data fetched from Majestic Million by domains.
|
||||
1.  [Jigidi (https://www.jigidi.com/)](https://www.jigidi.com/)*: top 10M, hobby*
|
||||
1.  [Allhockey (https://allhockey.ru/)](https://allhockey.ru/)*: top 10M, ru*
|
||||
1.  [Runitonce (https://www.runitonce.com/)](https://www.runitonce.com/)*: top 10M, ca*
|
||||
1.  [Onlyfinder (https://onlyfinder.com)](https://onlyfinder.com)*: top 10M, webcam*, search is disabled
|
||||
1.  [Onlyfinder (https://onlyfinder.com)](https://onlyfinder.com)*: top 10M, webcam*
|
||||
1.  [Postila (https://postila.ru/)](https://postila.ru/)*: top 10M, ru*
|
||||
1.  [Chemport (https://www.chemport.ru)](https://www.chemport.ru)*: top 10M, forum, ru*
|
||||
1.  [Vapenews (https://vapenews.ru/)](https://vapenews.ru/)*: top 10M, ru*
|
||||
@@ -823,7 +824,7 @@ Rank data fetched from Majestic Million by domains.
|
||||
1.  [Loveplanet (https://loveplanet.ru)](https://loveplanet.ru)*: top 10M, dating, ru*, search is disabled
|
||||
1.  [sevenstring.org (https://sevenstring.org)](https://sevenstring.org)*: top 10M, forum*, search is disabled
|
||||
1.  [Bikepost (https://bikepost.ru)](https://bikepost.ru)*: top 10M, ru*
|
||||
1.  [the-mainboard.com (http://the-mainboard.com/index.php)](http://the-mainboard.com/index.php)*: top 10M, forum, us*, search is disabled
|
||||
1.  [the-mainboard.com (http://the-mainboard.com/index.php)](http://the-mainboard.com/index.php)*: top 10M, forum, us*
|
||||
1.  [australianfrequentflyer.com.au (https://www.australianfrequentflyer.com.au/community/)](https://www.australianfrequentflyer.com.au/community/)*: top 10M, au, forum*
|
||||
1.  [4stor (https://4stor.ru)](https://4stor.ru)*: top 10M, ru*
|
||||
1.  [subaruoutback.org (https://subaruoutback.org)](https://subaruoutback.org)*: top 10M, forum, us*
|
||||
@@ -833,6 +834,7 @@ Rank data fetched from Majestic Million by domains.
|
||||
1.  [Snooth (https://www.snooth.com/)](https://www.snooth.com/)*: top 10M, news*
|
||||
1.  [svtperformance.com (https://svtperformance.com)](https://svtperformance.com)*: top 10M, forum, us*
|
||||
1.  [DefensiveCarry (https://www.defensivecarry.com)](https://www.defensivecarry.com)*: top 10M, us*
|
||||
1.  [Pitomec (https://www.pitomec.ru)](https://www.pitomec.ru)*: top 10M, ru, ua*
|
||||
1.  [GotovimDoma (https://gotovim-doma.ru)](https://gotovim-doma.ru)*: top 10M, ru*, search is disabled
|
||||
1.  [Chollometro (https://www.chollometro.com/)](https://www.chollometro.com/)*: top 10M, es, shopping*
|
||||
1.  [Hpc (https://hpc.ru)](https://hpc.ru)*: top 10M, ru*
|
||||
@@ -868,8 +870,9 @@ Rank data fetched from Majestic Million by domains.
|
||||
1.  [Affiliatefix (https://www.affiliatefix.com)](https://www.affiliatefix.com)*: top 10M, forum*
|
||||
1.  [Shophelp (https://shophelp.ru/)](https://shophelp.ru/)*: top 10M, ru*
|
||||
1.  [BeerMoneyForum (https://www.beermoneyforum.com)](https://www.beermoneyforum.com)*: top 10M, finance, forum, gambling*, search is disabled
|
||||
1.  [Math10 (https://www.math10.com/)](https://www.math10.com/)*: top 10M, forum, ru*, search is disabled
|
||||
1.  [Math10 (https://www.math10.com/)](https://www.math10.com/)*: top 10M, forum, ru*
|
||||
1.  [Pepper PL (https://www.pepper.pl/)](https://www.pepper.pl/)*: top 10M, pl*
|
||||
1.  [SQL.ru (https://www.sql.ru)](https://www.sql.ru)*: top 10M, ru*
|
||||
1.  [sigtalk.com (https://sigtalk.com)](https://sigtalk.com)*: top 10M, forum, us*
|
||||
1.  [mir-stalkera.ru (http://mir-stalkera.ru)](http://mir-stalkera.ru)*: top 10M, gaming, ru*
|
||||
1.  [Pedsovet (https://pedsovet.su/)](https://pedsovet.su/)*: top 10M, ru*, search is disabled
|
||||
@@ -887,7 +890,7 @@ Rank data fetched from Majestic Million by domains.
|
||||
1.  [lada-vesta.net (http://www.lada-vesta.net)](http://www.lada-vesta.net)*: top 10M, auto, forum, ru*
|
||||
1.  [Sysadmins (https://sysadmins.ru)](https://sysadmins.ru)*: top 10M, forum, ru, tech*
|
||||
1.  [Plug.DJ (https://plug.dj/)](https://plug.dj/)*: top 10M, music*, search is disabled
|
||||
1.  [mcfc-fan.ru (http://mcfc-fan.ru)](http://mcfc-fan.ru)*: top 10M, ru, sport*, search is disabled
|
||||
1.  [mcfc-fan.ru (http://mcfc-fan.ru)](http://mcfc-fan.ru)*: top 10M, ru, sport*
|
||||
1.  [Hipforums (https://www.hipforums.com/)](https://www.hipforums.com/)*: top 10M, forum, ru*, search is disabled
|
||||
1.  [Rusfishing (https://www.rusfishing.ru)](https://www.rusfishing.ru)*: top 10M, ru*
|
||||
1.  [jeepgarage.org (https://jeepgarage.org)](https://jeepgarage.org)*: top 10M, forum, us*
|
||||
@@ -2402,7 +2405,7 @@ Rank data fetched from Majestic Million by domains.
|
||||
1.  [Terminator (http://terminator-scc.net.ru)](http://terminator-scc.net.ru)*: top 100M, ru*
|
||||
1.  [Thedaftclub (https://www.thedaftclub.com)](https://www.thedaftclub.com)*: top 100M*, search is disabled
|
||||
1.  [Thephysicsforum (https://www.thephysicsforum.com)](https://www.thephysicsforum.com)*: top 100M, forum*, search is disabled
|
||||
1.  [TikTok Online Viewer (https://ttonlineviewer.com)](https://ttonlineviewer.com)*: top 100M*, search is disabled
|
||||
1.  [TikTok Online Viewer (https://ttonlineviewer.com)](https://ttonlineviewer.com)*: top 100M*
|
||||
1.  [Tkgr (http://tkgr.ru/)](http://tkgr.ru/)*: top 100M, ru*
|
||||
1.  [Torrent-soft (https://torrent-soft.net)](https://torrent-soft.net)*: top 100M, ru*
|
||||
1.  [TotalStavki (https://totalstavki.ru)](https://totalstavki.ru)*: top 100M, ru*, search is disabled
|
||||
@@ -3142,52 +3145,34 @@ Rank data fetched from Majestic Million by domains.
|
||||
1.  [discuss.flarum.org.cn (https://discuss.flarum.org.cn)](https://discuss.flarum.org.cn)*: top 100M, cn, forum*
|
||||
1.  [flarum.es (https://flarum.es)](https://flarum.es)*: top 100M, es, forum*
|
||||
1.  [forum.fibra.click (https://forum.fibra.click)](https://forum.fibra.click)*: top 100M, forum, it*
|
||||
1.  [hiveon.com forum (https://hiveon.com/forum)](https://hiveon.com/forum)*: top 100M, coding, ru*
|
||||
1.  [forum.manticoresearch.com (https://forum.manticoresearch.com)](https://forum.manticoresearch.com)*: top 100M, coding*
|
||||
1.  [forum.jscourse.com (https://forum.jscourse.com)](https://forum.jscourse.com)*: top 100M, coding, ru*
|
||||
1.  [forums.grandstream.com (https://forums.grandstream.com)](https://forums.grandstream.com)*: top 100M, coding*
|
||||
1.  [support.wirenboard.com (https://support.wirenboard.com)](https://support.wirenboard.com)*: top 100M, coding, ru*
|
||||
1.  [forum.cs-cart.ru (https://forum.cs-cart.ru)](https://forum.cs-cart.ru)*: top 100M, coding, ru*
|
||||
1.  [instantcms.ru (https://instantcms.ru)](https://instantcms.ru)*: top 100M, coding, ru*
|
||||
1.  [wewin.ru (https://wewin.ru)](https://wewin.ru)*: top 100M, ru*
|
||||
1.  [myslo.ru (https://myslo.ru)](https://myslo.ru)*: top 100M, news, ru*
|
||||
1.  [add-groups.com (https://add-groups.com)](https://add-groups.com)*: top 100M, messaging, ru*
|
||||
1.  [Profi.ru (https://profi.ru)](https://profi.ru)*: top 100M, freelance, ru*
|
||||
1.  [mover.uz (https://mover.uz)](https://mover.uz)*: top 100M, video*
|
||||
1.  [BitPapa (https://bitpapa.com)](https://bitpapa.com)*: top 100M, crypto, ru*
|
||||
1.  [minfin.com.ua (https://minfin.com.ua)](https://minfin.com.ua)*: top 100M, finance, ua*
|
||||
1.  [Pexels (https://www.pexels.com)](https://www.pexels.com)*: top 100M, photo*
|
||||
1.  [BestGore (https://bestgore.fun)](https://bestgore.fun)*: top 100M, video*
|
||||
1.  [AirNFTs (https://app.airnfts.com)](https://app.airnfts.com)*: top 100M, crypto, nft*
|
||||
1.  [GreasyFork (https://greasyfork.org)](https://greasyfork.org)*: top 100M, coding*
|
||||
|
||||
The list was updated at (2026-04-29)
|
||||
The list was updated at (2026-04-23)
|
||||
## Statistics
|
||||
|
||||
Enabled/total sites: 2523/3157 = 79.92%
|
||||
Enabled/total sites: 2529/3142 = 80.49%
|
||||
|
||||
Incomplete message checks: 316/2523 = 12.52% (false positive risks)
|
||||
Incomplete message checks: 320/2529 = 12.65% (false positive risks)
|
||||
|
||||
Status code checks: 633/2523 = 25.09% (false positive risks)
|
||||
Status code checks: 632/2529 = 24.99% (false positive risks)
|
||||
|
||||
False positive risk (total): 37.61%
|
||||
False positive risk (total): 37.64%
|
||||
|
||||
Sites with probing: 500px, Armchairgm, BinarySearch (disabled), BleachFandom, Bluesky, BongaCams, Boosty, BuyMeACoffee, Calendly, Cent, Chess, Code Sandbox, Code Snippet Wiki, DailyMotion, Discord, Diskusjon.no, Disqus, Docker Hub, Duolingo, FandomCommunityCentral, GitHub, GitLab, Google Plus (archived), Gravatar, HackTheBox, Hackerrank, Hashnode, Holopin, Imgur, Issuu, Keybase, Kick, Kvinneguiden, LeetCode, Lesswrong, Livejasmin, LocalCryptos (disabled), Medium, MicrosoftLearn, MixCloud, Monkeytype, NPM, Niftygateway, Omg.lol, OnlyFans, Paragraph, Picsart, Plurk, Polarsteps, Rarible, Reddit (disabled), Reddit Search (Pushshift) (disabled), Revolut.me, RoyalCams, Scratch, Soop, SportsTracker, Spotify, StackOverflow, Substack, TAP'D, Topcoder, Trello, Twitch, Twitter, Twitter Shadowban (disabled), UnstoppableDomains, Vimeo, Vivino, Warframe Market, Warpcast, Weibo, Wikipedia, Yapisal (disabled), YouNow, en.brickimedia.org, forums.grandstream.com, nightbot, notabug.org, qiwi.me (disabled)
|
||||
Sites with probing: 500px, Armchairgm, BinarySearch (disabled), BleachFandom, Bluesky, BongaCams, Boosty, BuyMeACoffee, Calendly, Cent, Chess, Code Sandbox, Code Snippet Wiki, DailyMotion, Discord, Diskusjon.no, Disqus, Docker Hub, Duolingo, FandomCommunityCentral, GitHub, GitLab, Google Plus (archived), Gravatar, HackTheBox, Hackerrank, Hashnode, Holopin, Imgur, Issuu, Keybase, Kick, Kvinneguiden, LeetCode, Lesswrong, Livejasmin, LocalCryptos (disabled), Medium, MicrosoftLearn, MixCloud, Monkeytype, NPM, Niftygateway, Omg.lol, OnlyFans, Paragraph, Picsart, Plurk, Polarsteps, Rarible, Reddit, Reddit Search (Pushshift) (disabled), Revolut.me, RoyalCams, Scratch, Soop, SportsTracker, Spotify, StackOverflow, Substack, TAP'D, Topcoder, Trello, Twitch, Twitter, Twitter Shadowban (disabled), UnstoppableDomains, Vimeo, Warframe Market, Warpcast, Weibo, Wikipedia, Yapisal (disabled), YouNow, en.brickimedia.org, nightbot, notabug.org, qiwi.me (disabled)
|
||||
|
||||
Sites with activation: OnlyFans, Twitter, Vimeo, Weibo
|
||||
|
||||
Top 20 profile URLs:
|
||||
- (709) `{urlMain}/index/8-0-{username} (uCoz)`
|
||||
- (312) `/{username}`
|
||||
- (314) `/{username}`
|
||||
- (223) `{urlMain}{urlSubpath}/members/?username={username} (XenForo)`
|
||||
- (172) `/user/{username}`
|
||||
- (140) `/profile/{username}`
|
||||
- (170) `/user/{username}`
|
||||
- (138) `/profile/{username}`
|
||||
- (127) `{urlMain}{urlSubpath}/search.php?author={username} (phpBB/Search)`
|
||||
- (120) `{urlMain}{urlSubpath}/member.php?username={username} (vBulletin)`
|
||||
- (116) `/u/{username}`
|
||||
- (95) `/users/{username}`
|
||||
- (92) `{urlMain}/u/{username}/summary (Discourse)`
|
||||
- (68) `/@{username}`
|
||||
- (93) `/users/{username}`
|
||||
- (87) `{urlMain}/u/{username}/summary (Discourse)`
|
||||
- (70) `/@{username}`
|
||||
- (55) `/wiki/User:{username}`
|
||||
- (45) `SUBDOMAIN`
|
||||
- (38) `/members/?username={username}`
|
||||
@@ -3200,39 +3185,39 @@ Top 20 profile URLs:
|
||||
|
||||
|
||||
Sites by engine:
|
||||
- `uCoz`: 634/709 (89.4%)
|
||||
- `XenForo`: 181/223 (81.2%)
|
||||
- `uCoz`: 635/709 (89.6%)
|
||||
- `XenForo`: 182/223 (81.6%)
|
||||
- `phpBB/Search`: 120/127 (94.5%)
|
||||
- `vBulletin`: 31/120 (25.8%)
|
||||
- `Discourse`: 86/92 (93.5%)
|
||||
- `phpBB`: 21/27 (77.8%)
|
||||
- `Discourse`: 81/87 (93.1%)
|
||||
- `phpBB`: 22/27 (81.5%)
|
||||
- `engine404`: 19/23 (82.6%)
|
||||
- `op.gg`: 17/17 (100.0%)
|
||||
- `Flarum`: 15/15 (100.0%)
|
||||
- `Wordpress/Author`: 7/9 (77.8%)
|
||||
- `engineRedirect`: 3/4 (75.0%)
|
||||
- `engine404get`: 3/3 (100.0%)
|
||||
- `phpBB2/Search`: 2/3 (66.7%)
|
||||
- `engine404get`: 2/2 (100.0%)
|
||||
|
||||
|
||||
Top 20 tags:
|
||||
- (1057) `NO_TAGS` (non-standard)
|
||||
- (750) `forum`
|
||||
- (128) `gaming`
|
||||
- (88) `coding`
|
||||
- (80) `coding`
|
||||
- (58) `photo`
|
||||
- (46) `tech`
|
||||
- (45) `social`
|
||||
- (42) `news`
|
||||
- (41) `news`
|
||||
- (39) `blog`
|
||||
- (33) `music`
|
||||
- (31) `shopping`
|
||||
- (29) `crypto`
|
||||
- (27) `finance`
|
||||
- (25) `video`
|
||||
- (27) `crypto`
|
||||
- (26) `finance`
|
||||
- (25) `sharing`
|
||||
- (23) `video`
|
||||
- (23) `education`
|
||||
- (22) `freelance`
|
||||
- (21) `art`
|
||||
- (21) `freelance`
|
||||
- (18) `hobby`
|
||||
- (17) `sport`
|
||||
|
||||
@@ -56,110 +56,3 @@ async def test_import_aiohttp_cookies(cookie_test_server):
|
||||
print(f"Server response: {result}")
|
||||
|
||||
assert result == {'cookies': {'a': 'b'}}
|
||||
|
||||
|
||||
# ---- OnlyFans signing tests (pure-compute, no network) ----
|
||||
|
||||
class _FakeSite:
|
||||
"""Minimal stand-in for MaigretSite with the attributes onlyfans() touches."""
|
||||
|
||||
def __init__(self, headers=None, activation=None):
|
||||
self.headers = headers or {}
|
||||
self.activation = activation or {
|
||||
"static_param": "jLM8LXHU1CGcuCzPMNwWX9osCScVuP4D",
|
||||
"checksum_indexes": [28, 3, 16, 32, 25, 24, 23, 0, 26],
|
||||
"checksum_constant": -180,
|
||||
"format": "57203:{}:{:x}:69cfa6d8",
|
||||
"url": "https://onlyfans.com/api2/v2/init",
|
||||
}
|
||||
|
||||
|
||||
class _FakeResponse:
|
||||
def __init__(self, cookies=None):
|
||||
self.cookies = cookies or {}
|
||||
|
||||
|
||||
def test_onlyfans_sets_xbc_when_zero(monkeypatch):
|
||||
site = _FakeSite(headers={"x-bc": "0", "cookie": "existing=1"})
|
||||
|
||||
# Prevent any real network. If _sign path still fires requests.get, fail loudly.
|
||||
import maigret.activation as act_mod
|
||||
|
||||
def boom(*a, **kw): # pragma: no cover - sanity
|
||||
raise AssertionError("requests.get should not run when cookie is present")
|
||||
|
||||
monkeypatch.setattr(act_mod.__dict__.get("requests", None) or __import__("requests"), "get", boom, raising=False)
|
||||
|
||||
logger = Mock()
|
||||
ParsingActivator.onlyfans(site, logger, url="https://onlyfans.com/api2/v2/users/adam")
|
||||
|
||||
# x-bc must be rewritten to a non-zero hex token
|
||||
assert site.headers["x-bc"] != "0"
|
||||
assert len(site.headers["x-bc"]) == 40 # 20 bytes → 40 hex chars
|
||||
# time / sign headers set for target URL
|
||||
assert "time" in site.headers and site.headers["time"].isdigit()
|
||||
assert site.headers["sign"].startswith("57203:")
|
||||
|
||||
|
||||
def test_onlyfans_fetches_init_cookie_when_missing(monkeypatch):
|
||||
"""When cookie header is absent, init endpoint is called and its cookies stored."""
|
||||
site = _FakeSite(headers={"x-bc": "already_set_token", "user-id": "0"})
|
||||
|
||||
import requests
|
||||
|
||||
captured = {}
|
||||
|
||||
def fake_get(url, headers=None, timeout=15):
|
||||
captured["url"] = url
|
||||
captured["headers"] = dict(headers or {})
|
||||
return _FakeResponse(cookies={"sess": "abc123", "csrf": "xyz"})
|
||||
|
||||
monkeypatch.setattr(requests, "get", fake_get)
|
||||
|
||||
logger = Mock()
|
||||
ParsingActivator.onlyfans(site, logger, url="https://onlyfans.com/api2/v2/users/adam")
|
||||
|
||||
# init request made
|
||||
assert captured["url"] == site.activation["url"]
|
||||
# headers passed to init include freshly generated time/sign
|
||||
assert "time" in captured["headers"]
|
||||
assert captured["headers"]["sign"].startswith("57203:")
|
||||
# cookie header populated from response
|
||||
assert site.headers["cookie"] == "sess=abc123; csrf=xyz"
|
||||
|
||||
|
||||
def test_onlyfans_signature_is_deterministic_for_same_time(monkeypatch):
|
||||
"""Two calls with patched time produce identical signatures."""
|
||||
site1 = _FakeSite(headers={"x-bc": "token", "cookie": "c=1"})
|
||||
site2 = _FakeSite(headers={"x-bc": "token", "cookie": "c=1"})
|
||||
|
||||
import maigret.activation
|
||||
monkeypatch.setattr(maigret.activation, "_time", __import__("time"), raising=False)
|
||||
|
||||
fixed = 1_700_000_000.123
|
||||
import time as time_mod
|
||||
monkeypatch.setattr(time_mod, "time", lambda: fixed)
|
||||
|
||||
logger = Mock()
|
||||
ParsingActivator.onlyfans(site1, logger, url="https://onlyfans.com/api2/v2/users/adam")
|
||||
ParsingActivator.onlyfans(site2, logger, url="https://onlyfans.com/api2/v2/users/adam")
|
||||
|
||||
assert site1.headers["time"] == site2.headers["time"]
|
||||
assert site1.headers["sign"] == site2.headers["sign"]
|
||||
|
||||
|
||||
def test_onlyfans_sign_differs_per_path(monkeypatch):
|
||||
"""Different target URLs must yield different signatures."""
|
||||
site = _FakeSite(headers={"x-bc": "token", "cookie": "c=1"})
|
||||
|
||||
import time as time_mod
|
||||
monkeypatch.setattr(time_mod, "time", lambda: 1_700_000_000.0)
|
||||
|
||||
logger = Mock()
|
||||
ParsingActivator.onlyfans(site, logger, url="https://onlyfans.com/api2/v2/users/adam")
|
||||
sig_adam = site.headers["sign"]
|
||||
|
||||
ParsingActivator.onlyfans(site, logger, url="https://onlyfans.com/api2/v2/users/bob")
|
||||
sig_bob = site.headers["sign"]
|
||||
|
||||
assert sig_adam != sig_bob
|
||||
|
||||
@@ -1,22 +1,7 @@
|
||||
from argparse import ArgumentTypeError
|
||||
|
||||
from mock import Mock
|
||||
import pytest
|
||||
|
||||
from maigret import search
|
||||
from maigret.checking import (
|
||||
detect_error_page,
|
||||
extract_ids_data,
|
||||
parse_usernames,
|
||||
update_results_info,
|
||||
get_failed_sites,
|
||||
timeout_check,
|
||||
debug_response_logging,
|
||||
process_site_result,
|
||||
)
|
||||
from maigret.errors import CheckError
|
||||
from maigret.result import MaigretCheckResult, MaigretCheckStatus
|
||||
from maigret.sites import MaigretSite
|
||||
|
||||
|
||||
def site_result_except(server, username, **kwargs):
|
||||
@@ -82,386 +67,3 @@ async def test_checking_by_message_negative(httpserver, local_test_db):
|
||||
|
||||
result = await search('unclaimed', site_dict=sites_dict, logger=Mock())
|
||||
assert result['Message']['status'].is_found() is True
|
||||
|
||||
|
||||
# ---- Pure-function unit tests (no network) ----
|
||||
|
||||
|
||||
def test_detect_error_page_site_specific():
|
||||
err = detect_error_page(
|
||||
"Please enable JavaScript to proceed",
|
||||
200,
|
||||
{"Please enable JavaScript to proceed": "Scraping protection"},
|
||||
ignore_403=False,
|
||||
)
|
||||
assert err is not None
|
||||
assert err.type == "Site-specific"
|
||||
assert err.desc == "Scraping protection"
|
||||
|
||||
|
||||
def test_detect_error_page_403():
|
||||
err = detect_error_page("some body", 403, {}, ignore_403=False)
|
||||
assert err is not None
|
||||
assert err.type == "Access denied"
|
||||
|
||||
|
||||
def test_detect_error_page_403_ignored():
|
||||
# XenForo engine uses ignore403 because member-not-found also returns 403
|
||||
assert detect_error_page("not found body", 403, {}, ignore_403=True) is None
|
||||
|
||||
|
||||
def test_detect_error_page_999_linkedin():
|
||||
# LinkedIn returns 999 on bot suspicion — must NOT be reported as Server error
|
||||
assert detect_error_page("", 999, {}, ignore_403=False) is None
|
||||
|
||||
|
||||
def test_detect_error_page_500():
|
||||
err = detect_error_page("", 503, {}, ignore_403=False)
|
||||
assert err is not None
|
||||
assert err.type == "Server"
|
||||
assert "503" in err.desc
|
||||
|
||||
|
||||
def test_detect_error_page_ok():
|
||||
assert detect_error_page("hello world", 200, {}, ignore_403=False) is None
|
||||
|
||||
|
||||
def test_parse_usernames_single_username():
|
||||
logger = Mock()
|
||||
result = parse_usernames({"profile_username": "alice"}, logger)
|
||||
assert result == {"alice": "username"}
|
||||
|
||||
|
||||
def test_parse_usernames_list_of_usernames():
|
||||
logger = Mock()
|
||||
result = parse_usernames({"other_usernames": "['alice', 'bob']"}, logger)
|
||||
assert result == {"alice": "username", "bob": "username"}
|
||||
|
||||
|
||||
def test_parse_usernames_malformed_list():
|
||||
logger = Mock()
|
||||
result = parse_usernames({"other_usernames": "not-a-list"}, logger)
|
||||
# should swallow the error and just return empty
|
||||
assert result == {}
|
||||
assert logger.warning.called
|
||||
|
||||
|
||||
def test_parse_usernames_supported_id():
|
||||
logger = Mock()
|
||||
# "telegram" is in SUPPORTED_IDS per socid_extractor
|
||||
from maigret.checking import SUPPORTED_IDS
|
||||
if SUPPORTED_IDS:
|
||||
key = next(iter(SUPPORTED_IDS))
|
||||
result = parse_usernames({key: "some_value"}, logger)
|
||||
assert result.get("some_value") == key
|
||||
|
||||
|
||||
def test_update_results_info_links():
|
||||
info = {"username": "test"}
|
||||
result = update_results_info(
|
||||
info,
|
||||
{"links": "['https://example.com/a', 'https://example.com/b']", "website": "https://example.com/w"},
|
||||
{"alice": "username"},
|
||||
)
|
||||
assert result["ids_usernames"] == {"alice": "username"}
|
||||
assert "https://example.com/w" in result["ids_links"]
|
||||
assert "https://example.com/a" in result["ids_links"]
|
||||
|
||||
|
||||
def test_update_results_info_no_website():
|
||||
info = {}
|
||||
result = update_results_info(info, {"links": "[]"}, {})
|
||||
assert result["ids_links"] == []
|
||||
|
||||
|
||||
def test_extract_ids_data_bad_html_returns_empty():
|
||||
logger = Mock()
|
||||
# Random HTML should not raise — returns {} if nothing matches
|
||||
out = extract_ids_data("<html><body>nothing special</body></html>", logger, Mock(name="Site"))
|
||||
assert isinstance(out, dict)
|
||||
|
||||
|
||||
def test_get_failed_sites_filters_permanent_errors():
|
||||
# Temporary errors (Request timeout, Connecting failure, etc.) are retryable → returned.
|
||||
# Permanent ones (Captcha, Access denied, etc.) and results without error → filtered out.
|
||||
good_status = MaigretCheckResult("u", "S1", "https://s1", MaigretCheckStatus.CLAIMED)
|
||||
timeout_err = MaigretCheckResult(
|
||||
"u", "S2", "https://s2", MaigretCheckStatus.UNKNOWN,
|
||||
error=CheckError("Request timeout", "slow server"),
|
||||
)
|
||||
captcha_err = MaigretCheckResult(
|
||||
"u", "S3", "https://s3", MaigretCheckStatus.UNKNOWN,
|
||||
error=CheckError("Captcha", "Cloudflare"),
|
||||
)
|
||||
results = {
|
||||
"S1": {"status": good_status},
|
||||
"S2": {"status": timeout_err},
|
||||
"S3": {"status": captcha_err},
|
||||
"S4": {}, # no status at all
|
||||
}
|
||||
failed = get_failed_sites(results)
|
||||
# Only the temporary-error site is retry-worthy
|
||||
assert failed == ["S2"]
|
||||
|
||||
|
||||
def test_timeout_check_valid():
|
||||
assert timeout_check("2.5") == 2.5
|
||||
assert timeout_check("30") == 30.0
|
||||
|
||||
|
||||
def test_timeout_check_invalid():
|
||||
with pytest.raises(ArgumentTypeError):
|
||||
timeout_check("abc")
|
||||
with pytest.raises(ArgumentTypeError):
|
||||
timeout_check("0")
|
||||
with pytest.raises(ArgumentTypeError):
|
||||
timeout_check("-1")
|
||||
|
||||
|
||||
def test_debug_response_logging_writes(tmp_path, monkeypatch):
|
||||
monkeypatch.chdir(tmp_path)
|
||||
debug_response_logging("https://example.com", "<html>hi</html>", 200, None)
|
||||
out = (tmp_path / "debug.log").read_text()
|
||||
assert "https://example.com" in out
|
||||
assert "200" in out
|
||||
|
||||
|
||||
def test_debug_response_logging_no_response(tmp_path, monkeypatch):
|
||||
monkeypatch.chdir(tmp_path)
|
||||
debug_response_logging("https://example.com", None, None, CheckError("Timeout"))
|
||||
out = (tmp_path / "debug.log").read_text()
|
||||
assert "No response" in out
|
||||
|
||||
|
||||
def _make_site(data_overrides=None):
|
||||
base = {
|
||||
"url": "https://x/{username}",
|
||||
"urlMain": "https://x",
|
||||
"checkType": "status_code",
|
||||
"usernameClaimed": "a",
|
||||
"usernameUnclaimed": "b",
|
||||
}
|
||||
if data_overrides:
|
||||
base.update(data_overrides)
|
||||
return MaigretSite("TestSite", base)
|
||||
|
||||
|
||||
def test_process_site_result_no_response_returns_info():
|
||||
site = _make_site()
|
||||
info = {"username": "a", "parsing_enabled": False, "url_user": "https://x/a"}
|
||||
out = process_site_result(None, Mock(), Mock(), info, site)
|
||||
assert out is info
|
||||
|
||||
|
||||
def test_process_site_result_status_already_set():
|
||||
site = _make_site()
|
||||
pre = MaigretCheckResult("a", "S", "u", MaigretCheckStatus.ILLEGAL)
|
||||
info = {"username": "a", "parsing_enabled": False, "status": pre, "url_user": "u"}
|
||||
# Since status is already set, function returns without changes
|
||||
out = process_site_result(("<html/>", 200, None), Mock(), Mock(), info, site)
|
||||
assert out["status"] is pre
|
||||
|
||||
|
||||
def test_process_site_result_status_code_claimed():
|
||||
site = _make_site({"checkType": "status_code"})
|
||||
info = {"username": "a", "parsing_enabled": False, "url_user": "https://x/a"}
|
||||
out = process_site_result(("<html/>", 200, None), Mock(), Mock(), info, site)
|
||||
assert out["status"].status == MaigretCheckStatus.CLAIMED
|
||||
assert out["http_status"] == 200
|
||||
|
||||
|
||||
def test_process_site_result_status_code_available():
|
||||
site = _make_site({"checkType": "status_code"})
|
||||
info = {"username": "a", "parsing_enabled": False, "url_user": "https://x/a"}
|
||||
out = process_site_result(("<html/>", 404, None), Mock(), Mock(), info, site)
|
||||
assert out["status"].status == MaigretCheckStatus.AVAILABLE
|
||||
|
||||
|
||||
def test_process_site_result_message_claimed():
|
||||
site = _make_site({
|
||||
"checkType": "message",
|
||||
"presenseStrs": ["profile-name"],
|
||||
"absenceStrs": ["not found"],
|
||||
})
|
||||
info = {"username": "a", "parsing_enabled": False, "url_user": "https://x/a"}
|
||||
out = process_site_result(("<div class='profile-name'>Alice</div>", 200, None), Mock(), Mock(), info, site)
|
||||
assert out["status"].status == MaigretCheckStatus.CLAIMED
|
||||
|
||||
|
||||
def test_process_site_result_message_available_by_absence():
|
||||
site = _make_site({
|
||||
"checkType": "message",
|
||||
"presenseStrs": ["profile-name"],
|
||||
"absenceStrs": ["not found"],
|
||||
})
|
||||
info = {"username": "a", "parsing_enabled": False, "url_user": "https://x/a"}
|
||||
out = process_site_result(("<h1>not found</h1> profile-name too", 200, None), Mock(), Mock(), info, site)
|
||||
# absence marker wins even if presence marker also appears
|
||||
assert out["status"].status == MaigretCheckStatus.AVAILABLE
|
||||
|
||||
|
||||
def test_process_site_result_with_error_is_unknown():
|
||||
site = _make_site({"checkType": "status_code"})
|
||||
info = {"username": "a", "parsing_enabled": False, "url_user": "https://x/a"}
|
||||
resp = ("body", 403, CheckError("Captcha", "Cloudflare"))
|
||||
out = process_site_result(resp, Mock(), Mock(), info, site)
|
||||
assert out["status"].status == MaigretCheckStatus.UNKNOWN
|
||||
assert out["status"].error is not None
|
||||
|
||||
|
||||
# ---- CurlCffiChecker: TLS impersonation header sanitisation ----
|
||||
|
||||
|
||||
class _FakeCurlResponse:
|
||||
def __init__(self, text="ok", status_code=200):
|
||||
self.text = text
|
||||
self.status_code = status_code
|
||||
|
||||
|
||||
class _FakeCurlSession:
|
||||
"""Captures the kwargs of the last .get/.post/.head call for assertions."""
|
||||
|
||||
last_method = None
|
||||
last_kwargs = None
|
||||
|
||||
async def __aenter__(self):
|
||||
return self
|
||||
|
||||
async def __aexit__(self, exc_type, exc, tb):
|
||||
return False
|
||||
|
||||
async def get(self, **kwargs):
|
||||
type(self).last_method = 'get'
|
||||
type(self).last_kwargs = kwargs
|
||||
return _FakeCurlResponse()
|
||||
|
||||
async def post(self, **kwargs):
|
||||
type(self).last_method = 'post'
|
||||
type(self).last_kwargs = kwargs
|
||||
return _FakeCurlResponse()
|
||||
|
||||
async def head(self, **kwargs):
|
||||
type(self).last_method = 'head'
|
||||
type(self).last_kwargs = kwargs
|
||||
return _FakeCurlResponse()
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def fake_curl_cffi(monkeypatch):
|
||||
"""Replace CurlCffiAsyncSession with a recorder. Resets capture between tests."""
|
||||
from maigret import checking
|
||||
_FakeCurlSession.last_method = None
|
||||
_FakeCurlSession.last_kwargs = None
|
||||
monkeypatch.setattr(checking, 'CurlCffiAsyncSession', _FakeCurlSession)
|
||||
return _FakeCurlSession
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_curl_cffi_strips_random_user_agent_to_let_impersonation_drive_ua(fake_curl_cffi):
|
||||
"""Regression: maigret used to forward `get_random_user_agent()` (often Chrome 91)
|
||||
to curl_cffi alongside `impersonate="chrome"` (Chrome 131 TLS). Cloudflare composite
|
||||
bot scoring rejects the resulting "Chrome 91 UA + Chrome 131 TLS" combo with a JS
|
||||
challenge. The fix strips User-Agent and Connection from the headers passed to
|
||||
curl_cffi so the impersonation default UA wins.
|
||||
"""
|
||||
from maigret.checking import CurlCffiChecker
|
||||
|
||||
checker = CurlCffiChecker(logger=Mock(), browser_emulate='chrome')
|
||||
checker.prepare(
|
||||
url='https://example.com/u/test',
|
||||
headers={
|
||||
"User-Agent": "Mozilla/5.0 ... Chrome/91.0.4472.124 ...", # maigret default
|
||||
"Connection": "close", # maigret default
|
||||
},
|
||||
allow_redirects=True,
|
||||
timeout=10,
|
||||
method='get',
|
||||
)
|
||||
await checker.check()
|
||||
|
||||
sent = fake_curl_cffi.last_kwargs
|
||||
assert fake_curl_cffi.last_method == 'get'
|
||||
assert sent['impersonate'] == 'chrome'
|
||||
# The whole point of the fix: random UA must not leak through.
|
||||
assert sent['headers'] is None or 'User-Agent' not in sent['headers']
|
||||
assert sent['headers'] is None or 'user-agent' not in {k.lower() for k in sent['headers']}
|
||||
# Connection: close also stripped (interferes with impersonation defaults).
|
||||
assert sent['headers'] is None or 'Connection' not in sent['headers']
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_curl_cffi_preserves_site_specific_headers(fake_curl_cffi):
|
||||
"""Site-specific headers (e.g. Content-Type for POST APIs, auth tokens, cookies)
|
||||
must survive the User-Agent strip — only UA and Connection are removed.
|
||||
"""
|
||||
from maigret.checking import CurlCffiChecker
|
||||
|
||||
checker = CurlCffiChecker(logger=Mock(), browser_emulate='chrome')
|
||||
checker.prepare(
|
||||
url='https://example.com/api',
|
||||
headers={
|
||||
"User-Agent": "Mozilla/5.0 random",
|
||||
"Connection": "close",
|
||||
"Content-Type": "application/json",
|
||||
"X-Csrf-Token": "abc123",
|
||||
},
|
||||
allow_redirects=True,
|
||||
timeout=10,
|
||||
method='get',
|
||||
)
|
||||
await checker.check()
|
||||
|
||||
sent_headers = fake_curl_cffi.last_kwargs['headers']
|
||||
assert sent_headers is not None
|
||||
assert sent_headers.get("Content-Type") == "application/json"
|
||||
assert sent_headers.get("X-Csrf-Token") == "abc123"
|
||||
# Sanity: stripped pair is gone
|
||||
assert "User-Agent" not in sent_headers
|
||||
assert "Connection" not in sent_headers
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_curl_cffi_handles_empty_headers(fake_curl_cffi):
|
||||
"""No headers at all → headers kwarg is None (not an empty dict that could confuse
|
||||
curl_cffi's impersonation header injection)."""
|
||||
from maigret.checking import CurlCffiChecker
|
||||
|
||||
checker = CurlCffiChecker(logger=Mock(), browser_emulate='chrome')
|
||||
checker.prepare(
|
||||
url='https://example.com/u/test',
|
||||
headers=None,
|
||||
allow_redirects=True,
|
||||
timeout=10,
|
||||
method='get',
|
||||
)
|
||||
await checker.check()
|
||||
|
||||
assert fake_curl_cffi.last_kwargs['headers'] is None
|
||||
assert fake_curl_cffi.last_kwargs['impersonate'] == 'chrome'
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_curl_cffi_strips_ua_for_post_too(fake_curl_cffi):
|
||||
"""The same UA-strip must apply on POST (e.g. Discord-style POST username probes
|
||||
with `tls_fingerprint`)."""
|
||||
from maigret.checking import CurlCffiChecker
|
||||
|
||||
checker = CurlCffiChecker(logger=Mock(), browser_emulate='chrome')
|
||||
checker.prepare(
|
||||
url='https://example.com/api/check',
|
||||
headers={
|
||||
"User-Agent": "Mozilla/5.0 random",
|
||||
"Content-Type": "application/json",
|
||||
},
|
||||
allow_redirects=True,
|
||||
timeout=10,
|
||||
method='post',
|
||||
payload={"username": "test"},
|
||||
)
|
||||
await checker.check()
|
||||
|
||||
sent = fake_curl_cffi.last_kwargs
|
||||
assert fake_curl_cffi.last_method == 'post'
|
||||
assert sent['json'] == {"username": "test"}
|
||||
assert "User-Agent" not in sent['headers']
|
||||
assert sent['headers'].get("Content-Type") == "application/json"
|
||||
|
||||
+33
-2
@@ -49,10 +49,9 @@ DEFAULT_ARGS: Dict[str, Any] = {
|
||||
'with_domains': False,
|
||||
'xmind': False,
|
||||
'md': False,
|
||||
'ai': False,
|
||||
'ai_model': 'gpt-4o',
|
||||
'no_autoupdate': False,
|
||||
'force_update': False,
|
||||
'extra_db_files': [],
|
||||
}
|
||||
|
||||
|
||||
@@ -128,6 +127,38 @@ def test_args_exclude_tags(argparser):
|
||||
assert getattr(args, arg) == want_args[arg]
|
||||
|
||||
|
||||
def test_args_single_extra_db(argparser):
|
||||
args = argparser.parse_args('--extra-db extras.json username'.split())
|
||||
|
||||
want_args = dict(DEFAULT_ARGS)
|
||||
want_args.update(
|
||||
{
|
||||
'extra_db_files': ['extras.json'],
|
||||
'username': ['username'],
|
||||
}
|
||||
)
|
||||
|
||||
for arg in vars(args):
|
||||
assert getattr(args, arg) == want_args[arg]
|
||||
|
||||
|
||||
def test_args_multiple_extra_dbs(argparser):
|
||||
args = argparser.parse_args(
|
||||
'--extra-db a.json --extra-db https://example.com/b.json username'.split()
|
||||
)
|
||||
|
||||
want_args = dict(DEFAULT_ARGS)
|
||||
want_args.update(
|
||||
{
|
||||
'extra_db_files': ['a.json', 'https://example.com/b.json'],
|
||||
'username': ['username'],
|
||||
}
|
||||
)
|
||||
|
||||
for arg in vars(args):
|
||||
assert getattr(args, arg) == want_args[arg]
|
||||
|
||||
|
||||
def test_args_tags_with_exclude_tags(argparser):
|
||||
args = argparser.parse_args('--tags coding --exclude-tags porn username'.split())
|
||||
|
||||
|
||||
+11
-11
@@ -26,7 +26,7 @@ async def test_simple_asyncio_executor():
|
||||
executor = AsyncioSimpleExecutor(logger=logger)
|
||||
assert await executor.run(tasks) == [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
|
||||
assert executor.execution_time > 0.2
|
||||
assert executor.execution_time < 1.0
|
||||
assert executor.execution_time < 0.3
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
@@ -37,7 +37,7 @@ async def test_asyncio_progressbar_executor():
|
||||
# no guarantees for the results order
|
||||
assert sorted(await executor.run(tasks)) == [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
|
||||
assert executor.execution_time > 0.2
|
||||
assert executor.execution_time < 1.0
|
||||
assert executor.execution_time < 0.3
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
@@ -48,7 +48,7 @@ async def test_asyncio_progressbar_semaphore_executor():
|
||||
# no guarantees for the results order
|
||||
assert sorted(await executor.run(tasks)) == [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
|
||||
assert executor.execution_time > 0.2
|
||||
assert executor.execution_time < 1.1
|
||||
assert executor.execution_time < 0.4
|
||||
|
||||
|
||||
@pytest.mark.slow
|
||||
@@ -59,12 +59,12 @@ async def test_asyncio_progressbar_queue_executor():
|
||||
executor = AsyncioProgressbarQueueExecutor(logger=logger, in_parallel=2)
|
||||
assert await executor.run(tasks) == [0, 1, 3, 2, 4, 6, 7, 5, 9, 8]
|
||||
assert executor.execution_time > 0.5
|
||||
assert executor.execution_time < 1.4
|
||||
assert executor.execution_time < 0.7
|
||||
|
||||
executor = AsyncioProgressbarQueueExecutor(logger=logger, in_parallel=3)
|
||||
assert await executor.run(tasks) == [0, 3, 1, 4, 6, 2, 7, 9, 5, 8]
|
||||
assert executor.execution_time > 0.4
|
||||
assert executor.execution_time < 1.3
|
||||
assert executor.execution_time < 0.6
|
||||
|
||||
executor = AsyncioProgressbarQueueExecutor(logger=logger, in_parallel=5)
|
||||
assert await executor.run(tasks) in (
|
||||
@@ -72,12 +72,12 @@ async def test_asyncio_progressbar_queue_executor():
|
||||
[0, 3, 6, 1, 4, 9, 7, 2, 5, 8],
|
||||
)
|
||||
assert executor.execution_time > 0.3
|
||||
assert executor.execution_time < 1.2
|
||||
assert executor.execution_time < 0.5
|
||||
|
||||
executor = AsyncioProgressbarQueueExecutor(logger=logger, in_parallel=10)
|
||||
assert await executor.run(tasks) == [0, 3, 6, 9, 1, 4, 7, 2, 5, 8]
|
||||
assert executor.execution_time > 0.2
|
||||
assert executor.execution_time < 1.1
|
||||
assert executor.execution_time < 0.4
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
@@ -88,13 +88,13 @@ async def test_asyncio_queue_generator_executor():
|
||||
results = [result async for result in executor.run(tasks)] # type: ignore[arg-type]
|
||||
assert results == [0, 1, 3, 2, 4, 6, 7, 5, 9, 8]
|
||||
assert executor.execution_time > 0.5
|
||||
assert executor.execution_time < 1.3
|
||||
assert executor.execution_time < 0.6
|
||||
|
||||
executor = AsyncioQueueGeneratorExecutor(logger=logger, in_parallel=3)
|
||||
results = [result async for result in executor.run(tasks)] # type: ignore[arg-type]
|
||||
assert results == [0, 3, 1, 4, 6, 2, 7, 9, 5, 8]
|
||||
assert executor.execution_time > 0.4
|
||||
assert executor.execution_time < 1.2
|
||||
assert executor.execution_time < 0.5
|
||||
|
||||
executor = AsyncioQueueGeneratorExecutor(logger=logger, in_parallel=5)
|
||||
results = [result async for result in executor.run(tasks)] # type: ignore[arg-type]
|
||||
@@ -103,10 +103,10 @@ async def test_asyncio_queue_generator_executor():
|
||||
[0, 3, 6, 1, 4, 9, 7, 2, 5, 8],
|
||||
)
|
||||
assert executor.execution_time > 0.3
|
||||
assert executor.execution_time < 1.1
|
||||
assert executor.execution_time < 0.4
|
||||
|
||||
executor = AsyncioQueueGeneratorExecutor(logger=logger, in_parallel=10)
|
||||
results = [result async for result in executor.run(tasks)] # type: ignore[arg-type]
|
||||
assert results == [0, 3, 6, 9, 1, 4, 7, 2, 5, 8]
|
||||
assert executor.execution_time > 0.2
|
||||
assert executor.execution_time < 1.0
|
||||
assert executor.execution_time < 0.3
|
||||
|
||||
@@ -10,15 +10,8 @@ import xmind # type: ignore[import-untyped]
|
||||
from jinja2 import Template
|
||||
|
||||
from maigret.report import (
|
||||
filter_supposed_data,
|
||||
sort_report_by_data_points,
|
||||
_md_format_value,
|
||||
generate_csv_report,
|
||||
generate_txt_report,
|
||||
save_csv_report,
|
||||
save_txt_report,
|
||||
save_json_report,
|
||||
save_markdown_report,
|
||||
save_xmind_report,
|
||||
save_html_report,
|
||||
save_pdf_report,
|
||||
@@ -463,223 +456,3 @@ def test_text_report_broken():
|
||||
assert brief_part in report_text
|
||||
assert 'us' in report_text
|
||||
assert 'photo' in report_text
|
||||
|
||||
|
||||
def test_filter_supposed_data():
|
||||
data = {
|
||||
'fullname': ['Alice'],
|
||||
'gender': ['female'],
|
||||
'location': ['Berlin'],
|
||||
'age': ['30'],
|
||||
'email': ['x@y.z'], # not allowed, must be dropped
|
||||
'bio': ['hi'], # not allowed
|
||||
}
|
||||
result = filter_supposed_data(data)
|
||||
assert result == {
|
||||
'Fullname': 'Alice',
|
||||
'Gender': 'female',
|
||||
'Location': 'Berlin',
|
||||
'Age': '30',
|
||||
}
|
||||
|
||||
|
||||
def test_filter_supposed_data_empty():
|
||||
assert filter_supposed_data({}) == {}
|
||||
assert filter_supposed_data({'nope': ['v']}) == {}
|
||||
|
||||
|
||||
def test_filter_supposed_data_scalar_values():
|
||||
# Strings and scalars must be kept whole — previously v[0] on "Alice"
|
||||
# silently returned "A" instead of "Alice".
|
||||
data = {
|
||||
'fullname': 'Alice',
|
||||
'gender': 'female',
|
||||
'location': 'Berlin',
|
||||
'age': 30,
|
||||
}
|
||||
assert filter_supposed_data(data) == {
|
||||
'Fullname': 'Alice',
|
||||
'Gender': 'female',
|
||||
'Location': 'Berlin',
|
||||
'Age': 30,
|
||||
}
|
||||
|
||||
|
||||
def test_filter_supposed_data_empty_list_yields_empty_string():
|
||||
# Edge case: list value present but empty should not crash with IndexError.
|
||||
assert filter_supposed_data({'fullname': []}) == {'Fullname': ''}
|
||||
|
||||
|
||||
def test_filter_supposed_data_mixed_values():
|
||||
# List and scalar mixed in the same payload.
|
||||
data = {'fullname': ['Alice', 'Alicia'], 'gender': 'female'}
|
||||
assert filter_supposed_data(data) == {
|
||||
'Fullname': 'Alice',
|
||||
'Gender': 'female',
|
||||
}
|
||||
|
||||
|
||||
def test_sort_report_by_data_points():
|
||||
status_many = MaigretCheckResult('', '', '', MaigretCheckStatus.CLAIMED)
|
||||
status_many.ids_data = {'a': 1, 'b': 2, 'c': 3}
|
||||
status_one = MaigretCheckResult('', '', '', MaigretCheckStatus.CLAIMED)
|
||||
status_one.ids_data = {'a': 1}
|
||||
status_none = MaigretCheckResult('', '', '', MaigretCheckStatus.CLAIMED)
|
||||
|
||||
results = {
|
||||
'few': {'status': status_one},
|
||||
'many': {'status': status_many},
|
||||
'zero': {'status': status_none},
|
||||
'nostatus': {},
|
||||
}
|
||||
sorted_out = sort_report_by_data_points(results)
|
||||
keys = list(sorted_out.keys())
|
||||
# site with 3 ids_data fields must come first
|
||||
assert keys[0] == 'many'
|
||||
# site with 1 field next
|
||||
assert keys[1] == 'few'
|
||||
|
||||
|
||||
def test_md_format_value_list():
|
||||
assert _md_format_value(['a', 'b', 'c']) == 'a, b, c'
|
||||
|
||||
|
||||
def test_md_format_value_url():
|
||||
assert _md_format_value('https://example.com') == '[https://example.com](https://example.com)'
|
||||
assert _md_format_value('http://x.y') == '[http://x.y](http://x.y)'
|
||||
|
||||
|
||||
def test_md_format_value_plain():
|
||||
assert _md_format_value('hello') == 'hello'
|
||||
assert _md_format_value(42) == '42'
|
||||
|
||||
|
||||
def test_save_csv_report():
|
||||
filename = 'report_test.csv'
|
||||
save_csv_report(filename, 'test', EXAMPLE_RESULTS)
|
||||
with open(filename) as f:
|
||||
content = f.read()
|
||||
assert 'username,name,url_main' in content
|
||||
assert 'test,GitHub' in content
|
||||
|
||||
|
||||
def test_save_txt_report():
|
||||
filename = 'report_test.txt'
|
||||
save_txt_report(filename, 'test', EXAMPLE_RESULTS)
|
||||
with open(filename) as f:
|
||||
content = f.read()
|
||||
assert 'https://www.github.com/test' in content
|
||||
assert 'Total Websites Username Detected On : 1' in content
|
||||
|
||||
|
||||
def test_save_json_report_simple():
|
||||
filename = 'report_test.json'
|
||||
save_json_report(filename, 'test', EXAMPLE_RESULTS, 'simple')
|
||||
with open(filename) as f:
|
||||
data = json.load(f)
|
||||
assert 'GitHub' in data
|
||||
|
||||
|
||||
def test_save_json_report_ndjson():
|
||||
filename = 'report_test_ndjson.json'
|
||||
save_json_report(filename, 'test', EXAMPLE_RESULTS, 'ndjson')
|
||||
with open(filename) as f:
|
||||
lines = f.readlines()
|
||||
assert len(lines) == 1
|
||||
assert json.loads(lines[0])['sitename'] == 'GitHub'
|
||||
|
||||
|
||||
def _markdown_context_with_rich_ids():
|
||||
"""Build a context with found accounts, ids_data (incl. image, url, list) to exercise all branches."""
|
||||
found_result = copy.deepcopy(GOOD_RESULT)
|
||||
found_result.tags = ['photo', 'us']
|
||||
found_result.ids_data = {
|
||||
"fullname": "Alice",
|
||||
"name": "Alice A.",
|
||||
"location": "Berlin",
|
||||
"bio": "Photographer",
|
||||
"external_url": "https://example.com/profile",
|
||||
"image": "https://example.com/avatar.png", # must be skipped
|
||||
"aliases": ["alice", "alicea"], # list value
|
||||
"last_online": "2024-01-02 10:00:00",
|
||||
}
|
||||
data = {
|
||||
'Github': {
|
||||
'username': 'alice',
|
||||
'parsing_enabled': True,
|
||||
'url_main': 'https://github.com/',
|
||||
'url_user': 'https://github.com/alice',
|
||||
'status': found_result,
|
||||
'http_status': 200,
|
||||
'is_similar': False,
|
||||
'rank': 1,
|
||||
'site': MaigretSite('Github', {}),
|
||||
'found': True,
|
||||
'ids_data': found_result.ids_data,
|
||||
},
|
||||
'Similar': {
|
||||
'username': 'alice',
|
||||
'url_user': 'https://other.com/alice',
|
||||
'is_similar': True,
|
||||
'found': True,
|
||||
'status': copy.deepcopy(GOOD_RESULT),
|
||||
},
|
||||
}
|
||||
return {
|
||||
'username': 'alice',
|
||||
'generated_at': '2024-01-02 10:00',
|
||||
'brief': 'Search returned 1 account',
|
||||
'countries_tuple_list': [('us', 1)],
|
||||
'interests_tuple_list': [('photo', 1)],
|
||||
'first_seen': '2023-01-01',
|
||||
'results': [('alice', 'username', data)],
|
||||
}
|
||||
|
||||
|
||||
def test_save_markdown_report():
|
||||
filename = 'report_test.md'
|
||||
context = _markdown_context_with_rich_ids()
|
||||
save_markdown_report(filename, context, run_info={'sites_count': 100, 'flags': '--top-sites 100'})
|
||||
with open(filename) as f:
|
||||
content = f.read()
|
||||
assert '# Report by searching on username "alice"' in content
|
||||
assert '## Summary' in content
|
||||
assert '## Accounts found' in content
|
||||
assert '### Github' in content
|
||||
assert '[https://github.com/alice](https://github.com/alice)' in content
|
||||
assert 'Ethical use' in content
|
||||
assert '100 sites checked' in content
|
||||
# image field must NOT appear in per-site listing
|
||||
assert 'avatar.png' not in content
|
||||
# list field rendered with join
|
||||
assert 'alice, alicea' in content
|
||||
# external url formatted as markdown link
|
||||
assert '[https://example.com/profile](https://example.com/profile)' in content
|
||||
|
||||
|
||||
def test_save_markdown_report_minimal_context():
|
||||
"""No run_info, no first_seen — exercise the fallback branches."""
|
||||
filename = 'report_test_min.md'
|
||||
context = {
|
||||
'username': 'bob',
|
||||
'brief': 'nothing found',
|
||||
'results': [],
|
||||
}
|
||||
save_markdown_report(filename, context)
|
||||
with open(filename) as f:
|
||||
content = f.read()
|
||||
assert '# Report by searching on username "bob"' in content
|
||||
assert '## Summary' in content
|
||||
|
||||
|
||||
def test_get_plaintext_report_minimal():
|
||||
"""Minimal context without countries/interests."""
|
||||
context = {
|
||||
'brief': 'Nothing to report.',
|
||||
'interests_tuple_list': [],
|
||||
'countries_tuple_list': [],
|
||||
}
|
||||
out = get_plaintext_report(context)
|
||||
assert 'Nothing to report.' in out
|
||||
assert 'Countries:' not in out
|
||||
assert 'Interests' not in out
|
||||
|
||||
@@ -1,5 +1,6 @@
|
||||
"""Maigret Database test functions"""
|
||||
|
||||
import json
|
||||
from typing import Any, Dict
|
||||
|
||||
from maigret.sites import MaigretDatabase, MaigretSite
|
||||
@@ -96,6 +97,163 @@ def test_site_strip_engine_data_with_site_prior_updates():
|
||||
assert amperka_stripped.json == UPDATED_EXAMPLE_DB['sites']['Amperka']
|
||||
|
||||
|
||||
def _write_db(tmp_path, name, data):
|
||||
p = tmp_path / name
|
||||
p.write_text(json.dumps(data), encoding='utf-8')
|
||||
return str(p)
|
||||
|
||||
|
||||
def test_extra_db_new_site(tmp_path):
|
||||
db = MaigretDatabase()
|
||||
db.load_from_json(EXAMPLE_DB)
|
||||
assert len(db.sites) == 1
|
||||
|
||||
extra = {
|
||||
'engines': {},
|
||||
'sites': {
|
||||
'ExampleExtra': {
|
||||
'tags': ['us'],
|
||||
'checkType': 'status_code',
|
||||
'url': 'https://example.com/{username}',
|
||||
'urlMain': 'https://example.com/',
|
||||
'usernameClaimed': 'test',
|
||||
'usernameUnclaimed': 'noonewouldeverusethis7',
|
||||
}
|
||||
},
|
||||
'tags': ['us'],
|
||||
}
|
||||
db.load_extra_from_path(_write_db(tmp_path, 'extra.json', extra))
|
||||
|
||||
assert len(db.sites) == 2
|
||||
assert set(db.sites_dict.keys()) == {'Amperka', 'ExampleExtra'}
|
||||
assert len(db._sites) == len(db.sites_dict)
|
||||
|
||||
|
||||
def test_extra_db_site_override_last_wins(tmp_path):
|
||||
db = MaigretDatabase()
|
||||
db.load_from_json(EXAMPLE_DB)
|
||||
assert db.sites_dict['Amperka'].url_main == 'http://forum.amperka.ru'
|
||||
|
||||
extra = {
|
||||
'engines': {},
|
||||
'sites': {
|
||||
'Amperka': {
|
||||
'engine': 'XenForo',
|
||||
'rank': 1,
|
||||
'tags': ['overridden'],
|
||||
'urlMain': 'https://overridden.example',
|
||||
'usernameClaimed': 'adam',
|
||||
'usernameUnclaimed': 'noonewouldeverusethis7',
|
||||
}
|
||||
},
|
||||
'tags': [],
|
||||
}
|
||||
db.load_extra_from_path(_write_db(tmp_path, 'extra.json', extra))
|
||||
|
||||
assert len(db.sites) == 1
|
||||
amperka = db.sites_dict['Amperka']
|
||||
assert amperka.url_main == 'https://overridden.example'
|
||||
assert 'overridden' in amperka.tags
|
||||
|
||||
|
||||
def test_extra_db_engine_override(tmp_path):
|
||||
main = {
|
||||
'engines': {
|
||||
'Proto': {
|
||||
'presenseStrs': ['orig'],
|
||||
'site': {
|
||||
'absenceStrs': ['original absence'],
|
||||
'checkType': 'message',
|
||||
'url': '{urlMain}/orig/{username}',
|
||||
},
|
||||
}
|
||||
},
|
||||
'sites': {
|
||||
'MainSite': {
|
||||
'engine': 'Proto',
|
||||
'rank': 1,
|
||||
'tags': [],
|
||||
'urlMain': 'https://main.example',
|
||||
'usernameClaimed': 'a',
|
||||
'usernameUnclaimed': 'noonewouldeverusethis7',
|
||||
}
|
||||
},
|
||||
'tags': [],
|
||||
}
|
||||
db = MaigretDatabase()
|
||||
db.load_from_json(main)
|
||||
|
||||
extra = {
|
||||
'engines': {
|
||||
'Proto': {
|
||||
'presenseStrs': ['overridden'],
|
||||
'site': {
|
||||
'absenceStrs': ['overridden absence'],
|
||||
'checkType': 'message',
|
||||
'url': '{urlMain}/overridden/{username}',
|
||||
},
|
||||
}
|
||||
},
|
||||
'sites': {
|
||||
'ExtraSite': {
|
||||
'engine': 'Proto',
|
||||
'rank': 10,
|
||||
'tags': [],
|
||||
'urlMain': 'https://extra.example',
|
||||
'usernameClaimed': 'a',
|
||||
'usernameUnclaimed': 'noonewouldeverusethis7',
|
||||
}
|
||||
},
|
||||
'tags': [],
|
||||
}
|
||||
db.load_extra_from_path(_write_db(tmp_path, 'extra.json', extra))
|
||||
|
||||
assert len(db._engines) == 1
|
||||
assert db.engines_dict['Proto'].presenseStrs == ['overridden']
|
||||
extra_site = db.sites_dict['ExtraSite']
|
||||
assert extra_site.absence_strs == ['overridden absence']
|
||||
main_site = db.sites_dict['MainSite']
|
||||
assert main_site.absence_strs == ['original absence']
|
||||
|
||||
|
||||
def test_extra_db_tag_dedup(tmp_path):
|
||||
db = MaigretDatabase()
|
||||
db.load_from_json({'engines': {}, 'sites': {}, 'tags': ['forum', 'ru']})
|
||||
|
||||
extra = {'engines': {}, 'sites': {}, 'tags': ['forum', 'us']}
|
||||
db.load_extra_from_path(_write_db(tmp_path, 'extra.json', extra))
|
||||
|
||||
assert db._tags.count('forum') == 1
|
||||
assert sorted(db._tags) == ['forum', 'ru', 'us']
|
||||
|
||||
|
||||
def test_extra_db_chain_last_wins(tmp_path):
|
||||
db = MaigretDatabase()
|
||||
db.load_from_json(EXAMPLE_DB)
|
||||
|
||||
def site_with_url(url):
|
||||
return {
|
||||
'engines': {},
|
||||
'sites': {
|
||||
'Amperka': {
|
||||
'engine': 'XenForo',
|
||||
'rank': 1,
|
||||
'tags': ['ru'],
|
||||
'urlMain': url,
|
||||
'usernameClaimed': 'adam',
|
||||
'usernameUnclaimed': 'noonewouldeverusethis7',
|
||||
}
|
||||
},
|
||||
'tags': [],
|
||||
}
|
||||
|
||||
db.load_extra_from_path(_write_db(tmp_path, 'a.json', site_with_url('https://a')))
|
||||
db.load_extra_from_path(_write_db(tmp_path, 'b.json', site_with_url('https://b')))
|
||||
|
||||
assert len(db.sites) == 1
|
||||
assert db.sites_dict['Amperka'].url_main == 'https://b'
|
||||
|
||||
|
||||
def test_saving_site_error():
|
||||
db = MaigretDatabase()
|
||||
|
||||
|
||||
@@ -1,5 +0,0 @@
|
||||
#!/bin/bash
|
||||
set -e
|
||||
|
||||
sudo apt-get update && sudo apt-get install -y libcairo2-dev pkg-config
|
||||
pip install .
|
||||
Reference in New Issue
Block a user