Soxoj
b145e7b26f
feat(core): add POST request support, new sites, migrate to Majestic Million ranking ( #2317 )
...
* feat(core): add POST request support, new sites, migrate to Majestic Million ranking
- Added native POST request support to the Maigret engine (requestMethod, requestPayload) to enable querying modern JSON registration endpoints.
- Replaced the discontinued Alexa rank API with the Majestic Million dataset for global popularity sorting and automated CI updates.
- Fixed multiple false positives among top 500 sites and bypassed standard anti-bot protections using custom User-Agents.
- Updated public documentation and internal playbooks to reflect the new features.
* feat(data): apply all data.json site check updates from main branch
- Added CTFtime and PentesterLab (new sites added in main)
- Removed forums.imore.com (deleted in main as dead site)
- Disabled 5 sites per main branch fixes: Librusec, MirTesen, amateurvoyeurforum.com, forums.stevehoffman.tv, vegalab
- Fixed 5 site checks per main branch: SoundCloud, Taplink, Setlist, RoyalCams, club.cnews.ru (switched from status_code to message checkType with proper markers)
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
Agent-Logs-Url: https://github.com/soxoj/maigret/sessions/a1d194d9-c0ff-4e2b-974c-c5e4b59548bf
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com >
2026-03-24 22:08:42 +01:00
Copilot
3e56c95e16
Fix SoundCloud false-positive: switch to message-based check ( #2355 )
...
* Initial plan
* Fix SoundCloud false-positive: switch from status_code to message checkType
SoundCloud returns HTTP 200 for non-existent user profiles (soft 404),
causing status_code check to report CLAIMED for random usernames.
Switch to message checkType with:
- presenseStrs: hydratable user marker in server-rendered HTML
- absenceStrs: generic page title for non-existent users
Markers sourced from WhatsMyName project's verified SoundCloud entry.
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
Agent-Logs-Url: https://github.com/soxoj/maigret/sessions/8aa10eef-78bf-4251-bf42-473cd94c7ef4
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com >
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
2026-03-24 15:12:56 +01:00
Copilot
28f35f9a4f
Fix club.cnews.ru false positive: switch from status_code to message checkType ( #2342 )
...
* Initial plan
* Fix club.cnews.ru false positive: switch from status_code to message checkType with absence strings
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
Agent-Logs-Url: https://github.com/soxoj/maigret/sessions/af131d2f-c7b5-4798-8ad1-86bab2673fe4
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com >
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
2026-03-24 10:52:23 +01:00
Julio César Suástegui
79cea49526
feat: add CTFtime and PentesterLab site support ( #2318 )
...
Add two cybersecurity platforms for username enumeration:
- CTFtime (ctftime.org) - CTF competition platform
- PentesterLab (pentesterlab.com) - Security training platform
Both verified working with status_code check type.
Returns 200 for existing users, 404 for non-existent.
Co-authored-by: Julio César Suástegui <juliosuas@users.noreply.github.com >
2026-03-24 10:52:07 +01:00
Copilot
eb541dcf51
Disable MirTesen site check (false positive) ( #2350 )
...
* Initial plan
* Disable MirTesen site check to fix false-positive probe
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
Agent-Logs-Url: https://github.com/soxoj/maigret/sessions/61c86064-423d-4f1b-8277-2838f747dd89
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com >
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
2026-03-24 09:51:31 +01:00
Copilot
4c97025a32
Disable Librusec site check (false positive) ( #2349 )
...
* Initial plan
* Disable Librusec site check to fix false-positive probe
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com >
2026-03-24 09:51:16 +01:00
Copilot
d3f13ac295
Fix false-positive site probe: Re-enable Taplink with message checkType ( #2326 )
...
* Initial plan
* Disable Taplink site check to fix false-positive detections
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
Agent-Logs-Url: https://github.com/soxoj/maigret/sessions/ef9281f4-ba67-4760-a6e2-57564ac4ea94
* Re-enable Taplink with message checkType and absenceStrs
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
Agent-Logs-Url: https://github.com/soxoj/maigret/sessions/db3e572e-b79b-4cec-ac7f-062e76144660
* Improve Taplink absenceStrs: add Russian variant and presenseStrs
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
Agent-Logs-Url: https://github.com/soxoj/maigret/sessions/28e24317-e8b9-45f6-bad5-0e549b891313
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com >
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
2026-03-23 21:36:36 +01:00
Copilot
00a9249229
[WIP] Fix invalid link on forums.imore.com ( #2337 )
...
* Initial plan
* Remove dead forums.imore.com site from database
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
Agent-Logs-Url: https://github.com/soxoj/maigret/sessions/c83530d0-d24f-45fc-aca3-ae1e46ece33c
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com >
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
2026-03-23 20:20:28 +01:00
Copilot
005863c2e0
Fix Setlist site check: switch to message checkType with proper markers ( #2333 )
...
* Initial plan
* Disable Setlist site check due to false positives (soft 404)
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
Agent-Logs-Url: https://github.com/soxoj/maigret/sessions/8c552ca6-51e5-4e79-a791-ddd6f27d2461
* Fix Setlist check: switch to message checkType with proper markers
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
Agent-Logs-Url: https://github.com/soxoj/maigret/sessions/3c387df6-1dfe-451f-96d8-b4b6455f7857
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com >
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
2026-03-23 20:18:33 +01:00
Copilot
e3aada6aef
Fix RoyalCams site check using BongaCams white-label pattern ( #2334 )
...
* Initial plan
* Disable RoyalCams site check to fix false-positive probe
The Telegram Maigret bot auto-probe reported CLAIMED for three random
usernames. The status_code checkType is unreliable as the site returns
200 for non-existent user profiles (soft 404). Disabling the site check
until a reliable detection method can be established.
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
Agent-Logs-Url: https://github.com/soxoj/maigret/sessions/05b3d513-fe15-477d-a455-0c9ddf0b8b51
* Fix RoyalCams: switch to message checkType using BongaCams white-label pattern
RoyalCams runs on the BongaCams platform. Applied the same fix pattern:
- Switch from status_code to message checkType
- Use Portuguese locale (pt.royalcams.com) as urlProbe
- absenceStrs matches generic title on non-existent profiles
- presenseStrs matches Portuguese profile title for existing users
- Add browser-like headers matching BongaCams config
- Remove disabled flag
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
Agent-Logs-Url: https://github.com/soxoj/maigret/sessions/2f6a9523-278a-4992-ba7c-c320de14bfa4
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com >
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
2026-03-23 20:16:45 +01:00
Copilot
9b35fc1ab0
[WIP] Fix false-positive probe for vegalab site ( #2336 )
...
* Initial plan
* Disable vegalab site check: domain is dead (DNS does not resolve), causing false positives
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
Agent-Logs-Url: https://github.com/soxoj/maigret/sessions/98430e81-5dcb-4cb3-9aaa-f8c5ce86d026
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com >
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
2026-03-23 20:09:46 +01:00
Copilot
146bc0481b
Disable forums.stevehoffman.tv due to false positives ( #2331 )
...
* Initial plan
* Disable forums.stevehoffman.tv to fix false-positive site probe
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
Agent-Logs-Url: https://github.com/soxoj/maigret/sessions/39fea4a9-ec6d-4a12-b34b-1a3486d647e4
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com >
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
2026-03-23 20:08:15 +01:00
Copilot
5930a3022e
Disable false-positive site probe: amateurvoyeurforum.com ( #2332 )
...
* Initial plan
* Disable amateurvoyeurforum.com site check to fix false positives
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
Agent-Logs-Url: https://github.com/soxoj/maigret/sessions/e7fcad2b-4511-4e6d-b186-411951170e0a
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com >
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
2026-03-23 20:07:42 +01:00
Copilot
b1a211c3cd
Disable forums.developer.nvidia.com (auth-gated user profiles) ( #2305 )
...
* Initial plan
* disable forums.developer.nvidia.com due to auth-locked user pages
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
Agent-Logs-Url: https://github.com/soxoj/maigret/sessions/b8f41f15-8588-4aac-a443-af5e2aaa1918
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com >
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
2026-03-22 22:43:51 +01:00
Copilot
56d0c9f2f1
Remove dead site xxxforum.org ( #2310 )
...
* Initial plan
* Remove broken site xxxforum.org from data.json and sites.md
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
Agent-Logs-Url: https://github.com/soxoj/maigret/sessions/bfbd3aa8-bfb1-480a-b2e7-a2c40fc69def
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com >
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
2026-03-22 22:43:21 +01:00
Copilot
01049b730d
Fix Love.Mail.ru: update to numeric-only identifiers and new profile URL ( #2307 )
...
* Initial plan
* fix: update Love.Mail.ru to use numeric-only identifiers (#1264 )
- Add regexCheck to enforce numeric-only IDs (^\d+$)
- Update usernameClaimed/usernameUnclaimed to numeric values
- Site remains disabled pending live verification
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
Agent-Logs-Url: https://github.com/soxoj/maigret/sessions/6de16097-6bc1-424a-beb1-1d2ec6b99944
* fix: update Love.Mail.ru URL to /profile/ path, enable check with verified ID
Use maintainer-provided working link https://love.mail.ru/profile/1838153357 .
- Change URL pattern from /ru/{username} to /profile/{username}
- Set usernameClaimed to 1838153357
- Remove disabled flag to enable the check
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
Agent-Logs-Url: https://github.com/soxoj/maigret/sessions/ac07d38e-46e2-42d3-9e93-eda3e5cfbcc3
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com >
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
2026-03-22 22:42:59 +01:00
Copilot
4f397fed1c
Re-enable taplink.cc with browser User-Agent to bypass Cloudflare ( #2308 )
...
* Initial plan
* fix(taplink): re-enable taplink.cc with browser User-Agent header to bypass Cloudflare
Remove disabled flag and add a Chrome User-Agent header to help
bypass Cloudflare bot detection for taplink.cc profile checks.
If Cloudflare still blocks requests, maigret's built-in error
detection will gracefully mark results as UNKNOWN.
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
Agent-Logs-Url: https://github.com/soxoj/maigret/sessions/271904b6-e358-4aeb-b503-21c9b91186d9
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com >
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
2026-03-22 22:10:44 +01:00
Soxoj
959b2be136
feat(sites): fix false positives: disable 74 broken sites, fix 8 with API probes and better markers ( #2302 )
...
- Disable 74 sites: Cloudflare/captcha blocks, identical responses,
dead domains, vBulletin/phpBB engine failures
- Fix Roblox, Salon24.pl, Planetaexcel → status_code (clear 404 signal)
- Fix en.brickimedia.org → message with "noarticletext" absenceStr
- Fix Arduino → narrower title-based presenseStrs/absenceStrs
- Re-enable Fandom (3 wikis) via MediaWiki api.php urlProbe
- Re-enable Substack via /api/v1/user/{}/public_profile urlProbe
- Re-enable hashnode via GraphQL GET urlProbe (URL-encoded query)
- Document lessons: engine template drift, search-by-author fragility,
always-200 sites, TLS degradation, API bypassing Cloudflare,
GraphQL GET support, URL-encoding for template safety
2026-03-22 20:47:51 +01:00
Soxoj
97cc4b46d9
Improve site-check quality: fix broken site configs, add diagnostic utilities, and make self-check report-only by default with opt-in auto-disable. ( #2301 )
...
- Fix VK and TradingView checkType; add Reddit and Microsoft Learn API-style probes where appropriate; adjust or disable entries that are unreliable under anti-bot protection.
- Self-check: stop aggressive auto-disable; default to reporting issues only; add --auto-disable and --diagnose for optional fixes and deeper output.
- Tooling: add utils/site_check.py and utils/check_top_n.py (and related helpers) to inspect and rank site behavior against the top-N list
- Scope: aligns with fixing top-traffic / high-impact sites and making diagnostics repeatable without silently flipping disabled flags
2026-03-22 16:48:35 +01:00
Soxoj
227a25bfa1
Twitter fixed, mirrors mechanism improvement ( #2299 )
2026-03-22 01:14:17 +01:00
Soxoj
f99091f5f7
Fixed false positives in top-500 ( #2292 )
2026-03-21 23:35:59 +01:00
Soxoj
fb26ccd1f6
Disabled some sites giving false positive results ( #2170 )
2025-08-22 03:10:47 +02:00
Soxoj
bebadb0362
Bump to 0.5.0 ( #2108 )
2025-08-10 13:10:50 +02:00
MR-VL
d90d8a8ac9
Disable AskFM ( #2037 )
2025-07-13 16:16:49 +02:00
Darlyson Rangel
c9e38632ca
Disable ICQ site ( #1993 )
2025-06-28 23:46:09 +02:00
Pierre-Yves Lapersonne
f76ea5d738
[ #2010 ] Add 6 more websites to manage ( #2009 )
...
* feat: add `framapiaf.org` in supported web sites, add tag `mastodon` (#2010 )
Signed-off-by: Pierre-Yves Lapersonne <dev@pylapersonne.info >
* feat: add `write.as` in supported web sites, add tag `writefreely` (#2010 )
Signed-off-by: Pierre-Yves Lapersonne <dev@pylapersonne.info >
* feat: add `programming.dev` in supported web sites, add tag `lemmy` (#2010 )
Signed-off-by: Pierre-Yves Lapersonne <dev@pylapersonne.info >
* feat: add `mamot.fr` in supported web sites (#2010 )
Signed-off-by: Pierre-Yves Lapersonne <dev@pylapersonne.info >
* feat: add `pixelfed.social` in supported web sites, add tag `pixelfed` (#2010 )
Signed-off-by: Pierre-Yves Lapersonne <dev@pylapersonne.info >
* feat: add `Outgress` in supported web sites (#2010 )
Signed-off-by: Pierre-Yves Lapersonne <dev@pylapersonne.info >
* Updated the list of supported sites
---------
Signed-off-by: Pierre-Yves Lapersonne <dev@pylapersonne.info >
Co-authored-by: Soxoj <soxoj@protonmail.com >
2025-06-28 23:33:29 +02:00
Soxoj
97e5f600d0
Async generator-executor for site checks ( #1978 )
2024-12-17 22:48:11 +01:00
Soxoj
f212bc9bc8
Site check fixes
2024-12-12 21:39:35 +01:00
Soxoj
4dd82bf4c9
Fixed Gravatar parsing (socid_extractor)
2024-12-12 02:30:29 +01:00
Soxoj
64ae391a4a
Updated Vimeo, CNET, DailyMotion
2024-12-11 01:17:20 +01:00
Soxoj
127d9032c3
Fixed Vimeo, activation/probing mechanisms improvements
2024-12-11 00:56:00 +01:00
Soxoj
81a817a39f
Improved "submit new site" mode, added tests, fixed top-500 sites ( #1952 )
2024-12-10 18:02:43 +01:00
Soxoj
51ab988e36
Fixed ProductHunt check ( #1951 )
2024-12-09 17:06:03 +01:00
Soxoj
5517636850
Updated OP.GG checks ( #1950 )
...
* Updated OP.GG checks
* Finalized LoL, added Valorant, disabled Archive.org
2024-12-09 15:59:19 +01:00
Soxoj
c66d776f8a
Refactoring, test coverage increased to 60% ( #1943 )
2024-12-08 02:13:28 +01:00
Soxoj
4b1317789d
Refactored self-check method, code formatting, small lint fixes ( #1942 )
2024-12-07 18:05:30 +01:00
Soxoj
8b7d8073d9
Fixed Linktr and discourse.mozilla.org ( #1941 )
2024-12-07 17:11:39 +01:00
Soxoj
2aa1ea39a0
Site fixes ( #1940 )
2024-12-06 14:27:38 +01:00
Soxoj
cd789ed138
Fixed Ebay and BongaCams checks ( #1939 )
2024-12-06 13:32:51 +01:00
Soxoj
5641456ba0
Weibo site check fix, activation mechanism added ( #1938 )
2024-12-06 11:31:20 +01:00
Soxoj
f04de78682
Activation mechanism documentation added ( #1935 )
...
Few site checks fixed
2024-12-06 01:35:19 +01:00
Soxoj
cb9f01c106
Fixed Figma check ( #1932 )
...
Fixed cookies bug
Improved self-check mode: don't disable sites because of check errors
2024-12-04 19:21:27 +01:00
Soxoj
1cb25946dd
Disabled Figma check ( #1928 )
2024-12-04 00:27:55 +01:00
Soxoj
d15e12750b
Sites fixes ( #1917 )
...
* Some sites fixes
* Sites stats updated
2024-12-01 03:19:36 +01:00
Soxoj
b370bc4c44
Sites checks fixes ( #1896 )
...
Fixed incorrect site names, added method to compare sites
2024-11-26 13:29:43 +01:00
Soxoj
d8a05807ba
New sites added ( #1888 )
2024-11-25 18:24:20 +01:00
Soxoj
86d51bced0
Added 7 sites, implemented integration with Marple, docs update ( #1881 )
...
* Added 5 sites, implemented integration with Marple
* Added 2 more sites, updated docs
* Updated sites list
2024-11-25 14:41:34 +01:00
Soxoj
54b864f167
Disabled unavailable sites ( #1880 )
2024-11-24 17:19:31 +01:00
Soxoj
24e545b62c
Added dev documentation, fixed some sites, removed GitHub issue links from reports ( #1869 )
2024-11-23 18:45:56 +01:00
synth
05db32f28f
Fixed 1 site, PyInstaller workflow, Google Colab example ( #1558 )
...
* Updated example colab file (Due to latest update)
* Fix RobertsSpaceIndustries URI
* Fix PyInstaller workflow
* Fix example.ipynb (read desc.)
Currently the version installed via pip3 doesn't appear to contain the latest data.json file, resulting in many false positives..
* Fix non-existant users (read desc.)
Fixed non-existant usernames for the following:
Telegram (t.me)
TikBuddy (tikbuddy.com)
FurAffinity (furaffinity.net)
2024-10-21 15:58:16 +02:00