Soxoj
b79f8aca28
Add site checks: 18 new sites ( #2575 )
2026-04-29 16:55:47 +02:00
Soxoj
1352bd35c6
Fix site checks: 5 fixed, 4 disabled; fix UA leak bug ( #2569 )
2026-04-26 14:51:44 +02:00
Soxoj
3960510b63
Fix site checks: 7 fixed, 1 disabled ( #2565 )
...
False-positive site probe issues #2531 , #2542 , #2556 , #2559 , #2560 , #2561 , #2563 , #2496 .
2026-04-26 12:34:52 +02:00
Soxoj
e962b8c693
Fix site checks: 5 fixed; readme fix ( #2562 )
...
* Fix site checks: 5 fixed; readme fix
* Logging improvements
* Improve YouTube data extraction
2026-04-25 18:15:38 +02:00
Soxoj
25026e21ea
Fix site checks: 4 → ip_reputation, 9 fixed, 16 disabled, 3 dead dele… ( #2555 )
...
* Fix site checks: 4 → ip_reputation, 9 fixed, 16 disabled, 3 dead deleted; clarify ip_reputation tag semantics
* Improved test coverage
2026-04-23 21:17:07 +02:00
Soxoj
5e1cc45c17
Fix site checks: 12 fixed, 19 disabled; add new protection tags ( #2550 )
2026-04-22 20:25:41 +02:00
Soxoj
d9b361b626
Fix site checks: 3 → ip_reputation, 10 fixed, 6 disabled, 2 dead deleted ( #2549 )
2026-04-22 12:46:53 +02:00
Soxoj
0131f0b64c
Add OnlyFans with activation mechanism; updated site ranks ( #2546 )
2026-04-21 19:03:45 +02:00
Soxoj
98f03c153b
Add 3 crypto sites (Polymarket, Zora, Revolut.me), added crypto inves… ( #2538 )
...
* Add 3 crypto sites (Polymarket, Zora, Revolut.me), added crypto investigation use case page in docs
* Added fintech tag
* Updated sites metadata
2026-04-21 11:08:48 +02:00
Soxoj
1f823e8322
Fix site checks: 3 fixed, 2 → ip_reputation, 7 disabled, 1 dead deleted ( #2539 )
2026-04-21 10:58:45 +02:00
Soxoj
d6905a8fd8
Fix site checks: 4 fixed, 14 → ip_reputation, 8 disabled, 5 dead deleted ( #2537 )
2026-04-21 00:40:24 +02:00
Soxoj
7d216638fa
fix site checks: 14 sites → ip_reputation, 7 disabled, 5 dead deleted ( #2536 )
2026-04-20 23:51:18 +02:00
Soxoj
fb71f26fd0
Fix site checks: recover 6 CF sites via tls_fingerprint, 500px GraphQL, delete 4 dead domains ( #2535 )
2026-04-20 22:41:51 +02:00
Soxoj
f74f82ee13
Fixed: Hack MD, DailyKos, Mywed, WikimapiaSearch, TikTok Online Viewer ( #2526 , #2522 , #2523 , #2500 , #2496 ); Disabled: Radiokot, Lurkmore, Mylespaul, AppleDiscussions, Loveplanet ( #2524 , #2511 , #2498 ) ( #2528 )
2026-04-17 17:04:50 +02:00
Soxoj
dc8751ac55
Added 3 sites, fixed 6, disabled 8 ( #2505 )
2026-04-10 12:26:41 +02:00
Copilot
9303b1686d
Disable Kinja.com site check ( #2503 )
2026-04-10 12:16:28 +02:00
Soxoj
5e24117e93
Fix false positives ( #2499 )
...
* Re-disable 29 false positives from #2478
2026-04-09 17:48:45 +02:00
Soxoj
777e503e30
Re-enable 69 stale-disabled sites validated via self-check ( #2478 )
...
Total: 2539 → 2608 enabled sites (+69).
2026-04-09 12:27:48 +02:00
Soxoj
b213f6e079
vBulletin cleanup, Flarum sites, engine stats, UA bump ( #2476 )
2026-04-09 01:17:24 +02:00
Copilot
fbb8255518
Update HackTheBox and Wikipedia to use new API endpoints ( #2470 )
...
* Initial plan
* Update HackTheBox and Wikipedia to use new API endpoints for username checking
Agent-Logs-Url: https://github.com/soxoj/maigret/sessions/6dc9147c-787f-4f4f-8903-7b9873ac6ac9
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com >
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
2026-04-08 10:23:07 +02:00
Soxoj
6db1df2ddb
Fix failing test for custom DB path resolution ( #2468 )
...
* Fix `--db` bug
* Fix test_resolve_db_path_custom_file to create the file before testing
Agent-Logs-Url: https://github.com/soxoj/maigret/sessions/3ea7b2e8-0565-4fca-8ec2-eff8eb4ee617
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com >
2026-04-08 00:53:57 +02:00
Soxoj
6834483360
Fix Spotify, add Spotify Community forum ( #2467 )
2026-04-07 18:25:13 +02:00
Soxoj
3fd34afb77
Sites fixes ( #2464 )
2026-04-06 21:41:16 +02:00
Soxoj
ad95302745
Add Markdown reports for LLM analysis ( #2463 )
2026-04-06 18:26:43 +02:00
Soxoj
6d0a22b738
False positive fixes ( #2460 )
...
* Fix false positives: APClips, Taplink, gentoo, Discord.bio, ChaturBate; disable 7Cups, playtime, openriskmanual, reactos; update tags
* Fix db_meta.json regeneration in update_site_data.py (inline instead of module import)
* Fix false positives: disable Bit.ly, Firearmstalk, Needrom, Travelblog; fix gentoo, Discord.bio, brickimedia via API; remove dead sites dreamhost, typepad
2026-04-04 19:08:51 +02:00
Soxoj
abce3c9be4
Fix false positives ( #2459 )
...
* Fix false positives: APClips, Taplink, gentoo, Discord.bio, ChaturBate; disable 7Cups, playtime, openriskmanual, reactos; update tags
* Fix db_meta.json regeneration in update_site_data.py (inline instead of module import)
2026-04-04 18:22:21 +02:00
Soxoj
269d50eedc
DB update mechanism ( #2458 )
...
* Database update mechanism
2026-04-04 18:00:50 +02:00
Soxoj
e8f4318e5d
Added Crypto/Web3 site checks ( #2457 )
2026-04-04 16:49:12 +02:00
Julio César Suástegui
eeb38ccdc0
fix(data): update InterPals absence string to match current site response ( #2442 )
...
The previous absence string 'The requested user does not exist or is inactive'
no longer matches the live site response. InterPals now returns 'User not found'
for non-existent profiles, causing false positives for all username searches.
Tested against interpals.net/noneownsthisusername (non-existent) and
interpals.net/blue (claimed) to confirm detection accuracy.
Closes #2433
Co-authored-by: Julio César Suástegui <juliosuas@users.noreply.github.com >
2026-04-03 13:43:33 +02:00
Soxoj
5d502eaef6
Add site protection tracking system, fix broken site checks (Instagra… ( #2452 )
...
* Add site protection tracking system, fix broken site checks (Instagram, StackOverflow, LeetCode, Boosty, LiveLib), preserve unicode in data.json
* Update poetry.lock by running poetry lock
Agent-Logs-Url: https://github.com/soxoj/maigret/sessions/14333f41-67d5-4e28-a782-9730b31fc667
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com >
2026-04-02 20:28:20 +02:00
Soxoj
5aa7f6429b
Overhaul site tags and naming: add social tag to 33 networks, fill mi… ( #2430 )
...
* Overhaul site tags and naming: add social tag to 33 networks, fill missing tags for 213 top-1000 sites, clean up false us/in country tags (~374 sites), normalize site names to Title Case, add tag validation tests, document tagging and naming rules
Remove LLM folder: ask @soxoj for the up-to-date version!
* Remove LLM/ from version control
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com >
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com >
2026-03-28 19:48:16 +01:00
Soxoj
a5d337b765
Tags and site names improvements ( #2427 )
...
- Added social tag to social networks (33 sites)
- Fixed wrong tags (8 sites)
- Filled empty tags for 213 sites in top-1000
- Country tag cleanup (~374 sites)
- Site naming normalization (75 sites)
- New tests (3)
- Documentation updates
2026-03-28 15:42:12 +01:00
Soxoj
51b452ad71
Add urlProbes ( #2425 )
2026-03-28 00:08:02 +01:00
Soxoj
fa1a4d1b4a
Sites re-check ( #2423 )
2026-03-27 22:41:55 +01:00
Soxoj
656fe1df24
Added Max.ru check; --no-progressbar flag fixed ( #2386 )
2026-03-25 11:48:12 +01:00
Soxoj
bc3d9faad9
Fix false-positive site checks reported by Maigret Bot ( #2376 )
2026-03-24 23:01:11 +01:00
Soxoj
b145e7b26f
feat(core): add POST request support, new sites, migrate to Majestic Million ranking ( #2317 )
...
* feat(core): add POST request support, new sites, migrate to Majestic Million ranking
- Added native POST request support to the Maigret engine (requestMethod, requestPayload) to enable querying modern JSON registration endpoints.
- Replaced the discontinued Alexa rank API with the Majestic Million dataset for global popularity sorting and automated CI updates.
- Fixed multiple false positives among top 500 sites and bypassed standard anti-bot protections using custom User-Agents.
- Updated public documentation and internal playbooks to reflect the new features.
* feat(data): apply all data.json site check updates from main branch
- Added CTFtime and PentesterLab (new sites added in main)
- Removed forums.imore.com (deleted in main as dead site)
- Disabled 5 sites per main branch fixes: Librusec, MirTesen, amateurvoyeurforum.com, forums.stevehoffman.tv, vegalab
- Fixed 5 site checks per main branch: SoundCloud, Taplink, Setlist, RoyalCams, club.cnews.ru (switched from status_code to message checkType with proper markers)
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
Agent-Logs-Url: https://github.com/soxoj/maigret/sessions/a1d194d9-c0ff-4e2b-974c-c5e4b59548bf
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com >
2026-03-24 22:08:42 +01:00
Copilot
3e56c95e16
Fix SoundCloud false-positive: switch to message-based check ( #2355 )
...
* Initial plan
* Fix SoundCloud false-positive: switch from status_code to message checkType
SoundCloud returns HTTP 200 for non-existent user profiles (soft 404),
causing status_code check to report CLAIMED for random usernames.
Switch to message checkType with:
- presenseStrs: hydratable user marker in server-rendered HTML
- absenceStrs: generic page title for non-existent users
Markers sourced from WhatsMyName project's verified SoundCloud entry.
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
Agent-Logs-Url: https://github.com/soxoj/maigret/sessions/8aa10eef-78bf-4251-bf42-473cd94c7ef4
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com >
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
2026-03-24 15:12:56 +01:00
Copilot
28f35f9a4f
Fix club.cnews.ru false positive: switch from status_code to message checkType ( #2342 )
...
* Initial plan
* Fix club.cnews.ru false positive: switch from status_code to message checkType with absence strings
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
Agent-Logs-Url: https://github.com/soxoj/maigret/sessions/af131d2f-c7b5-4798-8ad1-86bab2673fe4
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com >
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
2026-03-24 10:52:23 +01:00
Julio César Suástegui
79cea49526
feat: add CTFtime and PentesterLab site support ( #2318 )
...
Add two cybersecurity platforms for username enumeration:
- CTFtime (ctftime.org) - CTF competition platform
- PentesterLab (pentesterlab.com) - Security training platform
Both verified working with status_code check type.
Returns 200 for existing users, 404 for non-existent.
Co-authored-by: Julio César Suástegui <juliosuas@users.noreply.github.com >
2026-03-24 10:52:07 +01:00
Copilot
eb541dcf51
Disable MirTesen site check (false positive) ( #2350 )
...
* Initial plan
* Disable MirTesen site check to fix false-positive probe
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
Agent-Logs-Url: https://github.com/soxoj/maigret/sessions/61c86064-423d-4f1b-8277-2838f747dd89
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com >
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
2026-03-24 09:51:31 +01:00
Copilot
4c97025a32
Disable Librusec site check (false positive) ( #2349 )
...
* Initial plan
* Disable Librusec site check to fix false-positive probe
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com >
2026-03-24 09:51:16 +01:00
Copilot
d3f13ac295
Fix false-positive site probe: Re-enable Taplink with message checkType ( #2326 )
...
* Initial plan
* Disable Taplink site check to fix false-positive detections
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
Agent-Logs-Url: https://github.com/soxoj/maigret/sessions/ef9281f4-ba67-4760-a6e2-57564ac4ea94
* Re-enable Taplink with message checkType and absenceStrs
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
Agent-Logs-Url: https://github.com/soxoj/maigret/sessions/db3e572e-b79b-4cec-ac7f-062e76144660
* Improve Taplink absenceStrs: add Russian variant and presenseStrs
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
Agent-Logs-Url: https://github.com/soxoj/maigret/sessions/28e24317-e8b9-45f6-bad5-0e549b891313
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com >
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
2026-03-23 21:36:36 +01:00
Copilot
00a9249229
[WIP] Fix invalid link on forums.imore.com ( #2337 )
...
* Initial plan
* Remove dead forums.imore.com site from database
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
Agent-Logs-Url: https://github.com/soxoj/maigret/sessions/c83530d0-d24f-45fc-aca3-ae1e46ece33c
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com >
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
2026-03-23 20:20:28 +01:00
Copilot
005863c2e0
Fix Setlist site check: switch to message checkType with proper markers ( #2333 )
...
* Initial plan
* Disable Setlist site check due to false positives (soft 404)
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
Agent-Logs-Url: https://github.com/soxoj/maigret/sessions/8c552ca6-51e5-4e79-a791-ddd6f27d2461
* Fix Setlist check: switch to message checkType with proper markers
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
Agent-Logs-Url: https://github.com/soxoj/maigret/sessions/3c387df6-1dfe-451f-96d8-b4b6455f7857
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com >
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
2026-03-23 20:18:33 +01:00
Copilot
e3aada6aef
Fix RoyalCams site check using BongaCams white-label pattern ( #2334 )
...
* Initial plan
* Disable RoyalCams site check to fix false-positive probe
The Telegram Maigret bot auto-probe reported CLAIMED for three random
usernames. The status_code checkType is unreliable as the site returns
200 for non-existent user profiles (soft 404). Disabling the site check
until a reliable detection method can be established.
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
Agent-Logs-Url: https://github.com/soxoj/maigret/sessions/05b3d513-fe15-477d-a455-0c9ddf0b8b51
* Fix RoyalCams: switch to message checkType using BongaCams white-label pattern
RoyalCams runs on the BongaCams platform. Applied the same fix pattern:
- Switch from status_code to message checkType
- Use Portuguese locale (pt.royalcams.com) as urlProbe
- absenceStrs matches generic title on non-existent profiles
- presenseStrs matches Portuguese profile title for existing users
- Add browser-like headers matching BongaCams config
- Remove disabled flag
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
Agent-Logs-Url: https://github.com/soxoj/maigret/sessions/2f6a9523-278a-4992-ba7c-c320de14bfa4
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com >
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
2026-03-23 20:16:45 +01:00
Copilot
9b35fc1ab0
[WIP] Fix false-positive probe for vegalab site ( #2336 )
...
* Initial plan
* Disable vegalab site check: domain is dead (DNS does not resolve), causing false positives
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
Agent-Logs-Url: https://github.com/soxoj/maigret/sessions/98430e81-5dcb-4cb3-9aaa-f8c5ce86d026
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com >
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
2026-03-23 20:09:46 +01:00
Copilot
146bc0481b
Disable forums.stevehoffman.tv due to false positives ( #2331 )
...
* Initial plan
* Disable forums.stevehoffman.tv to fix false-positive site probe
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
Agent-Logs-Url: https://github.com/soxoj/maigret/sessions/39fea4a9-ec6d-4a12-b34b-1a3486d647e4
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com >
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
2026-03-23 20:08:15 +01:00
Copilot
5930a3022e
Disable false-positive site probe: amateurvoyeurforum.com ( #2332 )
...
* Initial plan
* Disable amateurvoyeurforum.com site check to fix false positives
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
Agent-Logs-Url: https://github.com/soxoj/maigret/sessions/e7fcad2b-4511-4e6d-b186-411951170e0a
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com >
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
2026-03-23 20:07:42 +01:00
Copilot
b1a211c3cd
Disable forums.developer.nvidia.com (auth-gated user profiles) ( #2305 )
...
* Initial plan
* disable forums.developer.nvidia.com due to auth-locked user pages
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
Agent-Logs-Url: https://github.com/soxoj/maigret/sessions/b8f41f15-8588-4aac-a443-af5e2aaa1918
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com >
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
2026-03-22 22:43:51 +01:00