Commit Graph

23 Commits

Author SHA1 Message Date
Sayon Dey f897598f98 Fix outdated Google Colab setup instructions (#2591) 2026-05-02 15:21:16 +02:00
Soxoj 25026e21ea Fix site checks: 4 → ip_reputation, 9 fixed, 16 disabled, 3 dead dele… (#2555)
* Fix site checks: 4 → ip_reputation, 9 fixed, 16 disabled, 3 dead deleted; clarify ip_reputation tag semantics

* Improved test coverage
2026-04-23 21:17:07 +02:00
Soxoj 5e1cc45c17 Fix site checks: 12 fixed, 19 disabled; add new protection tags (#2550) 2026-04-22 20:25:41 +02:00
Soxoj d9b361b626 Fix site checks: 3 → ip_reputation, 10 fixed, 6 disabled, 2 dead deleted (#2549) 2026-04-22 12:46:53 +02:00
Soxoj fb71f26fd0 Fix site checks: recover 6 CF sites via tls_fingerprint, 500px GraphQL, delete 4 dead domains (#2535) 2026-04-20 22:41:51 +02:00
Soxoj 37ce4fe728 Update of Readme and documentation (#2514)
* Big readme update

* Readme and documentation update

* Readme structure update

* Small fixes

* Changelog update
2026-04-17 17:42:36 +02:00
Soxoj 5d502eaef6 Add site protection tracking system, fix broken site checks (Instagra… (#2452)
* Add site protection tracking system, fix broken site checks (Instagram, StackOverflow, LeetCode, Boosty, LiveLib), preserve unicode in data.json

* Update poetry.lock by running poetry lock

Agent-Logs-Url: https://github.com/soxoj/maigret/sessions/14333f41-67d5-4e28-a782-9730b31fc667

Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
2026-04-02 20:28:20 +02:00
Soxoj a5d337b765 Tags and site names improvements (#2427)
- Added social tag to social networks (33 sites)
- Fixed wrong tags (8 sites)
- Filled empty tags for 213 sites in top-1000
- Country tag cleanup (~374 sites)
- Site naming normalization (75 sites)
- New tests (3)
- Documentation updates
2026-03-28 15:42:12 +01:00
Soxoj b145e7b26f feat(core): add POST request support, new sites, migrate to Majestic Million ranking (#2317)
* feat(core): add POST request support, new sites, migrate to Majestic Million ranking
- Added native POST request support to the Maigret engine (requestMethod, requestPayload) to enable querying modern JSON registration endpoints.
- Replaced the discontinued Alexa rank API with the Majestic Million dataset for global popularity sorting and automated CI updates.
- Fixed multiple false positives among top 500 sites and bypassed standard anti-bot protections using custom User-Agents.
- Updated public documentation and internal playbooks to reflect the new features.

* feat(data): apply all data.json site check updates from main branch

- Added CTFtime and PentesterLab (new sites added in main)
- Removed forums.imore.com (deleted in main as dead site)
- Disabled 5 sites per main branch fixes: Librusec, MirTesen, amateurvoyeurforum.com, forums.stevehoffman.tv, vegalab
- Fixed 5 site checks per main branch: SoundCloud, Taplink, Setlist, RoyalCams, club.cnews.ru (switched from status_code to message checkType with proper markers)

Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com>
Agent-Logs-Url: https://github.com/soxoj/maigret/sessions/a1d194d9-c0ff-4e2b-974c-c5e4b59548bf

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
2026-03-24 22:08:42 +01:00
Soxoj 97cc4b46d9 Improve site-check quality: fix broken site configs, add diagnostic utilities, and make self-check report-only by default with opt-in auto-disable. (#2301)
- Fix VK and TradingView checkType; add Reddit and Microsoft Learn API-style probes where appropriate; adjust or disable entries that are unreliable under anti-bot protection.
- Self-check: stop aggressive auto-disable; default to reporting issues only; add --auto-disable and --diagnose for optional fixes and deeper output.
- Tooling: add utils/site_check.py and utils/check_top_n.py (and related helpers) to inspect and rank site behavior against the top-N list
- Scope: aligns with fixing top-traffic / high-impact sites and making diagnostics repeatable without silently flipping disabled flags
2026-03-22 16:48:35 +01:00
Soxoj 227a25bfa1 Twitter fixed, mirrors mechanism improvement (#2299) 2026-03-22 01:14:17 +01:00
Soxoj f99091f5f7 Fixed false positives in top-500 (#2292) 2026-03-21 23:35:59 +01:00
Soxoj bebadb0362 Bump to 0.5.0 (#2108) 2025-08-10 13:10:50 +02:00
Soxoj 4b1317789d Refactored self-check method, code formatting, small lint fixes (#1942) 2024-12-07 18:05:30 +01:00
Soxoj f04de78682 Activation mechanism documentation added (#1935)
Few site checks fixed
2024-12-06 01:35:19 +01:00
Soxoj e982be4109 Installation docs update (#1927) 2024-12-03 20:23:49 +01:00
Soxoj 2f93963a0a Refactored sites module, updated documentation (#1918) 2024-12-01 11:41:41 +01:00
Soxoj 86d51bced0 Added 7 sites, implemented integration with Marple, docs update (#1881)
* Added 5 sites, implemented integration with Marple

* Added 2 more sites, updated docs

* Updated sites list
2024-11-25 14:41:34 +01:00
Soxoj 24e545b62c Added dev documentation, fixed some sites, removed GitHub issue links from reports (#1869) 2024-11-23 18:45:56 +01:00
Soxoj f7c7809d8d Bump to 0.4.4 (#621) 2022-09-03 14:30:24 +03:00
Soxoj 19956f74ca Bump to 0.4.3 2022-04-13 22:58:21 +03:00
Soxoj 8a865a1ce6 Op.gg fixes (#363)
* Fixed op.gg sites

* Added testing docs, fixed some error

* Updated site list and statistics
2022-02-26 14:16:13 +03:00
Soxoj 31fc656721 Added package publishing instruction (#356) 2022-02-23 16:46:58 +03:00