Copilot
01049b730d
Fix Love.Mail.ru: update to numeric-only identifiers and new profile URL ( #2307 )
...
* Initial plan
* fix: update Love.Mail.ru to use numeric-only identifiers (#1264 )
- Add regexCheck to enforce numeric-only IDs (^\d+$)
- Update usernameClaimed/usernameUnclaimed to numeric values
- Site remains disabled pending live verification
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
Agent-Logs-Url: https://github.com/soxoj/maigret/sessions/6de16097-6bc1-424a-beb1-1d2ec6b99944
* fix: update Love.Mail.ru URL to /profile/ path, enable check with verified ID
Use maintainer-provided working link https://love.mail.ru/profile/1838153357 .
- Change URL pattern from /ru/{username} to /profile/{username}
- Set usernameClaimed to 1838153357
- Remove disabled flag to enable the check
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
Agent-Logs-Url: https://github.com/soxoj/maigret/sessions/ac07d38e-46e2-42d3-9e93-eda3e5cfbcc3
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com >
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
2026-03-22 22:42:59 +01:00
Copilot
4f397fed1c
Re-enable taplink.cc with browser User-Agent to bypass Cloudflare ( #2308 )
...
* Initial plan
* fix(taplink): re-enable taplink.cc with browser User-Agent header to bypass Cloudflare
Remove disabled flag and add a Chrome User-Agent header to help
bypass Cloudflare bot detection for taplink.cc profile checks.
If Cloudflare still blocks requests, maigret's built-in error
detection will gracefully mark results as UNKNOWN.
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
Agent-Logs-Url: https://github.com/soxoj/maigret/sessions/271904b6-e358-4aeb-b503-21c9b91186d9
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com >
Co-authored-by: soxoj <31013580+soxoj@users.noreply.github.com >
2026-03-22 22:10:44 +01:00
Soxoj
959b2be136
feat(sites): fix false positives: disable 74 broken sites, fix 8 with API probes and better markers ( #2302 )
...
- Disable 74 sites: Cloudflare/captcha blocks, identical responses,
dead domains, vBulletin/phpBB engine failures
- Fix Roblox, Salon24.pl, Planetaexcel → status_code (clear 404 signal)
- Fix en.brickimedia.org → message with "noarticletext" absenceStr
- Fix Arduino → narrower title-based presenseStrs/absenceStrs
- Re-enable Fandom (3 wikis) via MediaWiki api.php urlProbe
- Re-enable Substack via /api/v1/user/{}/public_profile urlProbe
- Re-enable hashnode via GraphQL GET urlProbe (URL-encoded query)
- Document lessons: engine template drift, search-by-author fragility,
always-200 sites, TLS degradation, API bypassing Cloudflare,
GraphQL GET support, URL-encoding for template safety
2026-03-22 20:47:51 +01:00
Soxoj
97cc4b46d9
Improve site-check quality: fix broken site configs, add diagnostic utilities, and make self-check report-only by default with opt-in auto-disable. ( #2301 )
...
- Fix VK and TradingView checkType; add Reddit and Microsoft Learn API-style probes where appropriate; adjust or disable entries that are unreliable under anti-bot protection.
- Self-check: stop aggressive auto-disable; default to reporting issues only; add --auto-disable and --diagnose for optional fixes and deeper output.
- Tooling: add utils/site_check.py and utils/check_top_n.py (and related helpers) to inspect and rank site behavior against the top-N list
- Scope: aligns with fixing top-traffic / high-impact sites and making diagnostics repeatable without silently flipping disabled flags
2026-03-22 16:48:35 +01:00
Soxoj
227a25bfa1
Twitter fixed, mirrors mechanism improvement ( #2299 )
2026-03-22 01:14:17 +01:00
Soxoj
f99091f5f7
Fixed false positives in top-500 ( #2292 )
2026-03-21 23:35:59 +01:00
Tang Vu
4cd1fccaa3
♻️ Refactor: Hardcoded relative path for database file ( #2285 )
...
* refactor: hardcoded relative path for database file
`app.config['MAIGRET_DB_FILE']` is set to a hardcoded relative path `os.path.join('maigret', 'resources', 'data.json')`. If the Flask application is executed from a different working directory (other than the repository root), it will fail to find the database file and crash.
Affected files: app.py, settings.py
* refactor: hardcoded relative path for database file
`app.config['MAIGRET_DB_FILE']` is set to a hardcoded relative path `os.path.join('maigret', 'resources', 'data.json')`. If the Flask application is executed from a different working directory (other than the repository root), it will fail to find the database file and crash.
Affected files: app.py, settings.py
2026-03-21 18:06:36 +01:00
Soxoj
48ca13dc4d
Make web interface accessible for Docker deployment by default ( #2189 )
2025-08-31 16:14:42 +02:00
Soxoj
fb26ccd1f6
Disabled some sites giving false positive results ( #2170 )
2025-08-22 03:10:47 +02:00
Soxoj
bebadb0362
Bump to 0.5.0 ( #2108 )
2025-08-10 13:10:50 +02:00
MR-VL
d90d8a8ac9
Disable AskFM ( #2037 )
2025-07-13 16:16:49 +02:00
Darlyson Rangel
c9e38632ca
Disable ICQ site ( #1993 )
2025-06-28 23:46:09 +02:00
Pierre-Yves Lapersonne
f76ea5d738
[ #2010 ] Add 6 more websites to manage ( #2009 )
...
* feat: add `framapiaf.org` in supported web sites, add tag `mastodon` (#2010 )
Signed-off-by: Pierre-Yves Lapersonne <dev@pylapersonne.info >
* feat: add `write.as` in supported web sites, add tag `writefreely` (#2010 )
Signed-off-by: Pierre-Yves Lapersonne <dev@pylapersonne.info >
* feat: add `programming.dev` in supported web sites, add tag `lemmy` (#2010 )
Signed-off-by: Pierre-Yves Lapersonne <dev@pylapersonne.info >
* feat: add `mamot.fr` in supported web sites (#2010 )
Signed-off-by: Pierre-Yves Lapersonne <dev@pylapersonne.info >
* feat: add `pixelfed.social` in supported web sites, add tag `pixelfed` (#2010 )
Signed-off-by: Pierre-Yves Lapersonne <dev@pylapersonne.info >
* feat: add `Outgress` in supported web sites (#2010 )
Signed-off-by: Pierre-Yves Lapersonne <dev@pylapersonne.info >
* Updated the list of supported sites
---------
Signed-off-by: Pierre-Yves Lapersonne <dev@pylapersonne.info >
Co-authored-by: Soxoj <soxoj@protonmail.com >
2025-06-28 23:33:29 +02:00
pykereaper
b21ac36b27
Fix usage of data.json files from web ( #2020 )
2025-06-28 23:20:02 +02:00
pykereaper
0f7aa2c456
Pass db_file configuration to web interface ( #2019 )
...
* pass db_file configuration to web interface
* Autoformatting
---------
Co-authored-by: Soxoj <soxoj@protonmail.com >
2025-06-28 23:15:56 +02:00
Soxoj
97e5f600d0
Async generator-executor for site checks ( #1978 )
2024-12-17 22:48:11 +01:00
overcuriousity
36ce285572
make graph more meaningful ( #1977 )
...
* make graph more meaningful
if a search with multiple usernames is launched, it creates an additional site node where they both are found.
advantages:
- better recognition, that users have a connection with each other
- better detection of false positives when launching a search with two fake usernames (site node = definite false positive)
* fix Graph linking report.py
2024-12-17 16:51:19 +01:00
overcuriousity
c2e3e96cb7
Improving the web interface ( #1975 )
...
* update web interface with commandline options
* improve web interface
* update README images of web interface
* fix bug in app.py
* fix web interface
2024-12-17 16:50:49 +01:00
Soxoj
c3dfe9cb4d
Small docs and parameters fixes for web interface mode ( #1973 )
2024-12-16 17:18:22 +01:00
overcuriousity
88d68490f3
Created web frontend launched via --web flag ( #1967 )
...
Author: overcuriousity
Co-authored-by: Soxoj <soxoj@protonmail.com >
2024-12-16 14:24:03 +01:00
Soxoj
cb01535565
Preparation of 0.5.0 alpha version ( #1966 )
2024-12-13 12:51:31 +01:00
Soxoj
c4af0a4df0
Fixed flaky tests to check cookies ( #1965 )
2024-12-13 12:37:58 +01:00
Soxoj
b2283a5b04
Merge pull request #1961 from overcuriousity/main
...
fix bad linux filename generation
2024-12-12 22:07:21 +01:00
Soxoj
f212bc9bc8
Site check fixes
2024-12-12 21:39:35 +01:00
overcuriousity
b8c62f95ae
fix bad linux filename generation
...
currently maigret parses urls as usernames related to gravatar. this leads to bad filenames of the output on my linux host, as the slashes cause it to try to write subfolders, causing the script to abort with the error "file does not exist".
Applied a simple fix to replace all "/" with "_" in output file generation.
2024-12-12 15:00:54 +01:00
Soxoj
2653c617f8
Merge pull request #1958 from soxoj/gravatar-pypi-fix
...
Fixed Gravatar parsing (socid_extractor)
2024-12-12 02:32:35 +01:00
Soxoj
4dd82bf4c9
Fixed Gravatar parsing (socid_extractor)
2024-12-12 02:30:29 +01:00
Ikko Eltociear Ashimine
f8ab484cd2
chore: update submit.py
...
futher -> further
2024-12-11 23:23:45 +09:00
Soxoj
64ae391a4a
Updated Vimeo, CNET, DailyMotion
2024-12-11 01:17:20 +01:00
Soxoj
127d9032c3
Fixed Vimeo, activation/probing mechanisms improvements
2024-12-11 00:56:00 +01:00
Soxoj
81a817a39f
Improved "submit new site" mode, added tests, fixed top-500 sites ( #1952 )
2024-12-10 18:02:43 +01:00
Soxoj
51ab988e36
Fixed ProductHunt check ( #1951 )
2024-12-09 17:06:03 +01:00
Soxoj
5517636850
Updated OP.GG checks ( #1950 )
...
* Updated OP.GG checks
* Finalized LoL, added Valorant, disabled Archive.org
2024-12-09 15:59:19 +01:00
Soxoj
4eada16b94
Added a test for submitter ( #1944 )
2024-12-08 13:35:27 +01:00
Soxoj
c66d776f8a
Refactoring, test coverage increased to 60% ( #1943 )
2024-12-08 02:13:28 +01:00
Soxoj
4b1317789d
Refactored self-check method, code formatting, small lint fixes ( #1942 )
2024-12-07 18:05:30 +01:00
Soxoj
8b7d8073d9
Fixed Linktr and discourse.mozilla.org ( #1941 )
2024-12-07 17:11:39 +01:00
Soxoj
2aa1ea39a0
Site fixes ( #1940 )
2024-12-06 14:27:38 +01:00
Soxoj
cd789ed138
Fixed Ebay and BongaCams checks ( #1939 )
2024-12-06 13:32:51 +01:00
Soxoj
5641456ba0
Weibo site check fix, activation mechanism added ( #1938 )
2024-12-06 11:31:20 +01:00
Soxoj
f04de78682
Activation mechanism documentation added ( #1935 )
...
Few site checks fixed
2024-12-06 01:35:19 +01:00
Soxoj
cb9f01c106
Fixed Figma check ( #1932 )
...
Fixed cookies bug
Improved self-check mode: don't disable sites because of check errors
2024-12-04 19:21:27 +01:00
Soxoj
1cb25946dd
Disabled Figma check ( #1928 )
2024-12-04 00:27:55 +01:00
Soxoj
2f93963a0a
Refactored sites module, updated documentation ( #1918 )
2024-12-01 11:41:41 +01:00
Soxoj
d15e12750b
Sites fixes ( #1917 )
...
* Some sites fixes
* Sites stats updated
2024-12-01 03:19:36 +01:00
Soxoj
e96d09dee7
Permutator output and documentation updates ( #1914 )
2024-11-29 13:15:03 +01:00
Soxoj
15702bd9f4
Fixed dateutil parsing error for CDT timezone ( #1907 )
2024-11-29 12:02:41 +01:00
Soxoj
2e2a47a12b
Close http connections ( #1595 ) ( #1905 )
2024-11-27 15:28:10 +01:00
Soxoj
8a98aa9eaa
Retries set to 0 by default, refactored code of executor with progress ( #1899 )
...
* Retries set to 0 by default, refactored code of executor with progress
2024-11-26 19:07:15 +01:00
Soxoj
ee25c61fc2
Maigret bot support (custom progress function fixed) ( #1898 )
...
* Fixed progress/close functions
* Fixed tests: execution time increased with alive_progressbar
2024-11-26 15:54:26 +01:00