disable-sites

disable-donationalerts
2026-05-15 10:55:43 +00:00 · 2023-01-31 13:18:56 -05:00 · 2023-01-27 14:30:40 -05:00
66 changed files with 1216 additions and 6336 deletions
@@ -1,5 +1,3 @@
 # These are supported funding model platforms
 patreon: soxoj
 github: soxoj
 buy_me_a_coffee: soxoj
@@ -2,71 +2,21 @@ name: Package exe with PyInstaller - Windows
 on:
  push:
-    branches: [ main, dev ]
+    branches: [ main ]
 jobs:
  build:
    runs-on: ubuntu-latest
    steps:
-    - name: Checkout
+    - uses: actions/checkout@v2
-      uses: actions/checkout@v4
+    - name: PyInstaller Windows
      uses: JackMcKew/pyinstaller-action-windows@main
      with:
        path: pyinstaller
-    - name: TEST PyInstaller Windows Build
+    - uses: actions/upload-artifact@v2
      shell: bash
      run: |
        echo "test" > maigret_standalone_win32
    - name: TEST Upload PyInstaller Binary to Workflow as Artifact
      uses: actions/upload-artifact@v4
      with:
        name: maigret_standalone_win32
-        path: maigret_standalone_win32
+        path: pyinstaller/dist/windows # or path/to/artifact
    # - name: PyInstaller Windows Build
    #   uses: JackMcKew/pyinstaller-action-windows@main
    #   with:
    #     path: pyinstaller
    # - name: Upload PyInstaller Binary to Workflow as Artifact
    #   uses: actions/upload-artifact@v4
    #   with:
    #     name: maigret_standalone_win32
    #     path: pyinstaller/dist/windows
    - name: Download PyInstaller Binary
      uses: actions/download-artifact@v4
      with:
        name: maigret_standalone_win32
    - name: Remove Previous Release
      uses: soxoj/delete-release-action@v1
      with:
        release_name: ${{ github.ref_name }}
      env:
        GITHUB_TOKEN: ${{ github.token }}
 # test change
    - name: Create New Release
      uses: actions/create-release@v1
      id: create_release
      with:
        draft: false
        prerelease: true
        release_name: Windows Release [${{ github.ref_name }}]
        tag_name: ${{ github.ref_name }}-${{ github.run_number }}
        body: |
          This is a development release, built from the branch **${{ github.ref_name }}**.
          Download the attached file "maigret_standalone_win32.zip" to get the Windows executable.
      env:
        GITHUB_TOKEN: ${{ github.token }}
    - name: Upload PyInstaller Binary to Release
      uses: actions/upload-release-asset@v1
      env:
        GITHUB_TOKEN: ${{ github.token }}
      with:
        upload_url: ${{ steps.create_release.outputs.upload_url }}
        asset_path: ./maigret_standalone_win32
        asset_name: maigret_standalone_win32
        asset_content_type: application/zip
@@ -13,7 +13,7 @@ jobs:
    runs-on: ubuntu-latest
    strategy:
      matrix:
-        python-version: ["3.10", "3.11", "3.12"]
+        python-version: [3.7, 3.8, 3.9]
    steps:
    - uses: actions/checkout@v2
@@ -24,8 +24,8 @@ jobs:
    - name: Install dependencies
      run: |
        python -m pip install --upgrade pip
-        python -m pip install poetry
+        python -m pip install -r test-requirements.txt
-        python -m poetry install --with dev
+        if [ -f requirements.txt ]; then pip install -r requirements.txt; fi
    - name: Test with pytest
      run: |
-        poetry run pytest --reruns 3 --reruns-delay 5
+        pytest --reruns 3 --reruns-delay 5
@@ -1,6 +1,5 @@
 # Virtual Environment
 venv/
 .venv/
 # Editor Configurations
 .vscode/
@@ -39,7 +38,3 @@ htmlcov/
 # Maigret files
 settings.json
 # other
 *.egg-info
 build
@@ -1,16 +0,0 @@
 version: 2
 build:
  os: ubuntu-22.04
  tools:
    python: "3.10"
 sphinx:
  configuration: docs/source/conf.py
 formats:
  - pdf
 python:
  install:
    - requirements: docs/requirements.txt
@@ -2,10 +2,6 @@
 Hey! I'm really glad you're reading this. Maigret contains a lot of sites, and it is very hard to keep all the sites operational. That's why any fix is important. 
 ## Code of Conduct
 Please read and follow the [Code of Conduct](CODE_OF_CONDUCT.md) to foster a welcoming and inclusive community.
 ## How to add a new site
 #### Beginner level
@@ -31,23 +27,4 @@ Always write a clear log message for your commits. One-line messages are fine fo
 ## Coding conventions
 ### General Guidelines
 - Try to follow [PEP 8](https://www.python.org/dev/peps/pep-0008/) for Python code style.
 - Ensure your code passes all tests before submitting a pull request.
 ### Code Style
 - **Indentation**: Use 4 spaces per indentation level.
 - **Imports**: 
  - Standard library imports should be placed at the top.
  - Third-party imports should follow.
  - Group imports logically.
 ### Naming Conventions
 - **Variables and Functions**: Use `snake_case`.
 - **Classes**: Use `CamelCase`.
 - **Constants**: Use `UPPER_CASE`.
 Start reading the code and you'll get the hang of it. ;)
@@ -1,4 +1,4 @@
-FROM python:3.10-slim
+FROM python:3.9-slim
 LABEL maintainer="Soxoj <soxoj@protonmail.com>"
 WORKDIR /app
 RUN pip install --no-cache-dir --upgrade pip
@@ -1,128 +0,0 @@
@echo off
 REM check if running as admin
 goto check_Permissions
 :check_Permissions
 echo Administrative permissions required. Detecting permissions...
 net session >nul 2>&1
 if %errorLevel% == 0 (
    goto 1
 ) else (
    cls
    echo Failure: You MUST run this as administator, otherwise commands will fail. 
 )
 pause >nul
 REM Step 2: Check if Python and pip3 are installed
 python --version >nul 2>&1
 if %errorlevel% neq 0 (
    echo Python is not installed. Please install Python 3.8 or higher.
    pause
    exit /b
 )
 pip3 --version >nul 2>&1
 if %errorlevel% neq 0 (
    echo pip3 is not installed. Please install pip3.
    pause
    exit /b
 )
 REM Step 3: Check Python version
 python -c "import sys; exit(0) if sys.version_info >= (3,8) else exit(1)"
 if %errorlevel% neq 0 (
    echo Python version 3.8 or higher is required.
    pause
    exit /b
 )
 :1
 cls
 :::===============================================================
 :::   ______                  __  __       _                _   
 :::  |  ____|                |  \/  |     (_)              | |  
 :::  | |__   __ _ ___ _   _  | \  / | __ _ _  __ _ _ __ ___| |_ 
 :::  |  __| / _` / __| | | | | |\/| |/ _` | |/ _` | '__/ _ \ __|
 :::  | |___| (_| \__ \ |_| | | |  | | (_| | | (_| | | |  __/ |_ 
 :::  |______\__,_|___/\__, | |_|  |_|\__,_|_|\__, |_|  \___|\__|
 :::                    __/ |                  __/ |             
 :::                   |___/                  |___/             
 :::
 :::===============================================================
 echo.
 for /f "delims=: tokens=*" %%A in ('findstr /b ::: "%~f0"') do @echo(%%A
 echo.
 echo ----------------------------------------------------------------
 echo              Python 3.8 or higher and pip3 required.
 echo ----------------------------------------------------------------
 echo                 Press [I] to begin installation.
 echo                 Press [R] If already installed.
 echo ----------------------------------------------------------------
 choice /c IR
 if %errorlevel%==1 goto install1
 if %errorlevel%==2 goto after
 :install1
 cls
 echo ========================================================
 echo                Maigret Installation Script
 echo ========================================================
 echo.
 echo --------------------------------------------------------
 echo   If your pip installation is outdated, it could cause
 echo         cryptography to fail on installation.
 echo --------------------------------------------------------
 echo          check for and install pip updates now?
 echo --------------------------------------------------------
 choice /c YN
 if %errorlevel%==1 goto install2
 if %errorlevel%==2 goto install3
 :install2
 cls
 python -m pip install --upgrade pip
 goto:install3
 :install3
 cls
 echo ========================================================
 echo                Maigret Installation Script
 echo ========================================================
 echo.
 echo --------------------------------------------------------
 echo             Install requirements and maigret?
 echo --------------------------------------------------------
 choice /c YN
 if %errorlevel%==1 goto install4
 if %errorlevel%==2 goto 1
 :install4
 cls
 pip install .
 pip install maigret
 goto:after
 :after
 cls
 echo ========================================================
 echo                Maigret Background Search
 echo ========================================================
 echo.
 echo --------------------------------------------------------
 echo              Please Enter Username / Email
 echo --------------------------------------------------------
 set /p input= 
 maigret %input%
 echo.
 echo.
 echo.
 echo.
 pause
 goto:after
@@ -10,16 +10,16 @@ rerun-tests:
 lint:
 	@echo 'syntax errors or undefined names'
-	flake8 --count --select=E9,F63,F7,F82 --show-source --statistics ${LINT_FILES}
+	flake8 --count --select=E9,F63,F7,F82 --show-source --statistics ${LINT_FILES} maigret.py
 	@echo 'warning'
-	flake8 --count --exit-zero --max-complexity=10 --max-line-length=127 --statistics --ignore=E731,W503,E501 ${LINT_FILES}
+	flake8 --count --exit-zero --max-complexity=10 --max-line-length=127 --statistics --ignore=E731,W503,E501 ${LINT_FILES} maigret.py
 	@echo 'mypy'
-	mypy --check-untyped-defs ${LINT_FILES}
+	mypy ${LINT_FILES}
 speed:
-	time python3 -m maigret --version
+	time python3 ./maigret.py --version
 	python3 -c "import timeit; t = timeit.Timer('import maigret'); print(t.timeit(number = 1000000))"
 	python3 -X importtime -c "import maigret" 2> maigret-import.log
 	python3 -m tuna maigret-import.log
@@ -3,35 +3,27 @@
 <p align="center">
  <p align="center">
    <a href="https://pypi.org/project/maigret/">
-        <img alt="PyPI version badge for Maigret" src="https://img.shields.io/pypi/v/maigret?style=flat-square" />
+      <img alt="PyPI" src="https://img.shields.io/pypi/v/maigret?style=flat-square">
    </a>
    <a href="https://pypi.org/project/maigret/">
-        <img alt="PyPI download count for Maigret" src="https://img.shields.io/pypi/dw/maigret?style=flat-square" />
+      <img alt="PyPI - Downloads" src="https://img.shields.io/pypi/dw/maigret?style=flat-square">
    </a>
-    <a href="https://github.com/soxoj/maigret">
+    <a href="https://pypi.org/project/maigret/">
-        <img alt="Minimum Python version required: 3.10+" src="https://img.shields.io/badge/Python-3.10%2B-brightgreen?style=flat-square" />
+      <img alt="Views" src="https://komarev.com/ghpvc/?username=maigret&color=brightgreen&label=views&style=flat-square">
    </a>
    <a href="https://github.com/soxoj/maigret/blob/main/LICENSE">
        <img alt="License badge for Maigret" src="https://img.shields.io/github/license/soxoj/maigret?style=flat-square" />
    </a>
    <a href="https://github.com/soxoj/maigret">
        <img alt="View count for Maigret project" src="https://komarev.com/ghpvc/?username=maigret&color=brightgreen&label=views&style=flat-square" />
    </a>
  </p>
  <p align="center">
-    <img src="https://raw.githubusercontent.com/soxoj/maigret/main/static/maigret.png" height="300"/>
+    <img src="https://raw.githubusercontent.com/soxoj/maigret/main/static/maigret.png" height="200"/>
  </p>
 </p>
 <i>The Commissioner Jules Maigret is a fictional French police detective, created by Georges Simenon. His investigation method is based on understanding the personality of different people and their interactions.</i>
 <b>👉👉👉 [Online Telegram bot](https://t.me/osint_maigret_bot)</b>
 ## About
 **Maigret** collects a dossier on a person **by username only**, checking for accounts on a huge number of sites and gathering all the available information from web pages. No API keys required. Maigret is an easy-to-use and powerful fork of [Sherlock](https://github.com/sherlock-project/sherlock).
-Currently supported more than 3000 sites ([full list](https://github.com/soxoj/maigret/blob/main/sites.md)), search is launched against 500 popular sites in descending order of popularity by default. Also supported checking of Tor sites, I2P sites, and domains (via DNS resolving).
+Currently supported more than 2500 sites ([full list](https://github.com/soxoj/maigret/blob/main/sites.md)), search is launched against 500 popular sites in descending order of popularity by default. Also supported checking of Tor sites, I2P sites, and domains (via DNS resolving).
 ## Main features
@@ -45,13 +37,11 @@ See full description of Maigret features [in the documentation](https://maigret.
 ## Installation
 ‼️ Maigret is available online via [official Telegram bot](https://t.me/osint_maigret_bot).
 Maigret can be installed using pip, Docker, or simply can be launched from the cloned repo.
 Standalone EXE-binaries for Windows are located in [Releases section](https://github.com/soxoj/maigret/releases) of GitHub repository.
-Also, you can run Maigret using cloud shells and Jupyter notebooks (see buttons below). 
+Also you can run Maigret using cloud shells and Jupyter notebooks (see buttons below). 
 [![Open in Cloud Shell](https://user-images.githubusercontent.com/27065646/92304704-8d146d80-ef80-11ea-8c29-0deaabb1c702.png)](https://console.cloud.google.com/cloudshell/open?git_repo=https://github.com/soxoj/maigret&tutorial=README.md)
 <a href="https://repl.it/github/soxoj/maigret"><img src="https://replit.com/badge/github/soxoj/maigret" alt="Run on Replit" height="50"></a>
@@ -61,7 +51,7 @@ Also, you can run Maigret using cloud shells and Jupyter notebooks (see buttons
 ### Package installing
-**NOTE**: Python 3.10 or higher and pip is required, **Python 3.11 is recommended.**
+**NOTE**: Python 3.7 or higher and pip is required, **Python 3.8 is recommended.**
 ```bash
 # install from pypi
@@ -76,12 +66,10 @@ maigret username
 ```bash
 # or clone and install manually
 git clone https://github.com/soxoj/maigret && cd maigret
-
+pip3 install -r requirements.txt
 # build and install
 pip3 install .
 # usage
-maigret username
+./maigret.py username
 ```
 ### Docker
@@ -100,17 +88,12 @@ docker build -t maigret .
 ## Usage examples
 ```bash
-# make HTML, PDF, and Xmind8 reports
+# make HTML and PDF reports
-maigret user --html
+maigret user --html --pdf
 maigret user --pdf
 maigret user --xmind #Output not compatible with xmind 2022+
 # search on sites marked with tags photo & dating
 maigret user --tags photo,dating
 # search on sites marked with tag us
 maigret user --tags us
 # search for three usernames on all available sites
 maigret user1 user2 user3 -a
 ```
@@ -120,42 +103,22 @@ Use `maigret --help` to get full options description. Also options [are document
 ## Contributing
 Maigret has open-source code, so you may contribute your own sites by adding them to `data.json` file, or bring changes to it's code!
-
+If you want to contribute, don't forget to activate statistics update hook, command for it would look like this: `git config --local core.hooksPath .githooks/`
-For more information about development and contribution, please read the [development documentation](https://maigret.readthedocs.io/en/latest/development.html).
+You should make your git commits from your maigret git repo folder, or else the hook wouldn't find the statistics update script.
 ## Demo with page parsing and recursive username search
 ### Video (asciinema)
 <a href="https://asciinema.org/a/Ao0y7N0TTxpS0pisoprQJdylZ">
  <img src="https://asciinema.org/a/Ao0y7N0TTxpS0pisoprQJdylZ.svg" alt="asciicast" width="600">
 </a>
 ### Reports
 [PDF report](https://raw.githubusercontent.com/soxoj/maigret/main/static/report_alexaimephotographycars.pdf), [HTML report](https://htmlpreview.github.io/?https://raw.githubusercontent.com/soxoj/maigret/main/static/report_alexaimephotographycars.html)
 ![animation of recursive search](https://raw.githubusercontent.com/soxoj/maigret/main/static/recursive_search.svg)
 ![HTML report screenshot](https://raw.githubusercontent.com/soxoj/maigret/main/static/report_alexaimephotography_html_screenshot.png)
 ![XMind 8 report screenshot](https://raw.githubusercontent.com/soxoj/maigret/main/static/report_alexaimephotography_xmind_screenshot.png)
 [Full console output](https://raw.githubusercontent.com/soxoj/maigret/main/static/recursive_search.md)
 ## Disclaimer
 **This tool is intended for educational and lawful purposes only.** The developers do not endorse or encourage any illegal activities or misuse of this tool. Regulations regarding the collection and use of personal data vary by country and region, including but not limited to GDPR in the EU, CCPA in the USA, and similar laws worldwide.
 It is your sole responsibility to ensure that your use of this tool complies with all applicable laws and regulations in your jurisdiction. Any illegal use of this tool is strictly prohibited, and you are fully accountable for your actions.
 The authors and developers of this tool bear no responsibility for any misuse or unlawful activities conducted by its users.
 ## SOWEL classification
 This tool uses the following OSINT techniques:
 - [SOTL-2.2. Search For Accounts On Other Platforms](https://sowel.soxoj.com/other-platform-accounts)
 - [SOTL-6.1. Check Logins Reuse To Find Another Account](https://sowel.soxoj.com/logins-reuse)
 - [SOTL-6.2. Check Nicknames Reuse To Find Another Account](https://sowel.soxoj.com/nicknames-reuse) 
 ## License
 MIT © [Maigret](https://github.com/soxoj/maigret)<br/>
@@ -0,0 +1,18 @@
 #!/usr/bin/env python3
 import asyncio
 import sys
 from maigret.maigret import main
 def run():
    try:
        loop = asyncio.get_event_loop()
        loop.run_until_complete(main())
    except KeyboardInterrupt:
        print('Maigret is interrupted.')
        sys.exit(1)
 if __name__ == "__main__":
    run()
@@ -1,2 +1 @@
 sphinx-copybutton
 sphinx_rtd_theme
@@ -18,7 +18,7 @@ Parsing of account pages and online documents
 Maigret will try to extract information about the document/account owner
 (including username and other ids) and will make a search by the
-extracted username and ids. See examples in the :ref:`extracting-information-from-pages` section.
+extracted username and ids. :doc:`Examples <extracting-information-from-pages>`.
 Main options
 ------------
@@ -28,8 +28,8 @@ Options are also configurable through settings files, see
 ``--tags`` - Filter sites for searching by tags: sites categories and
 two-letter country codes (**not a language!**). E.g. photo, dating, sport; jp, us, global.
-Multiple tags can be associated with one site. **Warning**: tags markup is
+Multiple tags can be associated with one site. **Warning: tags markup is
-not stable now. Read more :doc:`in the separate section <tags>`.
+not stable now.**
 ``-n``, ``--max-connections`` - Allowed number of concurrent connections
 **(default: 100)**.
@@ -3,7 +3,7 @@
 # -- Project information
 project = 'Maigret'
-copyright = '2024, soxoj'
+copyright = '2021, soxoj'
 author = 'soxoj'
 release = '0.4.4'
@@ -3,37 +3,16 @@
 Development
 ==============
 Frequently Asked Questions
 --------------------------
 1. Where to find the list of supported sites?
 The human-readable list of supported sites is available in the `sites.md <https://github.com/soxoj/maigret/blob/main/sites.md>`_ file in the repository.
 It's been generated automatically from the main JSON file with the list of supported sites.
 The machine-readable JSON file with the list of supported sites is available in the
 `data.json <https://github.com/soxoj/maigret/blob/main/maigret/resources/data.json>`_ file in the directory `resources`.
 2. Which methods to check the account presence are supported?
 The supported methods (``checkType`` values in ``data.json``) are:
 - ``message`` - the most reliable method, checks if any string from ``presenceStrs`` is present and none of the strings from ``absenceStrs`` are present in the HTML response
 - ``status_code`` - checks that status code of the response is 2XX
 - ``response_url`` - check if there is not redirect and the response is 2XX
 See the details of check mechanisms in the `checking.py <https://github.com/soxoj/maigret/blob/main/maigret/checking.py#L339>`_ file.
 Testing
 -------
-It is recommended use Python 3.10 for testing.
+It is recommended use Python 3.7/3.8 for test due to some conflicts in 3.9.
 Install test requirements:
 .. code-block:: console
-  poetry install --with dev
+  pip install -r test-requirements.txt
 Use the following commands to check Maigret:
@@ -41,74 +20,19 @@ Use the following commands to check Maigret:
 .. code-block:: console
  # run linter and typing checks
-  # order of checks:
+  # order of checks%
  # - critical syntax errors or undefined names
  # - flake checks
  # - mypy checks
  make lint
  # run testing with coverage html report
-  # current test coverage is 58%
+  # current test coverage is 60%
-  make test
+  make text
  # open html report
  open htmlcov/index.html
  # get flamechart of imports to estimate startup time
  make speed
 How to fix false-positives
 -----------------------------------------------
 If you want to work with sites database, don't forget to activate statistics update git hook, command for it would look like this: ``git config --local core.hooksPath .githooks/``.
 You should make your git commits from your maigret git repo folder, or else the hook wouldn't find the statistics update script.
 1. Determine the problematic site.
 If you already know which site has a false-positive and want to fix it specifically, go to the next step.
 Otherwise, simply run a search with a random username (e.g. `laiuhi3h4gi3u4hgt`) and check the results.
 Alternatively, you can use `the Telegram bot <https://t.me/osint_maigret_bot>`_.
 2. Open the account link in your browser and check:
 - If the site is completely gone, remove it from the list
 - If the site still works but looks different, update in data.json how we check it
 - If the site requires login to view profiles, disable checking it
 3. Find the site in the `data.json <https://github.com/soxoj/maigret/blob/main/maigret/resources/data.json>`_ file.
 If the ``checkType`` method is not ``message`` and you are going to fix check, update it:
 - put ``message`` in ``checkType``
 - put in ``absenceStrs`` a keyword that is present in the HTML response for an non-existing account
 - put in ``presenceStrs`` a keyword that is present in the HTML response for an existing account
 If you have trouble determining the right keywords, you can use automatic detection by passing the account URL with the ``--submit`` option:
 .. code-block:: console
  maigret --submit https://my.mail.ru/bk/alex
 To disable checking, set ``disabled`` to ``true`` or simply run:
 .. code-block:: console
  maigret --self-check --site My.Mail.ru@bk.ru
 To debug the check method using the response HTML, you can run:
 .. code-block:: console
  maigret soxoj --site My.Mail.ru@bk.ru -d 2> response.txt
 There are few options for sites data.json helpful in various cases:
 - ``engine`` - a predefined check for the sites of certain type (e.g. forums), see the ``engines`` section in the JSON file
 - ``headers`` - a dictionary of additional headers to be sent to the site
 - ``requestHeadOnly`` - set to ``true`` if it's enough to make a HEAD request to the site
 - ``regexCheck`` - a regex to check if the username is valid, in case of frequent false-positives
 How to publish new version of Maigret
 -------------------------------------
@@ -175,26 +99,3 @@ PyPi package.
 - **Press "Publish release" button**
 8. That's all, now you can simply wait push to PyPi. You can monitor it in Action page: https://github.com/soxoj/maigret/actions/workflows/python-publish.yml
 Documentation updates
 ---------------------
 Documentations is auto-generated and auto-deployed from the ``docs`` directory.
 To manually update documentation:
 1. Change something in the ``.rst`` files in the ``docs/source`` directory.
 2. Install ``pip install -r requirements.txt`` in the docs directory.
 3. Run ``make singlehtml`` in the terminal in the docs directory.
 4. Open ``build/singlehtml/index.html`` in your browser to see the result.
 5. If everything is ok, commit and push your changes to GitHub. 
 Roadmap
 -------
 .. warning::
   This roadmap requires updating to reflect the current project status and future plans.
 .. figure:: https://i.imgur.com/kk8cFdR.png   
   :target: https://i.imgur.com/kk8cFdR.png
   :align: center
@@ -0,0 +1,35 @@
 .. _extracting-information-from-pages:
 Extracting information from pages
 =================================
 Maigret can parse URLs and content of web pages by URLs to extract info about account owner and other meta information.
 You must specify the URL with the option ``--parse``, it's can be a link to an account or an online document. List of supported sites `see here <https://github.com/soxoj/socid-extractor#sites>`_.
 After the end of the parsing phase, Maigret will start the search phase by :doc:`supported identifiers <supported-identifier-types>` found (usernames, ids, etc.).
 Examples
 --------
 .. code-block:: console
  $ maigret --parse https://docs.google.com/spreadsheets/d/1HtZKMLRXNsZ0HjtBmo0Gi03nUPiJIA4CC4jTYbCAnXw/edit\#gid\=0
  Scanning webpage by URL https://docs.google.com/spreadsheets/d/1HtZKMLRXNsZ0HjtBmo0Gi03nUPiJIA4CC4jTYbCAnXw/edit#gid=0...
  ┣╸org_name: Gooten
  ┗╸mime_type: application/vnd.google-apps.ritz
  Scanning webpage by URL https://clients6.google.com/drive/v2beta/files/1HtZKMLRXNsZ0HjtBmo0Gi03nUPiJIA4CC4jTYbCAnXw?fields=alternateLink%2CcopyRequiresWriterPermission%2CcreatedDate%2Cdescription%2CdriveId%2CfileSize%2CiconLink%2Cid%2Clabels(starred%2C%20trashed)%2ClastViewedByMeDate%2CmodifiedDate%2Cshared%2CteamDriveId%2CuserPermission(id%2Cname%2CemailAddress%2Cdomain%2Crole%2CadditionalRoles%2CphotoLink%2Ctype%2CwithLink)%2Cpermissions(id%2Cname%2CemailAddress%2Cdomain%2Crole%2CadditionalRoles%2CphotoLink%2Ctype%2CwithLink)%2Cparents(id)%2Ccapabilities(canMoveItemWithinDrive%2CcanMoveItemOutOfDrive%2CcanMoveItemOutOfTeamDrive%2CcanAddChildren%2CcanEdit%2CcanDownload%2CcanComment%2CcanMoveChildrenWithinDrive%2CcanRename%2CcanRemoveChildren%2CcanMoveItemIntoTeamDrive)%2Ckind&supportsTeamDrives=true&enforceSingleParent=true&key=AIzaSyC1eQ1xj69IdTMeii5r7brs3R90eck-m7k...
  ┣╸created_at: 2016-02-16T18:51:52.021Z
  ┣╸updated_at: 2019-10-23T17:15:47.157Z
  ┣╸gaia_id: 15696155517366416778
  ┣╸fullname: Nadia Burgess
  ┣╸email: nadia@gooten.com
  ┣╸image: https://lh3.googleusercontent.com/a-/AOh14GheZe1CyNa3NeJInWAl70qkip4oJ7qLsD8vDy6X=s64
  ┗╸email_username: nadia
 .. code-block:: console
  $ maigret.py --parse https://steamcommunity.com/profiles/76561199113454789
  Scanning webpage by URL https://steamcommunity.com/profiles/76561199113454789...
  ┣╸steam_id: 76561199113454789
  ┣╸nickname: Pok
  ┗╸username: Machine42
@@ -14,95 +14,13 @@ Also, Maigret use found ids and usernames from links to start a recursive search
 Enabled by default, can be disabled with ``--no extracting``.
 .. code-block:: text
    $ python3 -m maigret soxoj --timeout 5
        [-] Starting a search on top 500 sites from the Maigret database...
        [!] You can run search by full list of sites with flag `-a`
        [*] Checking username soxoj on:
        ...
        [+] GitHub: https://github.com/soxoj
                ├─uid: 31013580
                ├─image: https://avatars.githubusercontent.com/u/31013580?v=4
                ├─created_at: 2017-08-14T17:03:07Z
                ├─location: Amsterdam, Netherlands
                ├─follower_count: 1304
                ├─following_count: 54
                ├─fullname: Soxoj
                ├─public_gists_count: 3
                ├─public_repos_count: 88
                ├─twitter_username: sox0j
                ├─bio: Head of OSINT Center of Excellence in @SocialLinks-IO
                ├─is_company: Social Links
                └─blog_url: soxoj.com
        ...
 Recursive search
 ----------------
-Maigret has the ability to scan account pages for :ref:`common identifiers <supported-identifier-types>` and usernames found in links.
+Maigret can extract some :ref:`common ids <supported-identifier-types>` and usernames from links on the account page (often people placed links to their other accounts) and immediately start new searches. All the gathered information will be displayed in CLI output and reports.
 When people include links to their other social media accounts, Maigret can automatically detect and initiate new searches for those profiles.
 Any information discovered through this process will be shown in both the command-line interface output and generated reports.
 Enabled by default, can be disabled with ``--no-recursion``.
 .. code-block:: text
    $ python3 -m maigret soxoj --timeout 5
        [-] Starting a search on top 500 sites from the Maigret database...
        [!] You can run search by full list of sites with flag `-a`
        [*] Checking username soxoj on:
        ...
        [+] GitHub: https://github.com/soxoj
                ├─uid: 31013580
                ├─image: https://avatars.githubusercontent.com/u/31013580?v=4
                ├─created_at: 2017-08-14T17:03:07Z
                ├─location: Amsterdam, Netherlands
                ├─follower_count: 1304
                ├─following_count: 54
                ├─fullname: Soxoj
                ├─public_gists_count: 3
                ├─public_repos_count: 88
                ├─twitter_username: sox0j     <===== another username found here
                ├─bio: Head of OSINT Center of Excellence in @SocialLinks-IO
                ├─is_company: Social Links
                └─blog_url: soxoj.com
        ...
        Searching |████████████████████████████████████████| 500/500 [100%] in 9.1s (54.85/s)
        [-] You can see detailed site check errors with a flag `--print-errors`
        [*] Checking username sox0j on:
        [+] Telegram: https://t.me/sox0j
            ├─fullname: @Sox0j
            ...
 Username permutations
 ---------------------
 Maigret can generate permutations of usernames. Just pass a few usernames in the CLI and use ``--permute`` flag.
 Thanks to `@balestek <https://github.com/balestek>`_ for the idea and implementation.
 .. code-block:: text
    $ python3 -m maigret --permute hope dream --timeout 5
    [-] 12 permutations from hope dream to check...
        ├─ hopedream
        ├─ _hopedream 
        ├─ hopedream_
        ├─ hope_dream
        ├─ hope-dream
        ├─ hope.dream
        ├─ dreamhope
        ├─ _dreamhope
        ├─ dreamhope_
        ├─ dream_hope
        ├─ dream-hope
        └─ dream.hope
    [-] Starting a search on top 500 sites from the Maigret database...
    [!] You can run search by full list of sites with flag `-a`
    [*] Checking username hopedream on:
    ...
 Reports
 -------
@@ -116,8 +34,7 @@ HTML/PDF reports contain:
 Also, there is a short text report in the CLI output after the end of a searching phase.
-.. warning::
+**Warning**: XMind 8 mindmaps are incompatible with XMind 2022!
   XMind 8 mindmaps are incompatible with XMind 2022!
 Tags
 ----
@@ -153,42 +70,6 @@ The Maigret database contains not only the original websites, but also mirrors,
 It allows getting additional info about the person and checking the existence of the account even if the main site is unavailable (bot protection, captcha, etc.)
 .. _extracting-information-from-pages:
 Extractiion of information from account pages
 ---------------------------------------------
 Maigret can parse URLs and content of web pages by URLs to extract info about account owner and other meta information.
 You must specify the URL with the option ``--parse``, it's can be a link to an account or an online document. List of supported sites `see here <https://github.com/soxoj/socid-extractor#sites>`_.
 After the end of the parsing phase, Maigret will start the search phase by :doc:`supported identifiers <supported-identifier-types>` found (usernames, ids, etc.).
 .. code-block:: console
  $ maigret --parse https://docs.google.com/spreadsheets/d/1HtZKMLRXNsZ0HjtBmo0Gi03nUPiJIA4CC4jTYbCAnXw/edit\#gid\=0
  Scanning webpage by URL https://docs.google.com/spreadsheets/d/1HtZKMLRXNsZ0HjtBmo0Gi03nUPiJIA4CC4jTYbCAnXw/edit#gid=0...
  ┣╸org_name: Gooten
  ┗╸mime_type: application/vnd.google-apps.ritz
  Scanning webpage by URL https://clients6.google.com/drive/v2beta/files/1HtZKMLRXNsZ0HjtBmo0Gi03nUPiJIA4CC4jTYbCAnXw?fields=alternateLink%2CcopyRequiresWriterPermission%2CcreatedDate%2Cdescription%2CdriveId%2CfileSize%2CiconLink%2Cid%2Clabels(starred%2C%20trashed)%2ClastViewedByMeDate%2CmodifiedDate%2Cshared%2CteamDriveId%2CuserPermission(id%2Cname%2CemailAddress%2Cdomain%2Crole%2CadditionalRoles%2CphotoLink%2Ctype%2CwithLink)%2Cpermissions(id%2Cname%2CemailAddress%2Cdomain%2Crole%2CadditionalRoles%2CphotoLink%2Ctype%2CwithLink)%2Cparents(id)%2Ccapabilities(canMoveItemWithinDrive%2CcanMoveItemOutOfDrive%2CcanMoveItemOutOfTeamDrive%2CcanAddChildren%2CcanEdit%2CcanDownload%2CcanComment%2CcanMoveChildrenWithinDrive%2CcanRename%2CcanRemoveChildren%2CcanMoveItemIntoTeamDrive)%2Ckind&supportsTeamDrives=true&enforceSingleParent=true&key=AIzaSyC1eQ1xj69IdTMeii5r7brs3R90eck-m7k...
  ┣╸created_at: 2016-02-16T18:51:52.021Z
  ┣╸updated_at: 2019-10-23T17:15:47.157Z
  ┣╸gaia_id: 15696155517366416778
  ┣╸fullname: Nadia Burgess
  ┣╸email: nadia@gooten.com
  ┣╸image: https://lh3.googleusercontent.com/a-/AOh14GheZe1CyNa3NeJInWAl70qkip4oJ7qLsD8vDy6X=s64
  ┗╸email_username: nadia
 .. code-block:: console
  $ maigret.py --parse https://steamcommunity.com/profiles/76561199113454789
  Scanning webpage by URL https://steamcommunity.com/profiles/76561199113454789...
  ┣╸steam_id: 76561199113454789
  ┣╸nickname: Pok
  ┗╸username: Machine42
 Simple API
 ----------
@@ -3,44 +3,29 @@
 Welcome to the Maigret docs!
 ============================
-**Maigret** is an easy-to-use and powerful OSINT tool for collecting a dossier on a person by a username (alias) only.
+**Maigret** is an easy-to-use and powerful OSINT tool for collecting a dossier on a person by username only.
 This is achieved by checking for accounts on a huge number of sites and gathering all the available information from web pages.
-The project's main goal — give to OSINT researchers and pentesters a **universal tool** to get maximum information
+The project's main goal - give to OSINT researchers and pentesters a **universal tool** to get maximum information about a subject and integrate it with other tools in automatization pipelines.
 about a person of interest by a username and integrate it with other tools in automatization pipelines.
 .. warning::
   **This tool is intended for educational and lawful purposes only.**
   The developers do not endorse or encourage any illegal activities or misuse of this tool.
   Regulations regarding the collection and use of personal data vary by country and region,
   including but not limited to GDPR in the EU, CCPA in the USA, and similar laws worldwide.
   It is your sole responsibility to ensure that your use of this tool complies with all applicable laws
   and regulations in your jurisdiction. Any illegal use of this tool is strictly prohibited,
   and you are fully accountable for your actions.
   The authors and developers of this tool bear no responsibility for any misuse
   or unlawful activities conducted by its users.
 You may be interested in:
 -------------------------
- :doc:`Quick start <quick-start>`
+- :doc:`Command line options description <command-line-options>` and :doc:`usage examples <usage-examples>`
 - :doc:`Usage examples <usage-examples>`
 - :doc:`Command line options <command-line-options>`
 - :doc:`Features list <features>`
 - :doc:`Project roadmap <roadmap>`
 .. toctree::
   :hidden:
   :caption: Sections
   quick-start
   installation
   usage-examples
   command-line-options
   extracting-information-from-pages
   features
   philosophy
   roadmap
   supported-identifier-types
   tags
   usage-examples
   settings
   development
@@ -1,88 +0,0 @@
 .. _installation:
 Installation
 ============
 Maigret can be installed using pip, Docker, or simply can be launched from the cloned repo.
 Also, it is available online via `official Telegram bot <https://t.me/osint_maigret_bot>`_,
 source code of a bot is `available on GitHub <https://github.com/soxoj/maigret-tg-bot>`_.
 Package installing
 ------------------
 Please note that the sites database in the PyPI package may be outdated.
 If you encounter frequent false positive results, we recommend installing the latest development version from GitHub instead.
 .. note::
   Python 3.10 or higher and pip is required, **Python 3.11 is recommended.**
 .. code-block:: bash
   # install from pypi
   pip3 install maigret
   # usage
   maigret username
 Development version (GitHub)
 ----------------------------
 .. code-block:: bash
   git clone https://github.com/soxoj/maigret && cd maigret
   pip3 install .
   # OR
   pip3 install git+https://github.com/soxoj/maigret.git
   # usage
   maigret username
   # OR use poetry in case you plan to develop Maigret
   pip3 install poetry
   poetry run maigret
 Cloud shells and Jupyter notebooks
 ----------------------------------
 In case you don't want to install Maigret locally, you can use cloud shells and Jupyter notebooks.
 .. image:: https://user-images.githubusercontent.com/27065646/92304704-8d146d80-ef80-11ea-8c29-0deaabb1c702.png
   :target: https://console.cloud.google.com/cloudshell/open?git_repo=https://github.com/soxoj/maigret&tutorial=README.md
   :alt: Open in Cloud Shell
 .. image:: https://replit.com/badge/github/soxoj/maigret
   :target: https://repl.it/github/soxoj/maigret
   :alt: Run on Replit
   :height: 50
 .. image:: https://colab.research.google.com/assets/colab-badge.svg
   :target: https://colab.research.google.com/gist/soxoj/879b51bc3b2f8b695abb054090645000/maigret-collab.ipynb
   :alt: Open In Colab
   :height: 45
 .. image:: https://mybinder.org/badge_logo.svg
   :target: https://mybinder.org/v2/gist/soxoj/9d65c2f4d3bec5dd25949197ea73cf3a/HEAD
   :alt: Open In Binder
   :height: 45
 Windows standalone EXE-binaries
 -------------------------------
 Standalone EXE-binaries for Windows are located in the `Releases section <https://github.com/soxoj/maigret/releases>`_ of GitHub repository.
 Currently, the new binary is created automatically after each commit to the main branch, but is not deployed to the Releases section automatically.
 Docker
 ------
 .. code-block:: bash
   # official image of the development version, updated from the github repo
   docker pull soxoj/maigret
   # usage
   docker run -v /mydir:/app/reports soxoj/maigret:latest username --html
   # manual build
   docker build -t maigret .
@@ -5,7 +5,7 @@ Philosophy
 TL;DR: Username => Dossier
-Maigret is designed to gather all the available information about person by his username.
+Maigret is designed to gather all the available information about person by his usernname.
 What kind of information is this? First, links to person accounts. Secondly, all the machine-extractable
 pieces of info, such as: other usernames, full name, URLs to people's images, birthday, location (country,
@@ -1,15 +0,0 @@
 .. _quick-start:
 Quick start
 ===========
 After :doc:`installing Maigret <installation>`, you can begin searching by providing one or more usernames to look up:
 ``maigret username1 username2 ...``
 Maigret will search for accounts with the specified usernames across a vast number of websites. It will provide you with a list 
 of URLs to any discovered accounts, along with relevant information extracted from those profiles.
 .. image:: maigret_screenshot.png
   :alt: Maigret search results screenshot
   :align: center
@@ -0,0 +1,18 @@
 .. _roadmap:
 Roadmap
 =======
 .. figure:: https://i.imgur.com/kk8cFdR.png   
   :target: https://i.imgur.com/kk8cFdR.png
   :align: center
 Current status
 --------------
 - Sites DB stats - ok
 - Scan sessions stats - ok
 - Site engine autodetect - ok
 - Engines for all the sites - WIP
 - Unified reporting flow - ok
 - Retries - ok
@@ -3,9 +3,6 @@
 Settings
 ==============
 .. warning::
   The settings system is under development and may be subject to change.
 Options are also configurable through settings files. See
 `settings JSON file <https://github.com/soxoj/maigret/blob/main/maigret/resources/settings.json>`_
 for the list of currently supported options.
@@ -5,8 +5,7 @@ Tags
 The use of tags allows you to select a subset of the sites from big Maigret DB for search.
-.. warning::
+**Warning: tags markup is not stable now.**
   Tags markup is still not stable.
 There are several types of tags:
@@ -3,66 +3,49 @@
 Usage examples
 ==============
-1. Search for accounts with username ``machine42`` on top 500 sites (by default, according to Alexa rank) from the Maigret DB.
+Start a search for accounts with username ``machine42`` on top 500 sites from the Maigret DB.
 .. code-block:: console
  maigret machine42
-2. Search for accounts with username ``machine42`` on **all sites** from the Maigret DB.
+Start a search for accounts with username ``machine42`` on **all sites** from the Maigret DB.
 .. code-block:: console
  maigret machine42 -a
-.. note::
+Start a search [...] and generate HTML and PDF reports.
   Maigret will search for accounts on a huge number of sites,
   and some of them may return false positive results. At the moment, we are working on autorepair mode to deliver 
   the most accurate results. 
   If you experience many false positives, you can do the following:
   - Install the last development version of Maigret from GitHub
   - Run Maigret with ``--self-check`` flag and agree on disabling of problematic sites
 3. Search for accounts with username ``machine42`` and generate HTML and PDF reports.
 .. code-block:: console
-  maigret machine42 -HP
+  maigret machine42 -a -HP
-or
+Start a search for accounts with username ``machine42`` only on Facebook.
 .. code-block:: console
  maigret machine42 -a --html --pdf
 4. Search for accounts with username ``machine42`` on Facebook only.
 .. code-block:: console
  maigret machine42 --site Facebook
-5. Extract information from the Steam page by URL and start a search for accounts with found username ``machine42``.
+Extract information from the Steam page by URL and start a search for accounts with found username ``machine42``.
 .. code-block:: console
  maigret --parse https://steamcommunity.com/profiles/76561199113454789 
-6. Search for accounts with username ``machine42`` only on US and Japanese sites.
+Start a search for accounts with username ``machine42`` only on US and Japanese sites.
 .. code-block:: console
  maigret machine42 --tags en,jp
-7. Search for accounts with username ``machine42`` only on sites related to software development.
+Start a search for accounts with username ``machine42`` only on sites related to software development.
 .. code-block:: console
  maigret machine42 --tags coding
-8. Search for accounts with username ``machine42`` on uCoz sites only (mostly CIS countries).
+Start a search for accounts with username ``machine42`` on uCoz sites only (mostly CIS countries).
 .. code-block:: console
@@ -1,43 +1,68 @@
 {
-  "nbformat": 4,
+ "cells": [
-  "nbformat_minor": 0,
+  {
-  "metadata": {
+   "cell_type": "code",
-    "colab": {
+   "execution_count": null,
-      "provenance": []
+   "metadata": {
-    },
+    "id": "8v6PEfyXb0Gx"
-    "kernelspec": {
+   },
-      "name": "python3",
+   "outputs": [],
-      "display_name": "Python 3"
+   "source": [
-    },
+    "# clone the repo\n",
-    "language_info": {
+    "!git clone https://github.com/soxoj/maigret\n",
-      "name": "python"
+    "!pip3 install -r maigret/requirements.txt"
-    }
+   ]
  },
-  "cells": [
+  {
-    {
+   "cell_type": "code",
-      "cell_type": "code",
+   "execution_count": null,
-      "execution_count": null,
+   "metadata": {
-      "metadata": {
+    "id": "cXOQUAhDchkl"
-        "id": "acxNWJOUmLc4"
+   },
-      },
+   "outputs": [],
-      "outputs": [],
+   "source": [
-      "source": [
+    "# help\n",
-        "!git clone https://github.com/soxoj/maigret\n",
+    "!python3 maigret/maigret.py --help"
-        "!pip3 install ./maigret/\n",
+   ]
-        "from IPython.display import clear_output\n",
+  },
-        "clear_output()\n",
+  {
-        "username = str(input(\"Username >> \"))\n",
+   "cell_type": "code",
-        "!maigret {username} -a -n 10"
+   "execution_count": null,
-      ]
+   "metadata": {
-    },
+    "id": "SjDmpN4QGnJu"
-    {
+   },
-      "cell_type": "code",
+   "outputs": [],
-      "source": [],
+   "source": [
-      "metadata": {
+    "# search\n",
-        "id": "S3SmapMHmOoD"
+    "!python3 maigret/maigret.py user"
-      },
+   ]
-      "execution_count": null,
+  }
-      "outputs": []
+ ],
-    }
+ "metadata": {
-  ]
+  "colab": {
   "collapsed_sections": [],
   "include_colab_link": true,
   "name": "maigret.ipynb",
   "provenance": []
  },
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
 }
@@ -0,0 +1,21 @@
 #!/usr/bin/env python3
 import asyncio
 import sys
 from maigret.maigret import main
 def run():
    try:
        if sys.version_info.minor >= 10:
            asyncio.run(main())
        else:
            loop = asyncio.get_event_loop()
            loop.run_until_complete(main())
    except KeyboardInterrupt:
        print('Maigret is interrupted.')
        sys.exit(1)
 if __name__ == "__main__":
    run()
@@ -1,39 +1,39 @@
 # Standard library imports
 import ast
 import asyncio
 import logging
 import random
 import re
 import ssl
 import sys
 from typing import Dict, List, Optional, Tuple
 from urllib.parse import quote
 # Third party imports
 import aiodns
 from alive_progress import alive_bar
 from aiohttp import ClientSession, TCPConnector, http_exceptions
 from aiohttp.client_exceptions import ClientConnectorError, ServerDisconnectedError
 from python_socks import _errors as proxy_errors
 from socid_extractor import extract
 try:
    from mock import Mock
 except ImportError:
    from unittest.mock import Mock
-# Local imports
+import re
-from . import errors
+import ssl
 import sys
 import tqdm
 import random
 from typing import Tuple, Optional, Dict, List
 from urllib.parse import quote
 import aiodns
 import tqdm.asyncio
 from python_socks import _errors as proxy_errors
 from socid_extractor import extract
 from aiohttp import TCPConnector, ClientSession, http_exceptions
 from aiohttp.client_exceptions import ServerDisconnectedError, ClientConnectorError
 from .activation import ParsingActivator, import_aiohttp_cookies
 from . import errors
 from .errors import CheckError
 from .executors import (
    AsyncExecutor,
    AsyncioSimpleExecutor,
    AsyncioProgressbarQueueExecutor,
 )
 from .result import QueryResult, QueryStatus
 from .sites import MaigretDatabase, MaigretSite
 from .types import QueryOptions, QueryResultWrapper
-from .utils import ascii_data_display, get_random_user_agent
+from .utils import get_random_user_agent, ascii_data_display
 SUPPORTED_IDS = (
@@ -57,120 +57,119 @@ class CheckerBase:
 class SimpleAiohttpChecker(CheckerBase):
    def __init__(self, *args, **kwargs):
-        self.proxy = kwargs.get('proxy')
+        proxy = kwargs.get('proxy')
-        self.cookie_jar = kwargs.get('cookie_jar')
+        cookie_jar = kwargs.get('cookie_jar')
        self.logger = kwargs.get('logger', Mock())
-        self.url = None
+
-        self.headers = None
+        # moved here to speed up the launch of Maigret
-        self.allow_redirects = True
+        from aiohttp_socks import ProxyConnector
-        self.timeout = 0
+
-        self.method = 'get'
+        # make http client session
        connector = ProxyConnector.from_url(proxy) if proxy else TCPConnector(ssl=False)
        connector.verify_ssl = False
        self.session = ClientSession(
            connector=connector, trust_env=True, cookie_jar=cookie_jar
        )
    def prepare(self, url, headers=None, allow_redirects=True, timeout=0, method='get'):
-        self.url = url
+        if method == 'get':
-        self.headers = headers
+            request_method = self.session.get
-        self.allow_redirects = allow_redirects
+        else:
-        self.timeout = timeout
+            request_method = self.session.head
-        self.method = method
+
-        return None
+        future = request_method(
            url=url,
            headers=headers,
            allow_redirects=allow_redirects,
            timeout=timeout,
        )
        return future
    async def close(self):
-        pass
+        await self.session.close()
    async def check(self, future) -> Tuple[str, int, Optional[CheckError]]:
        html_text = None
        status_code = 0
        error: Optional[CheckError] = CheckError("Unknown")
    async def _make_request(self, session, url, headers, allow_redirects, timeout, method, logger) -> Tuple[str, int, Optional[CheckError]]:
        try:
-            request_method = session.get if method == 'get' else session.head
+            response = await future
            async with request_method(
                url=url,
                headers=headers,
                allow_redirects=allow_redirects,
                timeout=timeout,
            ) as response:
                status_code = response.status
                response_content = await response.content.read()
                charset = response.charset or "utf-8"
                decoded_content = response_content.decode(charset, "ignore")
-                error = CheckError("Connection lost") if status_code == 0 else None
+            status_code = response.status
-                logger.debug(decoded_content)
+            response_content = await response.content.read()
            charset = response.charset or "utf-8"
            decoded_content = response_content.decode(charset, "ignore")
            html_text = decoded_content
-                return decoded_content, status_code, error
+            error = None
            if status_code == 0:
                error = CheckError("Connection lost")
            self.logger.debug(html_text)
        except asyncio.TimeoutError as e:
-            return None, 0, CheckError("Request timeout", str(e))
+            error = CheckError("Request timeout", str(e))
        except ClientConnectorError as e:
-            return None, 0, CheckError("Connecting failure", str(e))
+            error = CheckError("Connecting failure", str(e))
        except ServerDisconnectedError as e:
-            return None, 0, CheckError("Server disconnected", str(e))
+            error = CheckError("Server disconnected", str(e))
        except http_exceptions.BadHttpMessage as e:
-            return None, 0, CheckError("HTTP", str(e))
+            error = CheckError("HTTP", str(e))
        except proxy_errors.ProxyError as e:
-            return None, 0, CheckError("Proxy", str(e))
+            error = CheckError("Proxy", str(e))
        except KeyboardInterrupt:
-            return None, 0, CheckError("Interrupted")
+            error = CheckError("Interrupted")
        except Exception as e:
            # python-specific exceptions
            if sys.version_info.minor > 6 and (
                isinstance(e, ssl.SSLCertVerificationError)
                or isinstance(e, ssl.SSLError)
            ):
-                return None, 0, CheckError("SSL", str(e))
+                error = CheckError("SSL", str(e))
            else:
-                logger.debug(e, exc_info=True)
+                self.logger.debug(e, exc_info=True)
-                return None, 0, CheckError("Unexpected", str(e))
+                error = CheckError("Unexpected", str(e))
-    async def check(self) -> Tuple[str, int, Optional[CheckError]]:
+        if error == "Invalid proxy response":
-        from aiohttp_socks import ProxyConnector
+            self.logger.debug(error, exc_info=True)
        connector = ProxyConnector.from_url(self.proxy) if self.proxy else TCPConnector(ssl=False)
        connector.verify_ssl = False
-        async with ClientSession(
+        return str(html_text), status_code, error
            connector=connector,
            trust_env=True,
            cookie_jar=self.cookie_jar.copy() if self.cookie_jar else None,
        ) as session:
            html_text, status_code, error = await self._make_request(
                session,
                self.url,
                self.headers,
                self.allow_redirects,
                self.timeout,
                self.method,
                self.logger
            )
            if error and str(error) == "Invalid proxy response":
                self.logger.debug(error, exc_info=True)
            return str(html_text) if html_text else '', status_code, error
 class ProxiedAiohttpChecker(SimpleAiohttpChecker):
    def __init__(self, *args, **kwargs):
-        self.proxy = kwargs.get('proxy')
+        proxy = kwargs.get('proxy')
-        self.cookie_jar = kwargs.get('cookie_jar')
+        cookie_jar = kwargs.get('cookie_jar')
        self.logger = kwargs.get('logger', Mock())
        # moved here to speed up the launch of Maigret
        from aiohttp_socks import ProxyConnector
        connector = ProxyConnector.from_url(proxy)
        connector.verify_ssl = False
        self.session = ClientSession(
            connector=connector, trust_env=True, cookie_jar=cookie_jar
        )
 class AiodnsDomainResolver(CheckerBase):
    if sys.platform == 'win32':  # Temporary workaround for Windows
        asyncio.set_event_loop_policy(asyncio.WindowsSelectorEventLoopPolicy())
    def __init__(self, *args, **kwargs):
        loop = asyncio.get_event_loop()
        self.logger = kwargs.get('logger', Mock())
        self.resolver = aiodns.DNSResolver(loop=loop)
    def prepare(self, url, headers=None, allow_redirects=True, timeout=0, method='get'):
-        self.url = url
+        return self.resolver.query(url, 'A')
        return None
-    async def check(self) -> Tuple[str, int, Optional[CheckError]]:
+    async def check(self, future) -> Tuple[str, int, Optional[CheckError]]:
        status = 404
        error = None
        text = ''
        try:
-            res = await self.resolver.query(self.url, 'A')
+            res = await future
            text = str(res[0].host)
            status = 200
        except aiodns.error.DNSError:
@@ -189,7 +188,7 @@ class CheckerMock:
    def prepare(self, url, headers=None, allow_redirects=True, timeout=0, method='get'):
        return None
-    async def check(self) -> Tuple[str, int, Optional[CheckError]]:
+    async def check(self, future) -> Tuple[str, int, Optional[CheckError]]:
        await asyncio.sleep(0)
        return '', 0, None
@@ -375,16 +374,8 @@ def process_site_result(
        if extracted_ids_data:
            new_usernames = {}
            for k, v in extracted_ids_data.items():
-                if "username" in k and not "usernames" in k:
+                if "username" in k:
                    new_usernames[v] = "username"
                elif "usernames" in k:
                    try:
                        tree = ast.literal_eval(v)
                        if type(tree) == list:
                            for n in tree:
                             new_usernames[n] = "username"
                    except Exception as e:
                        logger.warning(e)
                if k in SUPPORTED_IDS:
                    new_usernames[v] = k
@@ -424,8 +415,6 @@ def make_site_result(
    headers = {
        "User-Agent": get_random_user_agent(),
        # tell server that we want to close connection after request
        "Connection": "close",
    }
    headers.update(site.headers)
@@ -532,8 +521,7 @@ def make_site_result(
        # Store future request object in the results object
        results_site["future"] = future
-
+        results_site["checker"] = checker
    results_site["checker"] = checker
    return results_site
@@ -541,19 +529,14 @@ def make_site_result(
 async def check_site_for_username(
    site, username, options: QueryOptions, logger, query_notify, *args, **kwargs
 ) -> Tuple[str, QueryResultWrapper]:
-    default_result = make_site_result(
+    default_result = make_site_result(site, username, options, logger, retry=kwargs.get('retry'))
-        site, username, options, logger, retry=kwargs.get('retry')
+    future = default_result.get("future")
-    )
+    if not future:
    # future = default_result.get("future")
    # if not future:
        # return site.name, default_result
    checker = default_result.get("checker")
    if not checker:
        print(f"error, no checker for {site.name}")
        return site.name, default_result
-    response = await checker.check()
+    checker = default_result["checker"]
    response = await checker.check(future=future)
    response_result = process_site_result(
        response, query_notify, logger, default_result, site
@@ -565,8 +548,8 @@ async def check_site_for_username(
 async def debug_ip_request(checker, logger):
-    checker.prepare(url="https://icanhazip.com")
+    future = checker.prepare(url="https://icanhazip.com")
-    ip, status, check_error = await checker.check()
+    ip, status, check_error = await checker.check(future)
    if ip:
        logger.debug(f"My IP is: {ip.strip()}")
    else:
@@ -684,11 +667,8 @@ async def maigret(
        executor = AsyncioSimpleExecutor(logger=logger)
    else:
        executor = AsyncioProgressbarQueueExecutor(
-            logger=logger,
+            logger=logger, in_parallel=max_connections, timeout=timeout + 0.5,
-            in_parallel=max_connections,
+            *args, **kwargs
            timeout=timeout + 0.5,
            *args,
            **kwargs,
        )
    # make options objects for all the requests
@@ -730,10 +710,7 @@ async def maigret(
            tasks_dict[sitename] = (
                check_site_for_username,
                [site, username, options, logger, query_notify],
-                {
+                {'default': (sitename, default_result), 'retry': retries-attempts+1},
                    'default': (sitename, default_result),
                    'retry': retries - attempts + 1,
                },
            )
        cur_results = await executor.run(tasks_dict.values())
@@ -756,8 +733,10 @@ async def maigret(
    # closing http client session
    await clearweb_checker.close()
-    await tor_checker.close()
+    if tor_proxy:
-    await i2p_checker.close()
+        await tor_checker.close()
    if i2p_proxy:
        await i2p_checker.close()
    # notify caller that all queries are finished
    query_notify.finish()
@@ -792,7 +771,7 @@ def timeout_check(value):
 async def site_self_check(
    site: MaigretSite,
-    logger: logging.Logger,
+    logger,
    semaphore,
    db: MaigretDatabase,
    silent=False,
@@ -838,9 +817,6 @@ async def site_self_check(
            result = results_dict[site.name]["status"]
        if result.error and 'Cannot connect to host' in result.error.desc:
            changes["disabled"] = True
        site_status = result.status
        if site_status != status:
@@ -868,24 +844,18 @@ async def site_self_check(
    if changes["disabled"] != site.disabled:
        site.disabled = changes["disabled"]
        logger.info(f"Switching disabled status of {site.name} to {site.disabled}")
        db.update_site(site)
        if not silent:
            action = "Disabled" if site.disabled else "Enabled"
            print(f"{action} site {site.name}...")
    # remove service tag "unchecked"
    if "unchecked" in site.tags:
        site.tags.remove("unchecked")
        db.update_site(site)
    return changes
 async def self_check(
    db: MaigretDatabase,
    site_data: dict,
-    logger: logging.Logger,
+    logger,
    silent=False,
    max_connections=10,
    proxy=None,
@@ -899,7 +869,6 @@ async def self_check(
    def disabled_count(lst):
        return len(list(filter(lambda x: x.disabled, lst)))
    unchecked_old_count = len([site for site in all_sites.values() if "unchecked" in site.tags])
    disabled_old_count = disabled_count(all_sites.values())
    for _, site in all_sites.items():
@@ -909,30 +878,22 @@ async def self_check(
        future = asyncio.ensure_future(check_coro)
        tasks.append(future)
-    if tasks:
+    for f in tqdm.asyncio.tqdm.as_completed(tasks):
-        with alive_bar(len(tasks), title='Self-checking', force_tty=True) as progress:
+        await f
            for f in asyncio.as_completed(tasks):
                await f
                progress()  # Update the progress bar
    unchecked_new_count = len([site for site in all_sites.values() if "unchecked" in site.tags])
    disabled_new_count = disabled_count(all_sites.values())
    total_disabled = disabled_new_count - disabled_old_count
-    if total_disabled:
+    if total_disabled >= 0:
-        if total_disabled >= 0:
+        message = "Disabled"
-            message = "Disabled"
+    else:
-        else:
+        message = "Enabled"
-            message = "Enabled"
+        total_disabled *= -1
            total_disabled *= -1
-        if not silent:
+    if not silent:
-            print(
+        print(
-                f"{message} {total_disabled} ({disabled_old_count} => {disabled_new_count}) checked sites. "
+            f"{message} {total_disabled} ({disabled_old_count} => {disabled_new_count}) checked sites. "
-                "Run with `--info` flag to get more information"
+            "Run with `--info` flag to get more information"
-            )
+        )
-    if unchecked_new_count != unchecked_old_count:
+    return total_disabled != 0
        print(f"Unchecked sites verified: {unchecked_old_count - unchecked_new_count}")
    return total_disabled != 0 or unchecked_new_count != unchecked_old_count
@@ -58,12 +58,6 @@ COMMON_ERRORS = {
    'Сайт заблокирован хостинг-провайдером': CheckError(
        'Site-specific', 'Site is disabled (Beget)'
    ),
    'Generated by cloudfront (CloudFront)': CheckError(
        'Request blocked', 'Cloudflare'
    ),
    '/cdn-cgi/challenge-platform/h/b/orchestrate/chl_page': CheckError(
        'Just a moment: bot redirect challenge', 'Cloudflare'
    )
 }
 ERRORS_TYPES = {
@@ -1,13 +1,12 @@
 import asyncio
 import sys
 import time
-from typing import Any, Iterable, List
+import tqdm
-
+import sys
-import alive_progress
+from typing import Iterable, Any, List
 from alive_progress import alive_bar
 from .types import QueryDraft
 def create_task_func():
    if sys.version_info.minor > 6:
        create_asyncio_task = asyncio.create_task
@@ -35,14 +34,9 @@ class AsyncExecutor:
 class AsyncioSimpleExecutor(AsyncExecutor):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self.semaphore = asyncio.Semaphore(kwargs.get('in_parallel', 100))
    async def _run(self, tasks: Iterable[QueryDraft]):
-        async def sem_task(f, args, kwargs):
+        futures = [f(*args, **kwargs) for f, args, kwargs in tasks]
            async with self.semaphore:
                return await f(*args, **kwargs)
        futures = [sem_task(f, args, kwargs) for f, args, kwargs in tasks]
        return await asyncio.gather(*futures)
@@ -52,20 +46,9 @@ class AsyncioProgressbarExecutor(AsyncExecutor):
    async def _run(self, tasks: Iterable[QueryDraft]):
        futures = [f(*args, **kwargs) for f, args, kwargs in tasks]
        total_tasks = len(futures)
        results = []
-
+        for f in tqdm.asyncio.tqdm.as_completed(futures):
-        # Use alive_bar for progress tracking
+            results.append(await f)
        with alive_bar(total_tasks, title='Searching', force_tty=True) as progress:
            # Chunk progress updates for efficiency
            async def track_task(task):
                result = await task
                progress()  # Update progress bar once task completes
                return result
            # Use gather to run tasks concurrently and track progress
            results = await asyncio.gather(*(track_task(f) for f in futures))
        return results
@@ -83,12 +66,8 @@ class AsyncioProgressbarSemaphoreExecutor(AsyncExecutor):
        async def semaphore_gather(tasks: Iterable[QueryDraft]):
            coros = [_wrap_query(q) for q in tasks]
            results = []
-
+            for f in tqdm.asyncio.tqdm.as_completed(coros):
-            # Use alive_bar correctly as a context manager
+                results.append(await f)
            with alive_bar(len(coros), title='Searching', force_tty=True) as progress:
                for f in asyncio.as_completed(coros):
                    results.append(await f)
                    progress()  # Update the progress bar
            return results
        return await semaphore_gather(tasks)
@@ -98,35 +77,27 @@ class AsyncioProgressbarQueueExecutor(AsyncExecutor):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self.workers_count = kwargs.get('in_parallel', 10)
        self.progress_func = kwargs.get('progress_func', tqdm.tqdm)
        self.queue = asyncio.Queue(self.workers_count)
        self.timeout = kwargs.get('timeout')
        # Pass a progress function; alive_bar by default
        self.progress_func = kwargs.get('progress_func', alive_bar)
        self.progress = None
    # TODO: tests
    async def increment_progress(self, count):
-        """Update progress by calling the provided progress function."""
+        update_func = self.progress.update
-        if self.progress:
+        if asyncio.iscoroutinefunction(update_func):
-            if asyncio.iscoroutinefunction(self.progress):
+            await update_func(count)
-                await self.progress(count)
+        else:
-            else:
+            update_func(count)
-                self.progress(count)
+        await asyncio.sleep(0)
                await asyncio.sleep(0)
    # TODO: tests
    async def stop_progress(self):
-        """Stop the progress tracking."""
+        stop_func = self.progress.close
-        if hasattr(self.progress, "close") and self.progress:
+        if asyncio.iscoroutinefunction(stop_func):
-            close_func = self.progress.close
+            await stop_func()
-            if asyncio.iscoroutinefunction(close_func):
+        else:
-                await close_func()
+            stop_func()
-            else:
+        await asyncio.sleep(0)
                close_func()
                await asyncio.sleep(0)
    async def worker(self):
        """Consume tasks from the queue and process them."""
        while True:
            try:
                f, args, kwargs = self.queue.get_nowait()
@@ -141,33 +112,27 @@ class AsyncioProgressbarQueueExecutor(AsyncExecutor):
                result = kwargs.get('default')
            self.results.append(result)
-
+            await self.increment_progress(1)
            if self.progress:
                await self.increment_progress(1)
            self.queue.task_done()
    async def _run(self, queries: Iterable[QueryDraft]):
        """Main runner function to execute tasks with progress tracking."""
        self.results: List[Any] = []
        queries_list = list(queries)
        min_workers = min(len(queries_list), self.workers_count)
        workers = [create_task_func()(self.worker()) for _ in range(min_workers)]
-        # Initialize the progress bar
+        self.progress = self.progress_func(total=len(queries_list))
        if self.progress_func:
            with self.progress_func(len(queries_list), title="Searching", force_tty=True) as bar:
                self.progress = bar  # Assign alive_bar's callable to self.progress
-                # Add tasks to the queue
+        for t in queries_list:
-                for t in queries_list:
+            await self.queue.put(t)
                    await self.queue.put(t)
-                # Wait for tasks to complete
+        await self.queue.join()
                await self.queue.join()
-                # Cancel any remaining workers
+        for w in workers:
-                for w in workers:
+            w.cancel()
                    w.cancel()
        await self.stop_progress()
        return self.results
@@ -1,7 +1,6 @@
 """
 Maigret main module
 """
 import ast
 import asyncio
 import logging
 import os
@@ -41,10 +40,9 @@ from .submit import Submitter
 from .types import QueryResultWrapper
 from .utils import get_dict_ascii_tree
 from .settings import Settings
 from .permutator import Permute
-def notify_about_errors(search_results: QueryResultWrapper, query_notify, show_statistics=False):
+def notify_about_errors(search_results: QueryResultWrapper, query_notify):
    errs = errors.extract_and_group(search_results)
    was_errs_displayed = False
    for e in errs:
@@ -58,17 +56,12 @@ def notify_about_errors(search_results: QueryResultWrapper, query_notify, show_s
        query_notify.warning(text, '!')
        was_errs_displayed = True
    if show_statistics:
        query_notify.warning(f'Verbose error statistics:')
        for e in errs:
            text = f'{e["err"]}: {round(e["perc"],2)}%' 
            query_notify.warning(text, '!')
    if was_errs_displayed:
        query_notify.warning(
            'You can see detailed site check errors with a flag `--print-errors`'
        )
 def extract_ids_from_page(url, logger, timeout=5) -> dict:
    results = {}
    # url, headers
@@ -92,17 +85,8 @@ def extract_ids_from_page(url, logger, timeout=5) -> dict:
        else:
            print(get_dict_ascii_tree(info.items(), new_line=False), ' ')
        for k, v in info.items():
-            # TODO: merge with the same functionality in checking module
+            if 'username' in k:
            if 'username' in k and not 'usernames' in k:
                results[v] = 'username'
            elif 'usernames' in k:
                try:
                    tree = ast.literal_eval(v)
                    if type(tree) == list:
                        for n in tree:
                         results[n] = 'username'
                except Exception as e:
                    logger.warning(e)
            if k in SUPPORTED_IDS:
                results[v] = k
@@ -211,12 +195,6 @@ def setup_arguments_parser(settings: Settings):
        choices=SUPPORTED_IDS,
        help="Specify identifier(s) type (default: username).",
    )
    parser.add_argument(
        "--permute",
        action="store_true",
        default=False,
        help="Permute at least 2 usernames to generate more possible usernames.",
    )
    parser.add_argument(
        "--db",
        metavar="DB_FILE",
@@ -499,7 +477,7 @@ async def main():
    arg_parser = setup_arguments_parser(settings)
    args = arg_parser.parse_args()
-    # Re-set logging level based on args
+    # Re-set loggging level based on args
    if args.debug:
        log_level = logging.DEBUG
    elif args.info:
@@ -514,10 +492,6 @@ async def main():
        for u in args.username
        if u and u not in ['-'] and u not in args.ignore_ids_list
    }
    original_usernames = ""
    if args.permute and len(usernames) > 1 and args.id_type == 'username':
        original_usernames = " ".join(usernames.keys())
        usernames = Permute(usernames).gather(method='strict')
    parsing_enabled = not args.disable_extracting
    recursive_search_enabled = not args.disable_recursive_search
@@ -569,11 +543,7 @@ async def main():
    # Database self-checking
    if args.self_check:
-        if len(site_data) == 0:
+        print('Maigret sites database self-checking...')
            query_notify.warning('No sites to self-check with the current filters! Exiting...')
            return
        query_notify.success(f'Maigret sites database self-check started for {len(site_data)} sites...')
        is_need_update = await self_check(
            db,
            site_data,
@@ -592,9 +562,7 @@ async def main():
                print('Database was successfully updated.')
            else:
                print('Updates will be applied only for current search session.')
-
+        print('Scan sessions flags stats: ' + str(db.get_scan_stats(site_data)))
        if args.verbose or args.debug:
            query_notify.info('Scan sessions flags stats: ' + str(db.get_scan_stats(site_data)))
    # Database statistics
    if args.stats:
@@ -613,12 +581,6 @@ async def main():
        query_notify.warning('No usernames to check, exiting.')
        sys.exit(0)
    if len(usernames) > 1 and args.permute  and args.id_type == 'username':
        query_notify.warning(
            f"{len(usernames)} permutations from {original_usernames} to check..." +
            get_dict_ascii_tree(usernames, prepend="\t")
        )
    if not site_data:
        query_notify.warning('No sites to check, exiting!')
        sys.exit(2)
@@ -682,7 +644,7 @@ async def main():
            check_domains=args.with_domains,
        )
-        notify_about_errors(results, query_notify, show_statistics=args.verbose)
+        notify_about_errors(results, query_notify)
        if args.reports_sorting == "data":
            results = sort_report_by_data_points(results)
@@ -211,10 +211,6 @@ class QueryNotifyPrint(QueryNotify):
        else:
            print(msg)
    def success(self, message, symbol="+"):
        msg = f"[{symbol}] {message}"
        self._colored_print(Fore.GREEN, msg)
    def warning(self, message, symbol="-"):
        msg = f"[{symbol}] {message}"
        self._colored_print(Fore.YELLOW, msg)
@@ -1,26 +0,0 @@
 # License MIT. by balestek https://github.com/balestek
 from itertools import permutations
 class Permute:
    def __init__(self, elements: dict):
        self.separators = ["", "_", "-", "."]
        self.elements = elements
    def gather(self, method: str = "strict" or "all") -> dict:
        permutations_dict = {}
        for i in range(1, len(self.elements) + 1):
            for subset in permutations(self.elements, i):
                if i == 1:
                    if method == "all":
                        permutations_dict[subset[0]] = self.elements[subset[0]]
                        permutations_dict["_" + subset[0]] = self.elements[subset[0]]
                        permutations_dict[subset[0] + "_"] = self.elements[subset[0]]
                else:
                    for separator in self.separators:
                        perm = separator.join(subset)
                        permutations_dict[perm] = self.elements[subset[0]]
                        if separator == "":
                            permutations_dict["_" + perm] = self.elements[subset[0]]
                            permutations_dict[perm + "_"] = self.elements[subset[0]]
        return permutations_dict
@@ -8,7 +8,6 @@ from datetime import datetime
 from typing import Dict, Any
 import xmind
 from dateutil.tz import gettz
 from dateutil.parser import parse as parse_datetime_str
 from jinja2 import Template
@@ -17,8 +16,6 @@ from .result import QueryStatus
 from .sites import MaigretDatabase
 from .utils import is_country_tag, CaseConverter, enrich_link_str
 ADDITIONAL_TZINFO = {"CDT": gettz("America/Chicago")}
 SUPPORTED_JSON_REPORT_FORMATS = [
    "simple",
    "ndjson",
@@ -295,8 +292,8 @@ def generate_report_context(username_results: list):
                        first_seen = created_at
                    else:
                        try:
-                            known_time = parse_datetime_str(first_seen, tzinfos=ADDITIONAL_TZINFO)
+                            known_time = parse_datetime_str(first_seen)
-                            new_time = parse_datetime_str(created_at, tzinfos=ADDITIONAL_TZINFO)
+                            new_time = parse_datetime_str(created_at)
                            if new_time < known_time:
                                first_seen = created_at
                        except Exception as e:
@@ -305,7 +302,6 @@ def generate_report_context(username_results: list):
                                first_seen,
                                created_at,
                                str(e),
                                exc_info=True,
                            )
                for k, v in status.ids_data.items():
@@ -1,30 +1,21 @@
 {
    "presence_strings": [
        "user not found",
        "404",
        "Page not found",
        "error 404",
        "username",
        "not found",
        "пользователь",
        "profile",
        "lastname",
        "firstname",
        "DisplayName",
        "biography",
        "title",
        "birthday",
        "репутация",
        "информация",
-        "e-mail",
+        "e-mail"
        "body",
        "html",
        "style"
    ],
    "supposed_usernames": [
        "alex", "god", "admin", "red", "blue", "john"
    ],
-    "retries_count": 0,
+    "retries_count": 1,
    "sites_db_path": "resources/data.json",
    "timeout": 30,
    "max_connections": 100,
@@ -68,6 +68,7 @@
        <div class="row-mb">
            <div class="col-md">
                <div class="card flex-md-row mb-4 box-shadow h-md-250">
                    <span style="position: absolute; right: 10px;"><a href="https://github.com/soxoj/maigret/issues/new?assignees=soxoj&amp;labels=bug&amp;template=report-false-result.md&amp;title=Invalid%20result%20{{ v.url_user }}">Invalid?</a></span>
                    <img class="card-img-right flex-auto d-md-block" alt="Photo" style="width: 200px; height: 200px; object-fit: scale-down;" src="{{ v.status and v.status.ids_data and v.status.ids_data.image or 'https://i.imgur.com/040fmbw.png' }}" data-holder-rendered="true">
                    <div class="card-body d-flex flex-column align-items-start" style="padding-top: 0;">
                    <h3 class="mb-0" style="padding-top: 1rem;">
@@ -64,6 +64,7 @@
            <div class="sitebox" style="margin-top: 20px;" >
                <div>
                    <div>
                        <span class="invalid-button"><a href="https://github.com/soxoj/maigret/issues/new?assignees=soxoj&amp;labels=bug&amp;template=report-false-result.md&amp;title=Invalid%20result%20{{ v.url_user }}">Invalid?</a></span>
                        <table>
                            <tr>
                                <td valign="top">
@@ -21,7 +21,6 @@ class MaigretEngine:
 class MaigretSite:
    # Fields that should not be serialized when converting site to JSON
    NOT_SERIALIZABLE_FIELDS = [
        "name",
        "engineData",
@@ -32,65 +31,37 @@ class MaigretSite:
        "urlRegexp",
    ]
    # Username known to exist on the site
    username_claimed = ""
    # Username known to not exist on the site
    username_unclaimed = ""
    # Additional URL path component, e.g. /forum in https://example.com/forum/users/{username}
    url_subpath = ""
    # Main site URL (the main page)
    url_main = ""
    # Full URL pattern for username page, e.g. https://example.com/forum/users/{username}
    url = ""
    # Whether site is disabled. Not used by Maigret without --use-disabled argument
    disabled = False
    # Whether a positive result indicates accounts with similar usernames rather than exact matches
    similar_search = False
    # Whether to ignore 403 status codes
    ignore403 = False
    # Site category tags
    tags: List[str] = []
    # Type of identifier (username, gaia_id etc); see SUPPORTED_IDS in checking.py
    type = "username"
    # Custom HTTP headers
    headers: Dict[str, str] = {}
    # Error message substrings
    errors: Dict[str, str] = {}
    # Site activation requirements
    activation: Dict[str, Any] = {}
    # Regular expression for username validation
    regex_check = None
    # URL to probe site status
    url_probe = None
    # Type of check to perform
    check_type = ""
    # Whether to only send HEAD requests (GET by default)
    request_head_only = ""
    # GET parameters to include in requests
    get_params: Dict[str, Any] = {}
    # Substrings in HTML response that indicate profile exists
    presense_strs: List[str] = []
    # Substrings in HTML response that indicate profile doesn't exist
    absence_strs: List[str] = []
    # Site statistics
    stats: Dict[str, Any] = {}
    # Site engine name
    engine = None
    # Engine-specific configuration
    engine_data: Dict[str, Any] = {}
    # Engine instance
    engine_obj: Optional["MaigretEngine"] = None
    # Future for async requests
    request_future = None
    # Alexa traffic rank
    alexa_rank = None
    # Source (in case a site is a mirror of another site)
    source = None
    # URL protocol (http/https)
    protocol = ''
    def __init__(self, name, information):
@@ -109,37 +80,6 @@ class MaigretSite:
    def __str__(self):
        return f"{self.name} ({self.url_main})"
    def __is_equal_by_url_or_name(self, url_or_name_str: str):
        lower_url_or_name_str = url_or_name_str.lower()
        lower_url = self.url.lower()
        lower_name = self.name.lower()
        lower_url_main = self.url_main.lower()
        return \
            lower_name == lower_url_or_name_str or \
            (lower_url_main and lower_url_main == lower_url_or_name_str) or \
            (lower_url_main and lower_url_main in lower_url_or_name_str) or \
            (lower_url_main and lower_url_or_name_str in lower_url_main) or \
            (lower_url and lower_url_or_name_str in lower_url)
    def __eq__(self, other):
        if isinstance(other, MaigretSite):
            # Compare only relevant attributes, not internal state like request_future
            attrs_to_compare = [
                'name', 'url_main', 'url_subpath', 'type', 'headers',
                'errors', 'activation', 'regex_check', 'url_probe',
                'check_type', 'request_head_only', 'get_params',
                'presense_strs', 'absence_strs', 'stats', 'engine',
                'engine_data', 'alexa_rank', 'source', 'protocol'
            ]
            return all(getattr(self, attr) == getattr(other, attr)
                         for attr in attrs_to_compare)
        elif isinstance(other, str):
            # Compare only by name (exactly) or url_main (partial similarity)
            return self.__is_equal_by_url_or_name(other)
        return False
    def update_detectors(self):
        if "url" in self.__dict__:
            url = self.url
@@ -161,10 +101,6 @@ class MaigretSite:
        return None
    def extract_id_from_url(self, url: str) -> Optional[Tuple[str, str]]:
        """
        Extracts username from url.
        It's outdated, detects only a format of https://example.com/{username}
        """
        if not self.url_regexp:
            return None
@@ -287,15 +223,6 @@ class MaigretDatabase:
    def sites_dict(self):
        return {site.name: site for site in self._sites}
    def has_site(self, site: MaigretSite):
        for s in self._sites:
            if site == s:
                return True
        return False
    def __contains__(self, site):
        return self.has_site(site)
    def ranked_sites_dict(
        self,
        reverse=False,
@@ -307,17 +234,6 @@ class MaigretDatabase:
    ):
        """
        Ranking and filtering of the sites list
        Args:
            reverse (bool, optional): Reverse the sorting order. Defaults to False.
            top (int, optional): Maximum number of sites to return. Defaults to sys.maxsize.
            tags (list, optional): List of tags to filter sites by. Defaults to empty list.
            names (list, optional): List of site names (or urls, see MaigretSite.__eq__) to filter by. Defaults to empty list.
            disabled (bool, optional): Whether to include disabled sites. Defaults to True.
            id_type (str, optional): Type of identifier to filter by. Defaults to "username".
        Returns:
            dict: Dictionary of filtered and ranked sites, with site names as keys and MaigretSite objects as values
        """
        normalized_names = list(map(str.lower, names))
        normalized_tags = list(map(str.lower, tags))
@@ -504,64 +420,66 @@ class MaigretDatabase:
        return results
    def get_db_stats(self, is_markdown=False):
        # Initialize counters
        sites_dict = self.sites_dict
        urls = {}
        tags = {}
        output = ""
        disabled_count = 0
        total_count = len(sites_dict)
        message_checks = 0
        message_checks_one_factor = 0
        status_checks = 0
-        # Collect statistics
+        for _, site in sites_dict.items():
        for site in sites_dict.values():
            # Count disabled sites
            if site.disabled:
                disabled_count += 1
            # Count URL types
            url_type = site.get_url_template()
            urls[url_type] = urls.get(url_type, 0) + 1
-            # Count check types for enabled sites
+            if site.check_type == 'message' and not site.disabled:
-            if not site.disabled:
+                message_checks += 1
-                if site.check_type == 'message':
+                if site.absence_strs and site.presense_strs:
-                    if not (site.absence_strs and site.presense_strs):
+                    continue
-                        message_checks_one_factor += 1
+                message_checks_one_factor += 1
-                elif site.check_type == 'status_code':
+
-                    status_checks += 1
+            if site.check_type == 'status_code':
                status_checks += 1
            # Count tags
            if not site.tags:
                tags["NO_TAGS"] = tags.get("NO_TAGS", 0) + 1
            for tag in filter(lambda x: not is_country_tag(x), site.tags):
                tags[tag] = tags.get(tag, 0) + 1
-        # Calculate percentages
+        enabled_count = total_count-disabled_count
-        total_count = len(sites_dict)
+        enabled_perc = round(100*enabled_count/total_count, 2)
-        enabled_count = total_count - disabled_count
+        output += f"Enabled/total sites: {enabled_count}/{total_count} = {enabled_perc}%\n\n"
        enabled_perc = round(100 * enabled_count / total_count, 2)
        checks_perc = round(100 * message_checks_one_factor / enabled_count, 2)
        status_checks_perc = round(100 * status_checks / enabled_count, 2)
-        # Format output
+        checks_perc = round(100*message_checks_one_factor/enabled_count, 2)
-        separator = "\n\n"
+        output += f"Incomplete message checks: {message_checks_one_factor}/{enabled_count} = {checks_perc}% (false positive risks)\n\n"
        output = [
            f"Enabled/total sites: {enabled_count}/{total_count} = {enabled_perc}%",
            f"Incomplete message checks: {message_checks_one_factor}/{enabled_count} = {checks_perc}% (false positive risks)",
            f"Status code checks: {status_checks}/{enabled_count} = {status_checks_perc}% (false positive risks)",
            f"False positive risk (total): {checks_perc + status_checks_perc:.2f}%",
            self._format_top_items("profile URLs", urls, 20, is_markdown),
            self._format_top_items("tags", tags, 20, is_markdown, self._tags),
        ]
-        return separator.join(output)
+        status_checks_perc = round(100*status_checks/enabled_count, 2)
        output += f"Status code checks: {status_checks}/{enabled_count} = {status_checks_perc}% (false positive risks)\n\n"
-    def _format_top_items(self, title, items_dict, limit, is_markdown, valid_items=None):
+        output += f"False positive risk (total): {checks_perc+status_checks_perc:.2f}%\n\n"
-        """Helper method to format top items lists"""
+
-        output = f"Top {limit} {title}:\n"
+        top_urls_count = 20
-        for item, count in sorted(items_dict.items(), key=lambda x: x[1], reverse=True)[:limit]:
+        output += f"Top {top_urls_count} profile URLs:\n"
        for url, count in sorted(urls.items(), key=lambda x: x[1], reverse=True)[:top_urls_count]:
            if count == 1:
                break
-            mark = " (non-standard)" if valid_items is not None and item not in valid_items else ""
+            output += f"- ({count})\t`{url}`\n" if is_markdown else f"{count}\t{url}\n"
-            output += f"- ({count})\t`{item}`{mark}\n" if is_markdown else f"{count}\t{item}{mark}\n"
+
        top_tags_count = 20
        output += f"\nTop {top_tags_count} tags:\n"
        for tag, count in sorted(tags.items(), key=lambda x: x[1], reverse=True)[:top_tags_count]:
            mark = ""
            if tag not in self._tags:
                mark = " (non-standard)"
            output += f"- ({count})\t`{tag}`{mark}\n" if is_markdown else f"{count}\t{tag}{mark}\n"
        return output
@@ -1,12 +1,11 @@
 import asyncio
 import json
 import re
-from typing import List
+from typing import List, Tuple
-from xml.etree import ElementTree
+import xml.etree.ElementTree as ET
 from aiohttp import TCPConnector, ClientSession
 import requests
 import cloudscraper
 from colorama import Fore, Style
 from .activation import import_aiohttp_cookies
 from .checking import maigret
@@ -37,13 +36,12 @@ class CloudflareSession:
    async def close(self):
        pass
 class Submitter:
    HEADERS = {
        "User-Agent": get_random_user_agent(),
    }
-    SEPARATORS = "\"'\n"
+    SEPARATORS = "\"'"
    RATIO = 0.6
    TOP_FEATURES = 5
@@ -56,7 +54,6 @@ class Submitter:
        self.logger = logger
        from aiohttp_socks import ProxyConnector
        proxy = self.args.proxy
        cookie_jar = None
        if args.cookie_file:
@@ -72,7 +69,7 @@ class Submitter:
    def get_alexa_rank(site_url_main):
        url = f"http://data.alexa.com/data?cli=10&url={site_url_main}"
        xml_data = requests.get(url).text
-        root = ElementTree.fromstring(xml_data)
+        root = ET.fromstring(xml_data)
        alexa_rank = 0
        try:
@@ -138,27 +135,20 @@ class Submitter:
                    if status == QueryStatus.CLAIMED:
                        changes["disabled"] = True
                elif status == QueryStatus.CLAIMED:
-                    print(
+                    self.logger.warning(
-                        f"{Fore.YELLOW}[!] Not found `{username}` in {site.name}, must be claimed{Style.RESET_ALL}"
+                        f"Not found `{username}` in {site.name}, must be claimed"
                    )
-                    self.logger.warning(site.json)
+                    self.logger.info(results_dict[site.name])
                    changes["disabled"] = True
                else:
-                    print(
+                    self.logger.warning(
-                        f"{Fore.YELLOW}[!] Found `{username}` in {site.name}, must be available{Style.RESET_ALL}"
+                        f"Found `{username}` in {site.name}, must be available"
                    )
-                    self.logger.warning(site.json)
+                    self.logger.info(results_dict[site.name])
                    changes["disabled"] = True
            else:
                print(f"{Fore.GREEN}[+] {username} is successfully checked: {status} in {site.name}{Style.RESET_ALL}")
        self.logger.info(f"Site {site.name} checking is finished")
        # remove service tag "unchecked"
        if "unchecked" in site.tags:
            site.tags.remove("unchecked")
            changes["tags"] = site.tags
        return changes
    def generate_additional_fields_dialog(self, engine: MaigretEngine, dialog):
@@ -173,9 +163,7 @@ class Submitter:
                fields['urlSubpath'] = f'/{subpath}'
        return fields
-    async def detect_known_engine(
+    async def detect_known_engine(self, url_exists, url_mainpage) -> [List[MaigretSite], str]:
        self, url_exists, url_mainpage
    ) -> [List[MaigretSite], str]:
        resp_text = ''
        try:
            r = await self.session.get(url_mainpage)
@@ -233,8 +221,7 @@ class Submitter:
        return [], resp_text
-    @staticmethod
+    def extract_username_dialog(self, url):
    def extract_username_dialog(url):
        url_parts = url.rstrip("/").split("/")
        supposed_username = url_parts[-1].strip('@')
        entered_username = input(
@@ -293,10 +280,6 @@ class Submitter:
        a_minus_b = tokens_a.difference(tokens_b)
        b_minus_a = tokens_b.difference(tokens_a)
        # additional filtering by html response
        a_minus_b = [t for t in a_minus_b if not t in non_exists_resp_text]
        b_minus_a = [t for t in b_minus_a if not t in exists_resp_text]
        if len(a_minus_b) == len(b_minus_a) == 0:
            print("The pages for existing and non-existing account are the same!")
@@ -313,8 +296,6 @@ class Submitter:
            :top_features_count
        ]
        self.logger.debug([(keyword, match_fun(keyword)) for keyword in presence_list])
        print("Detected text features of existing account: " + ", ".join(presence_list))
        features = input("If features was not detected correctly, write it manually: ")
@@ -324,8 +305,6 @@ class Submitter:
        absence_list = sorted(b_minus_a, key=match_fun, reverse=True)[
            :top_features_count
        ]
        self.logger.debug([(keyword, match_fun(keyword)) for keyword in absence_list])
        print(
            "Detected text features of non-existing account: " + ", ".join(absence_list)
        )
@@ -350,76 +329,6 @@ class Submitter:
        site = MaigretSite(url_mainpage.split("/")[-1], site_data)
        return site
    async def add_site(self, site):
        sem = asyncio.Semaphore(1)
        print(f"{Fore.BLUE}{Style.BRIGHT}[*] Adding site {site.name}, let's check it...{Style.RESET_ALL}")
        result = await self.site_self_check(site, sem)
        if result["disabled"]:
            print(
                f"Checks failed for {site.name}, please, verify them manually."
            )
            return {
                "valid": False,
                "reason": "checks_failed",
            }
        while True:
            print("\nAvailable fields to edit:")
            editable_fields = {
                '1': 'name',
                '2': 'tags',
                '3': 'url',
                '4': 'url_main',
                '5': 'username_claimed',
                '6': 'username_unclaimed',
                '7': 'presense_strs',
                '8': 'absence_strs',
            }
            for num, field in editable_fields.items():
                current_value = getattr(site, field)
                print(f"{num}. {field} (current: {current_value})")
            print("0. finish editing")
            print("10. reject and block domain")
            print("11. invalid params, remove")
            choice = input("\nSelect field number to edit (0-8): ").strip()
            if choice == '0':
                break
            if choice == '10':
                return {
                    "valid": False,
                    "reason": "manual block",
                }
            if choice == '11':
                return {
                    "valid": False,
                    "reason": "remove",
                }
            if choice in editable_fields:
                field = editable_fields[choice]
                current_value = getattr(site, field)
                new_value = input(f"Enter new value for {field} (current: {current_value}): ").strip()
                if field in ['tags', 'presense_strs', 'absence_strs']:
                    new_value = list(map(str.strip, new_value.split(',')))
                if new_value:
                    setattr(site, field, new_value)
                    print(f"Updated {field} to: {new_value}")
        self.logger.info(site.json)
        self.db.update_site(site)
        return {
            "valid": True,
        }
    async def dialog(self, url_exists, cookie_file):
        domain_raw = self.URL_RE.sub("", url_exists).strip().strip("/")
        domain_raw = domain_raw.split("/")[0]
@@ -452,16 +361,14 @@ class Submitter:
        print('Detecting site engine, please wait...')
        sites = []
        text = None
        try:
            sites, text = await self.detect_known_engine(url_exists, url_exists)
        except KeyboardInterrupt:
            print('Engine detect process is interrupted.')
        if 'cloudflare' in text.lower():
-            print(
+            print('Cloudflare protection detected. I will use cloudscraper for futher work')
                'Cloudflare protection detected. I will use cloudscraper for futher work'
            )
            # self.session = CloudflareSession()
        if not sites:
@@ -469,16 +376,11 @@ class Submitter:
            redirects = False
            if self.args.verbose:
-                redirects = (
+                redirects = 'y' in input('Should we do redirects automatically? [yN] ').lower()
                    'y' in input('Should we do redirects automatically? [yN] ').lower()
                )
            sites = [
                await self.check_features_manually(
-                    url_exists,
+                    url_exists, url_mainpage, cookie_file, redirects,
                    url_mainpage,
                    cookie_file,
                    redirects,
                )
            ]
@@ -498,7 +400,7 @@ class Submitter:
        if not found:
            print(
-                f"{Fore.RED}[!] The check for site '{chosen_site.name}' failed!{Style.RESET_ALL}"
+                f"Sorry, we couldn't find params to detect account presence/absence in {chosen_site.name}."
            )
            print(
                "Try to run this mode again and increase features count or choose others."
@@ -522,18 +424,13 @@ class Submitter:
        chosen_site.name = input("Change site name if you want: ") or chosen_site.name
        chosen_site.tags = list(map(str.strip, input("Site tags: ").split(',')))
-        # rank = Submitter.get_alexa_rank(chosen_site.url_main)
+        rank = Submitter.get_alexa_rank(chosen_site.url_main)
-        # if rank:
+        if rank:
-        #     print(f'New alexa rank: {rank}')
+            print(f'New alexa rank: {rank}')
-        #     chosen_site.alexa_rank = rank
+            chosen_site.alexa_rank = rank
        self.logger.debug(chosen_site.json)
        site_data = chosen_site.strip_engine_data()
        self.logger.debug(site_data.json)
        self.db.update_site(site_data)
        if self.args.db:
            print(f"{Fore.GREEN}[+] Maigret DB is saved to {self.args.db}.{Style.RESET_ALL}")
            self.db.save_to_file(self.args.db)
        return True
@@ -1,47 +0,0 @@
 # Download this first to avoid compatibility issues:
 #
 # sudo zypper in python3-devel
 # sudo zypper in python3-dev
 #
 # Then run 'pip3 install -r opensuse.txt' as usual.
 #
 aiodns>=3.0.0
 aiohttp>=3.8.6
 aiohttp-socks>=0.7.1
 arabic-reshaper~=3.0.0
 async-timeout
 attrs>=22.2.0
 certifi>=2023.7.22
 chardet>=5.0.0
 colorama
 future>=0.18.3
 future-annotations>=1.0.0
 html5lib>=1.1
 idna>=3.4
 Jinja2
 lxml>=4.9.2
 MarkupSafe
 mock>=4.0.3
 multidict
 pycountry>=22.3.5
 PyPDF2>=3.0.1
 PySocks>=1.7.1
 python-bidi>=0.4.2
 requests
 requests-futures>=1.0.0
 six>=1.16.0
 socid-extractor>=0.0.24
 soupsieve>=2.3.2.post1
 stem>=1.8.1
 torrequest>=0.1.0
 tqdm
 typing-extensions
 webencodings>=0.5.1
 svglib
 xhtml2pdf~=0.2.11
 XMind>=1.2.0
 yarl
 networkx
 pyvis>=0.2.1
 reportlab
 cloudscraper>=1.2.71
@@ -1,5 +1,5 @@
 maigret @ https://github.com/soxoj/maigret/archive/refs/heads/main.zip
-pefile==2023.2.7 # do not bump while pyinstaller is 6.11.1, there is a conflict
+pefile==2022.5.30
-psutil==6.1.0
+psutil==5.9.2
-pyinstaller==6.11.1
+pyinstaller @ https://github.com/pyinstaller/pyinstaller/archive/develop.zip
-pywin32-ctypes==0.2.3
+pywin32-ctypes==0.2.0
@@ -1,90 +0,0 @@
 [build-system]
 requires = ["poetry-core"]
 build-backend = "poetry.core.masonry.api"
 [tool.poetry]
 name = "maigret"
 version = "0.4.4"
 description = "🕵️‍♂️ Collect a dossier on a person by username from thousands of sites."
 authors = ["Soxoj <soxoj@protonmail.com>"]
 readme = "README.md"
 license = "MIT License"
 homepage = "https://pypi.org/project/maigret"
 documentation = "https://maigret.readthedocs.io"
 repository = "https://github.com/soxoj/maigret"
 classifiers = [
    "Development Status :: 5 - Production/Stable",
    "Programming Language :: Python :: 3",
    "Intended Audience :: Information Technology",
    "Operating System :: OS Independent",
    "License :: OSI Approved :: MIT License",
    "Natural Language :: English"
 ]
 [tool.poetry.urls]
 "Bug Tracker" = "https://github.com/soxoj/maigret/issues"
 [tool.poetry.dependencies]
 # poetry install
 # Install only production dependencies:
 # poetry install --without dev
 # Install with dev dependencies:
 # poetry install --with dev
 python = "^3.10"
 aiodns = "^3.0.0"
 aiohttp = "^3.11.9"
 aiohttp-socks = "^0.9.1"
 arabic-reshaper = "^3.0.0"
 async-timeout = "^5.0.1"
 attrs = "^24.2.0"
 certifi = "^2024.8.30"
 chardet = "^5.0.0"
 colorama = "^0.4.6"
 future = "^1.0.0"
 future-annotations= "^1.0.0"
 html5lib = "^1.1"
 idna = "^3.4"
 Jinja2 = "^3.1.3"
 lxml = "^5.3.0"
 MarkupSafe = "^3.0.2"
 mock = "^5.1.0"
 multidict = "^6.0.4"
 pycountry = "^24.6.1"
 PyPDF2 = "^3.0.1"
 PySocks = "^1.7.1"
 python-bidi = "^0.6.3"
 requests = "^2.31.0"
 requests-futures = "^1.0.2"
 six = "^1.16.0"
 socid-extractor = "^0.0.26"
 soupsieve = "^2.6"
 stem = "^1.8.1"
 torrequest = "^0.1.0"
 alive_progress = "^3.2.0"
 typing-extensions = "^4.8.0"
 webencodings = "^0.5.1"
 xhtml2pdf = "^0.2.11"
 XMind = "^1.2.0"
 yarl = "^1.18.3"
 networkx = "^2.6.3"
 pyvis = "^0.3.2"
 reportlab = "^4.2.0"
 cloudscraper = "^1.2.71"
 [tool.poetry.group.dev.dependencies]
 # How to add a new dev dependency: poetry add black --group dev
 # Install dev dependencies with: poetry install --with dev
 flake8 = "^7.1.1"
 pytest = "^8.3.4"
 pytest-asyncio = "^0.24.0"
 pytest-cov = "^6.0.0"
 pytest-httpserver = "^1.0.0"
 pytest-rerunfailures = "^15.0"
 reportlab = "^4.2.0"
 mypy = "^1.13.0"
 tuna = "^0.5.11"
 [tool.poetry.scripts]
 # Run with: poetry run maigret <username>
 maigret = "maigret.maigret:run"
@@ -0,0 +1,39 @@
 aiodns==3.0.0
 aiohttp==3.8.3
 aiohttp-socks==0.7.1
 arabic-reshaper==2.1.4
 async-timeout==4.0.2
 attrs==22.1.0
 certifi==2022.9.24
 chardet==5.0.0
 colorama==0.4.6
 future==0.18.2
 future-annotations==1.0.0
 html5lib==1.1
 idna==3.4
 Jinja2==3.1.2
 lxml==4.9.1
 MarkupSafe==2.1.1
 mock==4.0.3
 multidict==6.0.2
 pycountry==22.3.5
 PyPDF2==2.10.8
 PySocks==1.7.1
 python-bidi==0.4.2
 requests==2.28.1
 requests-futures==1.0.0
 six==1.16.0
 socid-extractor>=0.0.21
 soupsieve==2.3.2.post1
 stem==1.8.1
 torrequest==0.1.0
 tqdm==4.64.1
 typing-extensions==4.4.0
 webencodings==0.5.1
 xhtml2pdf==0.2.8
 XMind==1.2.0
 yarl==1.8.1
 networkx==2.5.1
 pyvis==0.2.1
 reportlab==3.6.11
 cloudscraper==1.2.66
@@ -0,0 +1,9 @@
 [egg_info]
 tag_build = 
 tag_date = 0
 [flake8]
 per-file-ignores = __init__.py:F401
 [mypy]
 ignore_missing_imports = True
@@ -0,0 +1,26 @@
 from setuptools import (
    setup,
    find_packages,
 )
 with open('README.md') as fh:
    long_description = fh.read()
 with open('requirements.txt') as rf:
    requires = rf.read().splitlines()
 setup(name='maigret',
      version='0.4.4',
      description='Collect a dossier on a person by username from a huge number of sites',
      long_description=long_description,
      long_description_content_type="text/markdown",
      url='https://github.com/soxoj/maigret',
      install_requires=requires,
      entry_points={'console_scripts': ['maigret = maigret.maigret:run']},
      packages=find_packages(),
      include_package_data=True,
      author='Soxoj',
      author_email='soxoj@protonmail.com',
      license='MIT',
      zip_safe=False)
@@ -1,32 +1,43 @@
-title: Maigret
+name: maigret2
-icon: static/maigret.png
+adopt-info: maigret2
-name: maigret
+summary: SOCMINT / Instagram
 summary: 🕵️‍♂️ Collect a dossier on a person by username from thousands of sites.
 description: |
-  **Maigret** collects a dossier on a person **by username only**, checking for accounts on a huge number of sites and gathering all the available information from web pages. No API keys required. Maigret is an easy-to-use and powerful fork of Sherlock.
+  Test Test Test
  Currently supported more than 3000 sites, search is launched against 500 popular sites in descending order of popularity by default. Also supported checking of Tor sites, I2P sites, and domains (via DNS resolving).
 version: 0.4.4
 license: MIT
 base: core22
 confinement: strict
-source-code: https://github.com/soxoj/maigret
+base: core20
-issues:
+grade: stable
-  - https://github.com/soxoj/maigret/issues
+confinement: strict
-donation:
+compression: lzo
-  - https://patreon.com/soxoj
+
-contact:
+architectures:
-  - mailto:soxoj@protonmail.com
+  - build-on: amd64
 apps:
  maigret2:
    command: bin/maigret
    environment:
      LC_ALL: C.UTF-8
    plugs:
      - home
      - network
 parts:
-  maigret:
+  maigret2:
    plugin: python
-    source: .
+    source: https://github.com/soxoj/maigret
    source-type: git
-type: app
+    build-packages:
-apps:
+      - python3-pip
-  maigret:
+      - python3-six
-    command: bin/maigret
+      - python3
-    plugs: [ network, network-bind, home ]
+      
    stage-packages:
      - python3
      - python3-six
    override-pull: |
      snapcraftctl pull
      snapcraftctl set-version "$(git describe --tags | sed 's/^v//' | cut -d "-" -f1)"
@@ -0,0 +1,8 @@
 reportlab==3.6.11
 flake8==5.0.4
 pytest==7.2.0
 pytest-asyncio==0.16.0;python_version<"3.7"
 pytest-asyncio==0.20.1;python_version>="3.7"
 pytest-cov==4.0.0
 pytest-httpserver==1.0.6
 pytest-rerunfailures==10.2
@@ -19,7 +19,7 @@ empty_mark = Mark('', (), {})
 def by_slow_marker(item):
-    return item.get_closest_marker('slow', default=empty_mark).name
+    return item.get_closest_marker('slow', default=empty_mark)
 def pytest_collection_modifyitems(items):
@@ -1,44 +1,25 @@
 {
    "engines": {},
    "sites": {
-        "ValidActive": {
+        "GooglePlayStore": {
            "tags": ["global", "us"],
            "disabled": false,
            "checkType": "status_code",
            "alexaRank": 1,
            "url": "https://play.google.com/store/apps/developer?id={username}",
            "urlMain": "https://play.google.com/store",
-            "usernameClaimed": "OpenAI",
+            "usernameClaimed": "Facebook_nosuchname",
            "usernameUnclaimed": "noonewouldeverusethis7"
        },
-        "InvalidActive": {
+        "Reddit": {
-            "tags": ["global", "us"],
+            "tags": ["news", "social", "us"],
            "disabled": false,
            "checkType": "status_code",
-            "alexaRank": 1,
+            "presenseStrs": ["totalKarma"],
            "url": "https://play.google.com/store/apps/dev?id={username}",
            "urlMain": "https://play.google.com/store",
            "usernameClaimed": "OpenAI",
            "usernameUnclaimed": "noonewouldeverusethis7"
        },
        "ValidInactive": {
            "tags": ["global", "us"],
            "disabled": true,
-            "checkType": "status_code",
+            "alexaRank": 17,
-            "alexaRank": 1,
+            "url": "https://www.reddit.com/user/{username}",
-            "url": "https://play.google.com/store/apps/developer?id={username}",
+            "urlMain": "https://www.reddit.com/",
-            "urlMain": "https://play.google.com/store",
+            "usernameClaimed": "blue",
            "usernameClaimed": "OpenAI",
            "usernameUnclaimed": "noonewouldeverusethis7"
        },
        "InvalidInactive": {
            "tags": ["global", "us"],
            "disabled": true,
            "checkType": "status_code",
            "alexaRank": 1,
            "url": "https://play.google.com/store/apps/dev?id={username}",
            "urlMain": "https://play.google.com/store",
            "usernameClaimed": "OpenAI",
            "usernameUnclaimed": "noonewouldeverusethis7"
        }
    }
@@ -41,8 +41,7 @@ async def test_import_aiohttp_cookies():
        f.write(COOKIES_TXT)
    cookie_jar = import_aiohttp_cookies(cookies_filename)
-    # new aiohttp support
+    assert list(cookie_jar._cookies.keys()) == ['xss.is', 'httpbin.org']
    assert list(cookie_jar._cookies.keys()) in (['xss.is', 'httpbin.org'], [('xss.is', '/'), ('httpbin.org', '/')], [('xss.is', ''), ('httpbin.org', '')])
    url = 'https://httpbin.org/cookies'
    connector = aiohttp.TCPConnector(ssl=False)
@@ -23,12 +23,11 @@ DEFAULT_ARGS: Dict[str, Any] = {
    'no_progressbar': False,
    'parse_url': '',
    'pdf': False,
    'permute': False,
    'print_check_errors': False,
    'print_not_found': False,
    'proxy': None,
    'reports_sorting': 'default',
-    'retries': 0,
+    'retries': 1,
    'self_check': False,
    'site_list': [],
    'stats': False,
@@ -13,7 +13,4 @@ def test_tags_validity(default_db):
            if tag not in tags:
                unknown_tags.add(tag)
    # make sure all tags are known
    # if you see "unchecked" tag error, please, do
    # maigret --db `pwd`/maigret/resources/data.json --self-check --tag unchecked --use-disabled-sites
    assert unknown_tags == set()
@@ -55,12 +55,12 @@ async def test_asyncio_progressbar_queue_executor():
    executor = AsyncioProgressbarQueueExecutor(logger=logger, in_parallel=2)
    assert await executor.run(tasks) == [0, 1, 3, 2, 4, 6, 7, 5, 9, 8]
    assert executor.execution_time > 0.5
-    assert executor.execution_time < 0.7
+    assert executor.execution_time < 0.6
    executor = AsyncioProgressbarQueueExecutor(logger=logger, in_parallel=3)
    assert await executor.run(tasks) == [0, 3, 1, 4, 6, 2, 7, 9, 5, 8]
    assert executor.execution_time > 0.4
-    assert executor.execution_time < 0.6
+    assert executor.execution_time < 0.5
    executor = AsyncioProgressbarQueueExecutor(logger=logger, in_parallel=5)
    assert await executor.run(tasks) in (
@@ -68,9 +68,9 @@ async def test_asyncio_progressbar_queue_executor():
        [0, 3, 6, 1, 4, 9, 7, 2, 5, 8],
    )
    assert executor.execution_time > 0.3
-    assert executor.execution_time < 0.5
+    assert executor.execution_time < 0.4
    executor = AsyncioProgressbarQueueExecutor(logger=logger, in_parallel=10)
    assert await executor.run(tasks) == [0, 3, 6, 9, 1, 4, 7, 2, 5, 8]
    assert executor.execution_time > 0.2
-    assert executor.execution_time < 0.4
+    assert executor.execution_time < 0.3
@@ -35,22 +35,65 @@ RESULTS_EXAMPLE = {
@pytest.mark.slow
-@pytest.mark.asyncio
+def test_self_check_db_positive_disable(test_db):
-async def test_self_check_db(test_db):
+    logger = Mock()
-    # initalize logger to debug
+    assert test_db.sites[0].disabled is False
    loop = asyncio.get_event_loop()
    loop.run_until_complete(
        self_check(test_db, test_db.sites_dict, logger, silent=True)
    )
    assert test_db.sites[0].disabled is True
@pytest.mark.slow
@pytest.mark.skip(reason="broken, fixme")
 def test_self_check_db_positive_enable(test_db):
    logger = Mock()
-    assert test_db.sites_dict['InvalidActive'].disabled is False
+    test_db.sites[0].disabled = True
-    assert test_db.sites_dict['ValidInactive'].disabled is True
+    test_db.sites[0].username_claimed = 'Skyeng'
-    assert test_db.sites_dict['ValidActive'].disabled is False
+    assert test_db.sites[0].disabled is True
    assert test_db.sites_dict['InvalidInactive'].disabled is True
-    await self_check(test_db, test_db.sites_dict, logger, silent=False)
+    loop = asyncio.get_event_loop()
    loop.run_until_complete(
        self_check(test_db, test_db.sites_dict, logger, silent=True)
    )
-    assert test_db.sites_dict['InvalidActive'].disabled is True
+    assert test_db.sites[0].disabled is False
-    assert test_db.sites_dict['ValidInactive'].disabled is False
+
-    assert test_db.sites_dict['ValidActive'].disabled is False
+
-    assert test_db.sites_dict['InvalidInactive'].disabled is True
+@pytest.mark.slow
 def test_self_check_db_negative_disabled(test_db):
    logger = Mock()
    test_db.sites[0].disabled = True
    assert test_db.sites[0].disabled is True
    loop = asyncio.get_event_loop()
    loop.run_until_complete(
        self_check(test_db, test_db.sites_dict, logger, silent=True)
    )
    assert test_db.sites[0].disabled is True
@pytest.mark.skip(reason='broken, fixme')
@pytest.mark.slow
 def test_self_check_db_negative_enabled(test_db):
    logger = Mock()
    test_db.sites[0].disabled = False
    test_db.sites[0].username_claimed = 'Skyeng'
    assert test_db.sites[0].disabled is False
    loop = asyncio.get_event_loop()
    loop.run_until_complete(
        self_check(test_db, test_db.sites_dict, logger, silent=True)
    )
    assert test_db.sites[0].disabled is False
@pytest.mark.slow
@@ -202,20 +202,3 @@ def test_get_url_template():
        },
    )
    assert site.get_url_template() == "SUBDOMAIN"
 def test_has_site_url_or_name(default_db):
    # by the same url or partial match
    assert default_db.has_site("https://aback.com.ua/user/") == True
    assert default_db.has_site("https://aback.com.ua") == True
    # acceptable partial match
    assert default_db.has_site("https://aback.com.ua/use") == True
    assert default_db.has_site("https://aback.com") == True
    # by name
    assert default_db.has_site("Aback") == True
    # false
    assert default_db.has_site("https://aeifgoai3h4g8a3u4g5") == False
    assert default_db.has_site("aeifgoai3h4g8a3u4g5") == False
@@ -3,13 +3,23 @@
 This module generates the listing of supported sites in file `SITES.md`
 and pretty prints file with sites data.
 """
 import aiohttp
 import asyncio
 import json
 import sys
 import requests
 import logging
 import threading
 import xml.etree.ElementTree as ET
 from datetime import datetime
 from argparse import ArgumentParser, RawDescriptionHelpFormatter
-from maigret.maigret import get_response
+import tqdm.asyncio
-from maigret.sites import MaigretDatabase, MaigretEngine
+
 from maigret.maigret import get_response, site_self_check
 from maigret.sites import MaigretSite, MaigretDatabase, MaigretEngine
 from maigret.utils import CaseConverter
 async def check_engine_of_site(site_name, sites_with_engines, future, engine_name, semaphore, logger):
    async with semaphore:
@@ -88,10 +98,8 @@ if __name__ == '__main__':
            tasks.append(future)
        # progress bar
-        with alive_progress(len(tasks), title='Checking sites') as progress:
+        for f in tqdm.asyncio.tqdm.as_completed(tasks):
-            for f in asyncio.as_completed(tasks):
+            loop.run_until_complete(f)
                loop.run_until_complete(f)
                progress()
        print(f'Total detected {len(new_engine_sites)} sites on engine {engine_name}')
        # dict with new found engine sites
@@ -3,7 +3,7 @@ import json
 import random
 import re
-import alive_progress
+import tqdm.asyncio
 from mock import Mock
 import requests
@@ -181,7 +181,7 @@ if __name__ == '__main__':
    raw_maigret_data = json.dumps({site.name: site.json for site in sites_subset})
    new_sites = []
-    for site in alive_progress.alive_it(urls):
+    for site in tqdm.asyncio.tqdm(urls):
        site_lowercase = site.lower()
        domain_raw = URL_RE.sub('', site_lowercase).strip().strip('/')
@@ -271,9 +271,7 @@ if __name__ == '__main__':
        future = asyncio.ensure_future(check_coro)
        tasks.append(future)
-    with alive_progress(len(tasks), title='Checking sites') as progress:
+    for f in tqdm.asyncio.tqdm.as_completed(tasks, timeout=TIMEOUT):
        for f in asyncio.as_completed(tasks):
            progress()
        try:
            loop.run_until_complete(f)
        except asyncio.exceptions.TimeoutError:
@@ -3,12 +3,13 @@
 This module generates the listing of supported sites in file `SITES.md`
 and pretty prints file with sites data.
 """
 import json
 import sys
 import requests
 import logging
 import threading
 import xml.etree.ElementTree as ET
-from datetime import datetime, timezone
+from datetime import datetime
 from argparse import ArgumentParser, RawDescriptionHelpFormatter
 from maigret.maigret import MaigretDatabase
@@ -26,10 +27,9 @@ RANKS.update({
 SEMAPHORE = threading.Semaphore(20)
 def get_rank(domain_to_query, site, print_errors=True):
    with SEMAPHORE:
-        # Retrieve ranking data via alexa API
+        #Retrieve ranking data via alexa API
        url = f"http://data.alexa.com/data?cli=10&url={domain_to_query}"
        xml_data = requests.get(url).text
        root = ET.fromstring(xml_data)
@@ -137,7 +137,7 @@ Rank data fetched from Alexa by domains.
            site_file.write(f'1. {favicon} [{site}]({url_main})*: top {valid_rank}{tags}*{note}\n')
            db.update_site(site)
-        site_file.write(f'\nThe list was updated at ({datetime.now(timezone.utc).date()})\n')
+        site_file.write(f'\nThe list was updated at ({datetime.utcnow()} UTC)\n')
        db.save_to_file(args.base_file)
        statistics_text = db.get_db_stats(is_markdown=True)
@@ -1,38 +1,56 @@
 #!/usr/bin/env python3
 import asyncio
 import logging
 import maigret
 # top popular sites from the Maigret database
 TOP_SITES_COUNT = 300
 # Maigret HTTP requests timeout
 TIMEOUT = 10
 # max parallel requests
 MAX_CONNECTIONS = 50
-def main():
+if __name__ == '__main__':
    # setup logging and asyncio
    logger = logging.getLogger('maigret')
    logger.setLevel(logging.WARNING)
    loop = asyncio.get_event_loop()
    # setup Maigret
    db = maigret.MaigretDatabase().load_from_file('./maigret/resources/data.json')
    # also can be downloaded from web
    # db = MaigretDatabase().load_from_url(MAIGRET_DB_URL)
    # user input
    username = input('Enter username to search: ')
-    sites_count = int(input(
+
    sites_count_raw = input(
        f'Select the number of sites to search ({TOP_SITES_COUNT} for default, {len(db.sites_dict)} max): '
-    )) or TOP_SITES_COUNT
+    )
    sites_count = int(sites_count_raw) or TOP_SITES_COUNT
    sites = db.ranked_sites_dict(top=sites_count)
-    show_progressbar = input('Do you want to show a progressbar? [Yn] ').lower() != 'n'
+    show_progressbar_raw = input('Do you want to show a progressbar? [Yn] ')
-    extract_info = input(
+    show_progressbar = show_progressbar_raw.lower() != 'n'
    extract_info_raw = input(
        'Do you want to extract additional info from accounts\' pages? [Yn] '
-    ).lower() != 'n'
+    )
-    use_notifier = input(
+    extract_info = extract_info_raw.lower() != 'n'
    use_notifier_raw = input(
        'Do you want to use notifier for displaying results while searching? [Yn] '
-    ).lower() != 'n'
+    )
    use_notifier = use_notifier_raw.lower() != 'n'
    notifier = None
    if use_notifier:
        notifier = maigret.Notifier(print_found_only=True, skip_check_errors=True)
    # search!
    search_func = maigret.search(
        username=username,
        site_dict=sites,
@@ -40,7 +58,7 @@ def main():
        logger=logger,
        max_connections=MAX_CONNECTIONS,
        query_notify=notifier,
-        no_progressbar=not show_progressbar,
+        no_progressbar=(not show_progressbar),
        is_parsing_enabled=extract_info,
    )
@@ -51,7 +69,3 @@ def main():
    for sitename, data in results.items():
        is_found = data['status'].is_found()
        print(f'{sitename} - {"Found!" if is_found else "Not found"}')
 if __name__ == '__main__':
    main()
Author	SHA1	Message	Date
dezort12	b62aec4882	disable-sites	2023-01-31 13:18:56 -05:00
dezort12	4062dab288	disable-donationalerts	2023-01-27 14:30:40 -05:00
`@@ -1,2 +1 @@`
	`sphinx-copybutton`	`sphinx-copybutton`
	`sphinx_rtd_theme`