Skip to content

Danyalkhattak/async-link-runner

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

async-link-runner

A CLI-based Python tool for automated, asynchronous browsing of a list of URLs in a headless browser. The tool executes visits in cycles, supports controlled concurrency, logs outcomes, and rotates IP after each cycle using Tor.

Features

  • 🚀 Async Execution - Visit multiple URLs concurrently using Playwright and asyncio
  • 🔄 Cycle Management - Run multiple cycles with automatic Tor IP rotation between cycles
  • 📊 Structured Logging - JSON lines format + console output with rich formatting
  • ⚙️ Configurable - Control concurrency, delays, timeouts, and retries via CLI
  • 🛑 Graceful Shutdown - Handle Ctrl+C cleanly, properly closing browser instances
  • 🔁 Retry Logic - Configurable retry mechanism for failed requests
  • 🌐 Headless Browsing - Realistic page loads with headless Chromium
  • 🕵️ Anti-Detection Features - Stealth mode, randomized user-agents, realistic browser behavior
  • 🔐 IP Anonymization - Tor SOCKS5 proxy with automatic circuit rotation

Requirements

  • OS: Kali Linux / Ubuntu (latest LTS recommended)
  • Python: 3.10+
  • Tor: Running Tor daemon for IP rotation
  • pip: Python package manager

Installation

1. Clone or Navigate to Project

git clone https://github.com/Danyalkhattak/async-link-runner.git
cd async-link-runner

2. Create Virtual Environment

python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

3. Install Dependencies

pip install --upgrade pip
pip install -r requirements.txt
playwright install chromium

4. Prepare URLs File

Create or edit urls.txt with one URL per line:

https://example.com
https://example.org
https://example.net

Usage

Basic Command

python main.py --file urls.txt --cycles 5 --concurrency 3

Full Example

python main.py \
  --file urls.txt \
  --cycles 20 \
  --concurrency 20 \
  --delay-min 1 \
  --delay-max 5 \
  --timeout 10000 \
  --retries 3 \
  --headless true

CLI Arguments

Flag Type Default Description
--file str Path to file containing URLs (one per line)
--cycles int Number of cycles (omit for infinite)
--concurrency int 5 Parallel browser pages
--delay-min float 1 Min delay between visits (seconds)
--delay-max float 5 Max delay between visits (seconds)
--timeout int 10000 Page load timeout (ms)
--retries int 3 Retry attempts per URL
--headless bool true Run browser in headless mode

Output

Console Output

Real-time progress with rich formatting:

✓ https://example.com
✓ https://example.org
✗ https://example.net - Timeout

Logs Directory

  • logs/visits.jsonl - JSON lines format (one entry per visit)
  • logs/visits.log - Consolidated log file

JSON Log Format

{"url": "https://example.com", "status": "success", "timestamp": "2026-04-12T10:00:00Z"}
{"url": "https://site.com", "status": "fail", "error": "Timeout", "timestamp": "2026-04-12T10:00:05Z"}

How It Works

  1. Load URLs - Reads and validates URLs from file
  2. Start Cycle - Launches Chromium browser
  3. Concurrent Visits - Opens pages in parallel (respecting concurrency limit)
  4. Randomized User Agents - Each visit uses a different realistic browser user-agent
  5. Stealth Mode - Applies anti-detection measures to bypass bot protection
  6. Realistic Behavior - Simulates user interactions (scrolling, delays, page interactions)
  7. Tor IP Routing - All traffic routed through Tor SOCKS5 proxy
  8. Logging - Records success/failure with timestamps
  9. IP Rotation - Rotates Tor IP after cycle completion
  10. Repeat - Continues to next cycle until complete or interrupted

Anti-Detection Features

This tool uses multiple stealth measures to avoid detection:

1. Playwright Stealth Plugin

  • Hides automation indicators from JavaScript detection
  • Removes navigator.webdriver flag
  • Masks Chromium fingerprints

2. Randomized User-Agents

Each request uses a different realistic browser user-agent:

  • Chrome (Windows, macOS, Linux)
  • Firefox (Windows, macOS, Linux)
  • Safari (macOS)

3. Realistic Browser Behavior

  • Random delays between page visits (1-5 seconds configurable)
  • Page scrolling and interactions
  • Network idle waiting instead of just DOM content
  • Variable retry delays

4. Tor IP Anonymization

  • Routes all traffic through Tor SOCKS5 proxy
  • Changes exit IP between cycles
  • Different IP = different "user" for each cycle

5. Per-Context Isolation

  • Each URL visited in separate browser context
  • Isolated cookies, storage, and state
  • No fingerprint carryover between visits

Tor Setup

For IP rotation to work, Tor must be running with control port enabled.

1. Install Tor

# Linux (Debian/Ubuntu/Kali)
sudo apt-get install tor

# macOS
brew install tor

2. Configure Tor Control Port

Edit /etc/tor/torrc (Linux) or /usr/local/etc/tor/torrc (macOS):

# Find and uncomment/add these lines:
ControlPort 9051
CookieAuthentication 1

For easy authentication without password:

CookieAuthentication 1
CookieAuthFile /var/run/tor/control.authcookie

3. Start Tor Daemon

# Linux
sudo service tor start

# Or run in foreground (debugging)
tor

# Check if running
ps aux | grep tor

4. Verify Tor is Working

# Test SOCKS5 proxy
curl -x socks5://127.0.0.1:9050 https://check.torproject.org

# Test control port (requires stem installed)
python -c "from stem.control import Controller; Controller.from_port().authenticate()"

5. Run async-link-runner

Once Tor is running, the tool will:

  • Route all traffic through Tor SOCKS5 proxy (port 9050)
  • Rotate your IP between each cycle via Tor control port (9051)
  • Log IP rotation status

Note: IP rotation may take 3-5 seconds per rotation due to Tor circuit building.

Graceful Shutdown

Press Ctrl+C to gracefully shutdown:

  • Closes all open browser pages and contexts
  • Flushes all logs
  • Exits cleanly without crashes

Error Handling

  • Failed URL Visit: Retries per --retries argument before logging as failed
  • Network Errors: Caught and logged, doesn't crash the application
  • Browser Launch Failure: Logged and cycle aborts gracefully
  • Invalid URLs: Skipped with warning during loading

Performance Tuning

High Volume Requests

python main.py --file urls.txt --cycles infinite --concurrency 20 --delay-min 0.5 --delay-max 2

Conservative Testing

python main.py --file urls.txt --cycles 2 --concurrency 2 --delay-min 3 --delay-max 5

Memory Optimization

  • Browser contexts are closed immediately after each URL visit
  • Concurrency limit prevents resource exhaustion
  • Adjust --concurrency based on available RAM

Troubleshooting

"File not found" Error

Ensure urls.txt exists in the same directory as main.py, or provide full path:

python main.py --file /path/to/urls.txt

Browser Launch Failures

Ensure Playwright is properly installed:

playwright install chromium

Tor IP Rotation Not Working

Error: "Could not connect to Tor control port 9051"

  1. Check if Tor is running:
sudo service tor status
# or
ps aux | grep tor
  1. Verify control port is enabled in /etc/tor/torrc:
ControlPort 9051
CookieAuthentication 1
  1. Restart Tor after config changes:
sudo service tor restart
  1. Test control port connectivity:
nc -zv 127.0.0.1 9051
  1. Verify SOCKS5 proxy works:
curl -x socks5://127.0.0.1:9050 https://check.torproject.org

Error: "Could not connect to Tor SOCKS port 9050"

The browser can't reach the Tor SOCKS5 proxy. Ensure:

  • Tor is running and listening on localhost:9050
  • No firewall blocking 127.0.0.1:9050
  • Tor wasn't started with restricted binding

Try restarting Tor:

sudo service tor restart
sleep 2
python main.py --file urls.txt --cycles 1

Slow Performance

  • Reduce --concurrency if system runs out of memory
  • Increase --delay-min/--delay-max if being rate-limited
  • Reduce --timeout if pages hang indefinitely

Project Structure

async-link-runner/
├── main.py                    # Main application (entry point)
├── requirements.txt           # Python dependencies
├── urls.txt                   # List of URLs to visit
├── logs/
│   ├── visits.jsonl          # JSON lines format logs
│   └── visits.log            # Consolidated logs
├── README.md                  # This file
└── REQUIREMENTS.md            # Original specifications

Dependencies

  • playwright - Headless browser automation
  • aiofiles - Async file I/O for logging
  • rich - Enhanced console formatting and logging

Disclaimer

This tool is intended strictly for legitimate testing, monitoring, and development purposes. Users are responsible for complying with website Terms of Service and applicable laws.

Do NOT use this tool to:

  • Evade safeguards or security measures
  • Generate artificial traffic or inflate metrics
  • Perform denial-of-service (DoS) attacks
  • Violate website terms of service
  • Misuse network resources

Unauthorized access or abuse of computer systems is illegal. Use responsibly.

License

MIT License - See LICENSE file for details

Contributing

Contributions are welcome! Please submit issues and pull requests.

Support

For issues or questions, please open an issue on the repository.

About

A CLI-based Python tool for automated, asynchronous browsing of a list of URLs in a headless browser. The tool executes visits in cycles, supports controlled concurrency, logs outcomes, and rotates IP after each cycle using Tor.

Topics

Resources

License

Stars

Watchers

Forks

Contributors

Languages