This comprehensive guide will walk you through setting up the Job Prospect Automation system from scratch.
Before you begin, ensure you have:
- Python 3.8 or higher installed on your system
- pip package manager (usually comes with Python)
- Git for cloning the repository
- Internet connection for API access
- API accounts for required services (details below)
- Operating System: Windows, macOS, or Linux
- RAM: Minimum 4GB (8GB recommended for large operations)
- Storage: At least 1GB free space
- Network: Stable internet connection for API calls
# Clone the repository
git clone <repository-url>
cd job-prospect-automation
# Verify the structure
ls -laExpected directory structure:
job-prospect-automation/
├── controllers/
├── services/
├── models/
├── utils/
├── tests/
├── examples/
├── docs/
├── requirements.txt
├── cli.py
├── main.py
└── README.md
# Create virtual environment
python -m venv venv
# Activate virtual environment
# On Windows:
venv\Scripts\activate
# On macOS/Linux:
source venv/bin/activate
# Verify activation (should show venv path)
which python# Create conda environment
conda create -n job-prospect-automation python=3.9
# Activate environment
conda activate job-prospect-automation# Install all required packages
pip install -r requirements.txt
# Verify installation
pip list | grep -E "(notion|hunter|openai|scrapy|crawl4ai)"If you encounter issues, try:
# Upgrade pip first
pip install --upgrade pip
# Install with verbose output
pip install -r requirements.txt -v# Test CLI access
python cli.py --help
# Test dry-run mode
python cli.py --dry-run statusExpected output should show CLI help and dry-run confirmation.
You need to obtain API keys from several services for full functionality:
- Notion Integration Token - For data storage and organization
- Hunter.io API Key - For email discovery and verification
- Azure OpenAI API Key - For AI-powered email generation and data parsing (recommended)
- OpenAI API Key - Alternative to Azure OpenAI for AI features
- Resend API Key - For automated email sending and delivery tracking
Steps:
- Go to Notion Developers
- Click "Create new integration"
- Fill in integration details:
- Name: "Job Prospect Automation"
- Associated workspace: Select your workspace
- Capabilities: Read, Insert, Update content
- Click "Submit"
- Copy the "Internal Integration Token"
Important: After creating the integration, you must:
- Share your Notion workspace with the integration
- Or let the system create a new database (recommended for first-time users)
Steps:
- Sign up at Hunter.io
- Verify your email address
- Go to your dashboard
- Navigate to "API" section
- Copy your API key
Free Tier Limits:
- 25 requests per month
- 10 requests per minute
- Email finder and verifier access
Paid Plans: Higher limits and additional features
Why Azure OpenAI?
- Enterprise-grade reliability and security
- Better rate limits and availability
- Predictable pricing with reserved capacity
- Regional deployment options
Steps:
- Create Azure account at Azure Portal
- Apply for Azure OpenAI access at aka.ms/oai/access
- Create Azure OpenAI resource in your subscription
- Deploy GPT-4 models for email generation and AI parsing
- Get API key and endpoint from Azure Portal
Configuration:
USE_AZURE_OPENAI=true
AZURE_OPENAI_API_KEY=your_azure_api_key
AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com/
AZURE_OPENAI_DEPLOYMENT_NAME=gpt-4-deploymentSteps:
- Create account at OpenAI Platform
- Add payment method (required for API access)
- Go to "API Keys" section
- Click "Create new secret key"
- Copy the key immediately (won't be shown again)
Configuration:
USE_AZURE_OPENAI=false
OPENAI_API_KEY=sk-your_openai_key_hereUsage Costs:
- GPT-3.5-turbo: ~$0.002 per 1K tokens
- GPT-4: ~$0.03 per 1K tokens
- Typical email generation: $0.01-0.05 per email
- AI parsing: ~$0.02-0.10 per operation
Why Resend?
- High email deliverability rates
- Real-time tracking and analytics
- Webhook support for delivery status
- Developer-friendly API
Steps:
- Sign up at Resend
- Verify your email address
- Add and verify your domain (recommended)
- Create API key with sending permissions
- Configure DNS records for better deliverability
Configuration:
RESEND_API_KEY=re_your_resend_api_key
SENDER_EMAIL=your-name@yourdomain.com
SENDER_NAME=Your Full Name-
Copy the example file:
cp .env.example .env
-
Edit the .env file:
# On Windows notepad .env # On macOS/Linux nano .env
-
Add your API keys:
# Required API Keys NOTION_TOKEN=secret_your_actual_notion_token_here HUNTER_API_KEY=your_actual_hunter_api_key_here # Azure OpenAI Configuration (Recommended) USE_AZURE_OPENAI=true AZURE_OPENAI_API_KEY=your_azure_openai_api_key_here AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com/ AZURE_OPENAI_DEPLOYMENT_NAME=gpt-4-deployment AZURE_OPENAI_API_VERSION=2024-02-15-preview # Alternative: Regular OpenAI (if not using Azure) # USE_AZURE_OPENAI=false # OPENAI_API_KEY=sk-your_actual_openai_key_here # Email Sending Configuration (Optional but Recommended) RESEND_API_KEY=re_your_resend_api_key_here SENDER_EMAIL=your-name@yourdomain.com SENDER_NAME=Your Full Name REPLY_TO_EMAIL=your-name@yourdomain.com # Enhanced AI Features ENABLE_AI_PARSING=true ENABLE_PRODUCT_ANALYSIS=true ENHANCED_PERSONALIZATION=true # Processing Configuration SCRAPING_DELAY=2.0 HUNTER_REQUESTS_PER_MINUTE=10 RESEND_REQUESTS_PER_MINUTE=100 MAX_PRODUCTS_PER_RUN=50 MAX_PROSPECTS_PER_COMPANY=10 # Email Configuration EMAIL_TEMPLATE_TYPE=professional PERSONALIZATION_LEVEL=high MAX_EMAIL_LENGTH=500 # Workflow Configuration ENABLE_ENHANCED_WORKFLOW=true AUTO_SEND_EMAILS=false EMAIL_REVIEW_REQUIRED=true
-
Secure the file:
# On Unix systems chmod 600 .env
-
Generate template:
python cli.py init-config config.yaml
-
Edit the configuration:
# API Keys (Required) NOTION_TOKEN: "secret_your_actual_notion_token_here" HUNTER_API_KEY: "your_actual_hunter_api_key_here" OPENAI_API_KEY: "sk-your_actual_openai_key_here" # Optional: Pre-existing Notion database ID NOTION_DATABASE_ID: "" # Rate Limiting Settings SCRAPING_DELAY: 2.0 HUNTER_REQUESTS_PER_MINUTE: 10 # Processing Limits MAX_PRODUCTS_PER_RUN: 50 MAX_PROSPECTS_PER_COMPANY: 10 # Email Generation Settings EMAIL_TEMPLATE_TYPE: "professional" PERSONALIZATION_LEVEL: "medium" # Logging Configuration LOG_LEVEL: "INFO"
-
Use with CLI:
python cli.py --config config.yaml discover
-
Basic Configuration Test:
python cli.py --dry-run status
Expected output:
Running in DRY-RUN mode - no actual API calls will be made DRY-RUN: Would show workflow status -
Set Up Progress Dashboard:
python scripts/setup_dashboard.py
This creates Notion databases for campaign tracking, processing logs, and system status monitoring.
-
API Keys Validation:
python cli.py --dry-run discover --limit 1
This should show what would be discovered without making actual API calls.
-
Full System Test:
python cli.py --verbose --dry-run discover --limit 5
This provides detailed logging of what would happen.
-
Test OpenAI Client Manager:
python -c " from services.openai_client_manager import get_client_manager, configure_default_client from utils.config import Config config = Config.from_env() configure_default_client(config) manager = get_client_manager() print('OpenAI Client Manager Status:') client_info = manager.get_client_info() print(f' Client Type: {client_info[\"client_type\"]}') print(f' Model: {client_info[\"model_name\"]}') print(f' Configured: {client_info[\"configured\"]}') print('✅ OpenAI Client Manager working correctly') "
Solutions:
- Check
.envfile exists in project root - Verify no typos in variable names
- Ensure no extra spaces around the
=sign - Check file permissions (should be readable)
Solutions:
- Verify token starts with
secret_ - Check token hasn't expired
- Ensure integration has proper permissions
- Try creating a new integration token
Solutions:
- Verify account is activated (check email)
- Ensure API key is copied correctly
- Check if you've exceeded free tier limits
- Try generating a new API key
Solutions:
- Verify API key starts with
sk- - Check if billing is set up (required for API access)
- Ensure you have available credits
- Try creating a new API key
If you want to use an existing Notion database:
-
Create a database in Notion with these properties:
- Name (Title)
- Role (Text)
- Company (Text)
- LinkedIn (URL)
- Email (Email)
- Contacted (Checkbox)
- Status (Select: Not Contacted, Contacted, Responded, Rejected)
- Notes (Rich Text)
- Source (Text)
- Added Date (Date)
-
Share the database with your integration
-
Get the database ID from the URL:
https://notion.so/your-workspace/DATABASE_ID?v=... -
Add to configuration:
NOTION_DATABASE_ID=your_database_id_here
Adjust these settings based on your API limits and needs:
# Scraping delays (seconds between requests)
SCRAPING_DELAY=2.0 # ProductHunt/LinkedIn scraping
linkedin_scraping_delay=3.0 # LinkedIn-specific delay
# Hunter.io limits
HUNTER_REQUESTS_PER_MINUTE=10 # Free tier: 10/min
HUNTER_MONTHLY_LIMIT=25 # Free tier: 25/month
# Processing limits
MAX_PRODUCTS_PER_RUN=50 # Products per discovery run
MAX_PROSPECTS_PER_COMPANY=10 # Max prospects per companyCustomize email generation:
# Template types: professional, casual, formal
EMAIL_TEMPLATE_TYPE=professional
# Personalization levels: low, medium, high
PERSONALIZATION_LEVEL=medium
# OpenAI model settings
OPENAI_MODEL=gpt-3.5-turbo # or gpt-4 for better quality
OPENAI_MAX_TOKENS=500 # Max tokens per email
OPENAI_TEMPERATURE=0.7 # Creativity level (0-1)Configure logging levels and output:
# Log levels: DEBUG, INFO, WARNING, ERROR
LOG_LEVEL=INFO
# Log file settings
LOG_FILE_MAX_SIZE=10MB # Max size before rotation
LOG_FILE_BACKUP_COUNT=5 # Number of backup files
LOG_TO_CONSOLE=true # Also log to consoleFor containerized deployment:
-
Create Dockerfile:
FROM python:3.9-slim WORKDIR /app COPY requirements.txt . RUN pip install -r requirements.txt COPY . . CMD ["python", "cli.py", "discover"]
-
Build image:
docker build -t job-prospect-automation . -
Run with environment file:
docker run --env-file .env job-prospect-automation discover --limit 10
Run these commands to verify everything works:
# 1. Test CLI help
python cli.py --help
# 2. Test configuration
python cli.py --dry-run status
# 3. Test discovery (dry-run)
python cli.py --dry-run discover --limit 3
# 4. Test company processing (dry-run)
python cli.py --dry-run process-company "Test Company"
# 5. Test email generation (dry-run)
python cli.py --dry-run generate-emails --prospect-ids "1,2,3"
# 6. Run example scripts
python examples/cli_usage_examples.pyFor a complete test with actual API calls (uses your quotas):
# Small-scale real test
python cli.py discover --limit 1 --batch-size 1
# Check results
python cli.py statusAfter setup, your directory should look like:
job-prospect-automation/
├── .env # Your API keys (keep private!)
├── config.yaml # Optional config file
├── logs/ # Generated log files
│ ├── prospect_automation_YYYYMMDD.log
│ ├── error_monitoring.json
│ └── error_notifications.json
├── controllers/ # Main logic controllers
├── services/ # API service integrations
├── models/ # Data models
├── utils/ # Utility functions
├── tests/ # Test files
├── examples/ # Usage examples
├── docs/ # Documentation
├── requirements.txt # Python dependencies
├── cli.py # Command-line interface
├── main.py # Main entry point
└── README.md # Main documentation
- Never commit API keys to version control
- Use environment variables or secure config files
- Set proper file permissions:
chmod 600 .env config.yaml
- Rotate keys regularly (especially if compromised)
- Limit integration permissions to only what's needed
- Use dedicated workspace for automation
- Regularly review integration access
- Monitor database access logs
- Use HTTPS for all API calls (default in the system)
- Monitor API usage for unusual patterns
- Set up rate limiting to avoid being blocked
- Use VPN if required by your organization
After successful installation:
- Read the CLI Usage Guide for detailed command information
- Review examples/ for usage patterns
- Start with small tests using
--dry-runmode - Monitor your API usage to stay within limits
- Set up monitoring for production use
If you encounter issues during setup:
- Check the Troubleshooting section
- Review error logs in the
logs/directory - Use verbose mode for detailed error information:
python cli.py --verbose --dry-run [command]
- Test individual components to isolate issues
- Open an issue on GitHub with detailed error information
Congratulations! 🎉 You've successfully set up the Job Prospect Automation system. You're now ready to start automating your job prospecting workflow!