AI-Powered Career Analysis Tool with Personalized Learning Paths
Features • Demo • Installation • Usage • Architecture • Technologies
Job Market Analyzer is an intelligent desktop application that helps job seekers identify skill gaps and provides personalized learning recommendations. Upload your resume, and within 60 seconds, receive:
- ✅ Detailed skill gap analysis against 50+ job postings
- ✅ AI-powered career path matching using K-Means clustering
- ✅ Personalized 4-week learning plan with real course links
- ✅ Interactive visualizations of your skill profile
- ✅ Job-specific matching percentages
- K-Means Clustering: Groups similar career paths to find your best fit
- NLP Skill Extraction: Identifies 100+ technical skills using OpenNLP
- Smart Matching: Goes beyond simple keyword matching
- PDF files (text-based and scanned)
- Image formats (PNG, JPG, JPEG, BMP, TIFF)
- Advanced OCR with Tesseract (85-95% accuracy)
- 4-week structured learning plans
- 40+ curated courses from Udemy, Coursera, YouTube
- Domain-specific recommendations (Web Dev, Data Science, DevOps, etc.)
- Progress milestones and project suggestions
- Skill match percentage charts
- Top missing skills visualization
- Job-by-job analysis with color-coded results
- Export results to text files
- Fetches 50+ job postings via Adzuna API
- Domain-specific job filtering
- Intelligent fallback with sample jobs
Upload your resume and select your target job domain.
View comprehensive skill analysis with interactive tabs:
- Summary: Matched vs. missing skills
- Job Postings: 50+ analyzed jobs with match percentages
- Learning Path: Your personalized 4-week plan
- Charts: Visual skill gap analysis
- Java JDK 21 or higher
- Maven 3.x
- Tesseract OCR (for image processing)
git clone https://github.com/rakshanrk/Job_Market_Analyzer_Java.git
cd Job_Market_Analyzer_JavaWindows:
# Download installer from: https://github.com/UB-Mannheim/tesseract/wiki
# Install to: C:\Program Files\Tesseract-OCR
# Download tessdata: https://github.com/tesseract-ocr/tessdata
# Place eng.traineddata in C:\tessdatamacOS:
brew install tesseractLinux:
sudo apt-get install tesseract-ocr
sudo apt-get install tesseract-ocr-engmvn clean installmvn javafx:runOr run the JAR:
java -jar target/JobMarketAnalyzer-1.0-SNAPSHOT.jar-
Launch Application
- Run the application using Maven or the JAR file
-
Select Job Domain
- Choose from: Software Developer, Data Scientist, Web Developer, etc.
-
Upload Resume
- Click "Upload Resume" and select your PDF or image file
- Maximum file size: 10MB
-
Wait for Analysis (30-60 seconds)
- Text extraction
- Skill identification using NLP
- Job market analysis
- AI-powered matching
- Learning path generation
-
View Results
- Explore 4 interactive tabs
- View matched and missing skills
- Check job-specific match percentages
- Get your personalized 4-week learning plan
-
Export Results
- Save analysis to text file for future reference
├── models/ # Data structures (Skill, Job, Resume, etc.)
├── services/ # Business logic
│ ├── ResumeParser # PDF & OCR processing
│ ├── SkillExtractor # NLP-based skill identification
│ ├── JobFetcher # API integration
│ ├── SkillAnalyzer # K-Means clustering
│ └── LearningPathGenerator # Personalized recommendations
├── utils/ # Helper utilities
│ ├── FileValidator # Input validation
│ └── ChartGenerator # JFreeChart visualization
└── Main.java # JavaFX GUI + Controller
- Extracts text from PDFs using Apache PDFBox
- OCR processing for images using Tesseract
- Text normalization and cleanup
- Dictionary-based matching (100+ technical skills)
- NLP tokenization and POS tagging with OpenNLP
- Filters out common words to prevent false positives
- Integrates with Adzuna Job Search API
- Intelligent fallback to domain-specific sample jobs
- Parses job descriptions to extract required skills
- K-Means Clustering: Groups similar skill profiles
- Creates n-dimensional feature vectors for resumes and jobs
- Calculates match percentage based on cluster similarity
- Identifies skill gaps with priority ranking
- Prioritizes missing skills by job market demand
- Queries SQLite database for relevant courses
- Creates structured 4-week learning plan
- Includes milestones and project suggestions
- Java 21+: Core programming language
- JavaFX 23: Modern GUI framework
- Maven: Dependency management and build tool
- Weka 3.8+: K-Means clustering algorithm
- Apache OpenNLP 2.3+: NLP processing (tokenization, POS tagging)
- Apache PDFBox 3.0+: PDF text extraction
- Tesseract 5.x: OCR engine for images
- SQLite 3.46+: Embedded database
- Adzuna API: Real-time job market data
- Apache HttpClient 5.x: HTTP requests
- Gson 2.11+: JSON parsing
- JFreeChart 1.5+: Chart generation
-
Feature Space Creation
- Collects all unique skills from resume and jobs
- Example: [Java, Python, SQL, Docker, AWS, React]
-
Vector Representation
- Resume:
[1, 1, 1, 0, 0, 0](has Java, Python, SQL) - Job 1:
[1, 0, 1, 1, 0, 0](needs Java, SQL, Docker) - Job 2:
[0, 1, 0, 0, 1, 1](needs Python, AWS, React)
- Resume:
-
Clustering
- Groups similar skill profiles into 3 clusters
- Cluster 0: Backend Developers
- Cluster 1: Data Scientists
- Cluster 2: DevOps Engineers
-
Match Calculation
- Identifies resume's cluster
- Counts jobs in same cluster
- Match % = (same cluster jobs / total jobs) × 100
Advantage: Considers overall skill profile, not just individual skill overlap
Stores 40+ curated courses mapped to skills
- skill_name (Java, Python, React, etc.)
- resource_title (Course name)
- resource_url (Link to Udemy, Coursera, YouTube)
- platform (Udemy, Coursera, YouTube)
- duration_weeks (Time to complete)
- difficulty_level (Beginner, Intermediate, Advanced)Tracks past analyses for progress monitoring
- resume_filename
- extracted_skills
- missing_skills
- match_percentage
- jobs_analyzed
- analysis_dateStores generated 4-week plans
- analysis_id (Foreign key)
- week_number (1-4)
- skill_focus (Skills for the week)
- resources (Course links)To use real job data from Adzuna:
- Sign up at Adzuna Developer Portal
- Get your API credentials (App ID and App Key)
- Update
JobFetcher.java:
private static final String APP_ID = "your_app_id";
private static final String APP_KEY = "your_app_key";Note: Application works perfectly with sample jobs if API is not configured.
Update ResumeParser.java if Tesseract is installed in a custom location:
tesseract.setDatapath("path/to/your/tessdata");-
Processing Time: 30-60 seconds per resume
- Text extraction: 5-20 seconds (PDF) or 10-30 seconds (Image OCR)
- Skill extraction: 2-5 seconds
- Job analysis: 5-10 seconds
- AI clustering: 5-10 seconds
- Path generation: 1-2 seconds
-
Accuracy:
- PDF text extraction: ~99%
- OCR (images): 85-95% (depends on image quality)
- Skill detection: 85-95%
-
Supported Files: PDF, PNG, JPG, JPEG, BMP, TIFF (max 10MB)
Contributions are welcome! Here's how you can help:
- Fork the repository
- Create a feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
- Add more learning resources to database
- Improve OCR accuracy
- Support additional file formats
- Enhance UI/UX
- Add more job domains
- Implement additional ML algorithms
- Resume builder module
- LinkedIn profile import
- Job application tracker
- Skill verification quizzes
- Interview preparation questions
- Salary estimator based on skills
- Multiple career path suggestions
- Skill trend analysis dashboard
- Peer comparison feature
- Mobile app (iOS/Android)
Rakshan RK
- GitHub: @rakshanrk
- LinkedIn: [https://www.linkedin.com/in/rakshanrk/]
- Apache Software Foundation for OpenNLP, PDFBox, and HttpClient
- Weka Team for the machine learning library
- Tesseract Team for OCR engine
- Adzuna for job market API
- JFreeChart Team for visualization library
- Course providers: Udemy, Coursera, YouTube
If you encounter any issues or have questions:
- Check the Issues page
- Create a new issue with detailed description
- Reach out via email