University of Caldas course repository for Intelligent Systems II.
Class taken with Prof. Jorge Alberto Jaramillo GarzΓ³n
Live Demo: Bank Marketing ML Analysis
Intelligent Systems II - Advanced machine learning and artificial intelligence techniques
Institution: Universidad de Caldas | Computer Engineering
Professor: Jorge Alberto Jaramillo GarzΓ³n
Academic Period: 2024-2025
- 50% Partial Exams (Zero, First, Second, Third)
- 50% Final Project (Cybersecurity Incident Prediction)
Machine Learning Β· Data Science Β· Support Vector Machines Β· Neural Networks
PCA Analysis Β· Bayesian Inference Β· Ensemble Methods Β· Deep Learning
Dimensionality Reduction Β· Cross-Validation Β· Hyperparameter Tuning
Kernel Tricks Β· Gradient Boosting Β· Graph Neural Networks Β· Transformers
Technologies: python streamlit scikit-learn pytorch xgboost pandas numpy matplotlib seaborn plotly solara
.jorge/
βββ partials/ # Partial Exams (50%)
β βββ zero/ # Demo exam (practice)
β βββ first/ # Bayesian & K-NN classifiers
β βββ second/ # SVM, ANN, PCA β COMPLETE
β βββ third/ # TBD
β
βββ project/ # Final Project (50%)
β βββ Cybersecurity Incident Predictor
β βββ Microsoft GUIDE Dataset Analysis
β
βββ notebooks/ # Class Materials
βββ Theory: Perceptron, SVM, Kernels, ANN
βββ Weekly notes
Location: partials/zero/
Topics: Validation, Bayesian classifiers, K-NN
Dataset: Iris
Exercises:
- Cross-Validation (10-fold) - Compare Bayesian vs Geometric classifiers on Iris
- Bootstrapping K-NN - Investigate performance vs number of neighbors
- Classifier Comparison - Contrast assumptions, requirements, dimensionality impact
Location: partials/first/
Topics: Data preprocessing, model training, evaluation
Dataset: Iris classification
Completed:
- β Data preprocessing pipeline
- β Multiple classifier training
- β Performance evaluation metrics
Location: partials/second/
Status: β
COMPLETE (2.5/2.5 points)
Live: intel-ii-exam-ii.streamlit.app
Tasks Completed:
- 4 kernel types: Linear, RBF, Polynomial, Sigmoid
- Hyperparameter tuning: C, gamma, degree
- Cross-validation with K-Fold and Train/Test
- Experiment history tracking and comparison
- Confusion matrix visualization
- Best model auto-identification
- 12 architectures: 1-3 layers (10-100 neurons)
- 3 activation functions: ReLU, Tanh, Logistic
- 3 solvers: Adam, SGD, L-BFGS
- Learning curves visualization
- Performance comparison charts
- Best model saver
- Feature analysis: correlation, distributions, Q-Q plots
- Data exploration: 6 plots (3Γ2D + 3Γ3D interactive)
- PCA transformation with variance thresholds
- Model retraining on PCA data
- BEFORE vs AFTER comparison
- Automated insights and recommendations
- Answer: "What can you conclude for YOUR dataset?"
Features:
- π― Interactive Streamlit dashboard
- π Real-time experiment tracking
- πΎ Persistent experiment history
- π Automated performance insights
- π¨ Professional visualizations
- π‘ Smart recommendations (USE/AVOID PCA)
- π₯ CSV export functionality
Tech Stack:
Python | Streamlit | scikit-learn | pandas | matplotlib | seaborn | plotly
Documentation: See .jorge/partials/second/README.md
Location: partials/third/
Status: π Pending
Location: .jorge/project/
Advanced ensemble ML platform for predicting cybersecurity incidents before they occur. Transforms reactive cybersecurity into proactive prevention for Security Operations Centers (SOCs).
- Predictive (not reactive): Predicts incidents 1-24 hours in advance
- Hybrid ensemble: LSTM + GNN + XGBoost + Transformer
- Meta-learning: Adaptive model weighting by context
- Production-ready: Professional Solara dashboard
Level 0: Specialized Base Models
-
LSTM/GRU - Temporal pattern recognition
- Learns incident sequences over time
- Captures long-term dependencies
-
Graph Neural Networks - Entity relationship modeling
- Models risk propagation through network
- 33 entity types (users, IPs, domains, etc.)
-
XGBoost - Alert pattern classification
- Complex decision rules
- 9,100+ detector patterns
-
Transformers - Evidence sequence analysis
- Self-attention over evidence chains
- MITRE ATT&CK technique mapping
Level 1: Meta-Ensemble
- Adaptive weight learning by context
- Organization-specific optimization
- Online learning for drift adaptation
- 13M+ evidences from real cybersecurity incidents
- 1.6M alerts from 9,100+ unique detectors
- 1M incidents with expert triage labels
- 6,100+ organizations across industries
- 441 MITRE ATT&CK techniques mapped
- 2-week period with temporal resolution
Prediction Accuracy (4h): 94.2%
Early Warning Score: 0.89
Cost-Weighted Recall: 0.91
Alert Fatigue Score: 0.85
MTTD Reduction: 4+ hours
- β Prevents incidents before escalation
- β Reduces MTTD by 4+ hours
- β Optimizes analyst workload with intelligent prioritization
- β Scales across organizations with adaptive learning
Python 3.10+ | Solara | PyTorch | XGBoost | scikit-learn | Microsoft GUIDE Dataset
Documentation:
- Perceptron fundamentals
- Linear classification
- Support Vector Machine theory
- Margin maximization
- Kernel trick explained
- Mapping Ο to higher dimensions
- RBF, Polynomial, Sigmoid kernels
- Computational advantages: O(nΒ²) vs O(nΒ²d)
- Parameter selection (Ο, degree)
Key Concepts:
- Kernel function:
K(x,y) = Ο(x)α΅Ο(y)computed without explicit mapping - RBF kernel: Maps to infinite dimensions
- Polynomial kernel:
(xα΅y + c)α΅captures interactions - Binomial theorem: Connects products in original/transformed space
- Neural network architectures
- Backpropagation algorithm
- Activation functions (ReLU, Tanh, Sigmoid)
- Training strategies
cd .jorge/partials/second
# Install dependencies
pip install streamlit pandas numpy scikit-learn matplotlib seaborn plotly
# Launch app
streamlit run app.pyVisit: http://localhost:8501
cd .jorge/project
# Install with UV
uv sync
# Download Microsoft GUIDE dataset from Kaggle
# Extract to data/microsoft_guide/
# Run dashboard
uv run cybersec-dashboardVisit: http://localhost:8765
cd .jorge/notebooks/docs
jupyter notebook| Component | Status | Description | Grade |
|---|---|---|---|
| Demo Exam | β Complete | Bayesian, K-NN, Validation | Practice |
| First Exam | β Complete | Fundamentals, Iris dataset | TBD |
| Second Exam | β Complete | SVM + ANN + PCA Dashboard | 2.5/2.5 |
| Third Exam | π Pending | TBD | - |
| Final Project | β Complete | Cybersecurity Incident Predictor | TBD |
Overall Progress: 80% Complete
By the end of this course, you will master:
- β Support Vector Machines with kernel methods
- β Bayesian classifiers and probabilistic inference
- β K-Nearest Neighbors algorithms
- β Cross-validation and bootstrapping
- β Hyperparameter optimization
- β Artificial Neural Networks (feedforward)
- β LSTM/GRU for temporal sequences
- β Graph Neural Networks for relationships
- β Transformers and attention mechanisms
- β Principal Component Analysis (PCA)
- β Feature selection and engineering
- β Variance analysis and scree plots
- β Component interpretation
- β Random Forest, XGBoost, LightGBM
- β Gradient boosting techniques
- β Meta-ensemble with adaptive weighting
- β Model stacking strategies
- β Deploy ML models in production
- β Build interactive dashboards (Streamlit, Solara)
- β Handle imbalanced datasets (SMOTE, SMOTE-ENN)
- β Evaluate with business-focused metrics
- β Make data-driven conclusions
- β Communicate technical results effectively
Interactive dashboard comparing SVM, ANN, and PCA on UCI Bank Marketing dataset
Highlights:
- 3 ML algorithms with comprehensive tuning
- Automated experiment tracking
- PCA impact analysis with insights
- Smart recommendations based on results
- Professional production deployment
Live Demo: https://intel-ii-exam-ii.streamlit.app/
Enterprise ML platform for SOC teams with 4-hour incident predictions
Innovation:
- Hybrid ensemble (LSTM + GNN + XGBoost + Transformer)
- Meta-learning with context adaptation
- Microsoft GUIDE dataset (13M+ evidences)
- Professional Solara dashboard
- 94.2% prediction accuracy
Impact: Prevents incidents before escalation, saves millions in damages
- Second Exam:
.jorge/partials/second/README.md- SVM Guide:
.jorge/partials/second/ui/pages/svm/README.md - ANN Guide:
.jorge/partials/second/ui/pages/ann/README.md - PCA Guide:
.jorge/partials/second/ui/pages/pca/README.md - Deployment:
.jorge/partials/second/DEPLOYMENT.md
- SVM Guide:
- Overview:
.jorge/project/README.md - Project Vision:
.jorge/project/project_overview.md - Architecture:
.jorge/project/architecture_design.md - Dataset:
.jorge/project/dataset_guide.md - Metrics:
.jorge/project/evaluation_metrics.md
- SVM Theory:
.jorge/project/clase-03.md - Notebooks:
.jorge/notebooks/docs/
- β Deployed Production ML App - Streamlit Cloud
- β Built Professional SOC Dashboard - Solara
- β Implemented Ensemble Learning - 4 specialized models
- β Achieved 94%+ Accuracy - Real-world dataset
- β Created Comprehensive Documentation - Theory + Practice
- β Applied Advanced ML Techniques - Kernels, PCA, Meta-learning
Jorge Alberto Jaramillo GarzΓ³n
Computer Engineering Student
Universidad de Caldas
- Professor: Jorge Alberto Jaramillo GarzΓ³n
- Institution: Universidad de Caldas
- Course: Sistemas Inteligentes II (Intelligent Systems II)
- Datasets:
- UCI Machine Learning Repository (Bank Marketing, Iris)
- Microsoft GUIDE (Cybersecurity Incidents)
- CIC-IDS2017, UNSW-NB15 (Network intrusion)
Academic project for Universidad de Caldas coursework.