thompson-sampling

Here are 148 public repositories matching this topic...

sail-sg / oat

🌾 OAT: A research-friendly framework for LLM online alignment, including reinforcement learning, preference learning, etc.

thompson-sampling alignment reasoning distributed-training ppo dueling-bandits dpo distributed-rl llm online-rl rlhf llm-aligment online-alignment llm-exploration grpo r1-zero

Updated Jan 29, 2026
Python

alison-carrera / onn

Star

Online Deep Learning: Learning Deep Neural Networks on the Fly / Non-linear Contextual Bandit Algorithm (ONN_THS)

reinforcement-learning neural-network pytorch thompson-sampling reinforcement-learning-algorithms machine-learning-library neural-architecture-search contextual-bandits mab pytorch-implemention multiarmed-bandits pytorch-implementation thompson-algorithm

Updated Apr 13, 2026
Python

alison-carrera / mabalgs

Star

👤 Multi-Armed Bandit Algorithms Library (MAB) 👮

arm algorithm reinforcement-learning simulation monte-carlo rank thompson-sampling reinforcement-learning-algorithms ucb reward multi-armed-bandit montecarlo-simulation contextual-bandits ranking-algorithm mab ranked-mab

Updated Apr 13, 2026
Python

Eric-Bradford / TS-EMO

Star

This repository contains the source code for “Thompson sampling efficient multiobjective optimization” (TSEMO).

machine-learning matlab thompson-sampling multi-objective-optimization genetic-algorithms black-box-optimization gaussian-processes bayesian-optimization kriging expensive-to-evaluate-functions surrogate-based-optimization spectral-sampling

Updated Jun 19, 2020
MATLAB

stitchfix / mab

Star

Library for multi-armed bandit selection strategies, including efficient deterministic implementations of Thompson sampling and epsilon-greedy.

go golang data-science reinforcement-learning thompson-sampling experimentation multi-armed-bandits multi-armed-bandit thompson multiarmed-bandits

Updated Apr 9, 2026
Go

andrecianflone / thompson

Star

Thompson Sampling Tutorial

reinforcement-learning thompson-sampling bandit bandit-algorithm

Updated Jan 25, 2019
Jupyter Notebook

PlaytikaOSS / pybandits

Star

Python library for Multi-Armed Bandits

reinforcement-learning thompson-sampling multi-armed-bandits multi-armed-bandit bayesian-neural-networks contextual-bandits multiarmed-bandits stochastic-bandit-algorithms stochastic-bandit contextual-bandit-algorithms offline-policy-evaluation

Updated Apr 23, 2026
Python

farhanchoudhary / Machine_Learning_A-Z_All_Codes_and_Templates

Star

All codes, both created and optimized for best results from the SuperDataScience Course

natural-language-processing reinforcement-learning deep-learning clustering cross-validation naive-bayes-classifier thompson-sampling neural-networks classification dimensionality-reduction grid-search principal-component-analysis clustering-algorithm upper-confidence-bounds k-fold xgboost-algorithm association-rule-learning machine-learning-az

Updated Nov 5, 2017
Python

Nikronic / Machine-Learning-Models

Star

In This repository I made some simple to complex methods in machine learning. Here I try to build template style code.

Updated Nov 7, 2021
Python

niffler92 / Bandit

Star

Bandit algorithms

simulation thompson-sampling multiarm-bandit contextual-bandit bandit-algorithms linucb

Updated Oct 12, 2017
Python

kalibr-ai / kalibr-sdk-python

Star

Stop overpaying to run your agents. Kalibr routes every request to lower-cost model and tool paths without degrading performance.

Updated Apr 24, 2026
Python

michaelosthege / pyrff

Star

pyrff: Python implementation of random fourier feature approximations for gaussian processes

thompson-sampling gaussian-processes bayesian-optimization

Updated Jul 19, 2025
Jupyter Notebook

RonyAbecidan / Neural-Thompson-Sampling

Star

Study of the paper 'Neural Thompson Sampling' published in October 2020

neural-network thompson-sampling multi-armed-bandits non-linear-optimization contextual-bandits neural-tangent-kernel neural-thompson-sampling

Updated Sep 27, 2022
Jupyter Notebook

antoine-hochart / bandit_algo_evaluation

Star

Offline evaluation of multi-armed bandit algorithms

thompson-sampling epsilon-greedy policy-evaluation multi-armed-bandit upper-confidence-bound

Updated Dec 1, 2020
Python

nphdang / Bandit-BO

Star

Bayesian Optimization for Categorical and Continuous Inputs

machine-learning optimization thompson-sampling hyperparameter-optimization hyperopt gaussian-processes bayesian-optimization multi-armed-bandits hyperparameter-tuning automl automated-machine-learning smac categorical-variables continuous-variable acquisition-functions gpyopt batch-bayesian-optimization

Updated Jul 20, 2020
Python

v-i-s-h / MAB.jl

Star

A Julia Package for providing Multi Armed Bandit Experiments

reinforcement-learning julia julia-language thompson-sampling reinforcement-learning-algorithms multi-arm-bandits ucb julia-package exp julialang mab bandit-experiments

Updated Jul 19, 2018
Julia

hmishfaq / LSAC

Star

The official code release for "Langevin Soft Actor-Critic: Efficient Exploration through Uncertainty-Driven Critic Learning", ICLR 2025

reinforcement-learning thompson-sampling policy-gradient exploration-exploitation soft-actor-critic langevin-mc

Updated May 28, 2025
Python

ZIYU-DEEP / Awesome-Papers-on-Combinatorial-Semi-Bandit-Problems

Star

A curated list on papers about combinatorial multi-armed bandit problems.

thompson-sampling multi-armed-bandit combinatorial-optimization bandit-algorithms combinatorial-bandit

Updated May 10, 2021

IgorGanapolsky / ThumbGate

Sponsor

Star

Self-improving agent governance: 👍/👎 → Pre-Action Checks that block repeat AI mistakes. Stop paying for the same mistake twice.

Updated Apr 24, 2026
JavaScript

akshaykhadse / reinforcement-learning

Star

Implementations of basic concepts dealt under the Reinforcement Learning umbrella. This project is collection of assignments in CS747: Foundations of Intelligent and Learning Agents (Autumn 2017) at IIT Bombay

reinforcement-learning linear-programming thompson-sampling epsilon-greedy ucb policy-evaluation mdps multi-armed-bandits policy-iteration randomised-algorithms reinforcement-learning-excercises kl-divergence markovian-epidemic-processes reinforcement-learning-analysis multiarm-bandit ucb1 howards-pi batch-switching randomized-policy-iteration

Updated May 21, 2018
Python

Improve this page

Add a description, image, and links to the thompson-sampling topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the thompson-sampling topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

thompson-sampling

Here are 148 public repositories matching this topic...

sail-sg / oat

alison-carrera / onn

alison-carrera / mabalgs

Eric-Bradford / TS-EMO

stitchfix / mab

andrecianflone / thompson

PlaytikaOSS / pybandits

farhanchoudhary / Machine_Learning_A-Z_All_Codes_and_Templates

Nikronic / Machine-Learning-Models

niffler92 / Bandit

kalibr-ai / kalibr-sdk-python

michaelosthege / pyrff

RonyAbecidan / Neural-Thompson-Sampling

antoine-hochart / bandit_algo_evaluation

nphdang / Bandit-BO

v-i-s-h / MAB.jl

hmishfaq / LSAC

ZIYU-DEEP / Awesome-Papers-on-Combinatorial-Semi-Bandit-Problems

IgorGanapolsky / ThumbGate

akshaykhadse / reinforcement-learning

Improve this page

Add this topic to your repo