poli 🧪: a library of discrete objective functions#

poli is a library of discrete objective functions for benchmarking optimization algorithms. If offers

  • isolation of black box function calls inside conda environments. Don’t worry about clashes w. black box requirements, poli will create the relevant conda environments for you.

  • logging each black box call using observers.

  • A numpy interface. Inputs are np.arrays of strings, outputs are np.arrays of floats.

We also provide poli-baselines, a collection of optimizers of these discrete black box functions.

Getting started#

A good place to start is the next chapter! Go to Getting Started.

To install poli and poli-baselines, we recommend creating a fresh conda environment

conda create -n poli-base python=3.10
conda activate poli-base
pip install poli-core
pip install git+https://github.com/MachineLearningLifeScience/poli-baselines.git@main

poli also runs on Google Colab. Here is a small example of how to run one of the objective functions..

Black-box objective functions#

For a full list, click here.

Toy problems#

White noise

White noise drawn from a unit Gaussian

./using_poli/objective_repository/white_noise.html
Aloha

A toy example about optimizing 5-letter words to spell “ALOHA”

./using_poli/objective_repository/aloha.html
Toy continuous problems

The usual benchmark functions for continuous optimization (e.g. easom, or ackley_function_01)

./using_poli/objective_repository/toy_continuous_problems.html

Small molecules#

Albuterol Similarity (using tdc)

The Therapeutics Data Commons’ implementation of the Albuterol similarity oracle of GuacaMol.

./using_poli/objective_repository/albuterol_similarity.html
Amlodipine MPO (using tdc)

The Therapeutics Data Commons’ implementation of the Amlodipine MPO oracle of GuacaMol.

./using_poli/objective_repository/amlodipine_mpo.html
Celecoxib rediscovery (using tdc)

The Therapeutics Data Commons’ implementation of the Celecoxib rediscovery oracle of GuacaMol.

./using_poli/objective_repository/celecoxib_rediscovery.html
Decorator Hop (using tdc)

The Therapeutics Data Commons’ implementation of the “deco Hop” oracle of GuacaMol.

./using_poli/objective_repository/deco_hop.html
dockstring for ligand design

Using dockstring to assess the docking score of a small molecule.

./using_poli/objective_repository/dockstring.html
DRD2 docking (using tdc)

The Therapeutics Data Commons’ implementation of the DRD2 docking oracle.

./using_poli/objective_repository/albuterol_similarity.html
DRD3 (or 3pbl) docking (using tdc)

A wrapper around the Therapeutics Data Commons implementation of 3pbl docking.

./using_poli/objective_repository/drd3_docking.html
Ehrlich Functions

A Closed-form objective for discrete sequences

./using_poli/objective_repository/ehrlich_functions.html
Fexofenadine MPO (using tdc)

The Therapeutics Data Commons’ implementation of the Fexofenadine MPO oracle of GuacaMol.

./using_poli/objective_repository/fexofenadine_mpo.html
GSK3β (using tdc)

The Therapeutics Data Commons’ implementation of the GSK3β oracle.

./using_poli/objective_repository/gsk3_beta.html
Isomer C7H8N2O2 (using tdc)

The Therapeutics Data Commons’ implementation of the first isomer oracle of GuacaMol.

./using_poli/objective_repository/isomer_c7h8n2o2.html
Isomer C9H10N2O2PF2Cl (using tdc)

The Therapeutics Data Commons’ implementation of the second isomer oracle of GuacaMol.

./using_poli/objective_repository/isomer_c9h10n2o2pf2cl.html
JNK3 (using tdc)

The Therapeutics Data Commons’ implementation of the JNK3 oracle.

./using_poli/objective_repository/jnk3.html
Log-solubility (LogP)

Computing the log-quotient of solubilities using RDKit.

./using_poli/objective_repository/rdkit_logp.html
Median 1 (using tdc)

The Therapeutics Data Commons’ implementation of the “median 1” oracle of GuacaMol.

./using_poli/objective_repository/median_1.html
Median 2 (using tdc)

The Therapeutics Data Commons’ implementation of the “median 2” oracle of GuacaMol.

./using_poli/objective_repository/median_2.html
Mestranol Similarity (using tdc)

The Therapeutics Data Commons’ implementation of the Mestranol similarity oracle of GuacaMol.

./using_poli/objective_repository/albuterol_similarity.html
Osimetrinib MPO (using tdc)

The Therapeutics Data Commons’ implementation of the Osimetrinib MPO oracle of GuacaMol.

./using_poli/objective_repository/osimetrinib_mpo.html
Penalized Log-solubility (LogP, using lambo)

Computing the penalized log-quotient of solubilities using lambo’s implementation.

./using_poli/objective_repository/penalized_logp_lambo.html
Quantitative Estimate of Druglikeness (QED)

Computing the QED using RDKit.

./using_poli/objective_repository/rdkit_qed.html
Ranolazine MPO (using tdc)

The Therapeutics Data Commons’ implementation of the Ranolazine MPO oracle of GuacaMol.

./using_poli/objective_repository/ranolazine_mpo.html
Scaffold Hop (using tdc)

The Therapeutics Data Commons’ implementation of the scaffold Hop oracle of GuacaMol.

./using_poli/objective_repository/deco_hop.html
Sitagliptin MPO (using tdc)

The Therapeutics Data Commons’ implementation of the Sitagliptin MPO oracle of GuacaMol.

./using_poli/objective_repository/sitagliptin_mpo.html
Synthetic Accessibility (SA, using tdc)

A wrapper around the Therapeutics Data Commons implementation of the synthetic accessibility oracle.

./using_poli/objective_repository/sa_tdc.html
Thiothixene rediscovery (using tdc)

The Therapeutics Data Commons’ implementation of the Thiothixene rediscovery oracle of GuacaMol.

./using_poli/objective_repository/thiothixene_rediscovery.html
Troglitazone rediscovery (using tdc)

The Therapeutics Data Commons’ implementation of the Troglitazone rediscovery oracle of GuacaMol.

./using_poli/objective_repository/troglitazone_rediscovery.html
Valsartan SMARTS (using tdc)

The Therapeutics Data Commons’ implementation of the Valsartan SMARTS oracle of GuacaMol.

./using_poli/objective_repository/valsartan_smarts.html
Zaleplon MPO (using tdc)

The Therapeutics Data Commons’ implementation of the Zaleplon MPO oracle of GuacaMol.

./using_poli/objective_repository/zaleplon_mpo.html

Proteins#

Protein Stability (using foldx)

Stability of mutations of a wildtype using foldx

./using_poli/objective_repository/foldx_stability.html
Protein SASA score (using foldx)

Solvent accessibility of mutations of a wildtype using foldx

./using_poli/objective_repository/foldx_sasa.html
Protein Stability (using RaSP)

Rapid Stability Predictions of single mutations from a wildtype.

./using_poli/objective_repository/RaSP.html
Protein Stability (using PyRosetta)

Stability Predictions of variants from a wildtype.

./using_poli/objective_repository/Rosetta_energy.html
RFP Fluorescence Protein Stability (using lambo)

LaMBO Fluorescence (RFP) by stability and solvent-accessible surface area.

./using_poli/objective_repository/foldx_rfp_lambo.html

Black-box optimization algorithms#

On top of poli, we provide poli-baselines, a collection of black-box optimization algorithms (focusing especially on discrete sequences). Examples include

Discrete#

Random Mutations

Optimizing a discrete sequence by performing random mutations

./using_poli_baselines/random_mutations.html
LaMBO2

Optimizing protein sequences using guided discrete diffusion

./using_poli_baselines/lambo2.html
Increasingly high-dimensional combinatorial and continuous embeddings (Bounce)

Papenmeier et al’s Bounce, using their official implementation.

./using_poli_baselines/bounce.html
Bayesian optimization with probabilistic reparametrization (ProbRep)

Daulton et al’s PR, using their official implementation.

./using_poli_baselines/probrep.html

Continuous#

CMA-ES

An evolutionary strategy for continuous problems

./using_poli_baselines/cma_es.html
Line Bayesian Optimization

A version of Bayesian Optimization where the acquisition is optimized over a line.

./using_poli_baselines/latent_space_bo.html
Hvarfner’s Vanilla Bayesian Optimization

Bayesian Optimization with log-expected improvement and a dimensionality-dependent prior over the lengthscales.

./using_poli_baselines/hvarfners_vanilla_bo.html
Sparse Axis-Aligned Subspace Bayesian Optimization (SAASBO)

Eriksson and Jankowiak’s SAASBO, using Ax.

./using_poli_baselines/saasbo.html
Adaptive expanding subspaces (BAxUS)

Papenmeier et al’s BAxUS, using their official implementation.

./using_poli_baselines/baxus.html

Cite us and other relevant work#

If you use certain black boxes, we expect you to cite the relevant work. Check inside the documentation of each black box for the relevant references.

Contribute problems or solvers#

These are a couple of guides about how to contribute a new problem factory (i.e. black-box objective function), or a new optimization algorithm.

Contribute a new problem

A guide to contributing a new problem to the repository.

./contributing/a_new_problem.html
Contribute a new solver

How to contribute a new black-box optimization algorithm.

./contributing/a_new_solver.html