What is poli?#
Contents
poli
is a library for creating and calling black box objective functions, with a special focus on discrete sequence optimization. It stands for Protein Objectives Library, since some of the work done on drug design and protein engineering is done through representing proteins and small molecules as discrete sequences.
We also build poli-baselines
on top, allowing you to define black box optimization algorithms for discrete sequences.
These next chapters detail a basic example of how to use poli
and poli-baselines
.
The rest of this intro details the usual development loops we assume you’ll follow when using poli
and poli-baselines
:
The usual development loop#
Black-box optimization algorithms inside poli-baselines
are treated as solvers, and the discrete objective functions of poli
are described as problems.
We propose to you the following process for using poli-baselines
’ optimizers, or developing your own:
Identify the objective function#
Start by identify the black-box objective function you want to optimize, and check if it’s already registered in poli
, or available in poli
’s objective repository.
This can be done by running
from poli import get_problems
print(get_problems())
['albuterol_similarity', 'aloha', 'amlodipine_mpo', 'celecoxib_rediscovery', 'deco_hop', 'dockstring', 'drd2_docking', 'drd3_docking', 'ehrlich', 'ehrlich_holo', 'fexofenadine_mpo', 'foldx_rfp_lambo', 'foldx_sasa', 'foldx_stability', 'foldx_stability_and_sasa', 'gfp_cbas', 'gfp_select', 'gsk3_beta', 'isomer_c7h8n2o2', 'isomer_c9h10n2o2pf2cl', 'jnk3', 'median_1', 'median_2', 'mestranol_similarity', 'osimetrinib_mpo', 'penalized_logp_lambo', 'perindopril_mpo', 'ranolazine_mpo', 'rasp', 'rdkit_logp', 'rdkit_qed', 'rfp_foldx_stability', 'rfp_foldx_stability_and_sasa', 'rfp_rasp', 'rmf_landscape', 'rosetta_energy', 'sa_tdc', 'scaffold_hop', 'sitagliptin_mpo', 'super_mario_bros', 'thiothixene_rediscovery', 'toy_continuous_problem', 'troglitazone_rediscovery', 'valsartan_smarts', 'white_noise', 'zaleplon_mpo']
The output is a list of problems you may be able to run.
Note
Most black box functions run out-of-the-box, but some have more requirements, like installing external dependencies. Check the page on all objective functions and click on the objective function you are interested in to get a detailed set of instructions on how to install and run it.
In what follows, we will use the white_noise
objective function. You could drop-in another function if desired.
# One way to create a white noise problem/black box
from poli import objective_factory
problem = objective_factory.create(name="white_noise")
f, x0 = problem.black_box, problem.x0
# Another way
from poli.objective_repository import WhiteNoiseBlackBox
f = WhiteNoiseBlackBox()
At this point, you can call f
on arrays of shape [b, L]
. In the specific case of white_noise
, L
can be any positive integer.
Using a solver, or creating your own#
poli-baselines
also comes with black-box optimizers out-of-the-box. You can find them inside the library.
For example, let’s use the RandomMutation
solver, which takes the initial x0
and randomly mutates it according to the alphabet provided in problem_info
.
from poli_baselines.solvers.simple.random_mutation import RandomMutation
y0 = f(x0)
solver = RandomMutation(
black_box=f,
x0=x0,
y0=y0,
)
print(f"x0: {x0}")
print(f"y0: {y0}")
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[3], line 1
----> 1 from poli_baselines.solvers.simple.random_mutation import RandomMutation
3 y0 = f(x0)
5 solver = RandomMutation(
6 black_box=f,
7 x0=x0,
8 y0=y0,
9 )
File ~/Projects/poli-baselines/src/poli_baselines/solvers/simple/random_mutation.py:22
17 from poli.core.abstract_black_box import AbstractBlackBox
19 from poli_baselines.core.step_by_step_solver import StepByStepSolver
---> 22 class RandomMutation(StepByStepSolver):
23 def __init__(
24 self,
25 black_box: AbstractBlackBox,
(...)
33 tokenizer: Callable[[str], list[str]] | None = None,
34 ):
35 if x0.ndim == 1:
File ~/Projects/poli-baselines/src/poli_baselines/solvers/simple/random_mutation.py:32, in RandomMutation()
22 class RandomMutation(StepByStepSolver):
23 def __init__(
24 self,
25 black_box: AbstractBlackBox,
26 x0: np.ndarray,
27 y0: np.ndarray,
28 n_mutations: int = 1,
29 top_k: int = 1,
30 batch_size: int = 1,
31 greedy: bool = True,
---> 32 alphabet: list[str] | None = None,
33 tokenizer: Callable[[str], list[str]] | None = None,
34 ):
35 if x0.ndim == 1:
36 if tokenizer is None:
TypeError: unsupported operand type(s) for |: 'types.GenericAlias' and 'NoneType'
Solvers implement a solve(max_iter: int)
method, which runs the optimization for the provided budget.
In the specific example of RandomMutation
, each step proposes a new candidate by choosing an element from the alphabet at random, and mutating a random position. This alphabet is part of the black box information:
print(f.info)
print(f.info.alphabet)
BlackBoxInformation(name=white_noise, max_sequence_length=inf, aligned=False, fixed_length=False, discrete=True)
['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']
If you are interested in building your own solver, check out the chapter detailing how RandomMutation
is implemented or our guide for contributing a new solver.
Optimizing#
Once you have a black box objective function f
and a solver on top, the optimization is quite easy:
solver.solve(max_iter=100)
print(solver.get_best_solution())
[['1' '4' '5']]
Of course, this example is trivial. We dive deeper in the next chapters.
Conclusion#
This chapter discusses the usual development loop using poli
and poli-baselines
:
Start by identifying/building your objective function,
continue by creating/using a solver in
poli_baselines
, anduse the
solve
method to run a number of iterations from the solver.
The next three chapters talk about another trivial example, diving deeper in the process of defining your own objective functions and solvers. You can continue there, or by checking the currently implemented repository of objective functions inside poli
.