Using poli, poli-baselines, and mlflow for logging

Using poli, poli-baselines, and mlflow for logging#

We’ll use poli’s get_problems() method to pick the problems that interest us.

from poli import get_problems
get_problems()

['albuterol_similarity',
 'aloha',
 'amlodipine_mpo',
 'celecoxib_rediscovery',
 'deco_hop',
 'dockstring',
 'drd2_docking',
 'drd3_docking',
 'ehrlich',
 'ehrlich_holo',
 'fexofenadine_mpo',
 'foldx_rfp_lambo',
 'foldx_sasa',
 'foldx_stability',
 'foldx_stability_and_sasa',
 'gfp_cbas',
 'gfp_select',
 'gsk3_beta',
 'isomer_c7h8n2o2',
 'isomer_c9h10n2o2pf2cl',
 'jnk3',
 'median_1',
 'median_2',
 'mestranol_similarity',
 'osimetrinib_mpo',
 'penalized_logp_lambo',
 'perindopril_mpo',
 'ranolazine_mpo',
 'rasp',
 'rdkit_logp',
 'rdkit_qed',
 'rfp_foldx_stability',
 'rfp_foldx_stability_and_sasa',
 'rfp_rasp',
 'rmf_landscape',
 'rosetta_energy',
 'sa_tdc',
 'scaffold_hop',
 'sitagliptin_mpo',
 'super_mario_bros',
 'thiothixene_rediscovery',
 'toy_continuous_problem',
 'troglitazone_rediscovery',
 'valsartan_smarts',
 'white_noise',
 'zaleplon_mpo']

problems = ["white_noise", "aloha"]

Selecting solvers#

from poli_baselines.solvers.simple.random_mutation import RandomMutation
from poli_baselines.solvers.simple.genetic_algorithm import FixedLengthGeneticAlgorithm

For each solver, let’s also add information whether they need sequences to be aligned.

solvers = [(FixedLengthGeneticAlgorithm, True), (RandomMutation, False)]

(Advanced) Observer registration#

It may happen that problem, solver and observer cannot run in the same environment. That’s why for this example, we will register the observer and leave the instantiation to poli.

from poli.core.registry import register_observer
from poli.core.util.observers.mlflow_observer import MLFlowObserver

register_observer(
    observer=MLFlowObserver(),
    # conda_environment_location="poli",  # when not providing the environment, we use the current one
    observer_name="mlflow_observer",
    set_as_default_observer=False,  # this is True by default!
)

We recommend that you write your own observer.

Run benchmark#

Define a tracking-URI for mlflow.

import os
from pathlib import Path
tracking_uri = os.path.join(Path(os.getcwd()).resolve(), "mlruns")

Run the (mock-)benchmark.

from poli import create

for solver_class, needs_alignment in solvers:
    for name in problems:
        # ideally this part becomes a cluster job...
        problem = create(name, observer_name="mlflow_observer", 
                         observer_init_info=dict(solver=solver_class.__name__,
                                                tracking_uri=tracking_uri))
        if needs_alignment and not problem.info.sequences_are_aligned():
            continue
        f, x0 = problem.black_box, problem.x0
        y0 = f(x0)
        solver = solver_class(black_box=f, x0=x0, y0=y0, alphabet=problem.info.get_alphabet())
        solver.solve(max_iter=3)

poli 🧪: initializing the observer.
poli 🧪: attempting isolated observer instantiation.

poli 🧪: initializing the observer.
poli 🧪: attempting isolated observer instantiation.

poli 🧪: initializing the observer.
poli 🧪: attempting isolated observer instantiation.

poli 🧪: initializing the observer.
poli 🧪: attempting isolated observer instantiation.

Checking results#

To check the results, you can run mlflow ui on the terminal:

!mlflow ui

[2024-12-22 16:30:14 -0500] [81459] [INFO] Starting gunicorn 23.0.0
[2024-12-22 16:30:14 -0500] [81459] [INFO] Listening at: http://127.0.0.1:5000 (81459)
[2024-12-22 16:30:14 -0500] [81459] [INFO] Using worker: sync
[2024-12-22 16:30:14 -0500] [81464] [INFO] Booting worker with pid: 81464

[2024-12-22 16:30:14 -0500] [81465] [INFO] Booting worker with pid: 81465
[2024-12-22 16:30:14 -0500] [81466] [INFO] Booting worker with pid: 81466

[2024-12-22 16:30:14 -0500] [81467] [INFO] Booting worker with pid: 81467

Traceback (most recent call last):
  File "/Users/sjt972/anaconda3/envs/poli-docs3/lib/python3.10/site-packages/poli/core/util/observer_wrapper.py", line 98, in <module>
    start_observer_process(args.objective_name, args.port, args.password)
  File "/Users/sjt972/anaconda3/envs/poli-docs3/lib/python3.10/site-packages/poli/core/util/observer_wrapper.py", line 57, in start_observer_process
    msg_type, *msg = conn.recv()
  File "/Users/sjt972/anaconda3/envs/poli-docs3/lib/python3.10/multiprocessing/connection.py", line 250, in recv
    buf = self._recv_bytes()
  File "/Users/sjt972/anaconda3/envs/poli-docs3/lib/python3.10/multiprocessing/connection.py", line 414, in _recv_bytes
    buf = self._recv(4)
  File "/Users/sjt972/anaconda3/envs/poli-docs3/lib/python3.10/multiprocessing/connection.py", line 379, in _recv
    chunk = read(handle, remaining)
KeyboardInterrupt

^C
[2024-12-22 16:30:43 -0500] [81459] [INFO] Handling signal: int
[2024-12-22 16:30:43 -0500] [81466] [INFO] Worker exiting (pid: 81466)
[2024-12-22 16:30:43 -0500] [81467] [INFO] Worker exiting (pid: 81467)
[2024-12-22 16:30:43 -0500] [81464] [INFO] Worker exiting (pid: 81464)
[2024-12-22 16:30:43 -0500] [81465] [INFO] Worker exiting (pid: 81465)

Click here to check out the results.