Defining an observer#
In the previous chapter we discussed how to create and optimize objective functions using poli
and poli-baselines
. These objective functions are usually part of an optimization experiment, in which logging is essential. For example, you might be comparing the sample efficiency/the quality of two different black box optimization algorithms. Logging the results would then be essential for the comparison.
poli
provides logging via observers. An observer can be attached to a black box function, such that every single call to the function gets logged. This chapter explains how to write observers, and provides a simple example using an observer that saves/updates local json
files at each evaluation.
Want a more complex example?
If you are interested in more complex observers (using e.g. MLFlow or Weights and Biases), check the examples
folder in poli
. There you will find how to define and use simple observers using these two logging libraries.
An abstract observer#
All observers inherit from an AbstractObserver
(which you can find on poli/core/util/abstract_observer.py
). The abstract methods you need to overwrite are:
initialize_observer(problem_setup_info: BlackBoxInformation, caller_info: object, seed: int) -> object
, which gets called as part of the set-up of the objective function (whenobjective_factory.create
is called).observe(x: np.ndarray, y: np.ndarray, context: object) -> None
, which gets called every time your optimization algorithms query the objective function.finish()
, which gets called either by the user, or by the object deletion at the end of the script.
An instance: a simple observer#
Let’s define a simple observer that saves or updates a local json file every time the objective function is called. We can start by defining the skeleton of our observer:
import numpy as np
from poli.core.problem_setup_information import ProblemSetupInformation
from poli.core.util.abstract_observer import AbstractObserver
class SimpleObserver(AbstractObserver):
def __init__(self):
...
def initialize_observer(
self,
problem_setup_info: ProblemSetupInformation,
caller_info: object,
seed: int
) -> object:
...
def observe(self, x: np.ndarray, y: np.ndarray, context=None) -> None:
...
def finish(self) -> None:
...
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
Cell In[1], line 3
1 import numpy as np
----> 3 from poli.core.problem_setup_information import ProblemSetupInformation
4 from poli.core.util.abstract_observer import AbstractObserver
6 class SimpleObserver(AbstractObserver):
ModuleNotFoundError: No module named 'poli.core.problem_setup_information'
Initializing the observer#
Usually, at init
time we might need to create some folders (or log into services like wandb
). In this example, let’s use the __init__
to create a folder adjacent to this file called results
, and a unique identifier for this experiment using uuid4
. Let’s use the initialize_observer
to log the metadata of this individual experiment.
from pathlib import Path
from uuid import uuid4
import json
import numpy as np
from poli.core.black_box_information import BlackBoxInformation
from poli.core.util.abstract_observer import AbstractObserver
THIS_DIR = Path().resolve()
class SimpleObserver(AbstractObserver):
def __init__(self):
# Creating a unique id for this experiment in
# particular:
experiment_id = str(uuid4())
self.experiment_id = experiment_id
# Creating a local directory for the results
experiment_path = THIS_DIR / "results" / experiment_id
experiment_path.mkdir(exist_ok=True, parents=True)
self.experiment_path = experiment_path
def initialize_observer(
self,
problem_setup_info: BlackBoxInformation,
caller_info: object,
seed: int
) -> object:
# Saving the metadata for this experiment
metadata = problem_setup_info.as_dict()
# Adding the information the user wanted to provide
# (Recall that this caller info gets forwarded
# from the objective_factory.create function)
metadata["caller_info"] = caller_info
# Saving the initial evaluations and seed
metadata["seed"] = seed
# Saving the metadata
with open(self.experiment_path / "metadata.json", "w") as f:
json.dump(metadata, f)
# The rest of the class
...
The core of the logging: observe
#
The observe
method will be called every time the user/algorithm queries the objective function (if you are curious, you can check the AbstractBlackBox
in poli
).
In our case, we will simply append the x and y to a file called results.txt
.
Warning
Remember that this is a simple example! We are essentially re-inventing the wheel. You should write more complex logic for logging, or use libraries like tensorboard
, mlflow
or wandb
.
from pathlib import Path
from uuid import uuid4
import json
import numpy as np
from poli.core.problem_setup_information import ProblemSetupInformation
from poli.core.util.abstract_observer import AbstractObserver
THIS_DIR = Path().resolve()
class SimpleObserver(AbstractObserver):
# The init and initialize_observer methods
...
def observe(self, x: np.ndarray, y: np.ndarray, context=None) -> None:
# Appending these results to the results file.
with open(self.experiment_path / "results.txt", "a") as fp:
fp.write(f"{x.tolist()}\t{y.tolist()}\n")
Putting it all together#
In this next snippet, we put everything together into the final version. Notice how this simple example doesn’t require any complex logic for finish
. In other scenarios, you might want to finish the experiment by terminating your active run on mlflow
or wandb
.
Show code cell content
from pathlib import Path
from uuid import uuid4
import json
import numpy as np
from poli.core.problem_setup_information import ProblemSetupInformation
from poli.core.util.abstract_observer import AbstractObserver
THIS_DIR = Path().resolve()
class SimpleObserver(AbstractObserver):
def __init__(self):
# Creating a unique id for this experiment in
# particular:
experiment_id = str(uuid4())
self.experiment_id = experiment_id
# Creating a local directory for the results
experiment_path = THIS_DIR / "results" / experiment_id
experiment_path.mkdir(exist_ok=True, parents=True)
self.experiment_path = experiment_path
def initialize_observer(
self,
problem_setup_info: ProblemSetupInformation,
caller_info: object,
seed: int
) -> object:
# Saving the metadata for this experiment
metadata = problem_setup_info.as_dict()
# Adding the information the user wanted to provide
# (Recall that this caller info gets forwarded
# from the objective_factory.create function)
metadata["caller_info"] = caller_info
# Saving the initial evaluations and seed
metadata["seed"] = seed
# Saving the metadata
with open(self.experiment_path / "metadata.json", "w") as f:
json.dump(metadata, f)
def observe(self, x: np.ndarray, y: np.ndarray, context=None) -> None:
# Appending these results to the results file.
with open(self.experiment_path / "results.txt", "a") as fp:
fp.write(f"{x.tolist()}\t{y.tolist()}\n")
Logging a couple queries of aloha
#
Using the aloha
toy problem, let’s check that our observer logic works as expected:
from poli import objective_factory
# We create an instance of the observer
observer = SimpleObserver()
# We instantiate the objective function
problem = objective_factory.create(
name="aloha",
observer=observer,
)
f, x0 = problem.black_box, problem.x0
# We initialize the observer
observer.initialize_observer(
problem_setup_info=f.info,
caller_info={},
seed=None,
)
# We set the observer to track f.
f.set_observer(observer)
At this point, the observer __init__
call created a folder called results
right next to this file, and we can load up the metadata just to be sure:
with open(observer.experiment_path / "metadata.json") as fp:
print(json.load(fp))
{'name': 'aloha', 'max_sequence_length': 5, 'aligned': True, 'fixed_length': True, 'deterministic': True, 'discrete': True, 'fidelity': None, 'alphabet': ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z'], 'log_transform_recommended': False, 'padding_token': '', 'caller_info': {}, 'seed': None}
Let’s query the objective function at three points, and check whether the results were saved accordingly:
print(f(np.array([list("MIGUE")])))
print(f(np.array([list("FLEAS")])))
print(f(np.array([list("ALOHA")])))
[[0]]
[[1]]
[[5]]
We can verify by loading up and printing the results.txt
file:
with open(observer.experiment_path / "results.txt") as fp:
print(fp.read())
[['M', 'I', 'G', 'U', 'E']] [[0]]
[['F', 'L', 'E', 'A', 'S']] [[1]]
[['A', 'L', 'O', 'H', 'A']] [[5]]
Conclusion#
This small tutorial showcases the logic behind observers, which are the main way in which poli
logs results. We saw
the structure of an
AbstractObserver
, and which abstract methods need to be overwritten.how
initialize_observer
is called, andhow each query to the objective function is observed.
Tip
If you are interested in using more complex logic for your logging, you can check the examples
folder in poli
, as they include two observers using mlflow
and wandb
.
poli
is also able to isolate observers, and the examples
folder also includes a description of how.