Defining an observer#

In the previous chapter we discussed how to create and optimize objective functions using poli and poli-baselines. These objective functions are usually part of an optimization experiment, in which logging is essential. For example, you might be comparing the sample efficiency/the quality of two different black box optimization algorithms. Logging the results would then be essential for the comparison.

poli provides logging via observers. An observer can be attached to a black box function, such that every single call to the function gets logged. This chapter explains how to write observers, and provides a simple example using an observer that saves/updates local json files at each evaluation.

Want a more complex example?

If you are interested in more complex observers (using e.g. MLFlow or Weights and Biases), check the examples folder in poli. There you will find how to define and use simple observers using these two logging libraries.

An abstract observer#

All observers inherit from an AbstractObserver (which you can find on poli/core/util/abstract_observer.py). The abstract methods you need to overwrite are:

  • initialize_observer(problem_setup_info: BlackBoxInformation, caller_info: object, seed: int) -> object, which gets called as part of the set-up of the objective function (when objective_factory.create is called).

  • observe(x: np.ndarray, y: np.ndarray, context: object) -> None, which gets called every time your optimization algorithms query the objective function.

  • finish(), which gets called either by the user, or by the object deletion at the end of the script.

An instance: a simple observer#

Let’s define a simple observer that saves or updates a local json file every time the objective function is called. We can start by defining the skeleton of our observer:

import numpy as np

from poli.core.problem_setup_information import ProblemSetupInformation
from poli.core.util.abstract_observer import AbstractObserver

class SimpleObserver(AbstractObserver):
    def __init__(self):
        ...

    def initialize_observer(
        self,
        problem_setup_info: ProblemSetupInformation,
        caller_info: object,
        seed: int
    ) -> object:
        ...
    
    def observe(self, x: np.ndarray, y: np.ndarray, context=None) -> None:
        ...

    def finish(self) -> None:
        ...

Initializing the observer#

Usually, at init time we might need to create some folders (or log into services like wandb). In this example, let’s use the __init__ to create a folder adjacent to this file called results, and a unique identifier for this experiment using uuid4. Let’s use the initialize_observer to log the metadata of this individual experiment.

from pathlib import Path
from uuid import uuid4
import json

import numpy as np

from poli.core.black_box_information import BlackBoxInformation
from poli.core.util.abstract_observer import AbstractObserver

THIS_DIR = Path().resolve()

class SimpleObserver(AbstractObserver):
    def __init__(self):
        # Creating a unique id for this experiment in
        # particular:
        experiment_id = str(uuid4())
        self.experiment_id = experiment_id

        # Creating a local directory for the results
        experiment_path = THIS_DIR / "results" / experiment_id
        experiment_path.mkdir(exist_ok=True, parents=True)
        
        self.experiment_path = experiment_path
    
    def initialize_observer(
        self,
        problem_setup_info: BlackBoxInformation,
        caller_info: object,
        seed: int
    ) -> object:

        # Saving the metadata for this experiment
        metadata = problem_setup_info.as_dict()

        # Adding the information the user wanted to provide
        # (Recall that this caller info gets forwarded
        # from the objective_factory.create function)
        metadata["caller_info"] = caller_info

        # Saving the initial evaluations and seed
        metadata["seed"] = seed

        # Saving the metadata
        with open(self.experiment_path / "metadata.json", "w") as f:
            json.dump(metadata, f)
    
    # The rest of the class
    ...

The core of the logging: observe#

The observe method will be called every time the user/algorithm queries the objective function (if you are curious, you can check the AbstractBlackBox in poli).

In our case, we will simply append the x and y to a file called results.txt.

Warning

Remember that this is a simple example! We are essentially re-inventing the wheel. You should write more complex logic for logging, or use libraries like tensorboard, mlflow or wandb.

from pathlib import Path
from uuid import uuid4
import json

import numpy as np

from poli.core.problem_setup_information import ProblemSetupInformation
from poli.core.util.abstract_observer import AbstractObserver

THIS_DIR = Path().resolve()

class SimpleObserver(AbstractObserver):
    # The init and initialize_observer methods
    ...
    
    def observe(self, x: np.ndarray, y: np.ndarray, context=None) -> None:
        # Appending these results to the results file.
        with open(self.experiment_path / "results.txt", "a") as fp:
            fp.write(f"{x.tolist()}\t{y.tolist()}\n")

Putting it all together#

In this next snippet, we put everything together into the final version. Notice how this simple example doesn’t require any complex logic for finish. In other scenarios, you might want to finish the experiment by terminating your active run on mlflow or wandb.

Hide code cell content
from pathlib import Path
from uuid import uuid4
import json

import numpy as np

from poli.core.problem_setup_information import ProblemSetupInformation
from poli.core.util.abstract_observer import AbstractObserver

THIS_DIR = Path().resolve()

class SimpleObserver(AbstractObserver):
    def __init__(self):
        # Creating a unique id for this experiment in
        # particular:
        experiment_id = str(uuid4())
        self.experiment_id = experiment_id

        # Creating a local directory for the results
        experiment_path = THIS_DIR / "results" / experiment_id
        experiment_path.mkdir(exist_ok=True, parents=True)
        
        self.experiment_path = experiment_path
    
    def initialize_observer(
        self,
        problem_setup_info: ProblemSetupInformation,
        caller_info: object,
        seed: int
    ) -> object:

        # Saving the metadata for this experiment
        metadata = problem_setup_info.as_dict()

        # Adding the information the user wanted to provide
        # (Recall that this caller info gets forwarded
        # from the objective_factory.create function)
        metadata["caller_info"] = caller_info

        # Saving the initial evaluations and seed
        metadata["seed"] = seed

        # Saving the metadata
        with open(self.experiment_path / "metadata.json", "w") as f:
            json.dump(metadata, f)
    
    def observe(self, x: np.ndarray, y: np.ndarray, context=None) -> None:
        # Appending these results to the results file.
        with open(self.experiment_path / "results.txt", "a") as fp:
            fp.write(f"{x.tolist()}\t{y.tolist()}\n")

Logging a couple queries of aloha#

Using the aloha toy problem, let’s check that our observer logic works as expected:

from poli import objective_factory

# We create an instance of the observer
observer = SimpleObserver()

# We instantiate the objective function
problem = objective_factory.create(
    name="aloha",
    observer=observer,
)
f, x0 = problem.black_box, problem.x0

# We initialize the observer
observer.initialize_observer(
    problem_setup_info=f.info,
    caller_info={},
    seed=None,
)

# We set the observer to track f.
f.set_observer(observer)

At this point, the observer __init__ call created a folder called results right next to this file, and we can load up the metadata just to be sure:

with open(observer.experiment_path / "metadata.json") as fp:
    print(json.load(fp))
{'name': 'aloha', 'max_sequence_length': 5, 'aligned': True, 'fixed_length': True, 'deterministic': True, 'discrete': True, 'fidelity': None, 'alphabet': ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z'], 'log_transform_recommended': False, 'padding_token': '', 'caller_info': {}, 'seed': None}

Let’s query the objective function at three points, and check whether the results were saved accordingly:

print(f(np.array([list("MIGUE")])))
print(f(np.array([list("FLEAS")])))
print(f(np.array([list("ALOHA")])))
[[0]]
[[1]]
[[5]]

We can verify by loading up and printing the results.txt file:

with open(observer.experiment_path / "results.txt") as fp:
    print(fp.read())
[['M', 'I', 'G', 'U', 'E']]	[[0]]
[['F', 'L', 'E', 'A', 'S']]	[[1]]
[['A', 'L', 'O', 'H', 'A']]	[[5]]

Conclusion#

This small tutorial showcases the logic behind observers, which are the main way in which poli logs results. We saw

  1. the structure of an AbstractObserver, and which abstract methods need to be overwritten.

  2. how initialize_observer is called, and

  3. how each query to the objective function is observed.

Tip

If you are interested in using more complex logic for your logging, you can check the examples folder in poli, as they include two observers using mlflow and wandb.

poli is also able to isolate observers, and the examples folder also includes a description of how.