poli.objective_repository.rdkit_qed.register.QEDBlackBox#
- class poli.objective_repository.rdkit_qed.register.QEDBlackBox(string_representation: Literal['SMILES', 'SELFIES'] = 'SMILES', alphabet: list[str] | None = None, max_sequence_length: int = inf, batch_size: int = None, parallelize: bool = False, num_workers: int = None, evaluation_budget: int = inf)#
Quantitative estimate of druglikeness (QED) black box.
A simple black box that returns the QED of a molecule. By default, we assume that the result of concatenating the tokens will be a SMILES string, but you can set the context variable “from_selfies” to True to indicate that the input is a SELFIES string.
RDKit’s Chem.MolFromSmiles function and qed are known for failing silently, so we return NaN if the molecule cannot be parsed or if qed returns something other than a float.
- Parameters
string_representation (Literal["SMILES", "SELFIES"], optional) – The string representation to use, by default “SMILES”.
alphabet (list[str] | None, optional) – The alphabet to be used for the SMILES or SELFIES representation. It is common that the alphabet depends on the dataset used, so it is recommended to pass it as an argument. Default is None.
max_sequence_length (int, optional) – The maximum length of the sequence. Default is infinity.
batch_size (int, optional) – The batch size for processing multiple inputs simultaneously, by default None.
parallelize (bool, optional) – Flag indicating whether to parallelize the computation, by default False.
num_workers (int, optional) – The number of workers to use for parallel computation, by default None.
evaluation_budget (int, optional) – The maximum number of function evaluations. Default is infinity.
- from_selfies#
Flag indicating whether the input is a SELFIES string.
- Type
bool
- from_smiles#
Flag indicating whether the input is a SMILES string.
- Type
bool
- _black_box(x, context=None)#
The main black box method that performs the computation, i.e. it computes the qed of the molecule in x.
- __init__(string_representation: Literal['SMILES', 'SELFIES'] = 'SMILES', alphabet: list[str] | None = None, max_sequence_length: int = inf, batch_size: int = None, parallelize: bool = False, num_workers: int = None, evaluation_budget: int = inf)#
Initialize the QEDBlackBox.
- Parameters
string_representation (Literal["SMILES", "SELFIES"], optional) – The string representation to use, by default “SMILES”.
alphabet (list[str] | None, optional) – The alphabet to be used for the SMILES or SELFIES representation. It is common that the alphabet depends on the dataset used, so it is recommended to pass it as an argument. Default is None.
max_sequence_length (int, optional) – The maximum length of the sequence. Default is infinity.
batch_size (int, optional) – The batch size for parallel evaluation, by default None.
parallelize (bool, optional) – Flag indicating whether to parallelize the evaluation, by default False.
num_workers (int, optional) – The number of workers for parallel evaluation, by default None.
evaluation_budget (int, optional) – The maximum number of evaluations, by default float(“inf”).
Methods
__init__
([string_representation, alphabet, ...])Initialize the QEDBlackBox.
reset_evaluation_budget
()Resets the evaluation budget by setting the number of evaluations made to 0.
set_observer
(observer)Set the observer object for recording observations during evaluation.
terminate
()Terminate the black box optimization problem.