Rapid Stability Predictions#
Rapid Stability Predictions (RaSP) [Blaabjerg et al., 2023] predicts the stability of a protein using a supervised learning approach. Starting from features learned using self supervision [Boomsma and Frellsen, 2017], RaSP learns to predict Rosetta scores using neural networks. The drawback being that only additive mutations could be computed simultaneously. We limit the edit distance to 1 (i.e. we only consider mutations that are one-away from the wildtype).
This objective function is quite similar to foldx_stability
, and can be considered a drop-in replacement for single mutations. Be aware that the scales are different.
Since our black boxes are meant to be maximized, we return the negative of what RaSP predicts.
Prerequisites#
A collection of
pdb
files you’re interested in mutating.
However, your life would be easier if you run this black box objective function inside the poli__rasp
environment. See here for an environment.yml
file.
How to run#
Assuming you have 3ned.pdb
in the same directory as this script:
from pathlib import Path
from poli.objective_repository import RaspBlackBox, RaspProblemFactory
wildtype_pdb_path = Path(__file__).parent / "3ned.pdb"
# You can either
# (i) Create a black box
f = RaspBlackBox(
wildtype_pdb_path=[wildtype_pdb_path],
chains_to_keep=["A"], # <-- The chain in the pdb.
additive=False, # <-- Whether to treat multiple mutations additively.
)
# or
# (ii) creating a problem
problem = RaspProblemFactory().create(
wildtype_pdb_path=[wildtype_pdb_path]
chains_to_keep=["A"], # <-- The chain in the pdb.
additive=False, # <-- Whether to treat multiple mutations additively.
)
f, x0 = problem.black_box, problem.x0
# Querying:
print(f(x0))
Warnings#
Warning
This objective function requires clang
and cmake
, which will be included in the poli__rasp
conda environment. Make sure you are okay with this.
When the objective function is created, we will clone the reduce
GitHub repository inside ~/.poli_objectives/rasp
, and we will compile it. We pin the version to the commit hash bd23a0bf627ae9b08842102a5c2e9404b4a81924
.
This objective function also downloads several models (as .pt
files) from the RaSP repository.
Namely, we download cavity_model_15.pt
and all ds_models
and store them in ~/.poli_objectives/rasp
. These models are taken from the commit hash 3ccebe87e017b6bd737f88e1943557d128c85616
, and the files are checked against a pre-computed md5
checksum.
How to cite#
If you use this black box, we expect you to cite the following resources:
[1] Blaabjerg, Lasse M, Maher M Kassem, Lydia L Good, Nicolas Jonsson, Matteo Cagiada, Kristoffer E Johansson, Wouter Boomsma, Amelie Stein, and Kresten Lindorff-Larsen. “Rapid Protein Stability Prediction Using Deep Learning Representations.” Edited by José D Faraldo-Gómez, Detlef Weigel, Nir Ben-Tal, and Julian Echave. eLife 12 (May 2023): e82593. https://doi.org/10.7554/eLife.82593.
[2] González-Duque, M., Bartels, S., & Michael, R. (2024). poli: a libary of discrete sequence objectives [Computer software]. MachineLearningLifeScience/poli
@article{Blaabjerg:RaSP:2023,
title = {Rapid protein stability prediction using deep learning representations},
volume = {12},
issn = {2050-084X},
doi = {10.7554/eLife.82593},
journal = {eLife},
publisher = {eLife Sciences Publications, Ltd},
author = {Blaabjerg, Lasse M and Kassem, Maher M and Good, Lydia L and Jonsson, Nicolas and Cagiada, Matteo and Johansson, Kristoffer E and Boomsma, Wouter and Stein, Amelie and Lindorff-Larsen, Kresten},
editor = {Faraldo-Gómez, José D and Weigel, Detlef and Ben-Tal, Nir and Echave, Julian},
year = {2023},
month = may,
pages = {e82593}
}
@software{Gonzalez-Duque:poli:2024,
author = {González-Duque, Miguel and Bartels, Simon and Michael, Richard},
month = jan,
title = {{poli: a libary of discrete sequence objectives}},
url = {https://github.com/MachineLearningLifeScience/poli},
version = {0.0.1},
year = {2024}
}
Further reading#
In the examples of poli
you can find how to compute the saturation mutagenesis for a given protein at a given position.