Protein Therapeutics

Computational De Novo Design of Antibodies and Nanobodies

Computational De Novo Design of Antibodies and Nanobodies

In "Atomically accurate de novo design of antibodies with RFdiffusion", the authors from the IPD's Baker Lab propose a pipeline to computationally generate antibody VHHs and scFvs starting from a target structure and a desired epitope. The authors use cryo-EM and to demonstrate the folding of the generated structures into Igs at the intended binding site.

The authors show a validated, influenza-targeting VHH, along with scFvs (for both light and heavy chains) to TcdB and a Phox2b peptide-MHC complex. With this, de novo design of antibodies targeting specific epitopes becomes possible, albeit with some needed affinity optimization after the initial round of generated binders.

Try RFantibody on Tamarind Bio today: https://app.tamarind.bio/tools/rfantibody.

Practical Considerations

Scale

The smallest number of designs the authors find hits for is a set of 95 VHHs. They also note that a more realistic number of designs to test may be in the 10k range. This is primarily because RF2 is not a very strong filter for non-binders and binders, and finding a stricter filter remains a challenge. For the time being, yeast display or other high throughput screening methods seem to be the most reliable path to identifying de novo binders.

Target Site and Hotspot Selection

A site candidate for binding should have at least 3 hydrophobic residues, as binding to polar sites and those with nearby glycans remain a challenge. See here for strategies on how to generate binders to unstructured peptides, which should be applicable to unstructured loops as well.

RFantibody is highly sensitive to specified epitope/hotspots, as it generating both the binder and the docked pose. As most with de novo design tools, we recommend trying a smaller number of designs to see if the generated poses match your expectations. Once you have set of settings you are happy with Tamarind can support tens of thousands of in silico designs.

Filtering Methods

From the authors directly:

We recommend the following minimal filtering criteria:

RF2 pAE < 10
RMSD (design versus RF2 predicted) < 2Å

These serve as minimums and there isn't a strong guarantee that sequences conforming to the standards are binders The most significant bottleneck to this pipeline being adopted is the quality of the filtration step, I hope to see improvements as structure prediction tooling improves, especially for antibodies!

Methods

The RFantibody pipeline comprises of three steps: structure generation with fine-tuned RFdiffusion, design of CDR sequences with ProteinMPNN and using fine-tuned RoseTTAFold2 for design validation.

The authors fine-tune their previous work. RFdiffusion, which showed success in miniproteins and enzymes, on antibody complex structures. Then RFdiffusion-antibody becomes able to generate Ig-like structures that bind to a user specified epitope. Note that this is still just a structure without amino acids assigned to the newly designed CDR loop structures. To find feasible residues that fold in to the RFdiffusion-generated structure, the authors use ProteinMPNN.

Now that there is a list of sequences that RFdiffusion+ProteinMPNN has designed to dock to the target epitope, we need to evaluate if these are valid binder candidates. For this, the authors predict the structure of the sequence against the target(while constraining the binding site to the epitope) and evaluate the similarity of predicted structure to the RFdiffusion generated structure along with the error metrics from the fine-tuned RoseTTAFold. I mentioned the guidelines the authors set out previously. These serve as minimums and there isn't a strong guarantee that sequences conforming to the standards are binders The most significant bottleneck to this pipeline being adopted is the quality of the filtration, as the binders that the authors do validate tend to come from high-throughput screens, e.g. yeast display.