# De Novo Design in Engine Python Client

### Getting started 

[Install the Engine Python Client](https://levitate.bio/api/api-python-library/)

In a terminal, type:
- conda activate engine 
  - Recommended Python version: 3.7 <= version <= 3.10
- juptyer notebook

In [None]:
# import the engine client after installation and authorization
from engine import EngineClient
import os
import pandas as pd

client = EngineClient()

<br>

## Step 1: Clean my target PDB

In [None]:
# Submit the job
inputPdb = 'input.pdb' #change this to your target pdb

cleanPdbJob = client.submit_clean_pdb(pdb_path = inputPdb) # submit a clean pdb job from filepath

# Check the status of the job
print(client.get_status(cleanPdbJob)) # print the status of the job

In [None]:
# Check the status of the job
print(client.get_status(cleanPdbJob)) # print the status of the job

<br>
Tip: Once the job is done, you can retrieve the results
<br>
<br>

In [None]:
# Get the results of the job
cleanPdbOutput = client.get_results(cleanPdbJob) # get the results as a dictionary
print(cleanPdbOutput.items()) #see what the output key is called to dump the results

cleanPdbOutput['models'].dump('clean_pdb/') #dump the models from a the dictionary value
!ls 'clean_pdb/' #show the contents of the output directory

In [None]:
#preview the output file
with open('clean_pdb/full_structure.pdb') as f: 
    for i in range(5):
        print(f.readline())

<br>

## Step 2: RFDiffusion for de novo binder to my target 
Github: https://github.com/RosettaCommons/RFdiffusion

In [None]:
# Submit an RFDiffusion job
rfDiffusionJob = client.submit_rf_diffusion(
    template_pdb_path = 'clean_pdb/full_structure.pdb',                   # clean target pdb
    n_rfdiffusion_designs = 10,                                           # number of designs to generate
    rfdiffusion_contigs = '[A1-58/0 40-70]',                              # set-up to specify what to generate, learn more about contigs syntax at the above GitHub
    custom_args = ['ppi.hotspot_res=[A40,A43,A50]']                       # interaction hotspot positions on my target
)

# Check the status of the job
print(client.get_status(rfDiffusionJob)) # print the status of the job

In [None]:
# Check the status of the job
print(client.get_status(rfDiffusionJob)) # print the status of the job

In [None]:
# Get the results of the job
rfDiffusionOutput = client.get_results(rfDiffusionJob)
print(rfDiffusionOutput.items()) #see what the output key is called to dump the results

rfDiffusionOutput["results"].dump('rf_diffusion/') #dump the results to a local directory

<br>

## Step 3: Filter RFDiffusion outputs by Radius of Gyration

In [None]:
# Path to pdbs
path = 'rf_diffusion//all_outputs/'

# Submit radius of gyration job
radiusOfGyrationJob = client.submit_radius_of_gyration(
    pdb_paths = [path+pdb for pdb in os.listdir(path) if pdb.endswith('.pdb')], # submit all the rf-difusion pdbs
)

In [None]:
# Check the status of the job
print(client.get_status(radiusOfGyrationJob)) # print the status of the job

In [None]:
# Get the results of the job
radiusOfGyrationOutput = client.get_results(radiusOfGyrationJob)
print(radiusOfGyrationOutput.items()) #see what the output key is called to dump the results

radiusOfGyrationOutput["scores"].dump('radius_of_gyration')
!ls 'radius_of_gyration'

In [None]:
# Filter using Pandas
pd.set_option('display.max_colwidth', 100) # Set maximum column width to 100 characters

# Show data as a dataframe
rgScores = pd.read_csv('radius_of_gyration/rg_score.sc',sep='\s+') # in a text editor, I removed first line
display(rgScores)

# Filter rows where rg is less than a certain cutoff, ie. 15
rgFiltered = rgScores[rgScores['rg'] < 15]
display(rgFiltered)

In [None]:
filteredBackbones = []
for name in rgFiltered.description:
    filteredBackbones.append(name.split('_0001')[0]+'.pdb')

print(filteredBackbones)

<br>

## Step 4: ProteinMPNN to design binder sequences

In [None]:
# Path to pdbs
path = 'rf_diffusion/all_outputs/'

# Submit job
proteinmpnnJob = client.submit_protein_mpnn(
    pdb_path = path+filteredBackbones[0], #point to ROG-passing pdbs
    sampling_temperature = 0.1, # ranges 0-1, higher gives more diversity. Recommended 0.1-0.3. Default is 0.1.
    sequences_per_target = 5 # number of sequence to generate per input pdb
)


In [None]:
# Check the status of the job
print(client.get_status(proteinmpnnJob)) # print the status of the job

In [None]:
# Get the results of the job
proteinmpnnOutput = client.get_results(proteinmpnnJob)
print(proteinmpnnOutput.items()) #see what the output key is called to dump the results
proteinmpnnOutput["output"].dump('protein_mpnn/') 

<br>

## Step 5: Boltz to predict the structure of the binder

In [None]:
# Submit each job
boltzJob = client.submit_boltz1(
    fasta_paths = ["monomer_binder.fasta"],          
    single_seq_list = ["monomer_binder.fasta"],      
)

In [None]:
# Check the status of the job
print(client.get_status(boltzJob)) # print the status of the job

In [None]:
# Get the results of the job
boltzOutput = client.get_results(boltzJob)
print(boltzOutput.items()) #see what the output key is called to dump the results
boltzOutput["output"].dump('boltz/') 

Next, you can filter monomers using [rmsd](https://levitate.bio/api/api-rmsd/), [relax](https://levitate.bio/api/api-relax/), [solubility score](https://levitate.bio/api/api-solubility-scoring/), and [epitope scan](https://levitate.bio/api/api-epitope-scan/) to asses designed structure fidelity, stability, solubility, and immunogenicity. Also see [interface analyzer](https://levitate.bio/api/api-interface-analyzer/) to assess binding energy.