AMBER

Setting up SIRAH

Note

SIRAH Force Field is distributed with AMBER and AmberTools official releases.

Required Software

AMBER

Amber and AmberTools (version 16 or later) versions properly installed and running in your computer.

Tip

Download the latest version of Amber and AmberTools and refer to the How to obtain Amber23 section for more details on getting Amber23. In addition, go to the Install Amber page for specific instructions based on your operating system. If you have Amber already installed, it is unnecessary to install AmberTools separately.

VMD

The molecular visualization program VMD, version 1.9.3 or later (freely available download). Check VMD’s Installation guide for more instructions.

Prior knowledge

How to perform a standard atomistic molecular dynamics simulations with AMBER and basic usage of VMD.

Tip

AMBER

If you are not familiar with atomistic molecular dynamics simulations, we strongly recommend you to perform the basic usage tutorials of Amber and AmberTools: 1. Building Systems, 3. Creating Stable Systems and Running MD and 5. Case Studies. For more details and documentation, check Amber23 Manual.

VMD

If you are not familiar with VMD, we strongly recommend you to perform the basic usage tutorial of VMD (Using VMD). For more details and documentation, check VMD User Guide and the official VMD documentation.

Download and Setting up SIRAH

Note

Please report bugs, errors or enhancement requests through Issue Tracker or if you have a question about SIRAH open a New Discussion.

Download SIRAH for Amber and uncompress it into your working directory.

tar -xzvf sirah_x2.3_24-07.amber.tar.gz

If you are using a different version of SIRAH, type:

tar -xzvf sirah_[version].amber.tgz

You will get the folder sirah_[version].amber/ containing the force field definition, SIRAH Tools in the sirah_[version].amber/tools/ folder, molecular structures to build up systems in sirah_[version].amber/PDB/ and the required material to perform the tutorials in sirah_[version].amber/tutorial/X.X/

Caution

Remember to change X.X to the number that corresponds to the tutorial you are going to do.

Make a new folder for this tutorial in your working directory:

mkdir tutorialX.X; cd tutorialX.X

Create the following symbolic link in the folder tutorialX.X:

ln -s ../sirah_x2.3_24-07.amber sirah.amber

Note

SIRAH Force Field is also distributed with AMBER and AmberTools official releases. To use the native AMBER version of SIRAH, create a symbolic link located in $AMBERHOME:

ln -s $AMBERHOME/dat/SIRAH sirah.amber

Check the AMBER Manual section 3.11.2 for more details.

However, if you want the latest parameters and implementations strongly advise you to use the developers version of SIRAH from GitHub.

1. DNA molecule in implicit solvent

Note

Please report bugs, errors or enhancement requests through Issue Tracker or if you have a question about SIRAH open a New Discussion.

This tutorial shows how to perform a Coarse Grained (CG) simulation of a double stranded DNA using the Generalized Born model for implicit solvent (GB) and the SIRAH force field. The main references for this tutorial are: Dans et al. (latest parameters are those reported here) and Machado & Pantano. We strongly advise you to read these articles before starting the tutorial.

Important

Check the Setting up SIRAH section for download and set up details before starting this tutorial. Since this is Tutorial 1, remember to replace X.X and the files corresponding to this tutorial can be found in: sirah_[version].amber/tutorial/1/

Note

If you are not familiar with DNA stuff, we strongly recommend you to first perform the AMBER tutorial on DNA.

1.1. Build CG representations

Map the atomistic structure of a 20-mer DNA to its CG representation:

./sirah.amber/tools/CGCONV/cgconv.pl -i ./sirah.amber/tutorial/1/dna.pdb  -o dna_cg.pdb

The input file -i dna.pdb contains all the heavy atoms composing the DNA molecule, while the output -o dna_cg.pdb preserves a few of them. Please check both PDB structures using VMD:

vmd -m ./sirah.amber/tutorial/1/dna.pdb dna_cg.pdb

Tip

This is the basic usage of the script cgconv.pl, you can learn other capabilities from its help by typing:

./sirah.amber/tools/CGCONV/cgconv.pl -h

From now on it is just normal AMBER stuff!

1.2. Prepare LEaP input

Use a text editor to create the file gensystem.leap including the following lines:

# Load SIRAH force field
addPath ./sirah.amber
source leaprc.sirah

# Load model
dna = loadpdb dna_cg.pdb

# Save Parms
saveAmberParmNetcdf dna dna_cg.prmtop dna_cg.ncrst

# EXIT
quit

1.3. Run LEaP

Run the LEAP application to generate the molecular topology and initial coordinate files:

tleap -f gensystem.leap

Note

Warning messages about long, triangular or square bonds in leap.log file are fine and expected due to the CG topology.

This should create a topology file dna_cg.prmtop and a coordinate file dna_cg.ncrst.

Use VMD to check how the CG model looks like:

vmd dna_cg.prmtop dna_cg.ncrst -e ./sirah.amber/tools/sirah_vmdtk.tcl

Tip

VMD assigns default radius to unknown atom types, the script sirah_vmdtk.tcl sets the right ones, according to the CG representation. It also provides a kit of useful selection macros, coloring methods and backmapping utilities. Use the command sirah_help in the Tcl/Tk console of VMD to access the manual pages. To learn about SIRAH Tools’ capabilities, you can also go to the SIRAH Tools tutorial.

1.4. Run the simulation

Make a new folder for the run:

mkdir -p run; cd run

The folder sirah.amber/tutorial/1/PMEMD.GPU/ contains typical input files for energy minimization (em_GB.in), equilibration (eq_GB.in) and production (md_GB.in) runs. Please check carefully the input flags therein.

Tip

Some commonly used flags in AMBER

  • -i: Input file.

  • -o: Output file.

  • -p: Parameter/topology file.

  • -c: Coordinate file.

  • -r: Restart file.

  • -x: Trajectory file.

Caution

These input files are executed by the GPU implementation of pmemd.cuda. Other available modules are sander or pmemd, which are both CPU implementations of Amber.

Note

You can find example input files for CPU versions of sander and pmemd at folders SANDER/ and PMEMD.CPU/, within sirah.amber/tutorial/1/

Energy Minimization:

pmemd.cuda -O -i ../sirah.amber/tutorial/1/PMEMD.GPU/em_GB.in -p ../dna_cg.prmtop -c ../dna_cg.ncrst -o dna_cg_em.out -r dna_cg_em.ncrst &

Important

In the course of long MD simulations the capping residues may eventually separate, this effect is called helix fraying. To avoid such behavior is necessary to set Watson-Crick restraints for the capping base pairs of this CG DNA at the end of eq_GB.in and md_GB.in files. Check the files lines that start with &rst.

Warning

If you are using sander, to avoid the helix fraying, you must create a symbolic link to the file dna_cg.RST, which contains the definition of Watson-Crick restraints for the capping base pairs of this CG DNA:

ln -s ../sirah.amber/tutorial/1/SANDER/dna_cg.RST

The file dna_cg.RST can only be read by sander, pmemd reads a different restrain format.

Equilibration:

pmemd.cuda -O -i ../sirah.amber/tutorial/1/PMEMD.GPU/eq_GB.in -p ../dna_cg.prmtop -c dna_cg_em.ncrst -o dna_cg_eq.out -r dna_cg_eq.ncrst -x dna_cg_eq.nc &

Production (100ns):

pmemd.cuda -O -i ../sirah.amber/tutorial/1/PMEMD.GPU/md_GB.in -p ../dna_cg.prmtop -c dna_cg_eq.ncrst -o dna_cg_md.out -r dna_cg_md.ncrst  -x dna_cg_md.nc &

1.5. Visualizing the simulation

Now you can load, visualize and analize the trajectory in VMD:

vmd ../dna_cg.prmtop ../dna_cg.ncrst dna_cg_md.nc -e ../sirah.amber/tools/sirah_vmdtk.tcl

Note

The file sirah_vmdtk.tcl is a Tcl script that is part of SIRAH Tools and contains the macros to properly visualize the coarse-grained structures in VMD. Use the command sirah-help in the Tcl/Tk console of VMD to access the manual pages. To learn about SIRAH Tools’ capabilities, you can also go to the SIRAH Tools tutorial.

2. DNA molecule in explicit solvent

Note

Please report bugs, errors or enhancement requests through Issue Tracker or if you have a question about SIRAH open a New Discussion.

This tutorial shows how to use the SIRAH force field to perform a coarse grained (CG) simulation of a double stranded DNA embedded in explicit solvent (called WatFour, WT4). The main references for this tutorial are: Dans et al. (latest parameters are those reported here) and Machado & Pantano. We strongly advise you to read these articles before starting the tutorial.

Important

Check the Setting up SIRAH section for download and set up details before starting this tutorial. Since this is Tutorial 2, remember to replace X.X and the files corresponding to this tutorial can be found in: sirah_[version].amber/tutorial/2/

2.1. Build CG representations

Map the atomistic structure of a 20-mer DNA to its CG representation:

./sirah.amber/tools/CGCONV/cgconv.pl -i ./sirah.amber/tutorial/2/dna.pdb -o dna_cg.pdb

The input file -i dna.pdb contains all the heavy atoms composing the DNA molecule, while the output -o dna_cg.pdb preserves a few of them. Please check both PDB structures using VMD:

vmd -m ./sirah.amber/tutorial/2/dna.pdb dna_cg.pdb

Tip

This is the basic usage of the script cgconv.pl, you can learn other capabilities from its help by typing:

./sirah.amber/tools/CGCONV/cgconv.pl -h

From now on it is just normal AMBER stuff!

2.2. Prepare LEaP input

Use a text editor to create the file gensystem.leap including the following lines:

# Load SIRAH force field
addPath ./sirah.amber
source leaprc.sirah

# Load model
dna = loadpdb dna_cg.pdb

# Info on system charge
charge dna

# Add solvent, counterions and 0.15M NaCl
# Tuned solute-solvent closeness for best hydration
solvateoct dna WT4BOX 20 0.7
addionsrand dna NaW 88 ClW 50

# Save Parms
saveAmberParmNetcdf dna dna_cg.prmtop dna_cg.ncrst

# EXIT
quit

See also

The available electrolyte species in SIRAH force field are: Na⁺ (NaW), K⁺ (KW) and Cl⁻ (ClW) which represent solvated ions in solution. One ion pair (e.g., NaW-ClW) each 34 WT4 molecules results in a salt concentration of ~0.15M (see Appendix for details). Counterions were added according to Machado et al..

2.3. Run LEaP

Run the LEaP application to generate the molecular topology and initial coordinate files:

tleap -f gensystem.leap

Note

Warning messages about long, triangular or square bonds in leap.log file are fine and expected due to the CG topology of some residues.

This should create a topology file dna_cg.prmtop and a coordinate file dna_cg.ncrst.

Use VMD to check how the CG model looks like:

vmd dna_cg.prmtop dna_cg.ncrst -e ./sirah.amber/tools/sirah_vmdtk.tcl

Tip

VMD assigns default radius to unknown atom types, the script sirah_vmdtk.tcl sets the right ones, according to the CG representation. It also provides a kit of useful selection macros, coloring methods and backmapping utilities. Use the command sirah_help in the Tcl/Tk console of VMD to access the manual pages. To learn about SIRAH Tools’ capabilities, you can also go to the SIRAH Tools tutorial.

2.4. Run the simulation

Make a new folder for the run:

mkdir -p run; cd run

The folder sirah.amber/tutorial/2/PMEMD/ contains typical input files for energy minimization (em_WT4.in), equilibration (eq_WT4.in) and production (md_WT4.in) runs. Please check carefully the input flags therein, in particular the definition of flag chngmask=0 at &ewald section is mandatory.

Tip

Some commonly used flags in AMBER

  • -i: Input file.

  • -o: Output file.

  • -p: Parameter/topology file.

  • -c: Coordinate file.

  • -r: Restart file.

  • -x: Trajectory file.

Caution

These input files are executed by the GPU implementation of pmemd.cuda. Other available modules are sander or pmemd, which are both CPU implementations of Amber.

Note

You can find example input files for CPU versions of sander and pmemd at folders SANDER/ and PMEMD/, within sirah.amber/tutorial/2/

Energy Minimization:

pmemd.cuda -O -i ../sirah.amber/tutorial/2/PMEMD/em_WT4.in -p ../dna_cg.prmtop -c ../dna_cg.ncrst -o dna_cg_em.out -r dna_cg_em.ncrst &

Important

In the course of long MD simulations the capping residues may eventually separate, this effect is called helix fraying. To avoid such behavior is necessary to set Watson-Crick restraints for the capping base pairs of this CG DNA at the end of eq_GB.in and md_GB.in files. Check the files lines that start with &rst.

Warning

If you are using sander, to avoid the helix fraying, you must create a symbolic link to the file dna_cg.RST, which contains the definition of Watson-Crick restraints for the capping base pairs of this CG DNA:

ln -s ../sirah.amber/tutorial/1/SANDER/dna_cg.RST

The file dna_cg.RST can only be read by sander, pmemd reads a different restrain format.

Equilibration (NPT):

pmemd.cuda -O -i ../sirah.amber/tutorial/2/PMEMD/eq_WT4.in -p ../dna_cg.prmtop -c dna_cg_em.ncrst -ref dna_cg_em.ncrst -o dna_cg_eq.out -r dna_cg_eq.ncrst -x dna_cg_eq.nc &

Production (100ns):

pmemd.cuda -O -i ../sirah.amber/tutorial/2/PMEMD/md_WT4.in -p ../dna_cg.prmtop -c dna_cg_eq.ncrst -o dna_cg_md.out -r dna_cg_md.ncrst -x dna_cg_md.nc &

2.5. Visualizing the simulation

That’s it! Now you can analyze the trajectory. First, process the output trajectory to account for the Periodic Boundary Conditions (PBC):

echo -e "autoimage\ngo\nquit\n" | cpptraj -p ../dna_cg.prmtop -y dna_cg_md.nc -x dna_cg_md_pbc.nc --interactive

Then, load the processed trajectory in VMD:

vmd ../dna_cg.prmtop ../dna_cg.ncrst dna_cg_md.nc -e ../sirah.amber/tools/sirah_vmdtk.tcl

Note

The file sirah_vmdtk.tcl is a Tcl script that is part of SIRAH Tools and contains the macros to properly visualize the coarse-grained structures in VMD. Use the command sirah-help in the Tcl/Tk console of VMD to access the manual pages. To learn about SIRAH Tools’ capabilities, you can also go to the SIRAH Tools tutorial.

3. Multiscale simulation in implicit solvent

Note

Please report bugs, errors or enhancement requests through Issue Tracker or if you have a question about SIRAH open a New Discussion.

This tutorial shows how to apply a multiscale representation of DNA, which is available in the SIRAH force field, to study protein-DNA interactions. In this approach, the protein and its binding region are treated with an atomistic force field, while the contextual DNA is represented at coarse-grained (CG) level. The simulated system is composed of the human TATA binding protein (TBP PDB: 1C9B) bounded to the TATA box at the promoter region (-64 to +13) of the Human Immunodeficiency Virus type 1 (HIV-1, GenBank: K03455, Figure 1). Solvent effects are considered implicitly using the Generalized Born model (GB).

The main references for this tutorial are: Machado et al. and Machado & Pantano. We strongly advise you to read these articles before starting the tutorial.

map to buried treasure

Figure 1. Promoter region of HIV-1. The TATA box is highlighted in orange, while base pairs at 0.7nm of TBP are colored in yellow.

Important

Check the Setting up SIRAH section for download and set up details before starting this tutorial. Since this is Tutorial 3, remember to replace X.X and the files corresponding to this tutorial can be found in: sirah_[version].amber/tutorial/3/

3.1. Build CG representations

Map only the atomistic structure of nucleotides outside the TATA box of the HIV-1 promoter to CG:

./sirah.amber/tools/CGCONV/cgconv.pl -i ./sirah.amber/tutorial/3/tbphiv.pdb -R 1-32,45-110,123-154 -o dna_hyb.pdb

The input file -i tbphiv.pdb contains all the heavy atoms composing the protein-DNA system. Mapped residues are selected through option -R. The selection considers a “buffer region” of two base pairs at each side of the TATA box (Figure 1). The atomistic/CG interface must always be a step of B form DNA. The resulting coordinates are saved in the output -o dna_hyb.pdb.

Please check both PDB structures in VMD:

vmd -m ./sirah.amber/tutorial/3/tbphiv.pdb dna_hyb.pdb

Tip

This is the basic usage of the script cgconv.pl, you can learn other capabilities from its help by typing:

./sirah.amber/tools/CGCONV/cgconv.pl -h

From now on it is just normal AMBER stuff!

3.2. Prepare LEaP input

Use a text editor to create the file gensystem.leap including the following lines:

# Load AMBER force field (parm14SB/parmbsc0)
source oldff/leaprc.ff14SB

# Load SIRAH force field
addPath ./sirah.amber
source leaprc.sirah

# Load model
dna = loadpdb dna_hyb.pdb

# Save Parms
saveAmberParmNetcdf dna dna_hyb.prmtop dna_hyb.ncrst

# EXIT
quit

Note

  • According to AMBER version 10, 11 or 12 (14) the source file for parm99SB/bsc0 is leaprc.ff99bsc0, leaprc.ff10 or leaprc.ff12SB respectively.

  • The newest force field version for DNA, bsc1, works equally well with SIRAH.

3.3. Run LEaP

Run the LEaP application to generate the molecular topology and initial coordinate files:

tleap -f gensystem.leap

Note

Warning messages about long, triangular or square bonds in leap.log file are fine and expected due to the CG topology of some residues.

This should create a topology file dna_hyb.prmtop and a coordinate file dna_hyb.ncrst.

Use VMD to check how the multiscale model looks like:

vmd dna_hyb.prmtop dna_hyb.ncrst -e ./sirah.amber/tools/sirah_vmdtk.tcl

Tip

VMD assigns default radius to unknown atom types, the script sirah_vmdtk.tcl sets the right ones, according to the CG representation. It also provides a kit of useful selection macros, coloring methods and backmapping utilities. Use the command sirah_help in the Tcl/Tk console of VMD to access the manual pages. To learn about SIRAH Tools’ capabilities, you can also go to the SIRAH Tools tutorial.

3.4. Run the simulation

Make a new folder for the run:

mkdir -p run; cd run

The folder sirah.amber/tutorial/3/PMEMD.GPU/ contains typical input files for energy minimization (em_HYB.in), equilibration (eq_HYB.in) and production (md_HYB.in) runs. Please check carefully the input flags therein.

Tip

Some commonly used flags in AMBER

  • -i: Input file.

  • -o: Output file.

  • -p: Parameter/topology file.

  • -c: Coordinate file.

  • -r: Restart file.

  • -x: Trajectory file.

Caution

These input files are executed by the GPU implementation of pmemd.cuda. Other available modules are sander or pmemd, which are both CPU implementations of Amber.

Note

You can find example input files for CPU versions of sander and pmemd at folders SANDER/ and PMEMD.CPU/, within sirah.amber/tutorial/3/

However, this simulation is time consuming owing to the system’s size. A parallel or CUDA implementation of Amber is advised.

Energy Minimization:

pmemd.cuda -O -i ../sirah.amber/tutorial/3/PMEMD.GPU/em_HYB.in -p ../dna_hyb.prmtop -c ../dna_hyb.ncrst -o dna_hyb_em.out -r dna_hyb_em.ncrst &

Important

In the course of long MD simulations the capping residues may eventually separate, this effect is called helix fraying. To avoid such behavior is necessary to set Watson-Crick restraints for the capping base pairs of this CG DNA at the end of eq_GB.in and md_GB.in files. Check the files lines that start with &rst.

Warning

If you are using sander, to avoid the helix fraying, you must create a symbolic link to the file dna_cg.RST, which contains the definition of Watson-Crick restraints for the capping base pairs of this CG DNA:

ln -s ../sirah.amber/tutorial/3/SANDER/dna_hyb.RST

The file dna_cg.RST can only be read by sander, pmemd reads a different restrain format.

Equilibration:

pmemd.cuda -O -i ../sirah.amber/tutorial/3/PMEMD.GPU/eq_HYB.in -p ../dna_hyb.prmtop -c dna_hyb_em.ncrst -o dna_hyb_eq.out -r dna_hyb_eq.ncrst -x dna_hyb_eq.nc &

Production (10ns):

pmemd.cuda -O -i ../sirah.amber/tutorial/3/PMEMD.GPU/md_HYB.in -p ../dna_hyb.prmtop -c dna_hyb_eq.ncrst -o dna_hyb_md.out -r dna_hyb_md.ncrst -x dna_hyb_md.nc &

3.5. Visualizing the simulation

That’s it! Now you can load, visualize and analize the trajectory file in VMD:

vmd ../dna_hyb.prmtop ../dna_hyb.ncrst dna_hyb_md.nc -e ../sirah.amber/tools/sirah_vmdtk.tcl

Note

The file sirah_vmdtk.tcl is a Tcl script that is part of SIRAH Tools and contains the macros to properly visualize the coarse-grained structures in VMD. Use the command sirah-help in the Tcl/Tk console of VMD to access the manual pages. To learn about SIRAH Tools’ capabilities, you can also go to the SIRAH Tools tutorial.

4. Closed circular DNA in implicit solvent

Note

Please report bugs, errors or enhancement requests through Issue Tracker or if you have a question about SIRAH open a New Discussion.

This tutorial shows how to use the SIRAH force field to perform a coarse grained (CG) simulation of a closed circular DNA using the Generalized Born model (GB) for implicit solvent.

The main references for this tutorial are: Dans et al. (latest parameters are those reported here) and Machado & Pantano. We strongly advise you to read these articles before starting the tutorial.

Important

Check the Setting up SIRAH section for download and set up details before starting this tutorial. Since this is Tutorial 4, remember to replace X.X and the files corresponding to this tutorial can be found in: sirah_[version].amber/tutorial/4/

4.1. Build CG representations

Map the atomistic structure of the closed circular DNA to its CG representation:

./sirah.amber/tools/CGCONV/cgconv.pl -i ./sirah.amber/tutorial/4/ccdna.pdb | sed -e 's/DCX A   1/CW5 A   1/'  -e 's/DCX B   1/CW5 B   1/' -e 's/DGX A 100/GW3 A 100/' -e 's/DGX B 100/GW3 B 100/' > ccdna_cg.pdb

Note

The 5’ and 3’ end residues must be renamed to AW5, TW5, GW5 or CW5 and AW3, TW3, GW3 or CW3 to represent the corresponding Adenine, Thymine, Guanine or Cytosine extremes in a closed circular DNA.

The input file -i ccdna.pdb contains all the heavy atoms composing the DNA molecule, while the output -o ccdna_cg.pdb preserves a few of them.

Tip

This is the basic usage of the script cgconv.pl, you can learn other capabilities from its help by typing:

./sirah.amber/tools/CGCONV/cgconv.pl -h

Please check both PDB structures using VMD:

vmd -m ./sirah.amber/tutorial/4/ccdna.pdb ccdna_cg.pdb

From now on it is just normal AMBER stuff!

4.2. Prepare LEaP input

Use a text editor to create the file gensystem.leap including the following lines:

# Load SIRAH force field
addPath ./sirah.amber
source leaprc.sirah

# Load model
dna = loadpdb ccdna_cg.pdb

# Make a covalently closed circular DNA
bond dna.1.PX dna.100.C1X
bond dna.101.PX dna.200.C1X

# Save Parms
saveAmberParmNetcdf dna ccdna_cg.prmtop ccdna_cg.ncrst

# EXIT
quit

4.3. Run LEaP

Run the LEaP application to generate the molecular topology and initial coordinate files:

tleap -f gensystem.leap

Note

Warning messages about long, triangular or square bonds in leap.log file are fine and expected due to the CG topology of some residues.

This should create a topology file ccdna_cg.prmtop and a coordinate file ccdna_cg.ncrst.

Use VMD to check how the CG model looks like:

vmd ccdna_cg.prmtop ccdna_cg.ncrst -e ./sirah.amber/tools/sirah_vmdtk.tcl

Tip

VMD assigns default radius to unknown atom types, the script sirah_vmdtk.tcl sets the right ones, according to the CG representation. It also provides a kit of useful selection macros, coloring methods and backmapping utilities. Use the command sirah_help in the Tcl/Tk console of VMD to access the manual pages. To learn about SIRAH Tools’ capabilities, you can also go to the SIRAH Tools tutorial.

4.4. Run the simulation

Make a new folder for the run:

mkdir -p run; cd run

The folder sirah.amber/tutorial/4/GPU contains typical input files for energy minimization (em_GBSA.in), equilibration (eq_GBSA.in) and production (md_GBSA.in) runs. Please check carefully the input flags therein.

Tip

Some commonly used flags in AMBER

  • -i: Input file.

  • -o: Output file.

  • -p: Parameter/topology file.

  • -c: Coordinate file.

  • -r: Restart file.

  • -x: Trajectory file.

Caution

These input files are executed by the GPU implementation of pmemd.cuda. Other available modules are sander or pmemd, which are both CPU implementations of Amber.

Note

You can find example input files for CPU versions of sander and pmemd at folder CPU/, within sirah.amber/tutorial/4/

Energy Minimization:

pmemd.cuda -O -i ../sirah.amber/tutorial/4/GPU/em_GBSA.in -p ../ccdna_cg.prmtop -c ../ccdna_cg.ncrst -o ccdna_cg_em.out -r ccdna_cg_em.ncrst &

Equilibration:

pmemd.cuda -O -i ../sirah.amber/tutorial/4/GPU/eq_GBSA.in -p ../ccdna_cg.prmtop -c ccdna_cg_em.ncrst -o ccdna_cg_eq.out -r ccdna_cg_eq.ncrst -x ccdna_cg_eq.nc &

Production (100ns):

pmemd.cuda -O -i ../sirah.amber/tutorial/4/GPU/md_GBSA.in -p ../ccdna_cg.prmtop -c ccdna_cg_eq.ncrst -o ccdna_cg_md.out -r ccdna_cg_md.ncrst -x ccdna_cg_md.nc &

4.5. Visualizing the simulation

That’s it! Now you can load, visualize and analize the trajectory file in VMD:

vmd ../ccdna_cg.prmtop ../ccdna_cg.ncrst ccdna_cg_md.nc -e ../sirah.amber/tools/sirah_vmdtk.tcl

Note

The file sirah_vmdtk.tcl is a Tcl script that is part of SIRAH Tools and contains the macros to properly visualize the coarse-grained structures in VMD. Use the command sirah-help in the Tcl/Tk console of VMD to access the manual pages. To learn about SIRAH Tools’ capabilities, you can also go to the SIRAH Tools tutorial.

5. Proteins in explicit solvent

Note

Please report bugs, errors or enhancement requests through Issue Tracker or if you have a question about SIRAH open a New Discussion.

This tutorial shows how to use the SIRAH force field to perform a coarse grained (CG) simulation of a protein in explicit solvent (called WatFour, WT4). The main references for this tutorial are: Darré et al., Machado et al. and Machado & Pantano. We strongly advise you to read these articles before starting the tutorial.

Important

Check the Setting up SIRAH section for download and set up details before starting this tutorial. Since this is Tutorial 5, remember to replace X.X and the files corresponding to this tutorial can be found in: sirah_[version].amber/tutorial/5/

5.1. Build CG representations

Caution

The mapping to CG requires the correct protonation state of each residue at a given pH. We recommend using the CHARMM-GUI server and use the PDB Reader & Manipulator to prepare your system. An account is required to access any of the CHARMM-GUI Input Generator modules, and it can take up to 24 hours to obtain one.

Other option is the PDB2PQR server and choosing the output naming scheme of AMBER for best compatibility. This server was utilized to generate the PQR file featured in this tutorial. Be aware that modified residues lacking parameters such as: MSE (seleno MET), TPO (phosphorylated TYR), SEP (phosphorylated SER) or others are deleted from the PQR file by the server. In that case, mutate the residues to their unmodified form before submitting the structure to the server.

Map the protonated atomistic structure of protein 1CRN to its CG representation:

./sirah.amber/tools/CGCONV/cgconv.pl -i ./sirah.amber/tutorial/5/1CRN.pqr -o 1CRN_cg.pdb

The input file -i 1CRN.pqr contains the atomistic representation of 1CRN structure at pH 7.0, while the output -o 1CRN_cg.pdb is its SIRAH CG representation.

Tip

This is the basic usage of the script cgconv.pl, you can learn other capabilities from its help by typing:

./sirah.amber/tools/CGCONV/cgconv.pl -h

Note

Pay attention to residue names when mapping structures from other atomistic force fields or experimental structures. Although we provide compatibility for naming schemes in PDB, GMX, GROMOS, CHARMM and OPLS, there might always be some ambiguity in the residue naming, specially regarding protonation states, that may lead to a wrong mapping. For example, SIRAH Tools always maps the residue name “HIS” to a Histidine protonated at the epsilon nitrogen (\(N_{\epsilon}\)) regardless the actual proton placement. Similarly, protonated Glutamic and Aspartic acid residues must be named “GLH” and “ASH”, otherwise they will be treated as negative charged residues. In addition, protonated and disulfide bonded Cysteines must be named “CYS” and “CYX” respectively. These kind of situations need to be carefully checked by the users. In all cases the residues preserve their identity when mapping and back-mapping the structures. Hence, the total charge of the protein should be the same at atomistic and SIRAH levels. You can check the following mapping file to be sure of the compatibility: sirah.amber/tools/CGCONV/maps/sirah_prot.map.

Important

By default, charged termini are used, but it is possible to set them neutral by renaming the residues from s[code] to a[code] (Nt-acetylated) or m[code] (Ct-amidated) after mapping to CG, where [code] is the root residue name in SIRAH. For example, to set a neutral N-terminal Histidine protonated at epsilon nitrogen (\(N_{\epsilon}\)) rename it from “sHe” to “aHe”.

Please check both PDB and PQR structures using VMD:

vmd -m sirah.amber/tutorial/5/1CRN.pqr 1CRN_cg.pdb

From now on it is just normal Amber stuff!

5.2. Prepare LEaP input

Use a text editor to create the file gensystem.leap including the following lines:

# Load SIRAH force field
addPath ./sirah.amber
source leaprc.sirah

# Load model
protein = loadpdb 1CRN_cg.pdb

# Info on system charge
charge protein

# Set S-S bridges
bond protein.3.BSG protein.40.BSG
bond protein.4.BSG protein.32.BSG
bond protein.16.BSG protein.26.BSG

# Add solvent, counterions and 0.15M NaCl
# Tuned solute-solvent closeness for best hydration
solvateOct protein WT4BOX 20 0.7
addIonsRand protein NaW 22 ClW 22

# Save Parms
saveAmberParmNetcdf protein 1CRN_cg.prmtop 1CRN_cg.ncrst

# EXIT
quit

Caution

Each disulfide bond must be defined explicitly in LEaP using the command bond, e.g.: “bond unit.ri.BSG unit.rj.BSG”. Where ri and rj correspond to the residue index in the topology file starting from 1, which may differ from the biological sequence in the PDB file. You can try the command pdb4amber to get those indexes from the atomistic structure, but be aware that it may not work if the Cysteine residues are too far away:

pdb4amber -i sirah.amber/tutorial/5/1CRN.pqr -o 1CRN_aa.pdb && cat 1CRN_aa_sslink

See also

The available electrolyte species in SIRAH force field are: Na⁺ (NaW), K⁺ (KW) and Cl⁻ (ClW) which represent solvated ions in solution. One ion pair (e.g., NaW-ClW) each 34 WT4 molecules results in a salt concentration of ~0.15M (see Appendix for details). Counterions were added according to Machado et al..

5.3. Run LEaP

Run the LEaP application to generate the molecular topology and initial coordinate files:

tleap -f gensystem.leap

Note

Warning messages about long, triangular or square bonds in leap.log file are fine and expected due to the CG topology of some residues.

This should create a topology file 1CRN_cg.prmtop and a coordinate file 1CRN_cg.ncrst.

Use VMD to check how the CG model looks like and particularly the presence of disulfide bonds:

vmd 1CRN_cg.prmtop 1CRN_cg.ncrst -e ./sirah.amber/tools/sirah_vmdtk.tcl

Tip

VMD assigns default radius to unknown atom types, the script sirah_vmdtk.tcl sets the right ones, according to the CG representation. It also provides a kit of useful selection macros, coloring methods and backmapping utilities. Use the command sirah_help in the Tcl/Tk console of VMD to access the manual pages. To learn about SIRAH Tools’ capabilities, you can also go to the SIRAH Tools tutorial.

5.4. Run the simulation

Make a new folder for the run:

mkdir -p run; cd run

The folder sirah.amber/tutorial/5/ contains typical input files for energy minimization (em1_WT4.in and em2_WT4.in), equilibration (eq1_WT4.in and eq2_WT4.in) and production (md_WT4.in) runs. Please check carefully the input flags therein, in particular the definition of flag chngmask=0 at &ewald section is mandatory.

Tip

Some commonly used flags in Amber

  • -i: Input file.

  • -o: Output file.

  • -p: Parameter/topology file.

  • -c: Coordinate file.

  • -r: Restart file.

  • -x: Trajectory file.

  • -ref: Reference file

Caution

These input files are executed by the GPU implementation of pmemd.cuda. Other available modules are sander or pmemd, which are both CPU implementations of Amber.

Note

The same input files can be used to run on CPU with the modules pmemd or sander.

Energy Minimization of side chains and solvent by restraining the backbone:

pmemd.cuda -O -i ../sirah.amber/tutorial/5/em1_WT4.in -p ../1CRN_cg.prmtop -c ../1CRN_cg.ncrst -ref ../1CRN_cg.ncrst -o 1CRN_cg_em1.out -r 1CRN_cg_em1.ncrst &

Energy Minimization of whole system:

pmemd.cuda -O -i ../sirah.amber/tutorial/5/em2_WT4.in -p ../1CRN_cg.prmtop -c 1CRN_cg_em1.ncrst -o 1CRN_cg_em2.out -r 1CRN_cg_em2.ncrst &

Solvent Equilibration (NPT):

pmemd.cuda -O -i ../sirah.amber/tutorial/5/eq1_WT4.in -p ../1CRN_cg.prmtop -c 1CRN_cg_em2.ncrst -ref 1CRN_cg_em2.ncrst -o 1CRN_cg_eq1.out -r 1CRN_cg_eq1.ncrst -x 1CRN_cg_eq1.nc &

Caution

Option restraintmask=:’1-46’ in input file eq1_WT4.in must be set specifically for each system to embrace all protein’s residues.

Soft equilibration to improve side chain solvation (NPT):

pmemd.cuda -O -i ../sirah.amber/tutorial/5/eq2_WT4.in -p ../1CRN_cg.prmtop -c 1CRN_cg_eq1.ncrst -ref 1CRN_cg_eq1.ncrst -o 1CRN_cg_eq2.out -r 1CRN_cg_eq2.ncrst -x 1CRN_cg_eq2.nc &

Production (1000ns):

pmemd.cuda -O -i ../sirah.amber/tutorial/5/md_WT4.in -p ../1CRN_cg.prmtop -c 1CRN_cg_eq2.ncrst -o 1CRN_cg_md.out -r 1CRN_cg_md.ncrst -x 1CRN_cg_md.nc &

5.5. Visualizing the simulation

That’s it! Now you can analyze the trajectory. Process the output trajectory to account for the Periodic Boundary Conditions (PBC):

echo -e "autoimage\ngo\nquit\n" | cpptraj -p ../1CRN_cg.prmtop -y 1CRN_cg_md.nc -x 1CRN_cg_md_pbc.nc --interactive

Load the processed trajectory in VMD:

vmd ../1CRN_cg.prmtop ../1CRN_cg.ncrst 1CRN_cg_md.nc -e ../sirah.amber/tools/sirah_vmdtk.tcl

Note

The file sirah_vmdtk.tcl is a Tcl script that is part of SIRAH Tools and contains the macros to properly visualize the coarse-grained structures in VMD. Use the command sirah-help in the Tcl/Tk console of VMD to access the manual pages. To learn about SIRAH Tools’ capabilities, you can also go to the SIRAH Tools tutorial.

6. Lipid bilayers in explicit solvent

Note

Please report bugs, errors or enhancement requests through Issue Tracker or if you have a question about SIRAH open a New Discussion.

This tutorial shows how to use the SIRAH force field to perform a coarse grained (CG) simulation of a DMPC bilayer in explicit solvent (called WatFour, WT4). The main references for this tutorial are: Barrera et al. and Machado & Pantano. We strongly advise you to read these articles before starting the tutorial. You may also find interesting this book chapter.

Note

If you are not familiar with membrane stuff we strongly recommend you to first perform the Amber tutorial on lipids.

Important

Check the Setting up SIRAH section for download and set up details before starting this tutorial. Since this is Tutorial 6, remember to replace X.X and the files corresponding to this tutorial can be found in: sirah_[version].amber/tutorial/6/

6.1. Build CG representations

Map the atomistic structure of the preassembled DMPC bilayer to its CG representation:

./sirah.amber/tools/CGCONV/cgconv.pl -i ./sirah.amber/tutorial/6/DMPC64.pdb -o DMPC64_cg.pdb -a sirah.amber/tools/CGCONV/maps/tieleman_lipid.map

The input file -i DMPC64.pdb contains the atomistic representation of the DMPC bilayer, while the output -o DMPC64_cg.pdb is its SIRAH CG representation. The flag -a is a mapping scheme for lipids.

Important

By default, no mapping is applied to lipids, as there is no standard naming convention for them. So users are requested to append a MAP file from the list in Table 1, by setting the flag -a in cgconv.pl. We recommend using PACKMOL for building the system. Reference building-block structures are provided at folder sirah.amber/PDB/, which agree with the mapping scheme in sirah.amber/tools/CGCONV/maps/tieleman_lipid.map. The provided DMPC bilayer contains 64 lipid molecules per leaflet distributed in a 6.4 * 6.4 nm surface, taking into account an approximate area per lipid of 0.64 nm2 at 333 K. The starting configuration was created with the input file sirah.amber/tutorial/6/DMPC_bilayer.pkm. See FAQs for cautions on mapping lipids to SIRAH and tips on using fragment-based topologies. If you do not find your issue please start a discussion in our github discussion page F&Q.

Tip

This an advanced usage of the script cgconv.pl, you can learn other capabilities from its help by typing:

./sirah.amber/tools/CGCONV/cgconv.pl -h

The input file DMPC64.pdb contains all the heavy atoms composing the lipids, while the output DMPC64_cg.pdb preserves a few of them. Please check both PDB structures using VMD:

vmd -m sirah.amber/tutorial/6/DMPC64.pdb DMPC64_cg.pdb

From now on it is just normal Amber stuff!

6.2. Prepare LEaP input

Use a text editor to create the file gensystem.leap including the following lines:

# Load SIRAH force field
addPath ./sirah.amber
source leaprc.sirah

# Load model
Lipid = loadpdb DMPC64_cg.pdb

# Add solvent, counterions and 0.15M NaCl
# Tuned solute-solvent closeness for best hydration
solvateBox Lipid WT4BOX {0 0 40} 0.7
addIonsRand Lipid NaW 33 ClW 33

# Save Parms
saveAmberParmNetcdf Lipid DMPC64_cg.prmtop DMPC64_cg.ncrst

# EXIT
quit

See also

The available electrolyte species in SIRAH force field are: Na⁺ (NaW), K⁺ (KW) and Cl⁻ (ClW) which represent solvated ions in solution. One ion pair (e.g., NaW-ClW) each 34 WT4 molecules results in a salt concentration of ~0.15M (see Appendix for details). Counterions were added according to Machado et al..

6.3. Run LEaP

Run the LEaP application to generate the molecular topology and initial coordinate files:

tleap -f gensystem.leap

Note

Warning messages about long, triangular or square bonds in leap.log file are fine and expected due to the CG topology.

This should create a topology file DMPC64_cg.prmtop and a coordinate file DMPC64_cg.ncrst.

Use VMD to check how the CG model looks:

vmd DMPC64_cg.prmtop DMPC64_cg.ncrst -e ./sirah.amber/tools/sirah_vmdtk.tcl

By selecting +X, +Y and +Z periodic images from the Periodic tab in the Graphical Representations window you will see small vacuum slices at box boundaries. In the following step we will fix this issue by reducing the box dimensions a few angstroms. See FAQs for issues on membrane systems in Amber.

Tip

VMD assigns default radius to unknown atom types, the script sirah_vmdtk.tcl sets the right ones, according to the CG representation. It also provides a kit of useful selection macros, coloring methods and backmapping utilities. Use the command sirah_help in the Tcl/Tk console of VMD to access the manual pages. To learn about SIRAH Tools’ capabilities, you can also go to the SIRAH Tools tutorial.

6.4. Resize the box with CPPTRAJ

Note

As PACKMOL does not consider periodicity while building up the system, increasing the XY box sides a few Angstroms may be required to avoid bad contacts between images.

Use a text editor to create the file resize_box.cpptraj including the following lines:

# New box dimensions
box x 66 y 66 z 132

# Amber NetCDF Restart generation
trajout DMPC64_cg_nb.ncrst

# Do it!
go

# EXIT
quit

Run the CPPTRAJ application to adjust the size of the simulation box:

cpptraj -p DMPC64_cg.prmtop -y DMPC64_cg.ncrst -i resize_box.cpptraj

Once again, use VMD to check the PBC images in the new box of the system:

vmd DMPC64_cg.prmtop DMPC64_cg_nb.ncrst -e ./sirah.amber/tools/sirah_vmdtk.tcl

6.5. Run the simulation

Make a new folder for the run:

mkdir -p run; cd run

The folder sirah.amber/tutorial/6/ contains typical input files for energy minimization (em_Lipid.in), heating (heat_Lipid.in), equilibration (eq_Lipid.in) and production (md_Lipid.in) runs. Please check carefully the input flags.

Tip

Some commonly used flags in AMBER

  • -i: Input file.

  • -o: Output file.

  • -p: Parameter/topology file.

  • -c: Coordinate file.

  • -r: Restart file.

  • -x: Trajectory file.

  • -ref: Reference file

Warning

These input files are executed by the GPU implementation of pmemd.cuda. Other available modules are sander or pmemd, which are both CPU implementations of Amber.

However, this simulation is time consuming owing to the system’s size. A parallel or CUDA implementation of Amber is advised.

Energy Minimization:

pmemd.cuda -O -i ../sirah.amber/tutorial/6/em_Lipid.in -p ../DMPC64_cg.prmtop -c ../DMPC64_cg_nb.ncrst -o DMPC64_cg_em.out -r DMPC64_cg_em.ncrst &

Heating:

pmemd.cuda -O -i ../sirah.amber/tutorial/6/heat_Lipid.in -p ../DMPC64_cg.prmtop -c DMPC64_cg_em.ncrst -o DMPC64_cg_eq_0.out -r DMPC64_cg_eq_0.ncrst -x DMPC64_cg_eq_0.nc &

Important

To avoid “skinnb errors” on GPU due to large box size fluctuations, the system must be equilibrated by several “short” runs using a large skinnb value. The number and length of the runs may vary according to the characteristic stabilization times of the system. For more information visit the Amber tutorial on lipids.

Periodic box equilibration in GPU code (500 ps x 9):

for i in $(seq 1 9)
do
        echo "running equilibration $i"
        pmemd.cuda -O \
        -i ../sirah.amber/tutorial/6/eq_Lipid.in \
        -p ../DMPC64_cg.prmtop \
        -c DMPC64_cg_eq_$(($i -1)).ncrst \
        -o DMPC64_cg_eq_$i.out \
        -r DMPC64_cg_eq_$i.ncrst \
        -x DMPC64_cg_eq_$i.nc
done &

Production (1000ns):

pmemd.cuda -O -i ../sirah.amber/tutorial/6/md_Lipid.in -p ../DMPC64_cg.prmtop -c DMPC64_cg_eq_9.ncrst -o DMPC64_cg_md.out -r DMPC64_cg_md.ncrst -x DMPC64_cg_md.nc &

6.6. Visualizing the simulation

That’s it! Now you can analyze the trajectory.

Process the output trajectory to account for the Periodic Boundary Conditions (PBC):

echo -e "autoimage\ngo\nquit\n" | cpptraj -p ../DMPC64_cg.prmtop -y DMPC64_cg_md.nc -x DMPC64_cg_md_pbc.nc --interactive

Now you can check the simulation using VMD:

vmd ../DMPC64_cg.prmtop DMPC64_cg_md_pbc.nc -e ../sirah.amber/tools/sirah_vmdtk.tcl

Note

The file sirah_vmdtk.tcl is a Tcl script that is part of SIRAH Tools and contains the macros to properly visualize the coarse-grained structures in VMD. Use the command sirah-help in the Tcl/Tk console of VMD to access the manual pages. To learn about SIRAH Tools’ capabilities, you can also go to the SIRAH Tools tutorial.

You can also use CPPTRAJ to calculate the area per lipid:

cpptraj -p ../DMPC64_cg.prmtop -i ../sirah.amber/tutorial/6/area_lipid.cpptraj

Use Grace to plot the results:

xmgrace apl_DMPC64_310K.dat

Note

To calculate the area per lipid, divide the membrane’s area by the DMPC molecules per leaflet:

\[\frac{Area}{Lipid} = \frac{Box(x) * Box(y)}{64}\]

And density profiles and bilayer thickness:

cpptraj -p ../DMPC64_cg.prmtop -i ../sirah.amber/tutorial/6/dens_profile.cpptraj

Use Grace to plot the results:

xmgrace -nxy dens_profile_DMPC64_310K.dat

Note

The thickness of the bilayer is the distance between the two peaks corresponding to the position of phosphate beads (BFO) along the z-axis.

7. Membrane proteins in explicit solvent

Note

Please report bugs, errors or enhancement requests through Issue Tracker or if you have a question about SIRAH open a New Discussion.

This tutorial shows how to use the SIRAH force field to perform a coarse grained (CG) simulation of a protein embedded in a lipid bilayer using explicit solvent (called WatFour, WT4). The main references for this tutorial are: Machado et al., Barrera et al., and Machado & Pantano. We strongly advise you to read these articles before starting the tutorial.

Note

We strongly advise you to read and complete Tutorial 5 and Tutorial 6 before starting.

Important

Check the Setting up SIRAH section for download and set up details before starting this tutorial. Since this is Tutorial 7, remember to replace X.X and the files corresponding to this tutorial can be found in: sirah_[version].amber/tutorial/7/

7.1. Build CG representations

Caution

The mapping to CG requires the correct protonation state of each residue at a given pH. We recommend using the CHARMM-GUI server and use the PDB Reader & Manipulator to prepare your system. An account is required to access any of the CHARMM-GUI Input Generator modules, and it can take up to 24 hours to obtain one.

See Tutorial 5 for cautions while preparing and mapping atomistic proteins to SIRAH.

Map the atomistic structure of protein 2KYV to its CG representation:

./sirah.amber/tools/CGCONV/cgconv.pl -i sirah.amber/tutorial/7/2kyv.pqr -o 2kyv_cg.pdb

The input file -i 2kyv.pqr contains the atomistic representation of 2KYV structure at pH 7.0, while the output -o 2kyv_cg.pdb is its SIRAH CG representation.

Important

If you already have an atomistic protein within a membrane, then you can simply map the entire system to SIRAH (this is highly recommended) and skip the step 2, however clipping the membrane patch may be required to set a correct solvation box (see bellow). By default, no mapping is applied to lipids, as there is no standard naming convention for them. So users are requested to append a MAP file from the list in Table 1, by setting the flag -a in cgconv.pl. We recommend using PACKMOL for building the system. Reference building-block structures are provided at folder sirah.amber/PDB/, which agree with the mapping scheme sirah.amber/tools/CGCONV/maps/tieleman_lipid.map. See FAQs for cautions on mapping lipids to SIRAH and tips on using fragment-based topologies.

Tip

This an advanced usage of the script cgconv.pl, you can learn other capabilities from its help by typing:

./sirah.amber/tools/CGCONV/cgconv.pl -h

7.1.1. Embed the protein in a lipid bilayer

We will show one possible way to do it by starting from a pre-equilibrated CG membrane patch.

In sirah_[version].amber/tutorial/7/ you will find a pre-stabilized CG DMPC bilayer patch, concatenate it with the previously generated CG representation of the phospholamban (PLN) pentamer (PDB code: 2KYV):

head -qn -1 2kyv_cg.pdb ./sirah.amber/tutorial/7/DMPC_cg.pdb > 2kyv_DMPC_cg_init.pdb

Luckily, we already oriented the protein inside the membrane. For setting up your own system you can go to Orientations of Proteins in Membranes (OPM) database and (if your structure is available) use the dummy atoms provided there to make them match with your membrane model (see Figure 1).

_images/Tuto7.png

Figure 1. Protein oriented inside the membrane from the OPM database with dummy atoms represented as orange spheres.

7.1.2. Delete close contact lipid molecules

We need to use VMD to to delete lipid molecules in close contact with the protein. For a proper treatment and visualization of the system in VMD you must first generate the molecular topology and initial coordinate files.

Use a text editor to create the file geninit.leap including the following lines:

# Load SIRAH force field
addPath ./sirah.amber
source leaprc.sirah

# Load model
ProtMem = loadpdb 2kyv_DMPC_cg_init.pdb

# Save Parms
saveAmberParmNetcdf ProtMem 2kyv_DMPC_cg_init.prmtop 2kyv_DMPC_cg_init.ncrst

# EXIT
quit

Run the LEaP application to generate the molecular topology and initial coordinate files:

tleap -f geninit.leap

Note

Warning messages about long, triangular or square bonds in leap.log file are fine and expected due to the CG topology.

Now, open the files on VMD:

vmd 2kyv_DMPC_cg_init.prmtop 2kyv_DMPC_cg_init.ncrst -e sirah.amber/tools/sirah_vmdtk.tcl

Tip

VMD assigns default radius to unknown atom types, the script sirah_vmdtk.tcl sets the right ones, according to the CG representation. It also provides a kit of useful selection macros, coloring methods and backmapping utilities. Use the command sirah_help in the Tcl/Tk console of VMD to access the manual pages. To learn about SIRAH Tools’ capabilities, you can also go to the SIRAH Tools tutorial.

In the VMD main window, select Graphics > Representations. In the Selected Atoms box, type:

not (same residue as (sirah_membrane within 3.5 of sirah_protein) or (sirah_membrane and x < 5 or x > 142 or y < 2 or y > 140))

Important

In the first part of the selection, lipid molecules in close contact with the protein are removed. The second one is made to “trim” the membrane patch, deleting lipids with acyl chains located in the periodic boundary images. This is frequent when using pre-equilibrated membrane patches and is necessary to avoid clashes in the following steps.

To save the refined protein-membrane system, in the VMD main window click on 2kyv_DMPC_cg_init.prmtop, then select File > Save Coordinates. In the Selected atoms option choose the selection you have just created and Save as 2kyv_DMPC_cg.pdb.

From now on it is just normal AMBER stuff!

7.2. Prepare LEaP input

Use a text editor to create the file gensystem.leap including the following lines:

# Load SIRAH force field
addPath ./sirah.amber
source leaprc.sirah

# Load model
ProtMem = loadpdb 2kyv_DMPC_cg.pdb

# Info on system charge
charge ProtMem

# Prevent adding solvent molecules beyond the membrane boundaries
setbox ProtMem centers 0

# Add solvent, counterions and 0.15M NaCl
# Tuned solute-solvent closeness for best hydration
solvateBox ProtMem WT4BOX {0,0,27} 0.7
addIonsRand ProtMem NaW 109 ClW 124

# Save Parms
saveAmberParmNetcdf ProtMem 2kyv_DMPC_cg.prmtop 2kyv_DMPC_cg.ncrst

# EXIT
quit

See also

The available electrolyte species in SIRAH force field are: Na⁺ (NaW), K⁺ (KW) and Cl⁻ (ClW) which represent solvated ions in solution. One ion pair (e.g., NaW-ClW) each 34 WT4 molecules results in a salt concentration of ~0.15M (see Appendix for details). Counterions were added according to Machado et al..

7.3. Run LEaP

Run the LEaP application to generate the molecular topology and initial coordinate files:

tleap -f gensystem.leap

Caution

Warning messages about long, triangular or square bonds in leap.log file are fine and expected due to the CG topology.

This should create a topology file 2kyv_DMPC_cg.prmtop and a coordinate file 2kyv_DMPC_cg.ncrst.

Use VMD to check how the CG model looks:

vmd 2kyv_DMPC_cg.prmtop 2kyv_DMPC_cg.ncrst -e ./sirah.amber/tools/sirah_vmdtk.tcl

By selecting +X, +Y and +Z periodic images from the Periodic tab in the Graphical Representations window you will see unwanted water near the hydrophobic region of the membrane and small vacuum slices at box boundaries. In the following step we will fix these issues by deleting those water molecules and reducing the box dimensions a few angstroms. See FAQs for issues on membrane systems in Amber. If you do not find your issue please start a discussion in our github discussion page F&Q.

7.4. Resize the box with CPPTRAJ

Note

As PACKMOL does not consider periodicity while building up the system, increasing the XY box sides a few Angstroms may be required to avoid bad contacts between images.

Use a text editor to create the file resize_box.cpptraj including the following lines:

# Set reference coordinate file
reference 2kyv_DMPC_cg.ncrst

# Remove water using distance-based masks
strip :WT4&(@BCT1,BCT2,BC13,BC23<:12.0) parmout 2kyv_DMPC_cg.prmtop

# New box dimensions
box x 128 y 128 z 114

# Amber NetCDF Restart generation
trajout 2kyv_DMPC_cg_nb.ncrst

# Do it!
 go

# EXIT
 quit

Caution

This is a critical step when preparing membrane systems to simulate with Amber. In this case, the new box dimensions were set after some trial and error tests to allow for limited overlap between periodic box images. An excessive overlap may lead to important atom clashes an eventual system explosion during minimization/simulation, while insufficient overlap may impact the membrane cohesivity at PBC boundaries leading to pore formations or other issues.

Run the CPPTRAJ application application to adjust the size of the simulation box:

cpptraj -p 2kyv_DMPC_cg.prmtop -y 2kyv_DMPC_cg.ncrst -i resize_box.cpptraj

Once again, use VMD to check the PBC images in the new box of the system:

vmd 2kyv_DMPC_cg.prmtop 2kyv_DMPC_cg_nb.ncrst -e ./sirah.amber/tools/sirah_vmdtk.tcl

7.5. Run the simulation

Make a new folder for the run:

mkdir -p run; cd run

The folder sirah.amber/tutorial/7/ contains typical input files for energy minimization (em1_Prot-Lip.in and em2_Prot-Lip.in), heating (heat_Prot-Lip.in), equilibration (eq_Prot-Lip.in) and production (md_Prot-Lip.in) runs. Please check carefully the input flags.

Tip

Some commonly used flags in Amber

  • -i: Input file.

  • -o: Output file.

  • -p: Parameter/topology file.

  • -c: Coordinate file.

  • -r: Restart file.

  • -x: Trajectory file.

  • -ref: Reference file

Warning

These input files are executed by the GPU implementation of pmemd.cuda. Other available modules are sander or pmemd, which are both CPU implementations of Amber.

However, this simulation is time consuming owing to the system’s size. A parallel or CUDA implementation of Amber is advised.

Energy Minimization of side chains by restraining the backbone:

pmemd.cuda -O -i ../sirah.amber/tutorial/7/em1_Prot-Lip.in -p ../2kyv_DMPC_cg.prmtop -c ../2kyv_DMPC_cg_nb.ncrst -ref ../2kyv_DMPC_cg_nb.ncrst -o 2kyv_DMPC_cg_em_1.out -r 2kyv_DMPC_cg_em_1.ncrst &

Energy Minimization of of the whole system:

pmemd.cuda -O -i ../sirah.amber/tutorial/7/em2_Prot-Lip.in -p ../2kyv_DMPC_cg.prmtop -c 2kyv_DMPC_cg_em_1.ncrst -ref 2kyv_DMPC_cg_em_1.ncrst -o 2kyv_DMPC_cg_em_2.out -r 2kyv_DMPC_cg_em_2.ncrst &

Heating:

pmemd.cuda -O -i ../sirah.amber/tutorial/7/heat_Prot-Lip.in -p ../2kyv_DMPC_cg.prmtop -c 2kyv_DMPC_cg_em_2.ncrst -ref 2kyv_DMPC_cg_em_2.ncrst -o 2kyv_DMPC_cg_eq_0.out -r 2kyv_DMPC_cg_eq_0.ncrst -x 2kyv_DMPC_cg_eq_0.nc &

Important

To avoid “skinnb errors” on GPU due to large box size fluctuations, the system must be equilibrated by several “short” runs using a large skinnb value. The number and length of the runs may vary according to the characteristic stabilization times of the system. For more information visit the Amber tutorial on lipids.

Periodic box equilibration in GPU code (500 ps x 9):

for i in $(seq 1 9)
do
        echo "running equilibration $i"
        pmemd.cuda -O \
        -i ../sirah.amber/tutorial/7/heat_Prot-Lip.in \
        -p ../2kyv_DMPC_cg.prmtop \
        -c 2kyv_DMPC_cg_eq_$(($i -1)).ncrst \
        -ref 2kyv_DMPC_cg_eq_$(($i -1)).ncrst \
        -o 2kyv_DMPC_cg_eq_$i.out \
        -r 2kyv_DMPC_cg_eq_$i.ncrst \
        -x 2kyv_DMPC_cg_eq_$i.nc
done &

Production (1000ns):

pmemd.cuda -O -i ../sirah.amber/tutorial/7/md_Prot-Lip.in -p ../2kyv_DMPC_cg.prmtop -c 2kyv_DMPC_cg_eq_9.ncrst -o 2kyv_DMPC_cg_md.out -r 2kyv_DMPC_cg_md.ncrst -x 2kyv_DMPC_cg_md.nc &

7.6. Visualizing the simulation

That’s it! Now you can analyze the trajectory.

Process the output trajectory to account for the Periodic Boundary Conditions (PBC):

echo -e "autoimage\ngo\nquit\n" | cpptraj -p ../2kyv_DMPC_cg.prmtop -y 2kyv_DMPC_cg_md.nc -x 2kyv_DMPC_cg_md_pbc.nc --interactive

Now you can check the simulation using VMD:

vmd ../2kyv_DMPC_cg.prmtop 2kyv_DMPC_cg_md_pbc.nc -e ../sirah.amber/tools/sirah_vmdtk.tcl

Note

The file sirah_vmdtk.tcl is a Tcl script that is part of SIRAH Tools and contains the macros to properly visualize the coarse-grained structures in VMD. Use the command sirah-help in the Tcl/Tk console of VMD to access the manual pages. To learn about SIRAH Tools’ capabilities, you can also go to the SIRAH Tools tutorial.

8. Glycoprotein in explicit solvent

Note

Please report bugs, errors or enhancement requests through Issue Tracker or if you have a question about SIRAH open a New Discussion.

This tutorial shows how to use the SIRAH force field to perform a coarse grained (CG) simulation of a glycoprotein in explicit solvent (called WatFour, WT4). The main references for this tutorial are: Darré et al., Machado et al. and Machado & Pantano. We strongly advise you to read these articles before starting the tutorial.

Important

Check the Setting up SIRAH section for download and set up details before starting this tutorial. Since this is Tutorial 8, remember to replace X.X and the files corresponding to this tutorial can be found in: sirah_[version].amber/tutorial/8/

8.1. Build CG representations

Caution

The mapping to CG requires the correct protonation state of each residue at a given pH. We recommend using the CHARMM-GUI server with the Glycan Reader & Modeler to prepare your system, choosing the output naming scheme of AMBER for best compatibility. An account is required to access any of the CHARMM-GUI Input Generator modules, and it can take up to 24 hours to obtain one.

This example use the structure of glycoprotein 1GYA, using CHARMM-GUI server are obtained the files parm7 and rst7, these files are converted to pdb format with the naming scheme of AMBER/GLYCAM using AmberTool as follow:

ambpdb -p ./sirah.amber/tutorial/8/step3_input.parm7 -c ./sirah.amber/tutorial/8/step3_input.rst7 > 1GYA_glycam.pdb

From the file 1GYA_glycam.pdb generated delete the solvent, rename to 1GYA_glycam_NoW.pdb and then map the protonated atomistic structure to its CG representation:

./sirah.amber/tools/CGCONV/cgconv.pl -i ./sirah.amber/tutorial/8/1GYA_glycam_NoW.pdb -o 1GYA_cg.pdb

The input file -i 1GYA_glycam_NoW.pdb contains the atomistic representation of 1GYA structure at pH 7.0, while the output -o 1GYA_cg.pdb is its SIRAH CG representation.

Tip

This is the basic usage of the script cgconv.pl, you can learn other capabilities from its help by typing:

./sirah.amber/tools/CGCONV/cgconv.pl -h

Note

Pay attention to residue names when mapping structures from other atomistic force fields or experimental structures. Although we provide compatibility for naming schemes in PDB, GMX, GROMOS, CHARMM and OPLS, there might always be some ambiguity in the residue naming, specially regarding protonation states, that may lead to a wrong mapping. For example, SIRAH Tools always maps the residue name “HIS” to a Histidine protonated at the epsilon nitrogen (\(N_{\epsilon}\)) regardless the actual proton placement. Similarly, protonated Glutamic and Aspartic acid residues must be named “GLH” and “ASH”, otherwise they will be treated as negative charged residues. In addition, protonated and disulfide bonded Cysteines must be named “CYS” and “CYX” respectively. These kind of situations need to be carefully checked by the users. In all cases the residues preserve their identity when mapping and back-mapping the structures. Hence, the total charge of the protein should be the same at atomistic and SIRAH levels. You can check the following mapping file to be sure of the compatibility: sirah.amber/tools/CGCONV/maps/sirah_prot.map and sirah_glycans.map.

Important

By default, charged termini are used. However, it is possible to set them neutral by renaming the residues from s[code] to a[code] (Nt-acetylated) or m[code] (Ct-amidated) after mapping to CG, where [code] is the root residue name in SIRAH. For example, to set a neutral N-terminal Histidine protonated at epsilon nitrogen (\(N_{\epsilon}\)) rename it from “sHe” to “aHe”.

Please check both PDBs structures using VMD:

vmd -m sirah.amber/tutorial/8/1GYA_glycam_NoW.pdb 1GYA_cg.pdb

From now on it is just normal Amber stuff!

8.2. Prepare LEaP input

Use a text editor to create the file gensystem.leap including the following lines:

# Load SIRAH force field
addPath ./sirah.amber
source leaprc.sirah

# Load model
glycoprot = loadpdb 1GYA_cg.pdb

charge glycoprot

# N-Glycosilation
bond glycoprot.111.GO2  glycoprot.110.GO2
bond glycoprot.110.GO2  glycoprot.109.GO6

bond glycoprot.112.GO2  glycoprot.109.GO3
bond glycoprot.109.GO2  glycoprot.108.GO6

bond glycoprot.114.GO2 glycoprot.113.GO2
bond glycoprot.113.GO2 glycoprot.108.GO3

bond glycoprot.108.GO2  glycoprot.107.GO4
bond glycoprot.107.GNac glycoprot.106.GO4
# ASN 65
bond glycoprot.106.GNac glycoprot.65.BND


# Add solvent, counterions and 0.15M NaCl
# Tuned solute-solvent closeness for best hydration
solvateOct glycoprot WT4BOX 20 0.7
addIonsRand glycoprot NaW 40 ClW 41

# Save topology
saveAmberParmNetcdf glycoprot 1GYA_cg.prmtop 1GYA_cg.ncrst

# EXIT
quit

Caution

Each glycosidic bond must be defined explicitly in LEaP using the command bond, e.g.: “bond unit.ri.beadi unit.rj.beadj”. Where ri and rj correspond to the residue index in the topology file starting from 1, which may differ from the biological sequence in the PDB file. And beadi and beadj are the names of the beads involved in the glycosidic bond of the corresponding residues.

The above also applies to each disulfide bond, e.g.: “bond unit.ri.BSG unit.rj.BSG”. You can try the command pdb4amber to get those indexes from the atomistic structure, but be aware that it may not work if the Cysteine residues are too far away (in this case result in an empty file):

pdb4amber -i sirah.amber/tutorial/8/1GYA_glycam_NoW.pdb -o 1GYA_aa.pdb && cat 1GYA_aa_sslink

See also

The available electrolyte species in SIRAH force field are: Na⁺ (NaW), K⁺ (KW) and Cl⁻ (ClW) which represent solvated ions in solution. One ion pair (e.g., NaW-ClW) each 34 WT4 molecules results in a salt concentration of ~0.15M (see Appendix for details). Counterions were added according to Machado et al..

8.3. Run LEaP

Run the LEaP application to generate the molecular topology and initial coordinate files:

tleap -f gensystem.leap

Note

Warning messages about long, triangular or square bonds in leap.log file are fine and expected due to the CG topology of some residues.

This should create a topology file 1GYA_cg.prmtop and a coordinate file 1GYA_cg.ncrst.

Use VMD to check how the CG model looks like and particularly the presence of glycosidic bonds:

vmd 1GYA_cg.prmtop 1GYA_cg.ncrst -e ./sirah.amber/tools/sirah_vmdtk.tcl

Tip

VMD assigns default radius to unknown atom types, the script sirah_vmdtk.tcl sets the right ones, according to the CG representation. It also provides a kit of useful selection macros, coloring methods and backmapping utilities. Use the command sirah_help in the Tcl/Tk console of VMD to access the manual pages. To learn about SIRAH Tools’ capabilities, you can also go to the SIRAH Tools tutorial.

8.4. Run the simulation

Make a new folder for the run:

mkdir -p run; cd run

The folder sirah.amber/tutorial/8/ contains typical input files for energy minimization (em1_WT4.in and em2_WT4.in), relaxation (or equilibration) (eq1_WT4.in and eq2_WT4.in) and production (md_WT4.in) runs. Please check carefully the input flags therein, in particular the definition of flag chngmask=0 at &ewald section is mandatory.

Tip

Some commonly used flags in Amber

  • -i: Input file.

  • -o: Output file.

  • -p: Parameter/topology file.

  • -c: Coordinate file.

  • -r: Restart file.

  • -x: Trajectory file.

  • -ref: Reference file

Caution

These input files are executed by the GPU implementation of pmemd.cuda. Other available modules are sander or pmemd, which are both CPU implementations of Amber.

Note

The same input files can be used to run on CPU with the modules pmemd or sander.

Energy Minimization of side chains and solvent by restraining the protein backbone and glycan rings:

pmemd.cuda -O -i ../sirah.amber/tutorial/8/em1_WT4.in -p ../1GYA_cg.prmtop -c ../1GYA_cg.ncrst -ref ../1GYA_cg.ncrst -o 1GYA_cg_em1.out -r 1GYA_cg_em1.ncrst &

Energy Minimization of whole system:

pmemd.cuda -O -i ../sirah.amber/tutorial/8/em2_WT4.in -p ../1GYA_cg.prmtop -c ../1GYA_cg_em1.ncrst -o 1GYA_cg_em2.out -r 1GYA_cg_em2.ncrst &

Solvent Relaxation (or equlibration) in NPT:

pmemd.cuda -O -i ../sirah.amber/tutorial/8/eq1_WT4.in -p ../1GYA_cg.prmtop -c 1GYA_cg_em2.ncrst -ref 1GYA_cg_em2.ncrst -o 1GYA_cg_eq1.out -r 1GYA_cg_eq1.ncrst -x 1GYA_cg_eq1.nc &

Caution

Option restraintmask=:’1-114’ in input file eq1_WT4.in must be set specifically for each system to restrain all glycoprotein’s residues.

Soft Relaxation to improve side chain and glycan solvation (NPT):

pmemd.cuda -O -i ../sirah.amber/tutorial/8/eq2_WT4.in -p ../1GYA_cg.prmtop -c 1GYA_cg_eq1.ncrst -ref 1GYA_cg_eq1.ncrst -o 1GYA_cg_eq2.out -r 1GYA_cg_eq2.ncrst -x 1GYA_cg_eq2.nc &

Production (1000ns):

pmemd.cuda -O -i ../sirah.amber/tutorial/8/md_WT4.in -p ../1GYA_cg.prmtop -c 1GYA_cg_eq2.ncrst -o 1GYA_cg_md.out -r 1GYA_cg_md.ncrst -x 1GYA_cg_md.nc &

8.5. Visualizing the simulation

That’s it! Now you can analyze the trajectory. Process the output trajectory to account for the Periodic Boundary Conditions (PBC):

echo -e "autoimage\ngo\nquit\n" | cpptraj -p ../1GYA_cg.prmtop -y 1GYA_cg_md.nc -x 1GYA_cg_md_pbc.nc --interactive

Load the processed trajectory in VMD:

vmd ../1GYA_cg.prmtop ../1GYA_cg.ncrst 1GYA_cg_md.nc -e ../sirah.amber/tools/sirah_vmdtk.tcl

Note

The file sirah_vmdtk.tcl is a Tcl script that is part of SIRAH Tools and contains the macros to properly visualize the coarse-grained structures in VMD. Use the command sirah-help in the Tcl/Tk console of VMD to access the manual pages. To learn about SIRAH Tools’ capabilities, you can also go to the SIRAH Tools tutorial.

Appendix

Calculating ionic concentrations

\[\]
\[\rho_{\text{WT4}} = \rho_{\text{H}_2\text{O}} = 1000 \, \text{g/L} \] \[MW_{\text{H}_2\text{O}} = 18 \, \text{g/mol} \] \[1 \text{WT4} \sim 11 \text{H}_2\text{O} \] \[\text{M}=\frac{\text{mol}}{V} \quad ; \quad \text{n}={\text{N}_A} \quad ; \quad \rho = \frac{\text{m}}{V} \quad ; \quad \text{m} = \text{mol} \cdot \text{M} \cdot \text{W}\] \[\text{V} = \frac{\text{m}}{\rho} = \frac{\text{mol} \cdot \text{MW}_{\text{H}_2\text{O}}}{\rho} = \frac{\text{n}_{\text{H}_2\text{O}} \cdot \text{MW}_{\text{H}_2\text{O}}}{{\text{N}_A} \cdot \rho}\] \[\text{M}=\frac{\text{mol}}{V} = \frac{\text{n}_{\text{ion}}}{\text{N}_{\text{A}} \cdot \text{V}} = \frac{\text{n}_{\text{ion}}}{\text{N}_{\text{A}}} \frac{\text{N}_{\text{A}} \cdot \rho}{\text{n}_{\text{H}_2\text{O}} \cdot \text{MW}_{\text{H}_2\text{O}}} = \frac{\text{n}_{\text{ion}} \cdot 1000}{\text{n}_{\text{WT}_4} \cdot \text{(11)} \cdot \text{(18)}} \sim 5 \frac{\text{n}_{\text{ion}}}{\text{n}_{\text{WT}_4}} \]

Number of WT4 molecules per ion at 0.15M: \(\text{n}_{WT4} = 5 {{\text{n}_{\text{ion}}} \over \text{M}} = {5(1) \over 0.15} \sim 34 \)

Available mapping files

Table 1. Available mapping files (MAPs) at folder sirah.amber/tools/CGCONV/maps/ for converting atomistic lipid structures to SIRAH models. Important! MAPs can not inter-convert different name conventions, e.g. amber_lipid.map won’t generate fragment-based residues from residue-based force fields. Due to possible nomenclature conflicts, users are advised to check and modify the MAPs as required.

Map

Type*

Compatibility

Source

amber_lipid.map

F

AMBER Lipid11-17 force fields

GAFF_lipid.map

R

AMBER GAFF force field

LipidBook

charmm_lipid.map

R

CHARMM 27/36 force field, and “CHARMM
compatible” GAFF nomenclature

slipids.map

R

Stockholm lipids force field

OPLSA-AA_2014_lipid.map

R

All-atoms lipids for OPLS force field

OPLSA-UA_lipid.map

R

United-atom lipids for OPLS force field

LipidBook

GROMOS43a1_lipid.map

R

United-atom lipids for GROMOS 43a1 and
CKP force fields

GROMOS43a1-s3_lipid.map

R

United-atom lipids for GROMOS 43a1-s3
force field

GROMOS53a6_lipid.map

R

United-atom lipids for GROMOS 53a6
force field

tieleman_lipid.map

R

Berger lipids as implemented by
Tieleman et al. for GROMOS
force fields.

* Fragment-based (F) or Residue-based (R) topology.