MolSys Utils

get_dihedral_rotation_matrix(theta: float, v: _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes]) _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes]

Build a matrix that will rotate coordinates about a vector, v, by theta in radians.

Parameters:
  • theta (float) – Rotation angle in radians.

  • v ((3,) ArrayLike) – Three dimensional vector to rotate about.

Returns:

rotation_matrix – Matrix that will rotate coordinates about the vector, V by angle theta.

Return type:

np.ndarray

get_dihedral(p: _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes]) float

Calculates dihedral of a given set of atoms, p . Returns value in radians.

                3
     ------>  /
    1-------2
  /
0
Parameters:

p ((4, 3) ArrayLike) – Matrix containing coordinates to be used to calculate dihedral.

Returns:

dihedral – Dihedral angle in radians.

Return type:

float

get_dihedrals(p1: _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes], p2: _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes], p3: _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes], p4: _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes]) _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes]

Vectorized version of get_dihedral

Parameters:
  • p1 (ArrayLike) – Array containing coordinates of the first point in the dihedral.

  • p2 (ArrayLike) – Array containing coordinates of the second point in the dihedral

  • p3 (ArrayLike) – Array containing coordinates of the third point in the dihedral

  • p4 (ArrayLike) – Array containing coordinates of the fourth point in the dihedral

Returns:

dihedrals – Dihedral angles in radians.

Return type:

ArrayLike

get_angle(p: _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes]) float

Calculate the angle created by 3 points.

     p2
   / θ \
p1      p3
Parameters:

p (ArrayLike :) – Array of three points to calculate the angle between.

Returns:

angle – Angle created by the three points.

Return type:

float

get_angles(p1: _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes], p2: _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes], p3: _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes]) _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes]

Vectorized version of get_angle.

Parameters:
  • p1 (ArrayLike :) – Array of first points in the angles.

  • p2 (ArrayLike :) – Array of second points in the angles.

  • p3 (ArrayLike :) – Array of third points in angle.

Returns:

angles – Array of anlges.

Return type:

float

set_dihedral(p: _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes], angle: float, mobile: _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes]) _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes]

Sets the dihedral angle by rotating all mobile atoms from their current position about the dihedral bond defined by the four atoms in p . Dihedral will be set to the value of angle in degrees.

Parameters:
  • p (ArrayLike) – Coordinates of atoms that define dihedral to rotate about.

  • angle (float) – New angle to set the dihedral to (degrees).

  • mobile (np.ndarray) – Atom coordinates to move by setting dihedral.

Returns:

new_mobile – New positions for the mobile atoms

Return type:

np.ndarray

guess_mobile_dihedrals(ICs, aln_atoms=None)

Guess the flexible, uniqe, side chain dihedrals of a MolSysIC object.

Parameters:
  • ICs (MolSysIC) – Internal coordinates object you wish to guess dihedrals for.

  • aln_atoms (List[str]) – List of atom names corresponding to the “alignment atoms” of the molecule. These are usually the core backbone atoms, e.g. N CA C for a protein.

Returns:

dihedral_defs – List of unique mobile dihedral definitions

Return type:

List[List[str]]

class FreeAtom(name: str, atype: str, index: int, resn: str, resi: int, coords: ndarray)

Atom class for atoms in cartesian space that do not belong to a MolSys.

name

Atom name

Type:

str

atype

Atom type

Type:

str

index

Atom number

Type:

int

resn

Name of the residue that the atom belongs to

Type:

str

resi

The residue index/number that the atom belongs to

Type:

int

coords

The cartesian coordinates of the Atom

Type:

np.ndarray

save_ensemble(name: str, atoms: _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes], coords: _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes] | None = None) None

Save a rotamer ensemble as multiple states of the same molecule.

Parameters:
  • name (str) – file name to save rotamer ensemble to

  • atoms (ArrayLike) – list of Atom objects

  • coords (ArrayLike) – Array of atom coordinates corresponding to Atom objects

save_pdb(name: str | Path, atoms: _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes], coords: _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes], mode: str = 'w') None

Save a single state pdb structure of the provided atoms and coords.

Parameters:
  • name (str, Path) – Name or Path object of file to save

  • atoms (ArrayLike) – List of Atom objects to be saved

  • coords (ArrayLike) – Array of atom coordinates corresponding to atoms

  • mode (str) – File open mode. Usually used to specify append (“a”) when you want to add structures to a PDB rather than overwrite that pdb.

get_missing_residues(protein: Universe | AtomGroup, ignore: Set[int] | None = None, use_H: bool = False) List

Get a list of RotamerEnsemble objects corresponding to the residues of the provided protein that are missing heavy atoms.

Parameters:
  • protein (MDAnalysis.Universe, MDAnalysis.AtomGroup) – MolSys to search for residues with missing atoms.

  • ignore (set) – List of residue numbers to ignore. Usually sites you plan to label or mutate.

  • use_H (bool) – Whether the new side chain should have hydrogen atoms.

Returns:

missing_residues – A list of RotamerEnsemble objects corresponding to residues with missing heavy atoms.

Return type:

list

mutate(protein: Universe, *ensembles: RotamerEnsemble, add_missing_atoms: bool = True, random_rotamers: bool = False) Universe

Create a new Universe where the native residue is replaced with the highest probability rotamer from a RotamerEnsemble or SpinLabel object.

Parameters:
  • protein (MDAnalysis.Universe) – An MDA Universe object containing protein to be spin labeled

  • ensembles (RotamerEnsemble, SpinLabel) – Precomputed RotamerEnsemble or SpinLabel object to use for selecting and replacing the spin native amino acid

  • random_rotamers (bool) – Randomize rotamer conformations

  • add_missing_atoms (bool) – Model side chains missing atoms if they are not present in the provided structure.

Returns:

U – New Universe with a copy of the spin labeled protein with the highest probability rotamer

Return type:

MDAnalysis.Universe

randomize_rotamers(protein: Universe | AtomGroup, rotamer_libraries: List[RotamerEnsemble], **kwargs) None

Modify a protein object in place to randomize side chain conformations.

Parameters:
  • protein (MDAnalysis.Universe, MDAnalysis.AtomGroup) – MolSys object to modify.

  • rotamer_libraries (list) – RotamerEnsemble objects attached to the protein corresponding to the residues to be repacked/randomized.

  • **kwargs (dict) – Additional Arguments to pass to sample method. See sample .

get_sas_res(protein: Universe | AtomGroup, cutoff: float = 30, forcefield='charmm') Set[Tuple[int, str]]

Run FreeSASA to get solvent accessible surface residues in the provided protein

Parameters:
  • protein (MDAnalysis.Universe, MDAnalysis.AtomGroup) – MolSys object to measure Solvent Accessible Surfaces (SAS) area of and report the SAS residues.

  • cutoff (float) – Exclude residues from list with SASA below cutoff in angstroms squared.

  • forcefield (Union[str, chilife.ForceField]) – Forcefiled to use defining atom radii for calculating solvent accessibility.

Returns:

SAResi – A set of solvent accessible surface residues.

Return type:

set

pose2mda(pose) Universe

Create an MDAnalysis universe from a pyrosetta pose

Parameters:

pose (pyrosetta.rosetta.core.Pose) – pyrosetta pose object.

Returns:

mda_protein – Copy of the input pose as an MDAnalysis Universe object

Return type:

MDAnalysis.Universe

xplor2mda(xplor_sim) Universe

Converts an Xplor-NIH xplor.simulation object to an MDAnalysis Universe object

Parameters:

xplor_sim (xplor.simulation) – an Xplor-NIH simulation object. (See https://nmr.cit.nih.gov/xplor-nih/doc/current/python/ref/simulation.html)

make_mda_uni(anames: _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes], atypes: _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes], resnames: _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes], resindices: _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes], resnums: _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes], segindices: _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes], segids: _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes] | None = None) Universe

Create an MDAnalysis universe from numpy arrays of atom information.

Parameters:
  • anames (ArrayLike) – Array of atom names. Length should be equal to the number of atoms.

  • atypes (ArrayLike) – Array of atom elements or types. Length should be equal to the number of atoms.

  • resnames (ArrayLike) – Array of residue names. Length should be equal to the number of residues.

  • resindices (ArrayLike) – Array of residue indices. Length should be equal to the number of atoms. Elements of resindices should map to resnames and resnums of the atoms they represent.

  • resnums (ArrayLike) – Array of residue numbers. Length should be equal to the number of residues.

  • segindices (ArrayLike) – Array of segment/chain indices. Length should be equal to the number of residues.

  • segids (ArrayLike, None) – Array of segment/chain IDs. Length should be equal to the number of segs/chains.

Returns:

mda_uni – The Universe created by the function.

Return type:

MDAnalysis.Universe

template_ICs(template)

Get the internal coordinate parameters (bond angle and dihedral angle) of the protein backbone atoms of the template.

Parameters:

template (Union[str, Path, MolSys]) – The PDB file or MolSys object to serve as the template.

Returns:

phi, psi, omega, bond_angles – Arrays containing backbone dihedral and bond angles as calculated from the template structure.

Return type:

ArrayLike

make_peptide(sequence: str, phi=None, psi=None, omega=None, bond_angles=None, template=None) MolSys

Create a peptide from a string of amino acids. chilife NCAA rotamer libraries and smiles can be inserted by using square brackets and angle brackets respectively , e.g. [ACE]AA<C1=CC2=C(C(=C1)F)C(=CN2)CC(C(=O)O)N>AA[NME] where ACE and NME are peptide caps and <C1=CC2=C(C(=C1)F)C(=CN2)CC(C(=O)O)N> is a smiles for a NCAA. Backbone torsion and bond angles can be set using the phi, psi, omega, and bond_angel keyword argument. All angles should be passed in degrees. Alternativly backbone angles can be set with a protein template that can either be a MolSys object, or a sting/Path object pointing to a PDB file.

Parameters:
  • sequence (str) – The Amino acid sequence.

  • phi (ArrayLike) – N-1 length array specifying the peptide backbone phi angles (degrees).

  • psi (ArrayLike) – N-1 length array specifying the peptide backbone psi angles (degrees).

  • omega (ArrayLike) – N-1 length array specifying the peptide backbone omega angles (degrees).

  • bond_angles (ArrayLike) –

    Nx4 shaped array specifying the 4 peptide backbone bond angles (degrees). Note that, for the first residue any value for the first and second bond angles will be ignored because they are undefined.

    1 - CA(i-1)-C(i-1)-N 2 - C(i-1)_N_CA 3 - N-CA-C 4 - CA-C-O

  • template (str, Path, MolSys)

Returns:

mol – A chiLife MolSys object

Return type:

MolSys

parse_sequence(sequence: str) List[str]

Input a string of amino acids with mized 1-letter codes, square brackted [] 3-letter codes or angle <> bracketed smiles and return a list of 3-letter codes and smiles.

Parameters:
  • sequence (str)

  • sequence. (The Amino acid)

Returns:

  • parsed_sequence (List[str])

  • A list of the amino acid sequences.

smiles2residue(smiles: str, **kwargs) MolSys

Create a protein residue from a smiles string. Smiles string must contain an N-C-C-O dihedral to identify the protein backbone. If no dihedral is found or there are too many N-C-C-O dihedrals that are indistinguishable this function will fail.

Parameters:
  • smiles (str) – SMILES string to convert to 3d and a residue

  • kwargs (dict) – Keyword arguments to pass to rdkit.AllChem.EmbedMolecule (usually randomSeed)

Returns:

msys – chiLife molecular system object containing the smiles string as a 3D molecule and single residue.

Return type:

MolSys

append_cap(mol: MolSys, cap: str, dihedral: float | None = None) MolSys

Append a peptide cap to a molecular system.

Parameters:
  • mol (MolSys) – The molecular system to be capped.

  • cap (str) – The name (3-letter identifier) of the cap to be attached.

Returns:

mol – The molecular system with the cap.

Return type:

MolSys

store_cap(name, mol, term)

Save the cap attached to a molecule. A pdb file will be saved in the chilife/data/rotamer_libraries/cap_pdbs folder with the provided name and the cap will be registered in either the ncaps.txt or ccaps.txt file in the chilife/data/rotamer_libraries directory.

Parameters:
  • name (str) – Name (3-letter identifier) of the cap to be attached.

  • mol (str, Path, Universe, AtomGroup, MolSys) – Path, MDA.AtomGroup or MolSys object containing the cap. The cap must be its own residue number, i.e. not merged with the reside before or after it.

  • term (str) – The terminus the cap is attached to. Only two values are accepted N and C

get_grid_mx(N, CA, C)

Get the rotation matrix and translation vector to orient a grid of lennard-jones spheres into a new coordinate frame. It is expected that the grid is defined as it is in screen_site_volume(), i.e. the xy plane of the grid is centered at the origin and the z-axis is positive.

Parameters:
  • N (np.ndarray) – First coordinate of the site.

  • CA (np.ndarray) – Second coordinate of the site (The origin).

  • C (np.ndarray) – Third coordinate of the site.

Returns:

  • rotation matrix (np.ndarray) – Rotation matrix for the grid. Should be applied before translation.

  • origin (np.ndarray) – Translation vector for the grid. Should be applied after rotation.

write_grid(grid, name='grid.pdb', atype='Se')

Given an array of 3-dimensional points (a grid usually), write a PDB so the grid can be visualized.

Parameters:
  • grid (ArrayLike) – The 3D points to write

  • name (str) – The name of the file to write the grid to.

  • atype (str) – Atom type to use for the grid.

get_site_volume(site: int, mol: Universe | AtomGroup | MolecularSystemBase, grid_size: float | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes] = 10, offset: float | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes] = -0.5, spacing: float = 1.0, vdw_r=2.8, write: str = False, return_grid: bool = False) float | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes]

Calculate the (approximate) accessible volume of a single site. The volume is calculated by superimposing a grid of lennard-jones spheres at the site, eliminating those with clashes and calculating the volume from the remaining spheres. Note that this is a different “accessible volume” than the “accessible volume” method made popular by MtsslWizard.

Parameters:
  • site (int) – Site of the mol to calculate the accessible volume of.

  • mol (MDAnalysis.Universe, MDAnalysis.AtomGroup, chilife.MolecularSystemBase) – The molecule/set of atoms containing the site to calculate the accessible volume of.

  • grid_size (float, ArrayLike) – Dimensions of grid to use to calculate the accessible volume. Can be a float, specifying a cube with equal dimensions, or an array specifying a unique value for each dimension. Uses units of Angstroms.

  • offset (float, ArrayLike) – The offset of the initial grid with respect to the superimposition site. If offset is a float, the grid will be offset in the z-dimension (along the CA-CB bond). If offset is an array, the grid will be offset in each x, y, z dimension by the corresponding value of the array. Uses units of Angstroms

  • spacing (float) – Spacing between grid points. Defaults to 1.0 Angstrom

  • vdw_r (float) – van der Waals radius for clash detection. Defaults to 2.8 Angstroms.

  • write (bool, str, default False) – Switch to make chiLife write the grid to a PDB file. If True the grid will be written to grid.pdb. If given a string chiLife will use the string as the file name.

  • return_grid (bool, default False) – If return_grid=True, the volume will be returned as a numpy array containing the grid points that do not clash with the neighboring atoms.

Returns:

volume – The accessible volume at the given site. If return_grid=True the volume will be returned as the set of grid points that do not clash with neighboring atoms, otherwise volume will be the estimated accessible volume of the site in cubic angstroms (Å 3).

Return type:

float, np.ndarray