MolSys Utils
- get_dihedral_rotation_matrix(theta: float, v: _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes]) _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes]
Build a matrix that will rotate coordinates about a vector, v, by theta in radians.
- Parameters:
theta (float) – Rotation angle in radians.
v ((3,) ArrayLike) – Three dimensional vector to rotate about.
- Returns:
rotation_matrix – Matrix that will rotate coordinates about the vector, V by angle theta.
- Return type:
np.ndarray
- get_dihedral(p: _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes]) float
Calculates dihedral of a given set of atoms,
p
. Returns value in radians.3 ------> / 1-------2 / 0
- Parameters:
p ((4, 3) ArrayLike) – Matrix containing coordinates to be used to calculate dihedral.
- Returns:
dihedral – Dihedral angle in radians.
- Return type:
float
- get_dihedrals(p1: _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes], p2: _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes], p3: _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes], p4: _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes]) _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes]
Vectorized version of get_dihedral
- Parameters:
p1 (ArrayLike) – Array containing coordinates of the first point in the dihedral.
p2 (ArrayLike) – Array containing coordinates of the second point in the dihedral
p3 (ArrayLike) – Array containing coordinates of the third point in the dihedral
p4 (ArrayLike) – Array containing coordinates of the fourth point in the dihedral
- Returns:
dihedrals – Dihedral angles in radians.
- Return type:
ArrayLike
- get_angle(p: _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes]) float
Calculate the angle created by 3 points.
p2 / θ \ p1 p3
- Parameters:
p (ArrayLike :) – Array of three points to calculate the angle between.
- Returns:
angle – Angle created by the three points.
- Return type:
float
- get_angles(p1: _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes], p2: _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes], p3: _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes]) _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes]
Vectorized version of get_angle.
- Parameters:
p1 (ArrayLike :) – Array of first points in the angles.
p2 (ArrayLike :) – Array of second points in the angles.
p3 (ArrayLike :) – Array of third points in angle.
- Returns:
angles – Array of anlges.
- Return type:
float
- set_dihedral(p: _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes], angle: float, mobile: _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes]) _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes]
Sets the dihedral angle by rotating all
mobile
atoms from their current position about the dihedral bond defined by the four atoms inp
. Dihedral will be set to the value ofangle
in degrees.- Parameters:
p (ArrayLike) – Coordinates of atoms that define dihedral to rotate about.
angle (float) – New angle to set the dihedral to (degrees).
mobile (np.ndarray) – Atom coordinates to move by setting dihedral.
- Returns:
new_mobile – New positions for the mobile atoms
- Return type:
np.ndarray
- guess_mobile_dihedrals(ICs, aln_atoms=None)
Guess the flexible, uniqe, side chain dihedrals of a MolSysIC object.
- Parameters:
ICs (MolSysIC) – Internal coordinates object you wish to guess dihedrals for.
aln_atoms (List[str]) – List of atom names corresponding to the “alignment atoms” of the molecule. These are usually the core backbone atoms, e.g. N CA C for a protein.
- Returns:
dihedral_defs – List of unique mobile dihedral definitions
- Return type:
List[List[str]]
- class FreeAtom(name: str, atype: str, index: int, resn: str, resi: int, coords: ndarray)
Atom class for atoms in cartesian space that do not belong to a MolSys.
- name
Atom name
- Type:
str
- atype
Atom type
- Type:
str
- index
Atom number
- Type:
int
- resn
Name of the residue that the atom belongs to
- Type:
str
- resi
The residue index/number that the atom belongs to
- Type:
int
- coords
The cartesian coordinates of the Atom
- Type:
np.ndarray
- save_ensemble(name: str, atoms: _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes], coords: _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes] | None = None) None
Save a rotamer ensemble as multiple states of the same molecule.
- Parameters:
name (str) – file name to save rotamer ensemble to
atoms (ArrayLike) – list of Atom objects
coords (ArrayLike) – Array of atom coordinates corresponding to Atom objects
- save_pdb(name: str | Path, atoms: _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes], coords: _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes], mode: str = 'w') None
Save a single state pdb structure of the provided atoms and coords.
- Parameters:
name (str, Path) – Name or Path object of file to save
atoms (ArrayLike) – List of Atom objects to be saved
coords (ArrayLike) – Array of atom coordinates corresponding to atoms
mode (str) – File open mode. Usually used to specify append (“a”) when you want to add structures to a PDB rather than overwrite that pdb.
- get_missing_residues(protein: Universe | AtomGroup, ignore: Set[int] | None = None, use_H: bool = False) List
Get a list of RotamerEnsemble objects corresponding to the residues of the provided protein that are missing heavy atoms.
- Parameters:
protein (MDAnalysis.Universe, MDAnalysis.AtomGroup) – MolSys to search for residues with missing atoms.
ignore (set) – List of residue numbers to ignore. Usually sites you plan to label or mutate.
use_H (bool) – Whether the new side chain should have hydrogen atoms.
- Returns:
missing_residues – A list of RotamerEnsemble objects corresponding to residues with missing heavy atoms.
- Return type:
list
- mutate(protein: Universe, *ensembles: RotamerEnsemble, add_missing_atoms: bool = True, rotamer_index: int | str | None = None) Universe
Create a new Universe where the native residue is replaced with the highest probability rotamer from a RotamerEnsemble or SpinLabel object.
- Parameters:
protein (MDAnalysis.Universe) – An MDA Universe object containing protein to be spin labeled
ensembles (RotamerEnsemble, SpinLabel) – Precomputed RotamerEnsemble or SpinLabel object to use for selecting and replacing the spin native amino acid
rotamer_index (int, str, None) – Index of the rotamer to be used for the mutation. If None, the most probable rotamer will be used. If ‘all’, mutate will return an ensemble with each rotamer, if random, mutate will return an ensemble with a random rotamer.
add_missing_atoms (bool) – Model side chains missing atoms if they are not present in the provided structure.
- Returns:
U – New Universe with a copy of the spin labeled protein with the highest probability rotamer
- Return type:
MDAnalysis.Universe
- randomize_rotamers(protein: Universe | AtomGroup, rotamer_libraries: List[RotamerEnsemble], **kwargs) None
Modify a protein object in place to randomize side chain conformations.
- Parameters:
protein (MDAnalysis.Universe, MDAnalysis.AtomGroup) – MolSys object to modify.
rotamer_libraries (list) – RotamerEnsemble objects attached to the protein corresponding to the residues to be repacked/randomized.
**kwargs (dict) – Additional Arguments to pass to
sample
method. Seesample
.
- get_sas_res(protein: Universe | AtomGroup, cutoff: float = 30, forcefield='charmm') Set[Tuple[int, str]]
Run FreeSASA to get solvent accessible surface residues in the provided protein
- Parameters:
protein (MDAnalysis.Universe, MDAnalysis.AtomGroup) – MolSys object to measure Solvent Accessible Surfaces (SAS) area of and report the SAS residues.
cutoff (float) – Exclude residues from list with SASA below cutoff in angstroms squared.
forcefield (Union[str, chilife.ljParams]) – Forcefiled to use defining atom radii for calculating solvent accessibility.
- Returns:
SAResi – A set of solvent accessible surface residues.
- Return type:
set
- pose2mda(pose) Universe
Create an MDAnalysis universe from a pyrosetta pose
- Parameters:
pose (pyrosetta.rosetta.core.Pose) – pyrosetta pose object.
- Returns:
mda_protein – Copy of the input pose as an MDAnalysis Universe object
- Return type:
MDAnalysis.Universe
- xplor2mda(xplor_sim) Universe
Converts an Xplor-NIH xplor.simulation object to an MDAnalysis Universe object
- Parameters:
xplor_sim (xplor.simulation) – an Xplor-NIH simulation object. (See https://nmr.cit.nih.gov/xplor-nih/doc/current/python/ref/simulation.html)
- make_mda_uni(anames: _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes], atypes: _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes], resnames: _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes], resindices: _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes], resnums: _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes], segindices: _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes], segids: _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes] | None = None) Universe
Create an MDAnalysis universe from numpy arrays of atom information.
- Parameters:
anames (ArrayLike) – Array of atom names. Length should be equal to the number of atoms.
atypes (ArrayLike) – Array of atom elements or types. Length should be equal to the number of atoms.
resnames (ArrayLike) – Array of residue names. Length should be equal to the number of residues.
resindices (ArrayLike) – Array of residue indices. Length should be equal to the number of atoms. Elements of resindices should map to resnames and resnums of the atoms they represent.
resnums (ArrayLike) – Array of residue numbers. Length should be equal to the number of residues.
segindices (ArrayLike) – Array of segment/chain indices. Length should be equal to the number of residues.
segids (ArrayLike, None) – Array of segment/chain IDs. Length should be equal to the number of segs/chains.
- Returns:
mda_uni – The Universe created by the function.
- Return type:
MDAnalysis.Universe
- template_ICs(template)
Get the internal coordinate parameters (bond angle and dihedral angle) of the protein backbone atoms of the template.
- Parameters:
template (Union[str, Path, MolSys]) – The PDB file or MolSys object to serve as the template.
- Returns:
phi, psi, omega, bond_angles – Arrays containing backbone dihedral and bond angles as calculated from the template structure.
- Return type:
ArrayLike
- make_peptide(sequence: str, phi=None, psi=None, omega=None, bond_angles=None, template=None) MolSys
Create a peptide from a string of amino acids. chilife NCAA rotamer libraries and smiles can be inserted by using square brackets and angle brackets respectively , e.g.
[ACE]AA<C1=CC2=C(C(=C1)F)C(=CN2)CC(C(=O)O)N>AA[NME]
where ACE and NME are peptide caps and <C1=CC2=C(C(=C1)F)C(=CN2)CC(C(=O)O)N> is a smiles for a NCAA. Backbone torsion and bond angles can be set using the phi, psi, omega, and bond_angel keyword argument. All angles should be passed in degrees. Alternativly backbone angles can be set with a protein template that can either be a MolSys object, or a sting/Path object pointing to a PDB file.- Parameters:
sequence (str) – The Amino acid sequence.
phi (ArrayLike) – N-1 length array specifying the peptide backbone phi angles (degrees).
psi (ArrayLike) – N-1 length array specifying the peptide backbone psi angles (degrees).
omega (ArrayLike) – N-1 length array specifying the peptide backbone omega angles (degrees).
bond_angles (ArrayLike) –
Nx4 shaped array specifying the 4 peptide backbone bond angles (degrees). Note that, for the first residue any value for the first and second bond angles will be ignored because they are undefined.
1 - CA(i-1)-C(i-1)-N 2 - C(i-1)_N_CA 3 - N-CA-C 4 - CA-C-O
template (str, Path, MolSys)
- Returns:
mol – A chiLife MolSys object
- Return type:
- parse_sequence(sequence: str) List[str]
Input a string of amino acids with mized 1-letter codes, square brackted
[]
3-letter codes or angle<>
bracketed smiles and return a list of 3-letter codes and smiles.- Parameters:
sequence (str)
sequence. (The Amino acid)
- Returns:
parsed_sequence (List[str])
A list of the amino acid sequences.
- smiles2residue(smiles: str, **kwargs) MolSys
Create a protein residue from a smiles string. Smiles string must contain an N-C-C-O dihedral to identify the protein backbone. If no dihedral is found or there are too many N-C-C-O dihedrals that are indistinguishable this function will fail.
- Parameters:
smiles (str) – SMILES string to convert to 3d and a residue
kwargs (dict) – Keyword arguments to pass to rdkit.AllChem.EmbedMolecule (usually randomSeed)
- Returns:
msys – chiLife molecular system object containing the smiles string as a 3D molecule and single residue.
- Return type:
- append_cap(mol: MolSys, cap: str, dihedral: float | None = None) MolSys
Append a peptide cap to a molecular system.
- store_cap(name, mol, term)
Save the cap attached to a molecule. A pdb file will be saved in the chilife/data/rotamer_libraries/cap_pdbs folder with the provided name and the cap will be registered in either the ncaps.txt or ccaps.txt file in the chilife/data/rotamer_libraries directory.
- Parameters:
name (str) – Name (3-letter identifier) of the cap to be attached.
mol (str, Path, Universe, AtomGroup, MolSys) – Path, MDA.AtomGroup or MolSys object containing the cap. The cap must be its own residue number, i.e. not merged with the reside before or after it.
term (str) – The terminus the cap is attached to. Only two values are accepted
N
andC
- get_grid_mx(N, CA, C)
Get the rotation matrix and translation vector to orient a grid of lennard-jones spheres into a new coordinate frame. It is expected that the grid is defined as it is in
screen_site_volume()
, i.e. the xy plane of the grid is centered at the origin and the z-axis is positive.- Parameters:
N (np.ndarray) – First coordinate of the site.
CA (np.ndarray) – Second coordinate of the site (The origin).
C (np.ndarray) – Third coordinate of the site.
- Returns:
rotation matrix (np.ndarray) – Rotation matrix for the grid. Should be applied before translation.
origin (np.ndarray) – Translation vector for the grid. Should be applied after rotation.
- write_grid(grid, name='grid.pdb', atype='Se')
Given an array of 3-dimensional points (a grid usually), write a PDB so the grid can be visualized.
- Parameters:
grid (ArrayLike) – The 3D points to write
name (str) – The name of the file to write the grid to.
atype (str) – Atom type to use for the grid.
- get_site_volume(site: int, mol: Universe | AtomGroup | MolecularSystemBase, grid_size: float | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes] = 10, offset: float | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes] = -0.5, spacing: float = 1.0, vdw_r=2.8, write: str = False, return_grid: bool = False) float | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes]
Calculate the (approximate) accessible volume of a single site. The volume is calculated by superimposing a grid of lennard-jones spheres at the site, eliminating those with clashes and calculating the volume from the remaining spheres. Note that this is a different “accessible volume” than the “accessible volume” method made popular by MtsslWizard.
- Parameters:
site (int) – Site of the mol to calculate the accessible volume of.
mol (MDAnalysis.Universe, MDAnalysis.AtomGroup, chilife.MolecularSystemBase) – The molecule/set of atoms containing the site to calculate the accessible volume of.
grid_size (float, ArrayLike) – Dimensions of grid to use to calculate the accessible volume. Can be a float, specifying a cube with equal dimensions, or an array specifying a unique value for each dimension. Uses units of Angstroms.
offset (float, ArrayLike) – The offset of the initial grid with respect to the superimposition site. If
offset
is a float, the grid will be offset in the z-dimension (along the CA-CB bond). Ifoffset
is an array, the grid will be offset in each x, y, z dimension by the corresponding value of the array. Uses units of Angstromsspacing (float) – Spacing between grid points. Defaults to 1.0 Angstrom
vdw_r (float) – van der Waals radius for clash detection. Defaults to 2.8 Angstroms.
write (bool, str, default False) – Switch to make chiLife write the grid to a PDB file. If
True
the grid will be written to grid.pdb. If given a string chiLife will use the string as the file name.return_grid (bool, default False) – If
return_grid=True
, the volume will be returned as a numpy array containing the grid points that do not clash with the neighboring atoms.
- Returns:
volume – The accessible volume at the given site. If
return_grid=True
the volume will be returned as the set of grid points that do not clash with neighboring atoms, otherwise volume will be the estimated accessible volume of the site in cubic angstroms (Å 3).- Return type:
float, np.ndarray