Dimensional list of molecular descriptors

2019/1/5

What is a molecular descriptor?

A numerical value that expresses the characteristics of the molecule based on its chemical structure.The type of descriptor is divided into 0-4 dimensions depending on the compound space considered when calculating the descriptor.

Descriptor list by number of dimensions

Number of dimensions descriptor Example :
0D Configuration descriptor
Count descriptor
Molecular weight, number of bonds
Atomic number of C, H, O, N, etc.
1D Number of fragments
Fingerprints
Count number and presence / absence of a specific substructure (0 or 1)
-CH3, -OH, -NH2, -COOH
-CH2-, -CH2-CH2-… etc.
2D Topological descriptor
(Topological index, Connectivity index)
Balaban J index, Zagreb index, Wiener index,
Chi connectivity index, kappa shape index,
BCUT
3D Geometry descriptor
(Geometric descriptor)
3D-MoRSE descriptors
WHIM descriptors
GETAWAY descriptors
Quantum-chemical descriptors
size, steric, surface and volume descriptors, etc.
4D Interaction energy Sampling of 3D coordinates + conformation.
Grid, CoMFA, Volsurf

Image in 0-3 dimensions

The following image is easy to understand.This is a slide from the laboratory (Chemometrics and QSAR Research Group) of the University of Strasbourg, France.

Source:http://infochim.u-strasbg.fr/CS3/program/material/Todeschini.pdf

0D descriptor

The 0D molecular descriptors are also called Constitutional descriptors and Count descriptors.

Including the molecular weight, the count of certain atoms in the molecule (C, H, O, N, halogen, the number of rings, the total number of heavy atoms, etc.), the number of rotatable bonds, the number of 2 (or 3) double bonds, etc. The values ​​that can be obtained from the molecular formula are listed.

1D descriptor

A group of descriptors that count specific functional groups and substructures (= number of fragments) and express their presence or absence by 0 and 1 (= Fingerprint).
The target functional groups and partial structures include primary, secondary and tertiary carbons, terminal & internal carbons, hydroxy groups, amino groups, amide groups, imino groups, carboxylic acids, thiols, benzene rings, aromatic rings, etc.

The number of hydrogen bond donor & acceptor atoms and physical property values ​​such as various LogPs (AlogP, ClogP, SlogP, XlogP, etc.) are also included in the one-dimensional descriptor.

2D descriptor

A two-dimensional descriptor includes a topological descriptor.It is also called a topological index or a connectivity index.Professor Haruo Hosoya, an emeritus professor at Ochanomizu University, is known as the inventor.

A topological descriptor compound is a value calculated as an invariant of a compound as a graph structure and its molecular graph.

Example:
Wiener index: Sum of the shortest distances between certain atoms in a molecule
Topological Polar Surface Area (TPSA): The area of ​​the polar part of the molecular surface. A high-speed approximate calculation of PSA that requires a three-dimensional structure.

Approximate calculation of 2D information from 3D information such as TPSA is also called 2.5D descriptor, and a part of 3D descriptor also applies to this.

3D descriptor

The three-dimensional descriptor is a value calculated based on the three-dimensional structure of the compound. An accurate 3D structure is required to calculate the 3D descriptor.

A molecular graph that weights the values ​​calculated from quantum chemistry calculations (HOMO / LUMO energy levels, etc.) and the three-dimensional coordinates of x, y, and z according to the characteristics of each atom is placed, and the corresponding molecular matrix. The eigenvalues ​​calculated from are used.

4D descriptor

A descriptor defined through interaction with other compounds, such as interaction energy. It is obtained from the Grid, CoMFA, Volsurf method, etc.

Dimensional classification of descriptors

この記述子の次元は英語版wikiで0-4次元では0-3次元、RDkitやPaDEL‐descriptorでは1&2と3次元に分類されており、分類の仕方も様々です(Grid, CoMFA, Volsurfを3次元としているところもありました)。出典やソフトにより違いがありますが、運用上はSMILESからも計算できる0-2次元以内の記述子と立体構造情報が必要な3次元以上に大別して考えればよいのではないかと思います。

reference
・ English wiki https://en.wikipedia.org/wiki/Molecular_descriptor
・ ScienceDirect Topic  https://www.sciencedirect.com/topics/medicine-and-dentistry/molecular-descriptor
-http://infochim.u-strasbg.fr/CS3/program/material/Todeschini.pdf