R.I. Nugmanova*, T.I. Madzhidova**, I.S. Antipina***, A.A. Varneka,b****
aKazan Federal University, Kazan, 420008 Russia
bUniversity of Strasbourg, Strasbourg, 67084 France
E-mail: *stsouko@live.ru, **tmadzhid@gmail.com, ***iantipin54@ya.ru, ****varnek@unistra.fr
Received November 16, 2017
Abstract
One of the problems of chemical reactions databases management is the fact that most reactions in them are stoichiometrically unbalanced, i.e., they lack information about one or more reagents or products. It leads to the problem that some reactions cannot be found in structural searches. A solution of the problem based on the use of the Condensed Graph of Reaction (CGR) approach has been suggested. The CGR is a compressed representation of the reaction transformation in which the atoms of the reagent and the product are aligned. As a result, so-called dynamic bonds corresponding to changes in the bond order during the chemical transformation are identified in the pseudomolecular graph along with the usual chemical bonds. The key observation is that the Condensed Graph, on the one hand, contains all required structural information on chemical reaction and, on the other hand, can be constructed for an unbalanced chemical reaction with missing reagents or products. During the inverse transformation of the Condensed Graph into a chemical reaction, the information on the missing reagents and products can be restored. There are some limitations to this approach, because information about coefficients missing in the reaction cannot be retrieved by the proposed method. To solve such problems, an algorithm based on the use of isomorphic embedding has been developed. This algorithm reveals similar motifs in the Condensed Graph and, if necessary, adds a coefficient before the reagent and/or product, or copies them the required number of times in the reaction equation.
Keywords: chemoinformatics, chemical reactions, stoichiometric reaction balancing, Condensed Graph of Reaction
Acknowledgments. The study was supported by the Russian Science Foundation (project no. 14-43-00024).
Figure Captions
Fig. 1. The example of unbalanced (a) and balanced (b) reactions. If the user is interested in the reactions involving molecule Q, the unfilled reaction (a) will not be retrieved as the results of the search, while the filled reaction (b) will be.
Fig. 2. Chemical reaction (a) and its Condensed Graph (b). The numbers next to atoms correspond to atom-to-atom mapping. In CGR, a crossed bond means a cleaved bond, a circle marks formed single C–O bond. Converting CGR into reaction (c) allows identification of the missed reagents and products.
Fig. 3. a) Unbalanced reaction with missed coefficients and products (a1), its Condensed Graph (a2) and the final balanced reaction (a3). Using CGR, common fragments are identified. The process of restoration of the lost information about the outgoing group is shown. b) An unbalanced reaction (b1), incorrectly balanced (b2), and restored (b3) on the basis of the empirical rule (b4), which provides information about the missing reagents and a number of products.
References
For citation: Nugmanov R.I., Madzhidov T.I., Antipin I.S., Varnek A.A. An approach for automated detection of missing reagents and products in chemical reaction equations. Uchenye Zapiski Kazanskogo Universiteta. Seriya Estestvennye Nauki, 2018, vol. 160, no. 1, pp. 32–39. (In Russian)
The content is available under the license Creative Commons Attribution 4.0 License.