34 Matching Annotations
  1. Feb 2017
    1. aromatic bond type

      How would an aromatic bond be represented if needed to be used to bypass the multiple configuration possibilities?

    1. 3D (x,y,z) coordinates can also be stored for each atom and used to display the conformation of a molecule. These coordinates may be determined experimentally (typically via x-ray crystallography), or calculated (using force-fields, quantum chemistry, molecular dynamics or composite models such as docking). Understanding a molecule's actual shape, whether it be in solution, in a vacuum, or in the binding site of a protein, opens up a whole new domain of computational chemistry. Most molecules have some flexibility, and even if a given conformation is the most stable, there are often a number of competing shapes to consider. Knowing how a particular set of coordinates was determined is crucial to making intelligent use of it for cheminformatics purposes.

      This would be a very good project if all of the 3D data for the different conformations were all indexed and compared based on the influence type and how it changed a molecules shape.

    1. A substructure search query can be matchedagainst a connection table atom-by-atom but the so-called subgraph isomorphism algorithm that is usedin substructure search to compare one graph againstanother is slow and complex and it is likely that theremay be many mismatches before a hit is found. Asubstructure search can be carried out faster if an ini-tial screening stage is carried out to filter out quicklystructures that could not possibly be matches. A com-mon method is to use substructure fragments as thefilter.

      Its very interesting that breaking apart a structure into fragments, performing a search and then rebuilding the structure as part of a screening process with a complex search would be more accurate than just using a connection table to compare each atom.

    2. TheMolecular Structure Encoding System, MOSES, fromMolecular Networks (Erlangen, Germany) is a laterdevelopment from Gasteiger’s team.

      What features does MOSES have or how does it address the issue delocalization of electrons between more then two atoms?

    3. Of course, chemical structuresand topological graphs are not entirely equivalent: aconnection table is akin to a description of a singlevalence bond structure and does not take account,for example, of delocalized bonds.

      The Connection table could not explain about the delocalized bonds but most of organic compounds are made up of delocalized bonds so how can we understand these bonds through databases? and how can we said that connection table is good method ?

    4. WIREs Computational Molecular Science Representation of chemical structuresFIGURE 3|A redundant connection table

      Just a comment here. This table is much easier for me to follow than the examples in module 2 part 2.

    5. Of course, chemical structuresand topological graphs are not entirely equivalent: aconnection table is akin to a description of a singlevalence bond structure and does not take account,for example, of delocalized bonds.Alternative approaches have been suggested.

      Why hasn’t there been a bigger deal made about the fact that connection tables do not take in account delocalized bonds?

    6. The Morgan algorithm identifies atoms basedon an extended connectivity value. The atom withthe highest value becomes the first atom in the name,and its neighbors are then listed in descending or-der. Ties are resolved based on additional parame-ters, for example, bond order and atomic number.The original Morgan algorithm did not handle stere-ochemistry; the stereochemically unique naming algo-rithm [stereochemical extension of Morgan algorithm(SEMA)] was developed to handle stereoisomers

      If figure 3 connection table is created based on Morgan algorithm, how would the carbon from the far left is put as the 1st atom not the nitrogen?

    7. Another notation, called rep-resentation of structure diagram arranged linearly(ROSDAL),52,53was written to transfer structuresquickly in a compact form over a network to enablesearching of the Beilstein database online.54ROSDALis still supported by InfoChem, and by Elsevier (Am-sterdam, The Netherlands) in Reaxys (vide infra) andthe Beilstein CrossFire structure editor.

      I noticed that the ROSDAL notation is still supported by ELsevier. Doesn't Elsevier uses SMILES like most other information systems widely use today?

    1. Chirality

      How can we write the Mol file for Chirality between two different atoms like chirality between C-CH3 or C-OH?

    2. Bonds Block

      It seems confusing to me how do they build up this bond block? I am confused at 4-5,6-1 and 5-7 bond block.

    3. Resonance Run-of-the mill delocalization presents some of the same problems as aromaticity, but there is no conventional label for (non-aromatic) delocalized electrons, such as the delocalized negative charge and pi system in benzoate (VII and VIII). The connection tables will simply represent one resonance structure or another.

      In the figure MOL VII and MOL VIII, there is insert of 5 in the V2000 (file format), what is significant of this value? and how can we know which value should be inserted for other resonance structure?

    4. To make things even more complicated, software may account for the chirality of a stereocenter atom when generating a MOL file but ignore it when rendering a MOL file!

      Is there a way to catch this if it happens?

    5. Each of the two Kekulé structures for the benzene ring shows up as a different set of single and double bonds (MOL I, MOL IV). The Bond Tables are different: The MOL file format uses the number 4 to indicate bonds that are explicitly labeled as aromatic (MOL V). This has the advantage of differentiating aromatic bonds from single and double bonds without requiring the chemist to write a script to identify and label the alternating single and double bonds of a Kekulé structure.

      When would you use the aromatic structure instead of the Kekulé structures?

    6. Are connection tables not used for inorganic compounds? Is there a different program used to analyze these?

    7. Our atom table will consist of two fields: one an index number identifying the atom we’re talking about, one indicating atom type (i.e. C, H, O, N, etc.). Our bond table will consist of three fields: two indicating the two atoms that the bond connects, and one indicating the bond order (1=single, 2=double, 3=triple).

      Are the atoms numbered by a set of governing rules or does it matter which atom is numbered what?

      Even given the following examples, I still do not understand how to read or create a bond table. I would like to see an example worked in steps so that I can follow the process.

    8. As an example, take isopropyl alcohol. SCT I is a connection table representing this compound – or, more specifically, representing this structural formula. Connection tables are not necessarily unique We could draw up other tables of atoms and bonds that represent this compound as well: for example, SCT II and SCT III

      I'm just wonder how this connection table is form? Is it from Morgan algorithm or just randomly assign number for atom?

    9. As a starting point, this section will introduce a simplified form of connection table, which we’ll call an “SCT”. This SCT does not correspond directly to any existing file format (at least as far as we know!). Rather, it is a convenient model that we will use just for the purpose of this demonstration.

      Is an SCT table similar or the same as an MDL table?

    10. The MOL file format uses the number 4 to indicate bonds that are explicitly labeled as aromatic (MOL V). This has the advantage of differentiating aromatic bonds from single and double bonds without requiring the chemist to write a script to identify and label the alternating single and double bonds of a Kekulé structure. However, some software may not be built to handle this convention. (You might even run into cases in which it’s interpreted as a quadruple bond!)

      The Kekulé A and Kekulé B connection tables for the bond types is very confusing, when I tried to switch from one to the other. The introduction to the third type using just the number four is great.

    1. these additional data, coupled with the additional metadata recorded in the SDF format, makes this file format ideal

      In what instances is MDL ideal and/or when would you prefer MDL over SDF?

    2. a key advantage of the Molfile and SDF formats is the inclusion of geomet-ric information regarding the spatial arrangement of atoms in three-dimen-sional space.

      You talk of advantages of using both MDL and SDF formats, are there any disadvantages that would make you utilize other formats?

    1. The chemical formula is straightforward enough, but the connectivity and hydrogen sections require some explanation. The connectivity layer describes chains and branches –for example in the above Propene example, atom 1 is bonded to atom 3, which is bonded to atom 2. In the final example, we have branching, as represented by paren

      This explanation further down makes sense, considering that " different implementations interpret SMILES differently", especially for compounds that have the same chemical formula. I was wondering as to how you would get to the right InChI key, starting from a chemical formula.

    2. the full stop (“.”) which overrides the implicit single bond between adjacent atoms we can make some exotic variants on SMILES:C1C.CC1ButaneC1CC1.C2CC2Cyclohexane

      In this section, it is explained about notation for even number of molecule, like butane but not explain for odd number of molecule such as heptane, cycloheptane or cycloheptene. So how we can write nototion for these compound?

    3. An InChI Keycan also be generated for a compound. This is completely separate from the InChI linear notation, and is used to provide an identifier for a compound that is particularly suitable for use in Web search engine

      Does this mean that a regular InChI cannot be used in Web search engines?

    4. Systematic name

      Is there a way to update this definition on wikipedia or somehow make the definition more accessible via a search engine? I struggled to find a legitimate, consistent definition for a related assignment in my cheminformatics class. I wish I had seen this sooner.

    5. n InChI Keycan also be generated for a compound. This is completely separate from the InChI linear notation, and is used to provide an identifier for a compound that is particularly suitable for use in Web search engines. It is an ASCII character string based on a hashing of the InChI linear notation, but is of fixed length and uses only characters not normally conside

      I was wondering when you would use the InChI Key instead of regular InChI?

    6. Fortunately, this can be done with the Morgan Algorithm18.In this algorithm, each atom is given a "connectivity value" reflecting how many atoms it is connecte

      So you can create as many connection table as possible like the exercise in OLCC website, the correct or "canonical" table is the one that is generated with Morgan algorithm. Is it sound right?

    7. All InChIs currently are prefixed with “INCHI=”. Following this, a designator of “1/” or “1S/” indicates whether the InChI is non-standard or standard (i.e. with fixed standardized options in the software)

      In using InChI ,when exactly does it matter in adding the standard or non standardized option as needed?

    1. We would need to add an additional field to the atom and/or bond table to handle chirality (SCT VI, VII). We could do so either in a chemically sophisticated way, annotating the atom property, in a chemically-naive translation of a diagram feature, annotating the bond configuration, or both.

      In this section, SCT VI is used for R stereoisomer and SCT VII for S isomers but they are looks similar so how can we recognize which one is which? The authors said we can recognize them by using chemically sophisticated way but they have not fully explained it.

  2. Jan 2017
    1. Molecular similarity methods can be broadly classified into two-dimensional (2-D) and three-dimensional (3-D) similarity methods.  Typically, 2-D similarity methods use so-called molecular fingerprints, which encode structural information of a molecule into a binary string (that is, a string of 0’s and 1’s).  The position of each number in this string corresponds to a particular fragment.  If the molecule has a particular fragment, t

      Using tanimoto coefficient compare the two molecules on your list

    1. Open Data in Chemistry. One can obtain all scientific data in the public domain when wanted and reuse it for whatever purpose.• Open Standards in Chemistry. One can find visible community mechanisms for protocols and communicating information. The mechanisms for creating and maintaining these standards cover a wide spectrum of human organisations, including various degrees of consent.• Open Source in Chemistry. One can use other people's code without further permission, including changing it for one's own use and distributing it again.

      Assignment 2: Relate ODOSOS to Open Notebook science in 500 words or less

      https://youtu.be/zrE_NEXmHk0