29 Matching Annotations
  1. Mar 2017
    1. However, molecular formula search implemented in some databases, including PubChem Chemical Structure Search, has an option to allow other elements in returned hits (e.g., C6H6O or C6H6N2O for the “C6H6” query).

      Why is it has option to allow other elementsin returned hits when we type C6H6 or any other related molecules? Amita

    2. The most common types of molecular fingerprints are structural keys, which encode structural information of a molecule into a binary string (that is, a string of 0’s and 1’s).  The position of each number in this string corresponds to a particular fragment.  If the molecule has a particular fragment, the corresponding bit position is set to 1, and otherwise to 0.  Note that there are many different ways to design molecular fingerprints, depending on what fragments are included in the fingerprint definition.  PubChem uses its own fingerprint called PubChem subgraph fingerprints.

      I am confused with binary string and fingerprint. How does it work to recognize molecules? Amita

    3. PubChem provides two web-based tools that allow users to perform a cluster analysis of PubChem data:  the Structure Clustering tool and Structure-Activity Relationship (SAR) Analysis tool.

      I am confused using structure clustering tool, how can we use this? I practised to use this tool but I did not get any results. Amita

    4. The Structure-Activity Relationship (SAR) Analysis tool

      Confusion on using this tool. Amita

    5. On the contrary, superstructure search returns molecules that comprise or make up the provided chemical structure query (that is, substructures that is contained in the query superstructure).  It should be noted that substructure search does not give you substructures of the query and that superstructure search does not return superstructures of the query.

      How do we know which one is substructure and superstructure? Amita

    1. FLink tool

      The flink tool is also useful for 3D structure, macromolecular structure search. https://goo.gl/tgS4QB Amita

    2. However, some special filters, such as the "lipinski rule of 5" filter, or the “all” filter, are not link-based.

      What is "lipnski rule of 5" filter and why is it not link based? Amita

    3. The Entrez Search and Retrieval System

      This link provided here gives detail about Entrez, how it works. http://www.ncbi.nlm.nih.gov/books/NBK184582/ Amita

    4. aspirin[completesynonym] (1 hit, as of Feb. 26, 2017)https://www.ncbi.nlm.nih.gov/pccompound/?term=aspirin%5Bcompletesynonym%5D aspirin[synonym] (98 hits)https://www.ncbi.nlm.nih.gov/pccompound/?term=aspirin%5Bsynonym%5D aspirin (103 hits)

      I wonder why do we get different numbers of compound lists when we hit different type of queries? Amita

    1. Alternatively, PubChemRDF data can also be loaded into RDF-aware graph databases such as Neo4j, and the graph traversal algorithms can be used to query the PubChem knowledge graphs.

      Does anyone know the site for Neo4j for graph databases? Amita

    2. In 2-D similarity search, the similarity between chemical structures is quantified using the Tanimoto equation (22–24) in conjunction with the PubChem substructure fingerprint

      What is the Tanimoto equationa and what is a significent of it? Amita

    1. Therefore, some records in PubChem can persist with outdated (or incorrect) data.  To help identify such cases, we are introducing a “legacy” indication for contributors and their records.  Please note that this does not mean that data identified as “legacy” is without value.  Quite to the contrary, some legacy collections successfully collected valuable scientific data for the research community, and are simply no longer updating the information.

      How can we determined the data which are designated as Legacy are valuable or not? Amita

    1. The distinction is important as PubChem is organized in three separate databases: Compound, Substance, and BioAssay. 

      What is the connection between bioAssay with Compound ansd Substance? Amita

    1. TOXNET (http://toxnet.nlm.nih.gov/)22-25, maintained by the National Library of Medicine (NLM) at NIH, is a group of databases covering toxicology, hazardous chemicals, toxic releases, environmental and occupational health, risk assessment. 

      Does Toxnet also deals with nanomaterials and environmental pollutions? Amita

    2. 2.6. DrugBank: comprehensive information on drug molecules

      More details about DrugBank can get from this link. (http://www.drugbank.ca/about) Amita

  2. Feb 2017
    1. all data on a computer must be represented in binary notation.

      I am quite confused with binary notation. How is it related with chemistry?

    2. SPARQL which is the interestingly recursive acronym that stands for ‘SPARQL Protocol and RDF Query Language’.

      What is the significant of this?

    1. Mass Spectrum

      I did not find mass spectrum of benzene in massbank? So can tell me how to find it?

    1. Safe Drinking Water Act: Consumer Confidence Reports

      How can we get any data from this link?

    1. websites where you can obtain reliable spectral data, and software for viewing/simulating spectra.

      Is there any free software or website in which we can analyze SEM, TEM images and TGA spectra?

    1. Another extension of SMILES is SMIRKS28,29, which is a line notation for generic reactions. 

      Can you provide more details of SMIRKS and SMARTS with examples? How can we generate any reaction using this extension?

    2. Actually, it is very common that there are a lot of SMILES strings that represent the same structure, whether it has a ring or not, because one can start with any atom in a molecule to derive a SMILES string.  Therefore, it is necessary to select a “unique SMILES” for a molecule among many possibilities.  Because this is done through a process called “canonicalization”, this unique SMILES string is also called the “canonical SMILES”.

      How can we do canonicalization to get unique SMILES?

    1. All InChIs currently are prefixed with “INCHI=”. Following this, a designator of “1/” or “1S/” indicates whether the InChI is non-standard or standard (i.e. with fixed standardized options in the software)

      Can one compound or molecule have both standard and non-standard InCHIs?

    2. the full stop (“.”) which overrides the implicit single bond between adjacent atoms we can make some exotic variants on SMILES:C1C.CC1ButaneC1CC1.C2CC2Cyclohexane

      In this section, it is explained about notation for even number of molecule, like butane but not explain for odd number of molecule such as heptane, cycloheptane or cycloheptene. So how we can write nototion for these compound?

    1. Chirality

      How can we write the Mol file for Chirality between two different atoms like chirality between C-CH3 or C-OH?

    2. Bonds Block

      It seems confusing to me how do they build up this bond block? I am confused at 4-5,6-1 and 5-7 bond block.

    3. Resonance Run-of-the mill delocalization presents some of the same problems as aromaticity, but there is no conventional label for (non-aromatic) delocalized electrons, such as the delocalized negative charge and pi system in benzoate (VII and VIII). The connection tables will simply represent one resonance structure or another.

      In the figure MOL VII and MOL VIII, there is insert of 5 in the V2000 (file format), what is significant of this value? and how can we know which value should be inserted for other resonance structure?

    1. We would need to add an additional field to the atom and/or bond table to handle chirality (SCT VI, VII). We could do so either in a chemically sophisticated way, annotating the atom property, in a chemically-naive translation of a diagram feature, annotating the bond configuration, or both.

      In this section, SCT VI is used for R stereoisomer and SCT VII for S isomers but they are looks similar so how can we recognize which one is which? The authors said we can recognize them by using chemically sophisticated way but they have not fully explained it.

    1. Of course, chemical structuresand topological graphs are not entirely equivalent: aconnection table is akin to a description of a singlevalence bond structure and does not take account,for example, of delocalized bonds.

      The Connection table could not explain about the delocalized bonds but most of organic compounds are made up of delocalized bonds so how can we understand these bonds through databases? and how can we said that connection table is good method ?