31 Matching Annotations
  1. May 2017
    1. Two-dimensional (2-D) similarity methods

      In the 2D structure similarity below, I wonder how the fingerprint indicate the number of bond. For instance, like C-C bond, how do you know which one since both of them has C-C bond? - Phuc

  2. Apr 2017
    1. e textile industry is one of the oldest and largest industries on a global level, and while dyes and pigments of natural origin have been used for coloring clothing and other purposes since 3000 BC, it is only within the past 200 years that a wide variety of synthetic dyes have begun to be produced in massive quantities.[20, 21] Textiles are classified based on the material they are made of, which in turn dictates the type of dye which is used to impart them with color; for instance, reactive, direct, and indigo dyes are used for cellulose fibers, while acid dyes are used for protein fibers and both basic and dispersed dyes are used for synthetic fibers.[22] The structures of several common dyes used for textiles, paper, foo

      dont forget to post

    1. PubChem subgraph fingerprints,

      Why are these called a "Subgraph"? I see this from , wikipedia. Is it that each bit of the fingerprint represents a subgraph of the molecule, which in it self is the graph? That is, its existance/nonexistance of this subset within the entire?.

  3. Mar 2017
    1. In 2-D similarity methods, structural similarity between two molecules is estimated by comparing their molecular fingerprints. 

      So, will each structure on PubChem have a fingerprint and a molfile? lyndsie

    2. targets

      what is a target? lyndsie

    3. In substructure search, one provides an input substructure as a query to find molecules that contain the query substructure

      My question is if I were trying to find a large superstructure with more than one substructure, can I input more than one substructure to search? i.e. If I want to find compounds that contain and aldehyde and an alcohol. Lyndise

    4. Although each compound has up to 500 conformers (depending on the molecular size and flexibility),

      is this true? (nwume)

    5. PubChem homepage (http://pubchem.ncbi.nlm.nih.gov) PubChem Chemical Structure Search (https://pubchem.ncbi.nlm.nih.gov/search/search.cgi) PubChem Search (https://pubchem.ncbi.nlm.nih.gov/search/).

      i noticed that anything you search using these three different search interface gives you the same result.(nwume).

    6. Shape-Tanimoto (ST): quantifies steric shape similarity between two conformers. Color-Tanimoto (CT): quantifies the overlap of functional groups between two conformers, such as hydrogen bond donors and acceptors, cations, anions, rings, and hydrophobes. Combo-Tanimoto (ComboT): the sum of ST and CT scores between two conformers.  It takes into account the shape similarity (ST) and functional group similarity (CT) simultaneously. 

      whiich one is the best among these three metrics ? (nwume)

    7. The Structure Clustering tool

      please i dont understand how to make use of this structure clustering tools.(nwume)

    8. ? [Note that a compound may have multiple EC50 values determined from different experiments (likely under different experimental conditions)

      There are two ways to get rid of duplicates. You can simply paste the CIDs into the pubchem search, and search, I think this requires Chrome.


      You can go to Data/data tools in excel, and choose the "Remove duplicates" tool

    9. To perform an identity search for Cymbalta (CID 60835), go to the Chemical Structure Search pag

      so if many CIDs relate to specific isotopes, do the normal molar masses use ave values, like the periodic table?

    10. heat map-style layout,

      Is there an article that explains how to read the data shown below? Emily

    11. the Structure Clustering tool and Structure-Activity Relationship (SAR) Analysis tool.

      Where would these tool be used the most? Meaning in what kind of research would these tools be the most helpful? Emily

    12. 1.3. Substructure and superstructure search

      Could this search be used to find reactions to build superstructures from substructures? Emily

    13. As an alternative to 2-D similarity search, 3-D similarity search can also be performed using the “3D conformer” tab in PubChem Chemical Structure Search.  3-D similarity methods use the 3-D structures (that is, conformations) of molecules.  PubChem’s 3-D similarity method is based on the atom-centered Gaussian-shape comparison method by Grant and coworkers,9-12 implemented in the Rapid Overlay of Chemical Structures (ROCS).13,14  While the underlying mathematics of this approach is beyond the scope of this module, what this method essentially does is to find the “best” alignment of the 3-D structures of two molecules, which gives the maximized overlap between them.  The 3-D similarity method quantifies the 3-D molecular similarity using three metrics

      what are the differences of 2D and 3D structures?


    14. wo-dimensional (2-D) similarity methods

      I don't really understand this 2D similarity method.


    15. However, molecular formula search implemented in some databases, including PubChem Chemical Structure Search, has an option to allow other elements in returned hits (e.g., C6H6O or C6H6N2O for the “C6H6” query).

      Why is it has option to allow other elementsin returned hits when we type C6H6 or any other related molecules? Amita

    16. The most common types of molecular fingerprints are structural keys, which encode structural information of a molecule into a binary string (that is, a string of 0’s and 1’s).  The position of each number in this string corresponds to a particular fragment.  If the molecule has a particular fragment, the corresponding bit position is set to 1, and otherwise to 0.  Note that there are many different ways to design molecular fingerprints, depending on what fragments are included in the fingerprint definition.  PubChem uses its own fingerprint called PubChem subgraph fingerprints.

      I am confused with binary string and fingerprint. How does it work to recognize molecules? Amita

    17. PubChem provides two web-based tools that allow users to perform a cluster analysis of PubChem data:  the Structure Clustering tool and Structure-Activity Relationship (SAR) Analysis tool.

      I am confused using structure clustering tool, how can we use this? I practised to use this tool but I did not get any results. Amita

    18. The Structure-Activity Relationship (SAR) Analysis tool

      Confusion on using this tool. Amita

    19. MMFF94s

      In case if you curious about MMFF94s. and this might answering my previous question about certain element limitation. -Phuc

    20. Consist of only supported elements (H, C, N, O, F, Si, P, S, Cl, Br, and I).

      So any organometallic compounds such as cis-platin don't have 3D structure? and why we have to set this limit? -Phuc

    1. Document Version HistoryV1.3 – 2009May01 – Updated introduction to describe how to identify the PubChem Substructure Fingerprint property in a PubChem Compound record. V1.2 – 2007Aug30 – Added section on decoding PubChem fingerprints. V1.1 – 2007Aug06 – Corrected and expanded documentation of bits with SMARTS patterns used. V1.0 – 2005Dec02 - Initial release.

      from here looking at the document version history ,it was last updated on may1, 2009. is this still the normal thing till now or is there any new update.(nwume)

    1. ELISA

      The PubChem BioAssay Classification Tree is really an amazing feature. Within a few minutes I was able to go through the classification of bacteria to a species I use to work with in microbiology, combine the sources with a few keywords and find the exact AID needed to run confirmation tests. This being coupled to the Pubmed side on the publication is a great feature for scientists to have access to.

    1. The method is also used to quantify the degree of chirality of asymmetric molecules and to investigate the chirality of biphenyl and the amino acids.

      This article was cited in the OLCC Cheminformatics Class Module 6 for atom-centered Gaussian-shape comparison method. I will see if i can access the full article from campus tomorrow, but its very difficult for me to understand how causing a gaussian adaptation to a molecule would help to provide information about the degree of chirality on certain molecules. for example, it would be difficult to understand the shape of a miniature model sailboat inside of a glass bottle if the only shape you see is the outer bottle and not whats contained inside.


    1. ROCS is a fast shape comparison application, based on the idea that molecules have similar shape if their volumes overlay well and any volume mismatch is a measure of dissimilarity. It uses a smooth Gaussian function to represent the molecular volume [5], so it is possible to routinely minimize to the best global match.

      I assume that the ROCS description here that talks about volume overlays is what is mentioned on the olcc module 6 page about finding the best alignmentfor a 3D structure overlay. it seems that through the gaussian blur around the chemical being matched against others that many of the individual bonds would not be compared in the matching process.


    1. Options Identical Structures Similar Compounds, score >= 95% Similar Compounds, score >= 90% Similar Compounds, score >= 80% Identical Structures Similar Structures with  same connectivity any tautomer same stereoisomer same isotopical labels same stereochemistry and isotopes non-conflicting stereochemistry same isotopes and non-conflicting stereoisomers   threshold >=  99% 98% 97% 96% 95% 93% 90% 85% 80% 70% 60%   Substructure Superstructure Substructure Superstructure match stereochemistry  Ignore Exact Relative Nonconflicting  and  Match isotopes Match charges Match tautomers Ringsystems not embedded Single/double bonds match aromatic bonds Chain bonds match ring bonds Strip hydrogen   Molecular Formula with  exact stoichiometry allow other element   Sort results by: Shape-then-featureFeature-then-shapeShape-and-featureConformer Id Output to: PubChem 3D Alignment ViewerNCBI EntrezTable Summary Time Limit(seconds):  unlimited 30 60 90 300 600 3,600      Result Limit: 10 50 100 500 1,000 10,000 100,000 2,000,000   Filters

      i don't know what am doing wrong, i really dont understand how to work on question 1


    1. The method provides good accuracy across a range of organic and drug-like molecules. The core parameterization was provided by high-quality quantum calculations, rather than experimental data, across ~500 test molecular systems. The method includes parameters for a wide range of atom types including the following common organic elements: C, H, N, O, F, Si, P, S, Cl, Br, and I. It also supports the following common ions: Fe+2, Fe+3, F-, Cl-, Br-, Li+, Na+, K+, Zn+2, Ca+2, Cu+1, Cu+2, and Mg+2. The Open Babel implementation should automatically perform atom typing and recognize these elements.

      For the shape tanimoto, can we manually calculated it? Since this is overlapping, is there any relating to the X-Y-Z coordinate in Mol File? -Phuc