238 Matching Annotations
  1. Feb 2017
    1. CAS RN

      please i want to understand more about this CAS RN what is it really used for?

    2. To automate functions on chemical data, the data structure needs to be systematically defined and consistently applied.

      Who is responsible for the systematic definitions of data structure and their consistent application?

  2. olcc.ccce.divched.org olcc.ccce.divched.org
    1. NOTATIONSLine notations represent structures as a linear stringof alphanumeric symbols. Their compactness was anadvantage in the early days of cheminformatics whenstorage space was at a premium, and even nowa-days, it can be faster to enter a structure as a no-tation instead of using a chemical structure drawingprogram. Several notations20–22were proposed in the1950s and 1960s, but only one, the Wiswesser Line-Formula Notation (WLN; see Figure 1)23–28becamewidely used,29–47despite the fact that the Dyson no-tation was formally adopted by IUPAC.20,21WLNstarted to fall out of use in the early 1980s.

      What are the examples of Line notation and can i get a clearer explanation on it?

    2. To the practicing chemist, the language ofchemistry is the two-dimensional (2D) structure dia-gram and most chemical information systems featuregraphical input and output of chemical structures; themachine-held representation need not be meaningfulto the synthetic chemist. In the ideal (unique) repre-sentation there is only one ‘code’ for a given struc-ture and any one code can be interpreted to give onlyone structure. A unique representation is essential forchemical registration systems in which the novelty ofa structure is determined before it is recorded in adatabase. Some representations, for example, molec-ular formulas, are not unique; one molecular for-mula will generate more than one full structure. Somenonunique representations (e.g., molecular formulasand fragment codes) do, however, play a part in cer-tain chemical information systems, even though theydo not represent the full topology of a structure

      In this context, i don't understand when they say "Code can be interpreted to give one structure, what does it really mean?

    3. Efforts have been madeto establish an agreed standard format but theyhave not been generally unsuccessful.

      I was attempting to find CML file for aspirin and convert it to different formats. I found a link on the below link on the cactus toolkit site. https://cactus.nci.nih.gov/blog/?p=68

      The number of different structure formats seems almost overwhelming!

    4. Hydrogenatoms are not necessarily included explicitly in aconnection table: they may be implicit.

      If connection tables are concerned with canonicalization, why wouldn't H atoms always be explicitly specified?

    5. TheMolecular Structure Encoding System, MOSES, fromMolecular Networks (Erlangen, Germany) is a laterdevelopment from Gasteiger’s team.

      What features does MOSES have or how does it address the issue delocalization of electrons between more then two atoms?

    6. Of course, chemical structuresand topological graphs are not entirely equivalent: aconnection table is akin to a description of a singlevalence bond structure and does not take account,for example, of delocalized bonds.

      The Connection table could not explain about the delocalized bonds but most of organic compounds are made up of delocalized bonds so how can we understand these bonds through databases? and how can we said that connection table is good method ?

    7. It is termed redundant because each connection isdescribed twice. The redundancy is removed whena unique version of the table is stored.

      Forgive me, but I am confused as to how each connection is described twice? It appears as though there are three columns with the same heading (Bond order and Attached atom number). Would anyone mind clarifying?

    8. WIREs Computational Molecular Science Representation of chemical structuresFIGURE 3|A redundant connection table

      Just a comment here. This table is much easier for me to follow than the examples in module 2 part 2.

    9. Of course, chemical structuresand topological graphs are not entirely equivalent: aconnection table is akin to a description of a singlevalence bond structure and does not take account,for example, of delocalized bonds.Alternative approaches have been suggested.

      Why hasn’t there been a bigger deal made about the fact that connection tables do not take in account delocalized bonds?

    10. The Morgan algorithm identifies atoms basedon an extended connectivity value. The atom withthe highest value becomes the first atom in the name,and its neighbors are then listed in descending or-der. Ties are resolved based on additional parame-ters, for example, bond order and atomic number.The original Morgan algorithm did not handle stere-ochemistry; the stereochemically unique naming algo-rithm [stereochemical extension of Morgan algorithm(SEMA)] was developed to handle stereoisomers

      If figure 3 connection table is created based on Morgan algorithm, how would the carbon from the far left is put as the 1st atom not the nitrogen?

    11. Another notation, called rep-resentation of structure diagram arranged linearly(ROSDAL),52,53was written to transfer structuresquickly in a compact form over a network to enablesearching of the Beilstein database online.54ROSDALis still supported by InfoChem, and by Elsevier (Am-sterdam, The Netherlands) in Reaxys (vide infra) andthe Beilstein CrossFire structure editor.

      I noticed that the ROSDAL notation is still supported by ELsevier. Doesn't Elsevier uses SMILES like most other information systems widely use today?

    12. Morgan algorithm initially developed by Gluckat DuPont (Wilmington, DE, USA) and adapted byCAS.55

      How is the Morgan algorithm developed, and are there alternatives?

    1. The chemical formula is straightforward enough, but the connectivity and hydrogen sections require some explanation. The connectivity layer describes chains and branches –for example in the above Propene example, atom 1 is bonded to atom 3, which is bonded to atom 2. In the final example, we have branching, as represented by paren

      This explanation further down makes sense, considering that " different implementations interpret SMILES differently", especially for compounds that have the same chemical formula. I was wondering as to how you would get to the right InChI key, starting from a chemical formula.

    2. the full stop (“.”) which overrides the implicit single bond between adjacent atoms we can make some exotic variants on SMILES:C1C.CC1ButaneC1CC1.C2CC2Cyclohexane

      In this section, it is explained about notation for even number of molecule, like butane but not explain for odd number of molecule such as heptane, cycloheptane or cycloheptene. So how we can write nototion for these compound?

    3. An InChI Keycan also be generated for a compound. This is completely separate from the InChI linear notation, and is used to provide an identifier for a compound that is particularly suitable for use in Web search engine

      Does this mean that a regular InChI cannot be used in Web search engines?

    4. Systematic name

      Is there a way to update this definition on wikipedia or somehow make the definition more accessible via a search engine? I struggled to find a legitimate, consistent definition for a related assignment in my cheminformatics class. I wish I had seen this sooner.

    5. n InChI Keycan also be generated for a compound. This is completely separate from the InChI linear notation, and is used to provide an identifier for a compound that is particularly suitable for use in Web search engines. It is an ASCII character string based on a hashing of the InChI linear notation, but is of fixed length and uses only characters not normally conside

      I was wondering when you would use the InChI Key instead of regular InChI?

    6. Fortunately, this can be done with the Morgan Algorithm18.In this algorithm, each atom is given a "connectivity value" reflecting how many atoms it is connecte

      So you can create as many connection table as possible like the exercise in OLCC website, the correct or "canonical" table is the one that is generated with Morgan algorithm. Is it sound right?

    7. All InChIs currently are prefixed with “INCHI=”. Following this, a designator of “1/” or “1S/” indicates whether the InChI is non-standard or standard (i.e. with fixed standardized options in the software)

      In using InChI ,when exactly does it matter in adding the standard or non standardized option as needed?

    1. We would need to add an additional field to the atom and/or bond table to handle chirality (SCT VI, VII). We could do so either in a chemically sophisticated way, annotating the atom property, in a chemically-naive translation of a diagram feature, annotating the bond configuration, or both.

      In this section, SCT VI is used for R stereoisomer and SCT VII for S isomers but they are looks similar so how can we recognize which one is which? The authors said we can recognize them by using chemically sophisticated way but they have not fully explained it.

    1. Are connection tables not used for inorganic compounds? Is there a different program used to analyze these?

    2. Our atom table will consist of two fields: one an index number identifying the atom we’re talking about, one indicating atom type (i.e. C, H, O, N, etc.). Our bond table will consist of three fields: two indicating the two atoms that the bond connects, and one indicating the bond order (1=single, 2=double, 3=triple).

      Are the atoms numbered by a set of governing rules or does it matter which atom is numbered what?

      Even given the following examples, I still do not understand how to read or create a bond table. I would like to see an example worked in steps so that I can follow the process.

    3. As an example, take isopropyl alcohol. SCT I is a connection table representing this compound – or, more specifically, representing this structural formula. Connection tables are not necessarily unique We could draw up other tables of atoms and bonds that represent this compound as well: for example, SCT II and SCT III

      I'm just wonder how this connection table is form? Is it from Morgan algorithm or just randomly assign number for atom?

    4. As a starting point, this section will introduce a simplified form of connection table, which we’ll call an “SCT”. This SCT does not correspond directly to any existing file format (at least as far as we know!). Rather, it is a convenient model that we will use just for the purpose of this demonstration.

      Is an SCT table similar or the same as an MDL table?

    5. The MOL file format uses the number 4 to indicate bonds that are explicitly labeled as aromatic (MOL V). This has the advantage of differentiating aromatic bonds from single and double bonds without requiring the chemist to write a script to identify and label the alternating single and double bonds of a Kekulé structure. However, some software may not be built to handle this convention. (You might even run into cases in which it’s interpreted as a quadruple bond!)

      The Kekulé A and Kekulé B connection tables for the bond types is very confusing, when I tried to switch from one to the other. The introduction to the third type using just the number four is great.

  3. Jan 2017
    1. uick searchin Reaxysallows you to combine structure and text searches. Try doing the searc

      How do you turn on quicksearch

  4. olcc.ccce.divched.org olcc.ccce.divched.org
    1. Issues with Intelligent Search Engines

      Damon, I am not sure you are familiar with Hypothes.is, but we will be using this, and I think the first iteration of a learning assignment should be from the expert, which can then get dumbed down to multiple levels, depending on student abilities. So we can expect an iterative process.

    1. that is, a string of 0’s and 1’s).  The position of each number in this string corresponds to a particular fragment.  If the molecule has a particular fragment,

      Do XYZ using the tanimoto coef.

    2. Molecular similarity methods can be broadly classified into two-dimensional (2-D) and three-dimensional (3-D) similarity methods.  Typically, 2-D similarity methods use so-called molecular fingerprints, which encode structural information of a molecule into a binary string (that is, a string of 0’s and 1’s).  The position of each number in this string corresponds to a particular fragment.  If the molecule has a particular fragment, t

      Using tanimoto coefficient compare the two molecules on your list

    1. Open Data in Chemistry. One can obtain all scientific data in the public domain when wanted and reuse it for whatever purpose.• Open Standards in Chemistry. One can find visible community mechanisms for protocols and communicating information. The mechanisms for creating and maintaining these standards cover a wide spectrum of human organisations, including various degrees of consent.• Open Source in Chemistry. One can use other people's code without further permission, including changing it for one's own use and distributing it again.

      Assignment 2: Relate ODOSOS to Open Notebook science in 500 words or less


  5. Dec 2016
    1. Builder

      When I click "All Fields" a lot of options come up, ranging from ActiceAidCount to XlogP. Are all of these Entrez indices? (some being text, and some being numerical?


    1. Data submission using PubChem upload

      2017 Cheminformatics OLCC students are responsible to read this section on PubChem Upload. Further information is available at the Upload Help Page, https://pubchem.ncbi.nlm.nih.gov/upload/docs/upload_help_complete.html