Hypothesis

26 Matching Annotations

Feb 2017
olcc.ccce.divched.org olcc.ccce.divched.org

2.3 Chemical Representations on Computer: Part III | DivCHED CCCE: Cheminformatics OLCC

16
1. olccs11 18 Feb 2017
  
  in Public
  
  GENSAL (GENeric Structure LAnguage
  
  Why GENSAL and not GENSLA?
  
  2017OLCCModule2P3
2. olccs11 18 Feb 2017
  
  in Public
  
  Note that aromaticity is not a measurable physical quantity, but a concept without a unanimous mathematical definition. As a result, different aromaticity detection algorithms often disagree with each other on whether a given molecule is aromatic or not, making it difficult to interchange information between databases that use different aromaticity detection algorithms for SMILES generation
  
  Do any of the aromaticity detection algorithms employ Huckel's Rule (4n+2pi electron rule) for predicting aromaticity?
  
  2017OLCCModule2P3
3. apcornell 15 Feb 2017
  
  in Public
  
  Hashing is a one-way mathematical transformation typically used to calculate a compact fixed length digital representation of a much longer string of arbitrary length.
  
  This is very similar to how I use to protect passwords on my server before SSH keys became the standard. We used similar protocols under the MD5 standard so its very interesting to see the same thing used to make something easier to find with search engines as it was to keep something from being found when used as a security measure.
  
  2017OLCCModule2P3
4. axnakarmi 14 Feb 2017
  
  in Public
  
  Another extension of SMILES is SMIRKS28,29, which is a line notation for generic reactions.
  
  Can you provide more details of SMIRKS and SMARTS with examples? How can we generate any reaction using this extension?
  
  2017OLCCModule2P3
5. axnakarmi 14 Feb 2017
  
  in Public
  
  Actually, it is very common that there are a lot of SMILES strings that represent the same structure, whether it has a ring or not, because one can start with any atom in a molecule to derive a SMILES string. Therefore, it is necessary to select a “unique SMILES” for a molecule among many possibilities. Because this is done through a process called “canonicalization”, this unique SMILES string is also called the “canonical SMILES”.
  
  How can we do canonicalization to get unique SMILES?
  
  2017OLCCModule2P3
6. OLCCS17 14 Feb 2017
  
  in Public
  
  In SMILES, atoms are represented by their atomic symbols. The second letter of two-character atomic symbols must be entered in lower case. Each non-hydrogen atom is specified independently by its atomic symbol enclosed in square brackets, [ ] (for example, [Au] or [Fe]). Square brackets may be omitted for elements in the “organic subset” (B, C, N, O, P, S, F, Cl, Br, and I) if the proper number of “implicit” hydrogen atoms is assumed. “Explicitly” attached hydrogens and formal charges are always specified inside brackets. A formal charge is represented by one of the symbols + or -. Single, double, triple, and aromatic bonds are represented by the symbols, -, =, #, and :, respectively. Single and aromatic bonds may be, and usually are, omitted. Here are some examples of SMILES strings
  
  According to Smiles specification rules, atom with two characters are enclosed in a square bracket.why is CL, BR not included.?
  
  2017OLCCModule2P3
7. OLCCS17 14 Feb 2017
  
  in Public
  
  Line notations represent structures as a linear string of characters. They are widely used in Cheminformatics because computers can more easily process linear strings of data. Examples of line notations include the Wiswesser Line-Formula Notation (WLN)1, Sybyl Line Notation (SLN)2,3 and Representation of structure diagram arranged linearly (ROSDAL)4,5. Currently, the most widely used linear notations are the Simplified Molecular-Input Line-Entry System (SMILES)6-9 and the IUPAC Chemical Identifier (InChI)10-13, which are described below.
  
  In this context, does it mean that WLN, SLN and ROSDAL line notations are no longer in existence, since SMILES and Inchi are widely used now.
  
  2017OLCCModule2P3
8. OLCCS17 14 Feb 2017
  
  in Public
  
  The Simplified Molecular-Input Line-Entry System (SMILES)6-9 is a line notation for describing chemical structures using short ASCII strings.
  
  What is the meaning of ASCII strings?
  
  2017OLCCModule2P3
9. olcc197 14 Feb 2017
  
  in Public
  
  Many databases such as PubChem17, ChemSpider18, ChEBI19, and NIST Chemistry Webbook20 accept InChI and InChIKey strings as queries to search for chemical structures. InChIs and InChIKeys can also be used as queries in UniChem21 to produce cross-references between chemical structure identifiers from different databases.
  
  when all these databases, pubchem ,chemspider chEBI and NIST are used to search for the inchis or inchikey of a particular structure are they going to give the same result
  
  2017OLCCModule2P3
10. olcc197 14 Feb 2017
  
  in Public
  
  the standard InChI
  
  when open smiles and standard inchi were not in existence what was the previous scientist doing to avoid different result in their project when they use different software?
  
  2017OLCCModule2P3
11. olcc197 14 Feb 2017
  
  in Public
  
  hey are widely used in Cheminformatics because computers can more easily process linear strings of data. Examples of line notations include the Wiswesser Line-Formula Notation (WLN)1, Sybyl Line Notation (SLN)2,3 and Representation of structure diagram arranged linearly (ROSDAL)4,5. Currently, the most widely used linear notations are the Simplified Molecular-Input Line-Entry System (SMILES)6-9 and the IUPAC Chemical Identifier (InChI)10-13, which are described below.
  
  since smiles and inchi are currently used the linear notation ,does it mean that WLN, SLN and ROSDAL when used now can yield a wrong result since they are no more in existence.
  
  2017OLCCModule2P3
12. olccs16 14 Feb 2017
  
  in Public
  
  Generic structures are commonly used in chemistry texts as well as in chemical patents in which the inventor claims a whole class of related compounds. Generic structures are more often called “Markush” structures after Dr. Eugene A. Markush, who involved in a legal case which set a precedent in the USA for generic chemical structure patent filing.
  
  This might sound a bit dumb! Can Inchi be generate from a Markush structure? I know generic structure and Inchi are quite contradicted from each other. Markush can be quite proprietary and InCHI is open science.
  
  2017OLCCModule2P3
13. olccs16 14 Feb 2017
  
  in Public
  
  c1ccccc1 Benzene (C6H6)
  
  if there are different substituent group on a benzene. Will SMILE indicate its position such as Ortho, Para, Meta?
  
  2017OLCCModule2P3
14. apcornell 13 Feb 2017
  
  in Public
  
  As a result, different aromaticity detection algorithms often disagree with each other on whether a given molecule is aromatic or not, making it difficult to interchange information between databases that use different aromaticity detection algorithms for SMILES generation.
  
  What process or services would be needed for the databases to perform this interchange?
  
  2017OLCCModule2P3
15. OLCCS10 13 Feb 2017
  
  in Public
  
  There are currently six InChI layer types, each different class of structural information: the main layer, a charge layer, a stereochemical layer, an isotopic layer, a fixed-H layer and a reconnected layer.
  
  I understand the first four layers of InChI, but what does it mean about the last 2 a fixed-H layer and a reconnected layer?
  
  2017OLCCModule2P3
16. OLCCS10 13 Feb 2017
  
  in Public
  
  SMARTS is useful for substructure searching, which finds a particular pattern (subgraph) in a molecule.
  
  Can you give some examples of the substructures that are searched for?
  
  2017OLCCModule2P3
Visit annotations in context

Tags

2017OLCCModule2P3

Annotators

apcornell

OLCCS10

olcc197

OLCCS17

olccs11

olccs16

axnakarmi

URL

olcc.ccce.divched.org/2017OLCCModule2P3
olcc.ccce.divched.org olcc.ccce.divched.org

InChI, the IUPAC International Chemical Identifier

1
1. olccs16 15 Feb 2017
  
  in Public
  
  The first block of 14 (out of total 27) characters for anInChIKey encodes core molecular constitution, as de-scribed by formula, connectivity, hydrogen positions andcharge sublayers of the InChI main laye
  
  Can you search the web with the first block of the InchI key and find all isomer of the compound?
  
  2017OLCCModule2P3
Visit annotations in context

Tags

2017OLCCModule2P3

Annotators

olccs16

URL

olcc.ccce.divched.org/sites/olcc.ccce.divched.org/files/InChI-The IUPAC Chemical Identifier.pdf
olcc.ccce.divched.org olcc.ccce.divched.org

Microsoft Word - Introducing Cheminformatics V2.0 - Sample.docx

3
1. apcornell 15 Feb 2017
  
  in Public
  
  Linenotations are not the only way of communicating structure: also popular are file-based formats such asMDL's MOL File19(and its variant, the SD File), and Chemical Markup Language20(CML, avariant of XML
  
  One of the big drawbacks that I often hear when using XML with databasing is that it quickly starts making very large files in terms of storage. Would the same hold true for using CML with large chemical databases? If so, what size reduction could be expected for the same size collection of chemicals stored using connection tables?
  
  2017OLCCModule2P3
2. axnakarmi 14 Feb 2017
  
  in Public
  
  All InChIs currently are prefixed with “INCHI=”. Following this, a designator of “1/” or “1S/” indicates whether the InChI is non-standard or standard (i.e. with fixed standardized options in the software)
  
  Can one compound or molecule have both standard and non-standard InCHIs?
  
  2017OLCCModule2P3
3. OLCCS15 12 Feb 2017
  
  in Public
  
  . Ring aromaticity is handled in SMILES at the atomic level, not at the bond level (i.e. an atom is considered aromatic rather than a
  
  In SMILES, why is aromaticity not considered on bond level? Given that the bonds in the ring structures counts for it's aromaticity.
  
  2017OLCCModule2P3
Visit annotations in context

Tags

2017OLCCModule2P3

Annotators

apcornell

axnakarmi

OLCCS15

URL

olcc.ccce.divched.org/sites/olcc.ccce.divched.org/files/Introducing Cheminformatics - Wild.pdf
olcc.ccce.divched.org olcc.ccce.divched.org

In Silico Medicinal Chemistry_1.pdf

5
1. apcornell 15 Feb 2017
  
  in Public
  
  Main layer○Chemical formula, no prefix○atom connections, prefix ‘c’○hydrogen atoms, ‘h’●Charge layer○proton sublayer, ‘p’○Charge sublayer, ‘q’●Stereochemical layer○Double bonds and cumulenes, ‘b’○tetrahedral stereochemistry of atoms and allenes, ‘t’ or ‘m’○Stereochemistry information type, ‘s’●Isotope layer, ‘I’, ‘h’, and ‘b’, ‘t’ and ‘m’ for stereochemistry of isotopes●Fixed-h layer, ‘f’
  
  Will a future release version of InChI include a layer to include inorganic molecules that can be standardized? I have come across things in the past saying that it really only works efficiently for organic molecules. I have the understanding that some inorganic things have been included, but not the more complex structures.
  
  2017OLCCModule2P3
2. OLCCS199 14 Feb 2017
  
  in Public
  
  While the InChI representation is normally too complex for a human to decode, it is impossible for even a computer to extract the chemical structure from the InChIKey. therefore, it is important that the InChI repre-sentation is also included in any database.
  
  In instances were chemical structures can't be determined with the provided InChI or InChIKey, are there any tips for searching with an InChI?
  
  2017OLCCModule2P3
3. olccs16 14 Feb 2017
  
  in Public
  
  WLN:WiswesserLineNotatio
  
  I havent encountered this line notation at all. Wonder if is there any database systems still use this as a part of history showcase? (know that it is already out of favor)
  
  2017OLCCModule2P3
4. OLCCS10 13 Feb 2017
  
  in Public
  
  aromatic bonds are implied between aromatic atoms, but may be explicitly defined using the ‘:’ symbol.
  
  When would you use the colon instead of lowercase letters for aromatic bonds?
  
  2017OLCCModule2P3
5. OLCCS15 12 Feb 2017
  
  in Public
  
  another language, based on conventions in SMILeS, has also been devel-oped for rapid substructure searching, called SMiles arbitrary target Spec-ification (SMartS). Similarly, SMIrKS has also been defined as a subset of SMILeS that encodes reaction transforms. SMIrKS does not have a defini-tion, but plays on the SMILeS acronym. SMartS and SMIrKS will be consid-ered in more detail in later chapters.
  
  SMARTS used for rapid substructure searching is noted as another language based on SMILES conventions. What is the meaning of SMIRKS and is it used in a similar form as SMARTS?
  
  2017OLCCModule2P3
Visit annotations in context

Tags

2017OLCCModule2P3

Annotators

apcornell

OLCCS15

OLCCS10

olccs16

OLCCS199

URL

olcc.ccce.divched.org/sites/olcc.ccce.divched.org/files/In Silico Medicinal Chemistry_1.pdf
Local file Local file

OLCC-2017_mod-2_part-3.pdf

1
1. OLCCS15 12 Feb 2017
  
  in Public
  
  Rings are represented by breaking one single or aromatic bond in each ring, and designating this ring-closure point with a digit immediately following the atoms connected through the broken bond.Atoms in aromatic rings are specified by lower cases letters.Therefore, cyclohexane and benzene can be represented by the following SMILES
  
  Due to the complexity of aromatic compounds and ring structures, is it better to use Kekule structures or lower cases letters for SMILES?
  
  2017OLCCModule2P3
Tags

2017OLCCModule2P3

Annotators

OLCCS15

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators