19 Matching Annotations
  1. Apr 2017
    1. identity search.
    2. Getting molecular properties of a set of compounds

      It seems most of the things we can get are from the advanced search (entrez?) options. Can we get other items, like boiling points?

    3. https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/inchikey/CSCPPACGZOOCGX-UHFFFAOYSA-N/record/XML?record_type=3d
    4. One can perform partial synonym matching, by setting this option to “word”. 

      When using the word option, does pubchem limit the number of results to a specific number like 10,000 for example. I have seen some other systems limit results unless being accessed from a verified user with a token or some key for verification that they are not a robot.

    5. When a list of compounds are specified as the input, the image of only the first compound on the list will be returned.

      Is there a way to get images for a list of chemicals submitted or would they need to be submitted as separate individual searches? I notice a quick way to do this would be as shown below in the spreadsheet, but that looks like it is submitted as separate searches.

    6. Getting a list of CIDs for compounds with a given substructure

      Could SMILES be used here instead of the CIDs? Emily

    7. The input identifiers can also be specified by SMILES or InChI strings, although special care needs to be taken because these identifiers contain special characters (such as “/”) that cause conflicts with the URL syntax.4 

      Why use these identifiers if they can cause conflicts? Emily

    8. Getting a list of CIDs for compounds identical to a query compound

      This only shows the structures with identical things to the CID provided, how would one only find those that are similar? Emily


      What are the difference between these two, and in what situations where would one use PUG-SOAP instead of PUG-REST? Emily

    10. In PUG-REST, these three pieces of information are encoded into an URL in the following format:

      Is there a way to make sure the inquires work, other than just trying and not getting anything? Emily

    11. In PUG-REST, these three pieces of information are encoded into an URL in the following format:

      I'm guessing these basic piece of information would be similar to the CACTUS chemical resolver URL.-Phuc

    12. https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/cid/180/record/PNG

      I know it through PUG-REST, but how are these generated? same for those link afterward.-Phuc

    13. Entrez Utilities (also called E-Utilities or E-Utils) Power User Gateway (PUG) PUG-SOAP PUG-REST

      Are these suitable with any kind of programmatic platform? -Phuc

    1. Currently PubChem contains more than 180 million depositor-provided substance descriptions, 60 million unique chemical structures and 225 million biological test results from 1 million assays, covering more than 9000 unique protein target sequences (as of January 2015). Programmatic access to this vast amount of data by the chemical biology and biomedical communities presents new opportunities for data-driven research in a ‘big data’ era.

      Im curious as to what type of database all of this is run by with so many different queries and data sets available? I would think that since its relational data that a database with those features would be needed, such as MySQL, Oracle or Microsoft SQL Maybe? My second question is would this all run on cloud servers or traditional style servers? The deeper we get into these modules, the more I am just amazed at the abilities that PubChem has.


    2. However, it should be noted that recent re-engineering of many of PubChem's search services has made them fast enough to work synchronously. These operations have a ‘fast’ prefix on them in PUG-REST, in order to preserve backwards compatibility of the asynchronous variants as described above. For example, the ‘fastsubstructure’ input can be used to retrieve a list of CIDs from a chemical substructure search; this input can then be used with any other PUG-REST operation on CIDs, in a single request.

      Im curious as to what individual pieces of the search happen synchronously as described in the term "fast"? Is this referring to something like multiple returns being processed similarily instead of one at a time based on a query result?


    1. The CGI interprets your incoming request, initiates the appropriate action, then returns results (also) in XML format.

      Is this service mostly limited to an output in XML? I notice some services offered an output for ASN.1, SDF, CSV, etc, but can this be extended to all services with the possibility of a particular format or is XML the limit.

      The reason i ask is that a lot of data can quickly cause excessively large files when put into XML vs something like a CSV if it only needs to contain one type of result.


    1. Learn to code interactively, for free.

      This is a really good site and I am glad it was mentioned. I use this site a lot because I am not a programmer but I often have research meetings with people who do lots of programming and it quickly lets me learn the basics of what they do.


    1. PubChem’s PUG (Power User Gateway), documented elsewhere, is an XML-based interface suitable for low-level programmatic access to PubChem services, wherein data is exchanged through a relatively complex XML schema that is powerful but requires some expertise to use. PUG SOAP contains much of the same functionality, but broken down into simpler functions defined in a WSDL (http://www.w3.org/TR/wsdl), using the SOAP protocol (http://www.w3.org/TR/soap) for information exchange. This WSDL/SOAP layer is most suitable for SOAP-aware GUI workflow applications (Taverna, Pipeline Pilot) and programming languages (C#/.NET, Perl, Python, Java, etc.). See the Tips & Tricks section at the end of this document for more information on specific clients.

      Does this mean that PUG-SOAP is harder to use if the person has no programming background, and is more meant for professonals that do? Emily