Direct questions or comments about this page to the author or to the
Web Administrator.
This is an official
University of California, Santa Barbara Library web page.
Chem 184/284
Lecture 15
Chemical Abstracts Registry File
Part 2: Structure Searching
Structure Searching on STN
- One of the most powerful features of the chemical substance files on STN is the ability to search by chemical structure.
- A large number of STN files contain searchable chemical structures of various types.
- The REGISTRY, BEILSTEIN, GMELIN and Derwent Drug Files all contain records for individual compounds.
| Input | ------> | Output
|
|---|
| Specific structure | ------> | Single compound
|
| Generic structure | ------> | Set of compounds
|
- The CASREACT, CHEMINFORMRX, "Derwent Journal of Synthetic Methods" and CHEMREACT files all contain information on organic chemical reactions. The reactants and products are structure searchable.
| Input | ------> | Output
|
|---|
| Specific or generic structures | ------> | Set of reactions with desired features
|
- The MARPAT and MARPATpreviews files contain the generic Markush structures appearing in chemical patents in structure searchable form
| Input | ------> | Output
|
|---|
| Specific or generic structures | ------> | Patents containing appropriate Markush structures
|
How Structures are Stored
- In STN files, structures are stored as connection tables -- a list of each atom in the structure, which atoms each is linked to, and by what kind of bond.
- Structures with stereochemistry have additional information about the spatial arrangement of the bonds.
How Structures are Searched
- The Messenger software searches structure information in two steps: screening and atom-by-atom match.
- Screening filters out likely matches by looking for certain common features.
- Atom-by-atom match then compares the whole of the connection table of the query structure with that of the possible matches.
Building Query Structures
- Messenger has a whole set of commands for "drawing" chemical structures within the system itself.
- STN Express is a specialized software package which includes structure drawing software and the ability to upload these structures to the online system. For a manual on building structures using STN Express, see Structure Searching in the CAS Registry File at http://www.cas.org/ACAD/casreg.pdf. Note: this is a large PDF file requiring a current version of the Adobe Acrobat Reader for viewing.
- In addition to Messenger commands, STNWeb can use free plugin software to allow graphic structure drawing.
- SciFinder and SciFinder Scholar use graphic structure drawing modules for their substance and reaction searching.
STN Online and STNWeb
Structure Building Commands - STRUCTURE
- The command STRUCTURE initiates the structure building process.
- The system responds by prompting for a structure to recall.
- You may respond with the name of a template, a Registry Number, a previously-built structure L# or NONE.
- When in structure building mode, the arrow prompt is replaced by a colon.
GRAPH -- Creating the Pieces
- The GRAPH command (abbreviated GRA) tells the system to create atoms or sets of atoms, as either chains or rings. Note that, in general, you do not have to draw hydrogen atoms as part of the structure - they are assumed to be present by the system.
- The default atom is carbon; the default bond is unspecified.
- When structure building with GRA commands, the system automatically assigns a number to each node in the order constructed.
: gra c3
creates a three carbon chain, while
: gra r6
creates a six carbon ring. (i.e. the beginnings of cyclohexane or benzene)
: gra r66
creates two six membered rings fused along one side (i.e. the beginnings of naphthalene.)
- You can also attach chains to specific atoms:
: gra 2 c4
attaches a 4 carbon chain to atom 2
- You can create bonds between existing atoms:
: gra 1 2
DELETE - Removing Atoms and Bonds
- DELETE can be used to remove atoms or groups of atoms:
: del 1 5 8
- Or it can be used to remove bonds:
: del 1-2 7-9
NODE - Transmuting Elements
VARIABLE - Defining Your Own Generic Groups
BOND -- Specifiying Bond Types
DISPLAY -- Seeing What You've Built
- The DISPLAY command in structure building may be used at any step along the way to see what the current structure looks like. It is frequently added to a structure building command to save time.
: gra c3, nod 2 o, dis
- DIS SIA displays both the diagram and the attributes (see below) of the structure.
Attribute Commands
END -- Going from Building to Searching
- When all your structure building is complete, the END command creates an L# for the structure and returns you to the normal search mode.
- You must END a structure before you can search it.
Types of Structure Searches
- Messenger allows four types of structure search:
- EXACT: Looks for the compound exactly as drawn; only possible variations are isotopic (and stereochemical if unspecified)
- FAMILY: Same as above, but will also pick up salts (of acids) or polymers (of monomers).
- CSS: Stands for Closed Substructure Search This type of search will only allow substitution where you have specfically allowed it, as with a CONNECT attribute or the use of a variable or generic group.
- SSS: Stands for Substructure Search. Will allow any substitution at any atom except as you have specifically restricted it.
Ranges of Searching
- You can also specify how much of the database you wish to search:
- SAMPLE: This is a fixed, randomly selected, 5% of the database. Always search this before doing a substructure search to see if the search will work. Sample searches are always free!
- FULL: Self-explanatory
- RANGE: You may specify a range of Registry Numbers to search; useful for update searches or to continue searches which were to big to complete in one step. Ranges of less than 100K RN's are cheaper than a full search.
- SUBSET: lets you use a previously created L# (by name, mol. formula, ring data, structure or combinations) as the defined set to search on. Can be a verypowerful tool.
Structure Search Hints
- When doing a structure search, always use SEARCH, not S. This way, the system will prompt you for type of search and range of search.
- SAMPLE searches aren't necessary for EXACT or FAMILY searches, but are strongly recommended for substructure searches.
- If a structure is unsearchable (exceeds system limits), consider whether you can create a suitable subset with name fragments or molecular formula or ring information which would bring a subset search within system limits. Alternatively, modify the structure to make it more specific. Note that changing HCO or CON attributes does not affect the search at the screening level, so these limitations do not generally keep a search within system limits.
Structure Building Example:
Feropolone
First, build the rings:
:gra r6
:gra r66, dis
Then, build the chain connecting the two:
:gra 1 c6
:gra 22 7
Now, build the side chains:
:gra 2 c1, 2 c1, 3 c1, 5 c1, 5 c1, 11 c1, 19 c1, 19 c1, dis
Then, use the NOD command to change atoms as necessary:
:nod 10 22 25 28 o, 23 24 29 me, 27 30 oh, dis
Now apply the BON command:
:bon all se, dis
:bon 3-25 11-28 12-13 de, dis
:bon 7-8 8-9 9-14 14-15 15-16 16-7 n; dis
Then apply attribute commands as necessary.
When the structure is completed, use the END command to complete the structure and return to the regular search mode.
You may display completed structures online with "display query L#"
Search the query with "search L# [search type] [search range]"
=> search L1 exact full
=> search L1 sss sample
EXAMPLE:
=> search l3
ENTER TYPE OF SEARCH (SSS), CSS, FAMILY, OR EXACT:sss
ENTER SCOPE OF SEARCH (SAMPLE), FULL, RANGE, OR SUBSET:full
FULL SEARCH INITIATED 23:13:37
FULL SCREEN SEARCH COMPLETED - 71 TO ITERATE
100.0% PROCESSED 71 ITERATIONS 1 ANSWERS
SEARCH TIME: 00.00.02
SciFinder Scholar
From the Explore screen, select "Chemical Substance or Reaction". Then, select "Chemical Structure".
Structure Drawing Tools: Tool Bar
- Draw Tool - Use for atom-by-atom, bond-by-bond drawing. Note that the default for atoms is carbon; for bonds is single bond.
- Chain Tool - allows drawing of chains by clicking on a starting node, and "drawing out" the chain to the desired length.
- Atom Tool - allows selection of elements from a periodic table. Note that the tool remains with a particular atom until you change it.
- Shortcut Tool - allows selection from the list of "shortcut" groups.
- Variable Tool -- allows selection of variable groups, e.g., X for halogens.
- R-Group Tool -- allows building and insertion of groups containing a mix of atoms, shortcuts or variables.
- Ring Tools -- create cyclopentane, cyclopentadiene, cyclohexane or benzene rings, or saturated rings of any specified size. Rings may be attached to existing structures in fused fashion by clicking on a bond, or in spiro fashion by clicking on an atom.
- Template Tool -- Refers to the Template Menu for pre-drawn structures.
- Eraser Tool -- used to erase atoms or bonds
- Lasso Tool -- used to select or move atoms or groups of atoms freehand
- Select Arrow -- used to select atoms or groups of atoms individually or in a box.
- Rotate Tool -- used to rotate structure around a given atom.
- Charge Tools -- used to add positive or negative charge at a site
- Lock Atom Tool -- excludes further substitution at a site.
- Lock Ring Tool -- excludes further ring fusion on a ring system.
- Reaction Tools -- used to designate reactants or products and to specify bonds, atoms, etc.
- Note that SciFinder Scholar does not (yet) allow for use of structure fragments in variables, or specifying stereochemistry.
- To use a known compound as a model, find it by a substance search or from a reference, then cut and paste the Registry Number into the structure building window.
Searching Structures
- When your structure is ready, click on "Get Substances". Clicking on Preview does the equivalent of a sample search. This can be useful, but is not as necessary, since you can't "waste" a search in SciFinder Scholar.
- Then select "Exact Match" or "Substructure". Exact match is really the equivalent of a "family search" using Messenger. Substructure is an open substructure search.
- If your structure is too general, the system will respond with a message informing you of that. You can use the "Autofix" option, which allows the system to try making your structure more specific, or you can go back and modify it yourself.
- After the set of substances has been retrieved, you may proceed as with any set of substances: refining the set, or getting references.
This page created by Chuck Huber (huber@library.ucsb.edu). Last modified: February 29, 2000.