SMILES

SMILES, which stands for "Simplified Molecular Input Line Entry System", can be used to describe the structure of chemical substances as a text string.
Many chemical drawing programs can produce and understand SMILES, including the Marvin chemical drawing tool used in WebAssign content.
You may also find SMILES as part of a figure caption description of a chemical structure, included with a condensed chemical formula. Both forms are presented as additional information to clarify what the chemical drawing represents. This is especially meaningful if you have problems with color-vision or are blind. This, however, is not to say that SMILES are not for you. Read on!

Understanding SMILES

SMILES notation is relatively easy to learn: "C" means carbon, "H" means hydrogen, "O" means oxygen, and so on, as defined in the periodic table. Single bonds are implied (common in condensed chemical formulas), double bonds are represented by "=", and triple bonds are represented by "#". Thus, CO2 is "O=C=O". There are a few more nuances, as we'll show below.

Simple Atoms

Representative elements (with the exception of hydrogen) can be used as they appear on the periodic table. All other elements should be enclosed in square brackets.

Simple Bonds

Single bonds are implied. Double and triple bonds are explicit.
SMILES Structure
CC
two C's connected with a straight line
C=C
two C's connected with a double parallel line
C#C
two C's connected with a triple parallel line

Advanced Atom Annotation Techniques

Atomic Charges

For largely historical reasons, atomic charges take a somewhat odd form and appear with the atom symbol in between square brackets.
SMILES Structure
[Na+] Na+
[Ca++] Ca2+
[Sc+3] Sc3+
[F-] F
[S--] S2−
[P-3] P3−

Atom Labels

In some cases, it may be beneficial to label (a.k.a. map or mark) one or more atoms in a structure for emphasis. In SMILES this is accomplished by adding a colon and an integer to the atom specification.
SMILES Structure
[CH2:1]=[CH:2]CC
a linear sequence: a C marked [1] connected with a double bond to a C marked [2], further connected to two C's, each with a single-bond

Isotopes

Occasionally you will encounter structures that are isotopically labeled. Isotopes are written in AX format inside square brackets.
SMILES Structure
[3H]Br
a linear sequence: an H atom marked 3H, connected to a Br atom by a single bond

Advanced Bond Annotation Techniques

Branching

Not all chemical structures consist of simple chains of atoms. So-called branching (using parentheses) is used to define groups or substitution patterns, connected to the main chain.
SMILES Structure
CC(C)CC
a branched sequence containing only single bonds: a C atom connected to a branching C atom; the branching atom contains 1 C atom in the first branch and 2 sequential C atoms in the second branch
CC(=O)OC
a branched sequence: a C atom connected to a branching C atom; the branching atom contains 1 double-bonded O atom in the first branch and a single-bonded O atom connected to a further single-bonded C atom in the second branch

Cyclic Structures

Cyclic structures are represented by designating ring opening (or closure) bonds by a digit immediately following the atomic symbol at each ring closure.
SMILES Structure
C1CCCCC1
a six-membered cyclic structure containing only single-bonded C atoms
C1=CC=CC=C1
a six-membered cyclic structure containing alternating single-bonded and double-bonded C atoms
c1ccccc1
a six-membered cyclic structure containing only aromatically-bonded C atoms
C1CCC(C1)C2CCCC2
Two five-membered cyclic structures are connected by a single bond. Each cycle contains only single-bonded C atoms.
C1CCC2(C1)CCCC2
Two five-membered cyclic structures are connected to each other through a shared atom. Each cycle contains only single-bonded C atoms.
C1CC2CCCC2C1
Two five-membered cyclic structures are connected to each other through a shared bond. Each cycle contains only single-bonded C atoms.

Explicit, Implicit, and Implied Hydrogen atoms

Chemists have three principal ways to indicate how many hydrogen atoms are connected to any one carbon atom. This is due to the abundance of hydrogen atoms whenever carbon atoms are present.
This concept is oftentimes extended to non-carbon atoms. For instance the hydroxy group may be written as O[H], [OH], or simply as O, depending on the context.

Stereochemistry

Stereochemistry is indicated with slashes for E/Z (cis/trans) bonds and @ symbols for R/S stereocenters.
The relative orientation of the slashes indicates the orientation around the double bond.
The stereochemistry around chiral atoms is relative.
SMILES Structure
C\C=C\C
a linear sequence: 2 peripheral C atoms, each connected to 2 double-bonded C atoms; the peripheral atoms are positioned on opposite sides of the double bond
C\C=C/C
a linear sequence: 2 peripheral C atoms, each connected to 2 double-bonded C atoms; the peripheral atoms are positioned on the same side of the double bond
[H][C@](F)(Cl)Br
4 atoms arranged around a central atom with implied directionality: the central atom is a C atom; a Br atom left of center, and a Cl atom below center form a plane with the central atom; an H atom lies behind the plane, and an F atom lies in front of the plane
[H][C@@](F)(Cl)Br
4 atoms arranged around a central atom with implied directionality: the central atom is a C atom; a Br atom left of center, and a Cl atom below center form a plane with the central atom; an H atom lies in front of the plane, and an F atom lies behind the plane

Multiple Structures

When multiple structures are present in one drawing, the SMILES are separated from each other with a period. It is common to find this in salts, isomeric drawings, and reactions.
SMILES Structure
[Na+].[F-] Na+ F
C\C=C\C.C\C=C/C
a composite structure: 2 isomers of the CC=CC structure are shown; 1 with both ends on the same side of the double bond; and 1 with the ends on opposite sides

Reactions

Reactions are by definition composed of reactants and products. In SMILES, reactants appear to the left of a >> and products appear to the right of it. There is no choice of arrows. On rare occasions (such as when a catalyst is present), the catalyst is placed between the > symbols.
SMILES Structure
C\C=C\C.[H][H]>>CCCC
a reaction: C\C=C\C reacts with HH to form CCCC
[H][H].CC(C)=C>[Pt]>CC(C)C
a reaction: C=C(C)C reacts with HH in the presence of Pt to form CC(C)C
Note that in some drawing modes, H atoms appear on the drawing canvas that are not present in the SMILES code. This is normal and anticipated behavior for many chemical drawing tools. These H atoms are often called "implied hydrogen atoms" because the SMILES code implies that they should be there to fulfill valence rules.

Extended Properties

Decorations (such as lone pairs, radical electrons, and a few other items) are not intrinsic parts of chemical structures, but they are used to highlight certain features. These features cannot be captured using standard SMILES, but can be captured in so-called extended SMILES, where they are placed between vertical bars. Atoms are counted from the left of the SMILES starting at position 0.
SMILES Structure
[Li] |^1:0|
a Li atom with a single dot placed next to it
[Be] |^2:0|
a Be atom with two single dots placed next to it at opposite ends
[B] |^7:0|
a B atom with three single dots placed next to it at cardinal positions
Unfortunately, the tetravalent state
cannot be described in SMILES.
a C atom with four single dots placed next to it at cardinal positions
[H]N([H])[H] |lp:1:1|
The molecule NH3 is shown with a double-dot placed next to the N atom. All atoms are connected by single bonds.
[H]O[H] |lp:1:2|
The molecule H2O is shown with two double-dots placed next to the O atom. All atoms are connected by single bonds.
C#N |lp:0:1,1:1|
The CN icon is shown with a single double-dot placed next to each atom. The atoms are connected by a triple bond.
[N]=O |lp:0:1,1:2,^1:0|
The NO molecule is shown with a single-dot and a double-dot placed next to the N atom and 2 double-dots placed next to the O atom. The atoms are connected by a double-bond.

Mechanism Illustrations

Drawings used to show mechanistic movement of electrons during a chemical transformation are so-called process functions rather than state functions. While SMILES are adept at capturing states, they are intrinsically unsuitable to capture processes. The beginning- and end-state may be presented as SMILES, but electron-flow arrows can only be described in words for figure captions and in Chemical Markup Language when interacting with the Marvin drawing tool.

Further Reading

While this document describes features that are fairly common in general chemistry, biochemistry, and organic chemistry, it is by no means comprehensive. More information may be obtained from the following sources.