IUPAC Provisional Recommendations: IR-4 Formulae (Draft March 2004)
IUPAC Provisional Recommendations: IR-4 Formulae (Draft March 2004)
CONTENTS
n s
IR-4.1 Introduction
tio
IR-4.2 Definitions of types of formula
IR-4.2.1 Empirical formulae
da
IR-4.2.2 Molecular formulae
IR-4.2.3 Structural formulae and the use of enclosing marks in formulae
n
IR-4.2.4 Formulae of addition compounds
IR-4.2.5 Solid state structural information
me
IR-4.3 Indication of ionic charge
IR-4.4 Sequence of citation of symbols in formulae
om
IR-4.4.1 Introduction
IR-4.4.2 Ordering principles
IR-4.4.2.1 Electronegativity
ec
IR-4.4.2.2 Alphabetical order
IR-4.4.3 Formulae for specific classes of compounds
lR
IR-4.1 INTRODUCTION
Formulae (empirical, molecular, and structural formulae as described below) provide a simple
s
and clear method of designating compounds. They are of particular importance in chemical
n
equations and in descriptions of chemical procedures. In order to avoid ambiguity and for
tio
many other purposes, e.g. in databases, indexing, etc., standardisation is recommended.
da
IR-4.2.1 Empirical formulae
n
The empirical formula of a compound is formed by juxtaposition of the atomic symbols with
me
appropriate (integer) subscripts to give the simplest possible formula expressing the
composition. For the order of citation of symbols in formulae, see Section IR-4.4, but, in the
om
absence of any other ordering criterion (for example, if little structural information is
available), the alphabetical order of atomic symbols should be used in an empirical formula,
except that in carbon-containing compounds, C and H are usually cited first and second,
ec
respectively.1
lR
Examples:
1. BrClH3N2NaO2Pt
2. C10H10ClFe
na
For compounds consisting of discrete molecules, the molecular formula, as opposed to the
isi
empirical formula, may be used to indicate the actual composition of the molecules. For the
order of citation of symbols in molecular formulae, see Section IR-4.4.
ov
The choice of formula depends on the chemical context. In some cases, the empirical formula
Pr
may also correspond to a molecular composition, in which case the only possible difference
between the two formulae is the ordering of the atomic symbols. If it is not desirable or
possible to specify the composition, e.g. in the case of polymers, a letter subscript such as n
C
may be used.
PA
Examples:
Molecular formula Empirical formula
IU
1. S8 S
2. Sn S
3. SF6 F 6S
3. S2Cl2 ClS
4. H 4P 2O 6 H 2O 3P
5. Hg2Cl2 ClHg
6. N 2O 4 NO2
n s
IR-4.2.3 Structural formulae and the use of enclosing marks in formulae
tio
A structural formula gives partial or complete information about the way in which the atoms
in a molecule are connected and arranged in space. In simple cases, a line formula that is just
da
a sequence of atomic symbols gives structural information provided the reader knows that the
formula represents the order of the atoms in the linear structure.
n
me
Examples:
1. HOCN (empirical formula CHNO)
2. HNCO (empirical formula also CHNO)
om
3. HOOH (empirical formula HO)
As soon as the compound has even a slightly more complex structure, it becomes necessary
ec
to use enclosing marks in line formulae to separate subgroups of atoms. Different enclosing
marks must be used for repeating units and sidechains in order to avoid ambiguity.
lR
The basic rules for applying enclosing marks in structural formula are as follows:
na
(ii) Side groups to a main chain and groups (ligands) attached to a central atom
o
regarding their attachment in the structure, e.g. hydrogen in hydrides with a chain
structure).
ov
(v) In the case of polymers, if the bonds between repeating units are to be shown,
IU
s
(vii) Atoms or groups of atoms which are represented together with a prefixed
n
symbol, e.g. a structural modifier such as 'µ', are placed within enclosing marks, using
the nesting order ( ), {( )}, ({( )}), {({( )})}, etc.
tio
The use of enclosing marks for the specification of isotopic substitution is described in
da
Section IR-4.5.
n
Compared to line formulae, displayed formulae (Example 13 below) give more (or full)
information about the structure.
me
(The rules needed for ordering the symbols in some of the example formulae below are given
om
in Section IR-4.4.3.)
Examples:
ec
4. SiH3[SiH2]8SiH3 [rule (i)]
5. SiH3[SiH2]6SiH(SiH3)SiH3 [rules (i) and (ii)]
lR
12. ([PdCl2])n, or
ov
Cl
Pd
Pr
Cl
n
C
13.
Cl PPh3
PA
Ni
Cl PPh3
IU
14. NaCl
15. [NaCl]
n s
In Examples 14 and 15, the formula [NaCl] may be used to distinguish the molecular
tio
compound consisting of one sodium atom and one chlorine atom from the solid with the
composition NaCl.
da
IR-4.2.4 Formulae of addition compounds
n
In the formulae of addition compounds, including multiple salts and solvates (particularly
me
hydrates), a special format is used. The proportions of constituents are indicated by arabic
numerals preceding the formulae of the constituents, and the formulae of the constituents are
separated by a centre dot. The rules for ordering the constituent formulae are described in
Section IR-4.4.3.5.
Examples:
om
ec
1. Na2CO3.10H2O
2. 8H2S.46H2O
lR
3. BMe3.NH3
molecular formula (see Section IR-11.7.2). For example, polymorphs may be indicated by
isi
adding in parentheses an abbreviated expression for the crystal system. Structures may also
be designated by adding the name of a type-compound in italics in parentheses, but such
ov
usage may not be unambiguous. There are at least ten varieties of ZnS(h). Where several
polymorphs crystallise in the same crystal system they may be differentiated by the Pearson
symbol (see Sections IR-3.5.3 and IR-11.5.2). Greek letters are frequently employed to
Pr
designate polymorphs, but their use is often confused and contradictory and is not
recommended.
C
Examples:
PA
For the formulae of solid solutions and non-stoichiometric phases, see Chapter IR-11.
_ _
Ionic charge is indicated by means of a right upper index, as in An+ or An (not A+n or A n).
s
If the formula is placed in enclosing marks, the right upper index is placed outside the
n
enclosing marks. For polymeric ions, the charge of a single repeating unit should be placed
inside the parentheses that comprise the polymeric structure or the total charge of the
tio
polymeric species should be placed outside the polymer parentheses. (The rules needed for
ordering the symbols in some of the example formulae below are given in Section IR-4.4.3.)
da
Examples:
n
_
1. Cu+ 7. As3
me
_
2. Cu2+ 8. HF2
_
3. NO+ 9. CN
_
om
4. [Al(H2O)6]3+ 10. S 2O 72
_
5. H2NO3+ 11. [Fe(CN)6]4
_
6. [PCl4]+ 12. [PW12O40]3
ec
_ _
13. [P3O10]5 , or [O3POP(O)2OPO3]5 , or
lR
5-
O O O
na
O P O P O P O
o
O O O
isi
_
14. ([CuCl3] )n, or
ov
n-
Cl
Pr
Cu Cl
Cl
n
C
PA
IR-4.4.1 Introduction
IU
Atomic symbols in formulae may be ordered in various ways. Section IR-4.4.3 describes the
conventions usually adopted for some important classes of compounds. As a prerequisite,
Section IR-4.4.2 explains what is meant by the two ordering principles 'electronegativity' and
'alphabetical ordering'.
s
IR-4.4.2 Ordering principles
n
IR-4.4.2.1 Electronegativity
tio
If electronegativity is taken as the ordering principle in a formula or a part of a formula, the
da
atomic symbols are cited according to relative electronegativities, the least electronegative
element being cited first. For this purpose, Table VI* is used as a guide, except that oxygen is
n
placed between chlorine and fluorine.
me
IR-4.4.2.2 Alphabetical order
A single letter symbol always precedes a two-letter symbol with the same initial letter, e.g. B
om
before Be, O before OH. The group NH4 is treated as a single symbol and so is listed after
Na, for example.
ec
Where the entities to be arranged in a formula are polyatomic, the order of citation is decided
by selecting a particular atomic symbol to characterise the entity. The first symbol in the
lR
formula of a polyatomic group, as written according to the appropriate rule in Section IR-
4.4.3, determines the alphabetical order. For example, C5H5, SCN, UO2, NO3, OH, and
na
[Zn(H2O)6]2+ are ordered under C, S, U, N, O, and Zn, respectively. If the first symbols are
the same, the symbol with the lesser right index is cited first, e.g. NO2 precedes N2O2. If this
still does not discriminate, the subsequent symbols are used alphabetically and numerically to
o
define the order, e.g. NH2 precedes NO2 which precedes NO3.
isi
is:
_ _ _ _ _ _
N3 , NH2 , NH3, NO2 , NO3 , N2O22 , N3 .
Pr
Examples:
* Tables numbered with a Roman numeral are collected together at the end of this
book.
1. NH3
2. H 2S
3. OF2
s
4. Cl2O
n
_
5. ClO
tio
6. PH4+
_
7. P 2O 74
da
_
8. [SiAs4]8
9. RbBr
n
_
10. [Re2Cl9]
me
_ _
11. HO or OH
_
Note that the formula for the hydroxide ion should be HO to be consistent with the above
om
convention.
12. Rb15Hg16
ec
13. Cu5Zn8 and Cu5Cd8
lR
criteria for ordering the element symbols in the formula are more often used (see Sections
IR-4.4.3.2 to IR-4.4.3.4).
o
In the formula of a coordination entity, the symbol of the central atom(s) is/are placed first,
followed by the symbols or formulae of the ligands, unless additional structural information
C
can be presented by changing the order (see, for example, Section IR-4.4.3.3).
PA
The order of citation of central atoms is based on electronegativity as described in Section IR-
4.4.2.1. Ligands are cited alphabetically (Section IR-4.4.2.2) according to the first symbol of
IU
the ligand formula or ligand abbreviation (see Section IR-4.4.4) as written. Where possible,
the ligand formula should be written in such a way that a/the donor atom is closest to the
central atom to which it is attached.
Square brackets may be used to enclose the whole coordination entity whether charged or not.
Established practice is always to use square brackets for coordination entities with a transition
s
metal as the central atom.
n
Examples:
tio
1. PBrCl2
2. SbCl2F or [SbCl2F]
da
_
3. [Mo6O18]2
_
4. [CuSb2]5
n
5. [UO2]2+
me
_
6. [SiW12O40]4
_
7. [BH4]
_
om
8. [ICl4]
9. [PtCl2{P(OEt)3}2]
10. [Al(OH)(OH2)5]2+
ec
_
11. [PtBrCl(NH3)(NO2)]
12. [PtCl2(NH3)(py)]
lR
In a few cases, a moiety which comprises different atoms and which occurs in a series of
o
compounds is considered as an entity that acts as a central atom and is cited as such, even if
this violates the alphabetical order of ligands. For example, PO and UO2 are regarded as
isi
Examples:
15. POBr3 (alphabetically, PBr3O)
Pr
For derivatives of parent hydrides, the alphabetical order of ligands is traditionally disobeyed
C
in that remaining hydrogen atoms are listed first among the ligands in the formula.
PA
Examples:
17. GeH2F2
18. SiH2BrCl
IU
19. B2H5Cl
For carbaboranes, there has previously been some uncertainty over the order of B and C.3
The order 'B before C' recommended here conforms to both electronegativity and
alphabetical order (i.e. it is an exception to the Hill order1 in Section IR-4.2.1). In addition,
s
carbon atoms that replace skeletal boron atoms are cited immediately after boron, regardless
n
of what other elements are present. (See also Section IR-6.2.4.4).
tio
Examples:
20. B3C2H5 (recommended)
da
21. B3C2H4Br (recommended)
n
For inorganic oxoacids, there is a traditional ordering of formulae in which the 'acid' or
me
'replaceable' hydrogen atoms (hydrogen atoms bound to oxygen) are listed first, followed by
the central atom, then 'non-replaceable' hydrogen atoms (hydrogen atoms bound directly to
the central atom), and finally oxygen. This format is an alternative to writing the formulae as
om
coordination compound formulae (see Section IR-8.3).
Examples:
ec
22. HNO3 (traditional) or [NO2(OH)] (coordination)
23. H2PHO3 (traditional) or [PHO(OH)2] (coordination)
lR
_ _
24. H2PO4 (traditional) or [PO2(OH)2] (coordination)
25. H5P3O10 (traditional) or
na
[(HO)2P(O)OP(O)(OH)OP(O)(OH)2] (coordination)
26. (HBO2)n (traditional) or (B(OH)O)n (coordination)
o
For chain compounds containing three or more different elements, the sequence of atomic
ov
symbols should generally be in accord with the order in which the atoms are bound in the
molecule or ion, rather than using alphabetical order or order based on electronegativity.
Pr
Examples:
_ _ _
1. SCN (not CNS ) = [C(N)S] , nitridosulfidocarbonate(1-)
2. BrSCN (not BrCNS)
IU
If the formula of a compound containing three or more elements is not naturally assigned
s
using the preceding two sections, the compound can be treated as a generalised salt. This
n
term is taken to mean any compound in which it is possible to identify at least one
tio
constituent which is a positive ion or can be classified as electropositive or more
electropositive than the other constituents, and at least one constituent which is a negative ion
da
or can be classified as electronegative or more electronegative than the rest of the
constitutents. The ordering principle is then:
n
(i) all electropositive constituents precede all electronegative constituents;
me
(ii) within each of the two groups of constituents, alphabetical order is used.
Examples:
om
1. KMgF3
2. MgCl(OH)
3. FeO(OH)
ec
4. NaTl(NO3)2
5. Li[H2PO4]
lR
6. NaNH4[HPO4]
7. Na[HPHO3]
na
8. CuK5Sb2 or K5CuSb2
9. K5[CuSb2]
10. H[AuCl4]
o
11 Na(UO2)3[Zn(H2O)6](O2CMe)9
isi
formulae. The formula in Example 9, on the other hand, implies the presence of a molecular
_
entity or coordination entity [CuSb2]5 .
C
Deviation from alphabetical order of constituents in the same class is allowed to emphasise
PA
Example:
IU
Some generalised salts may also be treated as addition compounds, see Section IR-4.4.3.5.
s
In the formulae of addition compounds, multiple salts and solvates, the component molecules
n
or entities are cited in order of increasing number; if they occur in equal numbers, they are
cited in alphabetical order of the first symbols. In addition compounds containing water, the
tio
water remains conventionally cited last. However, boron compounds are no longer treated as
exceptions.
da
Examples:
n
1. 3CdSO4.8H2O
me
2. Na2CO3.10H2O
3. Al2(SO4)3.K2SO4.24H2O
4. AlCl3.4EtOH
om
5. 8H2S.46H2O
6. C6H6.NH3.Ni(CN)2
7. BF3.2H2O
ec
8. BF3.2MeOH
lR
Since abbreviations are widely used in the chemical literature, agreement on their use and
na
meaning is desirable. This Section provides guidelines for the selection of ligand
abbreviations for application in the formulae of coordination compounds (Section IR-
o
9.2.3.4). Some commonly used ligand abbreviations are listed in Table VII with diagrams of
isi
An abbreviation for an organic ligand should be derived from a name consistent with the
current rules for the systematic nomenclature of organic compounds.4 (For some ligands a
Pr
non-systematic name is included in Table VII if it was the source of the abbreviation and if
that abbreviation is still commonly used). However, new abbreviations should be constructed
according to the following recommendations:
C
PA
(ii) New meanings should not be suggested for abbreviations or acronyms that have
generally accepted meanings, e.g. DNA, NMR, ESR, HPLC, Me (for methyl), Et (for ethyl),
etc.
(iii) An abbreviation should readily suggest the ligand name, e.g. ida for iminodiacetato.
s
(iv) Abbreviations should be as short as practicable, but should contain more than one
n
letter or symbol.
tio
(v) The use of non-systematic names (and their abbreviations) is discouraged.
da
(vi) Abbreviations should normally use only lower-case letters, with several well
established exceptions:
n
me
(a) abbreviations for alkyl, aryl and similar groups, which have the first letter
capitalised with the remaining letters in lower case, e.g. Me (for methyl), Ac (for
acetyl), Cp (for cyclopentadienyl), etc.;
om
(b) abbreviations containing atomic symbols, e.g. [12]aneS4;
ec
(c) abbreviations containing Roman numerals, e.g. H2ppIX for protoporphyrin IX;
(d) abbreviations for ligands containing readily removable hydrons (see vii).
lR
(N.B. Abbreviations for solvents that behave as ligands should also be in lower case letters
[e.g. dmso for dimethyl sulfoxide{(methylsulfinyl)methane}, thf for tetrahydrofuran]; the
na
practice of capitalising the abbreviation of a solvent when it does not behave as a ligand is
strongly discouraged as an unnecessary distinction).
o
isi
(vii) Hydronation of anionic ligands, e.g. ida, leads to acids which may be abbreviated by
the addition of H, e.g. Hida, H2ida.
ov
(viii) Ligands which are normally neutral, but which continue to behave as ligands on
losing one or more hydrons, are abbreviated by adding _1H, _2H, etc. as subscripts
Pr
(including the numeral 1) after the usual abbreviation of the ligand. For example, if
_
Ph2PCH2PPh2 (dppm) loses one hydron to give [Ph2PCHPPh2] its abbreviation is
C
The mass number of any specific nuclide can be indicated in the usual way with a left
superscript preceding the appropriate atomic symbol (see Section IR-3.2).
When it is necessary to cite different nuclides at the same position in a formula, the nuclide
symbols are written in alphabetical order; when their atomic symbols are identical the order is
s
that of increasing mass number. Isotopically modified compounds may be classified as
n
isotopically substituted compounds and isotopically labelled compounds.
tio
IR-4.5.2 Isotopically substituted compounds
da
An isotopically substituted compound has a composition such that all the molecules of the
compound have only the indicated nuclide(s) at each designated position. The substituted
n
nuclides are indicated by insertion of the mass numbers as left superscripts preceding the
me
appropriate atom symbols in the normal formula.
Examples:
1. H3HO 5. 32PCl3
2.
3.
H36Cl
235UF6
6.
7. omK[32PF6]
K342K[Fe(14CN)6]
ec
4. 42KNa14CO3
compounds. They may be divided into several different types. Specifically labelled
isi
appropriate nuclide symbol(s) and multiplying subscript (if any) in square brackets.
PA
Examples:
1. H[36Cl] 4. [13C]O[17O]
IU
2. [32P]Cl3 5. [32P]O[18F3]
3. [15N]H2[2H] 6. Ge[2H2]F2
s
any necessary locant(s) (but without multiplying subscripts) enclosed in square brackets.
n
tio
Examples:
1. [36Cl]SOCl2
[2H]PH3
da
2.
3. [10B]B2H5Cl
n
The number of possible labels for a given position may be indicated by subscripts separated
me
by semicolons added to the atomic symbol(s) in the isotopic descriptor.
Example:
om
4. [1-2H1;2]SiH3OSiH2OSiH3
represented by the numeral 0 but is not usually shown. If an element occurs with more than
one oxidation state in the same formula, the element symbol is repeated, each symbol being
o
assigned a number cited in sequence in increasing magnitude and from negative to positive.
isi
Examples:
_
ov
1. [PV2Mo18O62]6 4. PbII2PbIVO4
_
2. K[OsVIII(N)O3] 5. Na2O I2
Pr
_
3. [MoV2MoVI4O18]2 6. [Os0(CO)5]
Where it is not feasible or reasonable to define an oxidation state for each individual member
C
of a group (or cluster), the overall oxidation level of the group should be defined by a formal
PA
ionic charge, indicated as in Section IR-4.3. This avoids the use of fractional oxidation states.
Examples:
IU
_
7. O2 8. Fe4S43+
A radical is an atom or molecule with one or more unpaired electrons. It may have positive,
negative or zero charge. An unpaired electron may be indicated in a formula by a superscript
s
dot. The dot is placed as a right upper index to the chemical symbol, so as not to interfere
n
with indications of mass number, atomic number or composition. In the case of diradicals,
tio
etc., the superscript dot is preceded by the appropriate superscript multiplier. The radical dot
with its multiplier, if any, precedes any charge. To avoid confusion, the multiplier and the
da
radical dot can be placed within parentheses.
Metals and their ions or complexes often possess unpaired electrons but, by convention, they
n
are not considered to be radicals, and radical dots are not used in their formulae. However,
me
there may be occasions when a radical ligand is bound to a metal or metal ion where it is
desirable to use a radical dot.
Examples:
1.
2.
H•
HO •
5.
6.
O 2•−om
BH 3•+
ec
3. NO 2• 7. PO3•2−
4. O 22• 8. NO(2•)−
lR
The sign of optical rotation is placed in parentheses, the wavelength (in nm) being indicated
as a right subscript. The whole symbol is placed before the formula and refers to the sodium
o
Examples:
ov
1. (+)589[Co(en)3]Cl3
2. (_)589[Co{(_)NH2CH(CH3)CH2NH2}3]Cl3
Pr
Excited electronic states may be indicated by an asterisk as right superscript. This practice
does not differentiate between different excited states.
PA
Examples:
IU
1. He*
2. NO*
Structural descriptors such as cis-, trans-, etc., are listed in Table V. Usually such descriptors
are used as italicised prefixes and are connected to the formula by a hyphen.
n s
Examples:
tio
1. cis-[PtCl2(NH3)2]
2. trans-[PtCl4(NH3)2]
da
The descriptor µ designates an atom or group bridging coordination centres.
n
Example:
me
3. [(H3N)5Cr(µ-OH)Cr(NH3)5]5+
IR-4.8 REFERENCES
1.
2. om
This is the so-called Hill order. See, E.A. Hill, J. Amer. Chem. Soc., 22, 479 (1900).
For intermetallic compounds, earlier recommendations prescribed alphabetical ordering
rather than by electronegativity (see Section I-4.6.6 of Nomenclature of Inorganic
ec
Chemistry, Recommendations 1990, Blackwell Scientific Publications, Oxford, 1990).
3. For example, the ordering of B and C in formulae was inconsistent in Nomenclature of
lR