0% found this document useful (0 votes)
18 views23 pages

Ben Frod

Uploaded by

BLUE Stache
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views23 pages

Ben Frod

Uploaded by

BLUE Stache
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

The Law of Anomalous Numbers

Author(s): Frank Benford


Reviewed work(s):
Source: Proceedings of the American Philosophical Society, Vol. 78, No. 4 (Mar. 31, 1938), pp.
551-572
Published by: American Philosophical Society
Stable URL: http://www.jstor.org/stable/984802 .
Accessed: 16/12/2012 16:19

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .
http://www.jstor.org/page/info/about/policies/terms.jsp

.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact [email protected].

American Philosophical Society is collaborating with JSTOR to digitize, preserve and extend access to
Proceedings of the American Philosophical Society.

http://www.jstor.org

This content downloaded on Sun, 16 Dec 2012 16:19:52 PM


All use subject to JSTOR Terms and Conditions
THE LAW OF ANOMALOUS NUMBERS
FRANK BENFORD
GeneralElectricCompany,
Physicist,ResearchLaboratory,
New York
Schenectady,
(Introducedby IrvingLangmuir)
(Read April22, 1937)

ABSTRACT

It has been observedthat the firstpages of a table of commonlogarithms


showmorewearthando thelast pages,indicating thatmoreusednumbersbegin
withthedigit1 thanwiththe digit9. A compilation ofsome20,000firstdigits
takenfromwidelydivergent sourcesshowsthatthereis a logarithmic distribution
offirstdigitswhenthenumbersare composedoffourormoredigits. An analysis
ofthe numbersfromdifferent sourcesshowsthatthe numberstakenfromunre-
latedsubjects,suchas a groupofnewspaper items,showa muchbetteragreement
witha logarithmic distributionthando numbersfrommathematical tabulations
or otherformaldata. There is herethe peculiarfactthat numbersthat indi-
viduallyare withoutrelationship are, whenconsideredin largegroups,in good
agreement witha distribution law-hence the name " AnomalousNumbers."
A furtheranalysisofthedata showsa strongtendency forbodiesofnumerical
data to fallintogeometric series. If theseriesis madeup ofnumberscontaining
threeormoredigitsthefirstdigitsforma logarithmic series. If thenumberscon-
tainonlysingledigitsthegeometric relationstillholdsbutthesimplelogarithmic
relationno longerapplies.
An equationis givenshowingthe frequencies of firstdigitsin the different
ordersofnumbers1 to 10, 10 to 100,etc.
The equationalso givesthefrequency ofdigitsin thesecond,third -place
ofa multi-digit number, and it is shownthatthesamelaw appliesto reciprocals.
Thereare manyinstancesshowingthatthegeometric series,or thelogarith-
miclaw,has longbeenrecognized as a commonphenomenon in factualliterature
and intheordinary affairs
oflife. The wiregaugeand drillgaugeofthemechanic,
the magnitudescale of the astronomer and the sensoryresponsecurvesof the
psychologist are all particularexamplesofa relationship thatseemsto extendto
all humanaffairs. The Law ofAnomalousNumbersis thusa generalprobability
law ofwidespreadapplication.

PART I: STATISTICAL DERIVATION OF THE LAW


IThas been observedthat the pages of a much used table
of common logarithmsshow evidences of a selective use of
the natural numbers. The pages containingthe logarithms
of the low numbers1 and 2 are apt to be more stained and
frayedby use than those of the highernumbers8 and 9. Of
PROCEEDINGS OF THE AMERICAN PHILOSOPHICAL SOCIETY,
VOL. 78, NO. 4, MARCH 1938 551

This content downloaded on Sun, 16 Dec 2012 16:19:52 PM


All use subject to JSTOR Terms and Conditions
552 FRANK BENFORD

course,no one could be expected to be greatlyinterestedin


the conditionof a table of logarithms,but the mattermay be
consideredmoreworthyofstudywhenwe recallthat the table
and
is used in the building up of our scientific,engineering,
general factual literature. There may be, in the relative
cleanlinessof the pages of a logarithmtable, data on how we
thinkand how we react when dealing withthingsthat can be
describedby means of numbers.
Methodsand Terms
Beforepresentingthe data collectedwhileinvestigatingthe
possibleexistenceof a distributionlaw that applies to numer-
ical data in general-,and to randomdata in particular,it may
be wellto definea fewtermsand outlinethe methodofattack.
First, a distinctionis made betweena digit,whichis one
of the nine natural numbers 1, 2, 3, ... 9, and a number,
whichis composedof one or moredigits,and whichmay con-
tain a 0 as a digitin any positionafterthe first. The method
ofstudyconsistsofselectingany tabulationofdata that is not
too restrictedin numericalrange,or conditionedin some way
too sharply,and makinga count of the numberof times the
natural numbers 1, 2, 3, ... 9 occur as firstdigits. If a
decimalpoint or zero occursbeforethe firstnaturalnumberit
is ignored,forno attentionis to be paid to magnitudeother
than that indicatedby the firstdigit.
The Law of Large Numbers
An effortwas made to collect data fromas many fieldsas
possible and to include a variety of widely different types.
The types range from purely random numbers that have no
relationotherthan appearing withinthe covers of the same
magazine, to formalmathematicaltabulations that admit of
no variationfromfixedlaws. Between these limitsone will
recognizevarious degrees of randomness,and in general the
title of each line of data in Table I will suggestthe nature of
the source. In every group the count was continuousfrom
the beginningto the end, or in the case oflong tabulations,to
a sufficient numberof observationsto insure a fair average.

This content downloaded on Sun, 16 Dec 2012 16:19:52 PM


All use subject to JSTOR Terms and Conditions
THE LAW OF ANOMALOUS NUMBERS 553

The numberscountedin each groupis givenin the last column


of Table I.
TABLE I
PERCENTAGE OF TIMES THE NATURAL NUMBERS 1 TO 9 ARE USED AS FIRST
DIGITS IN NUMBERS, AS DETERMINED BY 20,229OBSERVATIONS

First Digit
T itle _ _ _ _ _ _ _ __ - _ _ _ - _ _ _ - _ _ _ -~ount C

1 2 3 4 5 6 7 8 9

A Rivers,Area 31.0 16.4 10.7 11.3 7.2 8.6 5.5 4.2 5.1 335
B Population 33.9 20.4 14.2 8.1 7.2 6.2 4.1 3.7 2.2 3259
C Constants 41.3 14.4 4.8 8.6 10.6 5.8 1.0 2.9 10.6 104
D Newspapers 30.0 18.0 12.0 10.0 8.0 6.0 6.0 5.0 5.0 100
E Spec.Heat 24.0 18.4 16.2 14.6 10.6 4.1 3.2 4.8 4.1 1389
F Pressure 29.6 18.3 12.8 9.8 8.3 6.4 5.7 4.4 4.7 703
G H.P. Lost 30.0 18.4 11.9 10.8 8.1 7.0 5.1 5.1 3.6 690
H Mol. Wgt. 26.7 25.2 15.4 10.8 6.7 5.1 4.1 2.8 3.2 1800
I Drainage 27.1 23.9 13.8 12.6 8.2 5.0 5.0 2.5 1.9 159
J AtomicWgt. 47.2 18.7 5.5 4.4 6.6 4.4 3.3 4.4 5.5 91
K n-, i/n,*.**25.7 20.3 9.7 6.8 6.6 6.8 7.2 8.0 8.9 5000
L Design 26.8 14.8 14.3 7.5 8.3 8.4 7.0 7.3 5.6 560
M Digest 33.4 18.5 12.4 7.5 7.1 6.5 5.5 4.9 4.2 308
N Cost Data 32.4 18.8 10.1 10.1 9.8 5.5 4.7 5.5 3.1 741
0 X-RayVolts 27.9 17.5 14.4 9.0 8.1 7.4 5.1 5.8 4.8 707
P Am. League 32.7 17.6 12.6 9.8 7.4 6.4 4.9 5.6 3.0 1458
Q Black Body 31.0 17.3 14.1 8.7 6.6 7.0 5.2 4.7 54 1165
R Addresses 28.9 19.2 12.6 8.8 8.5 6.4 5.6 5.0 5.0 342
S n',n2... n! 25.3 16.0 12.0 10.0 8.5 8.8 6.8 7.1 5.5 900
T Death Rate 27.0 18.6 15.7 9.4 6.7 6.5 7.2 4.8 4.1 418

Average. . . . . . .30.6 18.5 12.4 9.4 8.0 6.4 5.1 4.9 4.7 1011
Probable Error 4-0.8 +0.4 -0.4 +0.3 +-0.2 +-0.2 +-0.2 +-0.2 +0.3

At the foot of each column of Table I the average per-


centage is given for each firstdigit, and also the probable
errorof the average. These averages can be better studied
if the decimal point is moved two places to the left,making
the sum of all the averages unity. The frequencyof firstl's
is then seen to be 0.306, whichis about equal to the common
logarithmof 2. The frequencyof first2's is 0.185, which is
slightlygreater than the logarithmof 3/2. The difference
here,log 3 - log 2, is called the logarithmicintegral. These
resemblancespersistthroughout,and finallythereis 0.047 to
be comparedwithlog 10/9,or 0.046.

This content downloaded on Sun, 16 Dec 2012 16:19:52 PM


All use subject to JSTOR Terms and Conditions
554 FRANK BENFORD

The frequency of first digits thus follows closely the


logarithmicrelation
Fa = log (a + ), (1)

whereFa is the frequencyof the digit a in the firstplace of


used numbers.
TABLE II
OBSERVED AND COMPUTED FREQUENCIES

Natural Number Observed Logarithm Observed Prob. Error


Number Interval Frequency Interval - Computed of Mean

1 1 to 2 0.306 0.301 +0.005 ?t0.008


2 2 to 3 0.185 0.176 +0.009 ?40.004
3 3 to 4 0.124 0.125 -0.001 -0.004
4 4 to 5 0.094 0.097 -0.003 ?40.003
5 5 to 6 0.080 0.079 +0.001 ?40.002
6 6 to 7 0.064 0.067 -0.003 ?t0.002
7 7 to 8 0.051 0.058 -0.007 4t0.002
8 8 to 9 0.049 0.051 -0.002 ?t0.002
9 9 to 10 0.047 0.046 +0.001 ?t0.003

There is a qualification to be noted immediately,for


Table I was compiledfromnumberscomposed in general of
four,five-and six digits. It will be shown later that Eq. (1)
is a distributionlaw for largenumbers,and there is a more
general equation that applies when consideringnumbersof
one, two significantdigits.
If we may assume the accuracyof Eq. (1), we thenhave a
probabilitylaw ofthe mostgeneralnature,forit is a probabil-
ity derived from"events" throughthe medium of theirde-
scriptivenumbers;it is not a law of numbersin themselves.
The range of subjects studied and tabulated was as wide as
timeand energypermitted;and as no definiteexceptionshave
ever been observed among true variables, the logarithmic
law forlarge numbersevidentlygoes deeper among the roots
ofprimalcauses than our numbersystemunaided can explain.
FrequencyofDigits in theqthPosition
The second-place digits are ten in number,for here we
musttake 0 into account. Also, in consideringthe frequency

This content downloaded on Sun, 16 Dec 2012 16:19:52 PM


All use subject to JSTOR Terms and Conditions
THE LAW OF ANOMALOUS NUMBERS 555

Fb of a second-place digit b we must take into account the


digit a that preceded it. The logarithmicinterval between
two digitsis now to be dividedinto ten parts correspondingto
the ten digits0, 1, 2, ... of
9. Let a be the firstdigit a num-
ber and b be the second digit;thenusingthe customarymean-
ing of position and order in our decimal system a two-digit
numberis writtenab, and the next greaternumberis written
ab + 1.
The logarithmic interval between ab and ab + 1 is
log (ab + 1) - log ab, while the interval covered by the ten
possible second-place digits is log (a + 1) - log a. There-
fore the frequencyFb of a second-place digit b followinga
first-placedigit a is

= log
F Fb= Og ( ab?+ )/l1 + (2)
ab ,1og a
As an example,the probabilityFb of a 0 followinga first-place
5 in a randomnumberis the quotient

Fb = log0/ log .

It followsthat the probabilityfora digitin the qthpositionis

1 lgabc abc
... p (q +1)
... pq
Fb = -3abc o (p+1))
log abc... oIp .. abc p
Here the frequencyof q depends upon all the digits that
precede it, but when all possible combinationsof these digits
are takeninto accountFq approachesequalityforall the digits
0, 1, 2, *.. 9, or
Fq 0.1. (4)
As a resultof this approach to uniformity in the qth place
the distributionof digitsin all places in an extensivetabula-
tion of multi-digitnumberswill be also nearlyuniform.

This content downloaded on Sun, 16 Dec 2012 16:19:52 PM


All use subject to JSTOR Terms and Conditions
556 FRANK BENFORD

TABLE III
FREQUENCY OF DIGITS IN FIRST AND SECOND PLACES

Digit First Place Second Place

0. 0.000 0.120
1. 0.301 0.114
2. 0.176 0.108
3. 0.125 0.104
4. 0.097 0.100
5. 0.079 0.097
6. 0.067 0.093
7... 0.058 0.090
8.... 0.051 0.088
9. 0.046 0.085

Reciprocals
Some tabulations of engineeringand scientificdata are
given in reciprocalform,such as candles per watt, and watts
per candle. If one formof tabulation followsa logarithmic
distribution,then the reciprocaltabulation will also have the
same distribution. A little considerationwill show that this
must followfordividingunityby a given set of numbersby
means of logarithmsleads to identicallogarithmswithmerely
a negativesign prefixed.
The Law ofAnomalousNumbers
A study of the itemsof Table I shows a distincttendency
forthose of a randomnatureto agreebetterwiththe logarith-
mic law than those of a formalor mathematicalnature. The
best agreementwas foundin the arabic numbers(not spelled
out) of consecutivefrontpage news items of a newspaper.
Dates were barred as not being variable, and the omissionof
spelled-outnumbersrestrictedthe counted digitsto numbers
10 and over. The first342 streetaddressesgiven in the cur-
rentAmericanMen ofScience (Item R, Table IV) gave excel-
lent agreement,and a complete count (except for dates and
page numbers)of an issue of the Readers' Digest was also in
agreement.
On the other hand, the greatest variations from the
logarithmicrelation were found in the firstdigits of mathe-

This content downloaded on Sun, 16 Dec 2012 16:19:52 PM


All use subject to JSTOR Terms and Conditions
THE LAW OF ANOMALOUS NUMBERS 557

maticaltables fromengineering handbooks,and in tabulations


ofsuch closelyknitdata as Molecular Weights,SpecificHeats,
Physical Constantsand AtomicWeights.
TABLE IV
SUMMATION OF DIFFERENCES BETWEEN OBSERVED AND THEORETICAL
FREQUENCIES

Nature Nature

1 D NewspaperItems 2.8 11 N Cost Data, Concrete 12.4


2 F PressureLost,AirFlow 3.2 12 S n.... n8,n! 13.8
3 G H.P. Lost in AirFlow 4.8 13 L DesignData Generators 16.6
4 R StreetAddresses,A.M.S. 5.4 14 B Population,U. S. A. 16.6
5 P Am. League,1936 6.6 15 I DrainageRate ofRivers 21.6
6 Q Black Body Radiation 7.2 16 K n-1,-Vfn .*- 22.8
7 0 X-Ray Voltage 7.4 17 H MolecularWgts. 23.2
8 M Readers'Digest 8.4 18 E SpecificHeats 24.2
9 A AreaRivers 9.8 19 C PhysicalConstants 34.9
10 T Death Rates 11.2 20 J AtomicWeights 35.4

These factslead to the conclusionthat the logarithmiclaw


applies particularlyto those outlaw numbersthat are without
known relationshiprather than to those that individually
followan orderlycourse;and therefore the logarithmicrelation
is essentiallya Law of Anomalous Numbers.
PART II: GEOMETRIC BASIS OF THE LAW

The data so farconsideredhave been composedentirelyof


used numbers;that is, numbersas they are used in everyday
affairs. There must be some underlyingcauses that distort
what we call the "natural" numbersysteminto a logarithmic
distribution,and perhaps we can best get at these causes by
firstexaminingbrieflythe frequencyof the natural numbers
themselves when arranged in the infinitearithmeticseries
1, 2, 3, ... n, wheren is as large as any numberencountered
in use.
Let us assume that each individualnumberin the natural
numbersystemup to n is used exactlyas oftenas everyother
individual number. Starting with 1, and counting up to

This content downloaded on Sun, 16 Dec 2012 16:19:52 PM


All use subject to JSTOR Terms and Conditions
558 FRANK BENFORD

10,000,forexample, 1 would have been used 1,112 times,or


11.12 per cent of all uses. If the count is extendedto 19,999
thereare 9,999 l's added, and firstl's occur in 55.55 per cent
of the 19,999 numbers. When number 20,000 is reached
there is a temporarystopping of the addition of first l's
and 90,000 of the other digits are added to the series before

FRfEQfJENVCY
Or
P/,Sr PLACC D/G/r3
0 OBSERVED

0.30

/ 2 3 4 5 6 7 8 3
a
FIG. 1. Comparisonofobservedand computedfrequencies
formulti-digit
numbers.

l's are again broughtintothe series,at100,000. At thispoint


the percentageof l's is again reduced to 11.112 per cent as
illustratedin curve A of Fig. 2. This curve is Fn and log n
plotted to a semi-logarithmic scale. If the equations forA
are writtenforthe threediscontinuousbut connectedsections
10,000-20,000, 20,000-99,999 and 99,999-100,000 the area
underthe curve will be veryclosely0.30103, wherethe entire
area of the frameof coordinateshas an area 1. But an inte-
grationby the methodsof the calculus is merelya quick way
of adding up an infinitenumberof equallyspaced ordinatesto
the curve and fromthis additionfindingthe average heightof

This content downloaded on Sun, 16 Dec 2012 16:19:52 PM


All use subject to JSTOR Terms and Conditions
THE LAW OF ANOMALOUS NUMBERS 559

the ordinatesand hence the area underthe curve. But if we


are satisfiedwitha resultsomewhatshortof the perfectionof
the integralcalculus we may take a finitenumberof equally
spaced ordinatesand by plain arithmeticcome to practically
the same answer. By definitioneach point of A represents

LINEAR FREqU/NC/ES
FROM /Q000TO/0,000
1 2 3 4 5 67 89 /0

ArFOR/
8FOR 9

0.4
0O3

NA7VRAA NUMBER
FIG. 2. Linear frequencies
of the naturalnumbersystembetween10,000and
100,000.

the frequencyof firstl's from1 up to that point,and an inte-


gration (by calculus or arithmetic)under curve A gives the
averagefrequencyoffirstl's up to 100,000. The finitenumber
correspondingto equally spaced ordinatesnow representsa
geometricseries of numbersfrom10,000 to 100,000,and it is
substantiallythisseriesofnumbers,in thisand otherordersof

This content downloaded on Sun, 16 Dec 2012 16:19:52 PM


All use subject to JSTOR Terms and Conditions
560 FRANK BENFORD

the natural numberscale that lead to the,numericalfrequen-


cies already presented.
Curve B of Fig. 2 is for9 as a firstdigit. The frequency
of9's decreasesin the numberrangefrom10,000to 89,999 and
then increasesas 9's are added from90,000 to 99,999, and an
integrationundercurve B leads to a good numericalapproxi-
mationto the logarithmicintervallog 10 - log 9, as called for
by the previousstatisticalstudy.

Geometricand LogarithmicSeries
The close relationshipof a geometricseriesand a logarith-
micseriesis easilyseenand hardlyneedsformaldemonstration.
The uniformlyspaced ordinates of Fig. 2 forma geometric
series of numbersfor these numbershave a constant factor
betweenadjacent terms,and thisconstantfactoris determined
in size by the constantlogarithmicincrement.

Semi-LogCurves
A geometricseriesofnumbersplottedto a semi-logarithmic
scale gives a straightline. In the originaltabulation of ob-
served numbersthe line of data marked "R" is designated
simplyas "street addresses." These are the streetaddresses
ofthe first342 people mentionedin the currentAmericanMen
ofScience. The randomnessof such a list is hardlyto be dis-
puted, and it should thereforebe usefulforillustrativepur-
poses.
In Fig. 3 these addresses are firstindicatedby the height
of the lines at the base of the diagram. The heightof a line,
measured on the scale at the left,indicates the numberof
addresses at, or near, that streetnumber. Thus therewere
fiveaddressesat No. 29 on various streets. In orderto make
the trendclearer,the heightsof these lines were summed,be-
ginningat the leftand proceedingacross to the right. It was
found that four straightlines could be drawn among these
summationpoints with fair fidelityof trend,and these four
lines representfour geometricseries, each with a different
factorbetweenterms. Each line will give the observedfre-

This content downloaded on Sun, 16 Dec 2012 16:19:52 PM


All use subject to JSTOR Terms and Conditions
- ~~D/UTRl8ur/ON AND 3UMMA7YON
OF F/Rsr 34 2

5IMER/CAN MEN OF SCIENCE" /9-34


- - - - -

I WIll
~ ~ ~ ~ ~ ~

/0

/0 ___t_Q _08_C

FIG. 3. Distribution
and summation
offirst342streetaddresses,American
M

This content downloaded on Sun, 16 Dec 2012 16:19:52 PM


All use subject to JSTOR Terms and Conditions
562 FRANK BENFORD

quency over the numericalrange it covers,and hence satisfies


the logarithmicrelationship.

The Natural Numbersand Nature'sNumbers


In natural events and in events of which man considers
himselfan originatorthereare plentyofexamplesofgeometric
orlogarithmicprogressions. We are so accustomedto labeling
things1, 2, 3, 4, *** and thensayingtheyare in naturalorder
that the idea of 1, 2, 4, 8, * being a more natural arrange-
.

ment is not easily accepted. Yet it is in this latter manner


that a surprisingly large numberofphenomenaoccur,and the
evidenceforthis is available to everyone.
First, let us considerthe physiologicaland psychological
reactionto externalstimuli.
The growthof the sensationof brightnesswith increasing
illumination is a logarithmic function, as illustrated by
Fechner's Law. The growth of sensation is slow at first
whilethe rodsofthe retinaare alone responsive,and a straight
line on semi-logarithmicpaper (the stimulus being on the
logarithmicscale) can representthe intensity-brightness func-
tion in this region. When the cones come into action there
is a sharp change in the rate of growth,and anotherstraight
line representsour workingrangeof vision. When over-exci-
tationand fatigueset in, a thirdline is needed; and thus three
geometricseries could be used to state the relationbetween
illuminationand the sensation of brightness. If the litera-
ture contained sufficientnumericalreferences, the brightness
functionshould give an extremelyclose approximationof the
logarithmiclaw of distribution.
The sense of loudness followsthe same rules, as does the
sense of weight;and perhaps the same laws operate to make
the senseofelapsed timeseemso different at ages ten and fifty.
Our music scales are irregulargeometricseriesthat repeat
rigidlyeveryoctave.
In the fieldof medicine,the responseof the body to medi-
cine or radiationis oftenlogarithmic,as are the killingcurves
undertoxinsand radiation.

This content downloaded on Sun, 16 Dec 2012 16:19:52 PM


All use subject to JSTOR Terms and Conditions
THE LAW OF ANOMALOUS NUAIBERS 563

In the mechanical arts, where standard sizes have arisen


fromyears of practical experience,the finalresultsare often
geometricseries,as witness our standards of wire diameters
and drillsizes, and the issued lists of "preferrednumbers."
The astronomerlists stars on a geometricbrightnessscale
that multipliesby 100 every five steps and the illuminating
engineeradopts the same typeofseriesin choosingthewattage
of incandescentlamps.
In the field of experimentalatomic physics, where the
results representwhat occurs among groups of the building
units of nature, and where the unit itselfis known only by
mass action,the test data are statisticalaverages. The action
of a single atom or electronis a random and unpredictable
event; and a statistical average of a group of such events
would show a statisticalrelationshipto the resultsand laws
here presented. That this is so is evidencedby the frequent
use made of semi-logpaper in plottingthe test data, and the
test points often fall on one or more straightlines. The
analogy is complete,and one is temptedto thinkthat the 1, 2,
3, *.. scale is not the natural scale; but that, invokingthe
base e of the natural logarithms,Nature counts
e0, ex, e2x e3x ...

and builds and functionsaccordingly.

PART III. DIGITAL ORDERS OF NUMBERS


The natural number system is an array of numbers in
simple arithmeticseries,but on top of this we have imposed
an idea taken froma geometricseries. Numbers composed
of many digits are ordinarilyseparated into groups of three
digitsby interposingcommas,and here we unknowinglygive
evidence of the use of these numberson a geometricscale.
For convenienceofdescriptionthe naturalnumbers1 to 10
are called the firstdigitalordernumbers,thosefrom10 to 100
the second digitalorder,etc. It will be noted that 10 is both
the last numberof the firstorderand the firstnumberof the
second order,and when an integrationis carriedout, as will

This content downloaded on Sun, 16 Dec 2012 16:19:52 PM


All use subject to JSTOR Terms and Conditions
564 FRANK BENFORD

be done later, 10 appears as both an upper and a lowerlimit,


and it is thus used in this case as a boundaryline ratherthan
a unit zone in the natural numbersystem.
In Fig. 4 the curves show the frequencywith which the
naturalnumbersoccurin the Natural NumberSystem,begin-
ningat the leftedge,where1 is the onlynumber,its frequency
is 1; that is, until a second numberis added 1 is the entire
numbersystem. When2 is reachedthefrequencyis 0.50 for1

/INEAIR FREQUEAW/E5 OF rHE NATURAL mum8ERS / ro /,0O

am r ~~~~~JECOND 7-I/RD
feL
A DwX w _ _ r 4 /Gr17L ORDER-O_ /
DTrAL OeRDCR

X I1 I 0 41 11;1
44 V0

XM
42C~~~~~~~ -Vtl=SW
1.2% rr/2 ;SS

FIG. 4. Linear frequenciesof the natural numbers in the firstthree orders.

andO.S0for2. AtS,forexample,thefrequency foreachofthe


firstS digitsis 0.20,and theequal divisioncontinuesuntil9 is
reached. At 10, the digit1 has'appearedtwiceand has a
frequencyof 0.20 against0.10 foreach of the othereight
digitsthathave appearedbut once.
It willbe observedthatthecurverisingfrom9 on thescale
of abscissoeis foronlythedigit1, whilethecurvecontinuing
downward from9 is forthedigits2 to 9 inclusive. At 19 the
frequency curve for2 risesto join thecurvefor1 at 29 and 1

This content downloaded on Sun, 16 Dec 2012 16:19:52 PM


All use subject to JSTOR Terms and Conditions
THE LAW OF ANOMALOUS NUMBERS 565

and 2 have a commoncurve until 99 is reached and a third


first1 is about to be added to the series. At any ordinatethe
curves thereforetell the frequencyof the total number of
natural numbersup to that point.

II
I

I,

9 /0
FIG. 5. Continuous and discontinuousfunctionsin the neighborhoodof the
digit 9.

The curvesare drawnas ifwe weredealingwithcontinuous


functionsin place of a discontinuousnumbersystem. The
justificationforusing a continuousform-
is that the thingswe
use the numbersystem to representare nearly always per-
fectlycontinuousfunctions,and the number,say 9, given to
any phenomenonwillbe used in some degreeforall the infinite
sizes of phenomenabetween 8 and 10 when we confineour-
selves to singledigitnumbers.

This content downloaded on Sun, 16 Dec 2012 16:19:52 PM


All use subject to JSTOR Terms and Conditions
566 FRANK BENFORD

An enlargedsketch of the linear frequencycurves at the


junctionofthe firstand second ordersis givenin Fig. 5. The
lines h-b and b-j are the computedratios of 1 in this region,
whilethe lines 8-b forthe ratio of 9 beginsat 8, foras soon as
size 8 is passed thereis a possibilityof our usinga 9, whilefor

rFIW /EUNCY OF S/NGLE D/G/I3


/ T09
+ rHEORE7/CAL
O OgSERvVED FREQUENCY Or FOOTNOFfS
/N /o BOOKS EACH H/AVING Ar L.EAsr
ONE PAGE WIrH TEN FoorWores
(2,968 O&RV&ED)
0.50 l __

0.40 _

0. /O -

/ 2 3 4 5 6 7 8a9

ofsingledigits.
FIG. 6. Theoreticaland observedfrequencies

size 812 the chancesare about equal forcallingit either8 or 9.


The summationof area underthe curve 8-b-c is taken as the
probabilityofusinga 9 forphenomenain this region. This is
about equivalentto knowingaccuratelythe size ofall phenom-
ena in this regionand decidingto call everythingbetween8.5

This content downloaded on Sun, 16 Dec 2012 16:19:52 PM


All use subject to JSTOR Terms and Conditions
THE LAW OF ANOMALOUS NUMBERS 567

and 9.5 by the number9. Once 9 is passed the curve for 1,


b-j, beginsto risein anticipationof the phenomenabetween9
and 10 that will be called 10.
It has been notedthat forhighordersofnumbersthe areas
underthe curvesof Fig. 2 are proportionalto the frequencyof
use of the firstdigit. The same demonstrationwill now be
made withthe aid ofthe calculus in regionsthat are markedly
discontinuous.
Selectingthe thirddigitalorder,Fig. 4, the area underthe
1-curvecan be written
*199 999 lOQO

A1"' -
00
yddx +j 19
Y2d + 99
3 dx, (5)
wherethe ordinatesof the firstrisingsectionof the curve are
a - 88
Yi a (6)

The descendingsectionof the curve has ordinates


111
Y2 = (7)

and the last risingsectionbetween999 and 1,000has ordinates


a - 888
a- (8)

The curvesare plottedto semi-logarithmic


coordinatesaDd
x = log a, (9)
dx = dala. (10)
The integralsaftermakingthese substitutionsgive the value
1990 8
A1"'= loe 99 1000'
A similar operation yields for the 1-curve in the second
digital order
A1= og.
A1" 8
g190 + 100
9

This content downloaded on Sun, 16 Dec 2012 16:19:52 PM


All use subject to JSTOR Terms and Conditions
568 FRANK BENFORD

and in the firstorder


10 8
Al' = loge-jj+ -?jj-

From the symmetryrunningthroughthese solutionsand


fromthe solutionsforthe eightotherfirstdigits,we can write
the generalequation forthe Law of AnomalousNumbers

F r = =[log 10 (2.10r1
r -
_1 1) +or 8 1N
tN11
loge
1
where oge [lge (a 1) 10- lori
Fa ~~a+ )10r -1 - _

whereN = log, 10 is thefactorto convertthe expressionsfrom


the naturallogarithmsystem,base e, to the commonlogarithm
system,base 10.
done
If highordersof r are considered,as was unwittingly
in the originalstatisticalwork,these expressionssimplifyby
droppingthe terms - 1 in both numeratorand denominator,
and the numericaltermshaving lor in the denominatorbe-
come negligible. Hence the generalequations become
2
Fr = =log0lo1, (12)

Far = log , (13)


a$l a
in form,
but these two expressionsno longerhave a difference
and -theymay be mergedinto
+
Far = log1o a (14)

whichwas the relationshiporiginallyobservedformulti-digit


numbers.
In Table V numericalvalues are given forthe theoretical
frequenciesof used numbersfor the first,second, third and
limitingdigitalorders.

This content downloaded on Sun, 16 Dec 2012 16:19:52 PM


All use subject to JSTOR Terms and Conditions
THE LAW OF ANOMALOUS NUMBERS 569

TABLE V
THEORETICAL FREQUENCIES IN VARIOUS DIGITAL ORDERS

First Digit First Order Second Order Third Order Limiting Order

1 tO 10 10 tO 100 100 tO 1000


1 0.39319 0.31786 0.30276 0.30103
2 0.25760 0.17930 0.17638 0.17609
3 0.13266 0.12432 0.12487 0.12494
4 0.08152 0.09479 0.09669 0.09691
5 0.05348 0.07631 0.07889 0.07918
6 0.03575 0.06366 0.06662 0.06695
7 0.02352 0.05444 0.05764 0.05799
8 0.01456 0.04742 0.05078 0.05115
9 0.00772 0.04190 0.04537 0.04576

The frequenciesofthe singledigits1 to 9 varyenoughfrom


the frequenciesof the limitingorderto allow a statisticaltest
if a source of digitsused singlycan be found. The footnotes
so commonlyused in technical literatureare an excellent
source, consistingof units that are indicated by numbers,
lettersor symbols.
The procedureofcollectingdata forthe first-order numbers
was to make a cursoryexaminationof a volume to see if it
contained as many as 10 footnotesto a page, forobviously
no test of the range1 to 9 could be made if the maximum
number fell short of the full range. The numbershere re-
corded in Table VI are the numberof footnotesobservedon
consecutivepages, beginningon page 1 and continuingto the
end of the book, or until it seemed that a fairsample of the
book had been obtained. The books used were the Standard
Handbook for Electrical Engineers, Smithsonian Physical
Tables, Handbuchder Physik and Glazebrook's Dictionaryof
Applied Physics.
In Table VI the observedpercentagesof singledigits1 to
9 are givenalongwiththenumberofpages used in each volume
and the numberof footnotesobserved. The frequencyfor1
is seen to be 43.2 per cent as against the theoreticalfrequency
of39.3 per cent,and forthe digit9 the observationsagreewith
theorywith Fg' = 0.8 per cent.

This content downloaded on Sun, 16 Dec 2012 16:19:52 PM


All use subject to JSTOR Terms and Conditions
570 FRANK BENFORD

In general the agreementwith theoryis as good as the


computedprobable errorsof the observation.
TABLE VI
COUNT OF FOOTNOTES

1 2 3 4 5 6 7 8 9
Volume
Volume ~~Pages __ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ Total
Used Count
Frequencies, in Per Cent

1. S. H. E. E. All 55.1 22.7 12.3 5.0 2.4 1.7 0.3 0.3 0.3 586
2. Sm. Phy. Ta ....... All 56.3 22.1 6.6 6.1 5.0 2.2 1.1 0.6 0.0 181
3. II. derPhy.. 360 52.8 23.6 8.5 5.5 4.0 3.2 0.8 0.0 1.6 127
4. H. der Phy.. 360 37.2 25.7 12.1 9.5 4.8 5.2 2.6 2.2 0.9 230
5. H. der Phy.. 365 29.7 26.6 14.6 11.0 8.0 5.9 1.8 1.0 1.4 287
6. H. der Phy... . 361 19.5 17.4 17.7 11.9 11.3 9.2 6.1 5.8 1.1 293
7. H. derPhy... 360 33.0 27.5 11.8 10.7 4.3 5.9 2.8 2.4 1.6 254
8. H. derPhy... 360 56.8 23.2 6.7 7.6 2.4 1.4- 0.5 1.4 0.0 211
9. GlazebrookI ...... All 49.5 22.3 13.7 6.9 2.3 1.5 1.5 1.5 0.8 394
10. GlazebrookV ..... All 41.7 25.2 13.4 9.1 4.7 3.2 1.7 0.5 0.5 405
ObservedAve.43.2 23.6 11.8 8.3 4.9 3.9 1.9 1.6 0.8 2968
PredictedAve.39.3 25.7 13.3 8.1 5.3 3.6 2.4 1.5 0.8
Difference. +3.9 -2.1 -1.5 +0.2 -0.4 +0.3 -0.5 +0.1 0.0
Probable Error. 3.0 40.6 40.7 ?0.5 ?0.6 ?0.5 ?0.4 ?0.4 ?0.4

SummationofFrequencies
One ofthe conditionsthatmustbe metby theseexpressions
forthe frequenciesof the integersis that, in any one order,
the sum of the frequenciesmust equal unity;that is, the sum
of theirprobabilitiesmust equal certainty.
Selectingthe first-orderdigits,Eq. 11, and remembering
the logarithmicrule that the sum of the logarithmsof a group
of numbersis equal to the logarithmof theircombinedprod-
ucts, we have the probabilityP'
1023456789
P = logio 9102-345678

rs 1 1 1 1 1 1 1 1 1
1-010 10 1010 10 10 10 10 N'
whichreducesto
Pt = log1010 + 0
=1.
In a similarmanner fromthe complete set of equations

This content downloaded on Sun, 16 Dec 2012 16:19:52 PM


All use subject to JSTOR Terms and Conditions
THE LAW OF ANOMALOUS NUMBERS 571

indicatedby Eq. 11 we have

P= lo 190 29-3949-59-6979 89-99


1
+I
g10 99 19 29 39-4959-69-79-89
1 1 1 - 1 1 - 1
10 10 10100 100 100 100 100
1 1
100 N
= log1010 + 0
=1
and similarproofcan be workedout forthe otherorders.

SummaryofPart III
Single digits, regardlessof their relation to the decimal
point and also regardlessof precedingor followingzeros,have
a specific natural frequencythat varies sharply from the
logarithmicratios. The second digital order,which is com-
posed of two adjacent significantdigits, has a specificfre-
quency approximatingthe logarithmicfrequency; and for
three or more associated digitsthe variation fromthe latter
frequencywould be extremelydifficult to findstatistically.
The basic operation

F=f
F fda
a
or
F_ _a
a
in convertingfromthe linearfrequencyofthe naturalnumbers
to the logarithmicfrequencyofnaturalphenomenaand human
events can be interpretedas meaningthat, on the average,
these things proceed on a logarithmicor geometricscale.
Anotherway of interpreting this relationis to say that small
thingsare more numerousthan large things,and there is a
tendencyfor the step between sizes to be equal to a fixed
fractionofthe last precedingphenomenonor event. There is

This content downloaded on Sun, 16 Dec 2012 16:19:52 PM


All use subject to JSTOR Terms and Conditions
572 FRANK BENFORD

no necessityor implicationof limits at eitherthe upper or


the lower regionsof the series.
If the view is accepted that phenomenafallinto geometric
series,then it followsthat the observedlogarithmicrelation-
ship is not a result of the particularnumericalsystem,with
its base, 10, that we have elected to use. Any other base,
such as 8, or 12, or 20, to selectsome ofthe numbersthat have
been suggestedat various times, would lead to similarrela-
tionships; for the logarithmicscales of the new numerical
systemwouldbe coveredby equally spaced stepsby the march
ofnaturalevents. As has been pointedout before,the theory
of anomalous numbersis reallythe theoryof phenomenaand
events, and the numbersbut play the poor part of lifeless
symbolsforlivingthings.

This content downloaded on Sun, 16 Dec 2012 16:19:52 PM


All use subject to JSTOR Terms and Conditions

You might also like