# Number 1 and Benford’s Law – Numberphile

STEVE MOULD: This is
Benford’s Law. And it’s about numbers, but it’s
about the leading digit. For example, you could look at
the populations of all the countries in the world and
look at the leading digits of all those. So for example, if it was 1,269,
then the leading digit in that case is the one. Benford’s Law works on a
distribution of numbers if that distribution spans quite
a few orders of magnitude. And the brilliant thing about
populations of countries is that it actually goes from
tens up to billions. If you were to think about
that, OK, what are the distribution of leading
digits. So some of the populations will
start with the one, some will start with two, three,
four, five, six, seven, eight, or nine. And so there are nine possible
leading digits. And you might imagine that each
one of those possible leading digits are equally
likely to appear. So that’s one in nine– 11%. And if I was to plot that on a
graph, you might expect that to fluctuate around 11%. So it’s going to go like that. So what actually happens is
that a third of the time, that’s up here. A third of the time the
number you choose will start with a one. And it will hardly ever
start with a nine. So nine is down here– tiny number. And then you get this
brilliant curve that goes up like that. Isn’t that crazy? BRADY HARAN: I know you talk
about this sometimes in talks and things you do. What’s the reaction to
that normally when you tell people this? STEVE MOULD: The reaction? The noise is sort
of like this– ohh. And there’s a certain
amount of disbelief sometimes as well. And the way we do it actually
in the show is that we get people to tweet numbers to us. So we’re collecting numbers, and
I try to give them ideas. So maybe like, take the distance
from the venue to where they live and convert that
into some strange units. Or something like that. The interesting thing is, like I
was saying, it works so long as the distribution you’re
choosing from spans loads of orders of magnitude. But if you’re picking numbers
from lots of different distributions, the individual
distributions don’t have to span lots of orders
of magnitude. The meta-distribution of
individual things picked from different distributions follows
Benford’s Law anyway. So it works brilliantly well. BRADY HARAN: What clump
of numbers will this not work for? STEVE MOULD: Human
height in meters. So humans are between one
meter and three meters. So it doesn’t work for that. You get a massive load
around there. And no one’s nine meters tall. Anything that has that short
distribution, it doesn’t work for. But it does work for several
distributions put together that don’t necessarily
individually follow the rule. So I did it for populations. I did it for areas of countries in kilometers squared. If you take one number and
convert it to loads of different units, that
will tend to follow Benford’s Law as well. You can do it for the
Financial Times. Look at all the numbers
on the front cover of the Financial Times. They will tend to follow
Benford’s Law as well. BRADY HARAN: Just a quick
interjection– you can also apply this to the
number of times you watch Numberphile videos or leave
comments underneath. More information at the
end of the video. STEVE MOULD: So the explanation
is to do with scale invariance, which
I’m just getting my head around now. But there are a couple
of intuitive ways of understanding it. One of them is to use the
idea of a raffle. To begin with, it’s a
very small raffle. So there are only two tickets
in this raffle. What are the chances of the
winning ticket in this raffle having a leading digit of one? Well, that’s this one. So it’s one in two. It’s 50%. But then if you increase the
size of the raffle, so there are now three tickets in the
raffle, the chance now are one in three or about 33%. If you add a fourth ticket, then
the probability of the leading digit of the winning
ticket being a one is now 25%, and then 20%, and so on and so
on until you have a raffle with nine tickets in it. And then the probability of the
winning ticket having a leading digit of one
is one in nine. It’s 11%, which was the
intuitive thing that you might think. But then you add your
tenth ticket. And now there are two tickets
that start with a one. So now the probability
is 2 in 10 or 1 in 5. So it would go back up to 20%. The probability will go up, and
up, and up as you add more tickets that start with a one. And once you have a raffle with
19 tickets in it, you’re up to something like 58%. And then you add the
20th ticket. And so the probability
goes down again. So the probability of the
winning ticket having a leading digit of one will go
down, and down, and down through the 20s. It will go down through the
40s, down through the 50s, 60s, 70s, 80s, 90s, until you
add the hundredth ticket. And then the probability will
start to go up again. And then the probability will go
up, and up, and up, all the way through the 100s. And then you get to the 200s,
and it goes down, and down, and down through all the 200s,
300s, 400s, 500s, 600s, 700s, 800s, 900s. And you’ll be back to
11% again then. Then you add the thousandth
ticket. And the probability will
start to go up again. So the probability goes up,
and up, and up through the thousands and then down through
the 2000s, 3000s, blah, blah, blah. And then you get to 10,000
and it goes up. And so basically the probability
of the winning ticket having any digit of one
fluctuates as the size of the raffle increases. And so this is a log scale of
the raffle increasing in size. So you might have a 10, 100,
1,000, 10,000, and so on. And then this is the probability
here of having a leading digit of one. It goes like that. What Frank Benford realized was
that if you pick a number from a distribution that spans
loads of orders of magnitude, or if you pick a number from
the world and you don’t necessarily know what the
distribution is in advance, then it’s like picking a ticket
from a raffle when you don’t know the size
of the raffle. So you have to take the average
of this wiggly line, which is what he did. So that’s the average there. And it’s around 30%. There’s a formula for it, which
is the probability of picking a number with a
particular leading digit of d is equal to log to base 10
of 1 plus 1/d, like that. And so that’s how you do it. And if you plug one into
there, then it’s log base 10 of two. And it ends up being
about 30%. The beauty is that you can
do it in any base. So this doesn’t have
to be base 10. It could be base five, base 16,
whatever you want to do. You can apply Benford’s Law
to different bases. This is a formula that a
forensic accountant would use as a tax formula of something
like that. If you’re making up numbers in
your accounts and the numbers you make up don’t follow
Benford’s Law, then that’s a clue that you might
be cheating. So this is a formula you need to
remember if you’re going to cheat on your tax return. BRADY HARAN: A lot of things
that mathematically inclined people like yourself tell me
when I hear about them seem counter-intuitive. And then you cleverly explain
why it works the way it works. This is one of the few things
that when I’ve heard about it, this just seems logical to me. When someone says one will come
up more often, to me that just seems like, of course
that would happen. STEVE MOULD: Yes. Funny isn’t it? Some people are like you. I would say you’re in the
minority of people that go, well, yeah. And I wonder if there is another
intuitive way of looking at it that you’ve tapped
into, which is that if you imagine something like
the NASDAQ index or something like that– and I don’t know what the NASDAQ
index is size-wise– but imagine that the NASDAQ
index is at 1,000. To change that to 2,000, you’d
have to double it. So the NASDAQ index would have
to increase by 100% to get from something that starts with
a one to something that starts with a two. So that’s quite a big change. But if the NASDAQ index was
on 9,000 and you wanted to increase it to 10,000, then
that’s an 11% increase. So it’s hardly anything. So basically, you don’t really
hang around the nines. As things are growing and
shrinking, you don’t hang around, whereas you do
hang around the ones. And maybe that’s intuitive
to you. So you’re like, yeah
obviously. BRADY HARAN: If you’d like to
see even more about Benford’s Law, we’ve done a bit of a
statistical analysis to find out whether or not your viewing
habits and the number of times you comment on
Numberphile videos is following the Benford curve. The link is below this video
or here on the screen. So why don’t you check it out?

## 100 thoughts on “Number 1 and Benford’s Law – Numberphile”

• ### Niscate Post author

I want Steve to play a new character in Game of Thrones.

• ### Steve Frandsen Post author

Great video. I had to comment so I don't follow Benford's law anymore in commenting.

• ### Dj Sushi Post author

2:22 – Notice "Mahematical constants"

Zipf…

• ### Cacyo Mattos Nunes Post author

Well, Benford`s law certainly doesn`t apply for the first digit of people`s heights… but it must apply for the last one!

• ### Zakatos Post author

To me it's intuitive thar, in any growth process of units, once you hit a 9-leading, it's tending soon to overflow to the next decimal position, so a 9 "atracts" a 1. I know this doesn't prove anything strictly, but thats my impression.
Another way of thinking it: If you have some arbitrary ammount of units whatsoever, and the ammount is represented by x decimal (or any other base's) positions, it's most likely you got from x-1 to x by reaching a new "1", and then all the process along this new order os magnitude tends to be closer and closer to the next "1" of the next position.

It's the 1aw

• ### monkerud2108 Post author

understood it at 3:00 the 10-19 numbers are a larger fraction of the total sum of all the numbers themselves, than say a 2 diggit number starting with 7 so 70-79.

• ### Hei Post author

0, 1, 2, 6, 15, 40, 104, 273, 714, 1870, 4895, 12816, 33552, 87841, 229970

• ### Zakaria Azami Post author

why can't we apply the raffle demonstration example to any digit in [| 2. 9|] ?

• ### Steven McKeating Post author

https://oeis.org/webcam Will probably show that the terms of random sequences on average will follow Benford's law.

• ### Clifford Thompson Post author

Numberphile is the best!

• ### Ty O'Brien Post author

It also works if you spell out the number. Crazy!

• ### Bag Man Post author

zipf's mystery!

• ### Amoura Bidi Post author

3:20 that ancient mac

• ### Simon Chan Post author

This is very interesting but I have a question – if I was to buy a raffle ticket, would it be wiser for me to buy one thats starts with a one given Bedford's law if I assume that approximately 150 people in the office buys the raffle tickets?

• ### Carbon Scythe Post author

So what does it mean if something doesn't follow Benford's Law? I know it can be used to find fraud but what else? Can you somehow use it to figure out the possibility that it is actually fraud that is going on?

• ### Nexus Clarum Post author

8:05… that was my instinctual way of looking at it.

• ### Austin Liu Post author

Does Benford's Law remain true for different number bases? If you took data that conformed to Benford's law in Base 10, and converted it to Base 7, or Base 9, would it still conform to the law? What about higher bases?

• ### Zack Post author

I feel that it happens Because count Zero goes notice , first round & the 1 Take the credit on every Cycle of Increment. Example : 0,1,2,3- 10,11,12,13
just a humble opinion

• ### Garrett Van Cleef Post author

County by county votes by state in 2016 US by election? What should does show? Please do a Numberphile video on this relative to Benford's Law! Great stuff!

• ### AbstrctIgwana84 Post author

Benford's Law makes sense, because when you go up a place, it starts with 1, so it gets the boost in percentage from the new digit first.

• ### The Real Flenuan Post author

It makes sense once you visualise a logarithmic scale, where the 1s take up a third of the space and the other numbers gradually compress until the 9 is practically a tenth the size of the next 1 that will follow, etc.

• ### Atlas WalkedAway Post author

This seems like a painfully obvious principle.

• ### Daniel Arrizza Post author

The way I reason about it is that when you stop counting, you're chopping off the later numbers, leaving you with more early numbers, especially 1.

• ### Анатолий Петровский Post author

Hi! Here ( https://youtu.be/XXjlR2OK1kM?t=357 ) you try to describe Benford's law, but we can see the same image for 2, 3, 4, 5, 6…. and we can do the same calculations for other digits probabilities in number sequence and get the same conclusion, so I don't understand why this law works.

Sorry for my english, this is not my native language 🙂

• ### Mazahir Mammadli Post author

1098 comments

• ### Danilego Post author

2:00 He made the pi symbol with his forehead

• ### Hamzah Husain Post author

that's crazy that bank people use this

• ### mydogiscalledoscar Post author

Was it a nice wedding?

• ### rbapf Post author

I tried the formula, and I noticed that, in base 2^n, the probability that a number will start with a 1 is 1/n. I guess it's just the reciprocal of logs in base 2.

• ### ATMunn Post author

We Are Number 1 but it's Benford's Law

• ### dwaltrip77 Post author

Another way of looking at it: Each of the numerals only have an equal shot at being the leading digit if the distribution stops RIGHT before the next power of 10. Some examples of this would be if the distribution was from 1 to 9, or from 1 to 99, or 1 to 999, etc.

If there is any extra "spare change" added to the size of the distribution, such as 1 to 99 being increased so that it now goes from 1 to 135, then that means there are 35 extra spots in the distribution that start with 1 (i.e. 100, 101, 102… 134, 135). These new options now make the number 1 a much more likely choice to be the leading digit.

Distributions never end nicely right before the next order of magnitude, as the point right between orders of magnitudes is just an arbitrary spot on the number line [1]. This means there is almost always some "spare change".

As 1 is the first number, it is the most likely to be part of the "spare change". When we go from one order of magnitude to the next, we go from the 9's to the 1's (99 –> 100, 999 –> 1000, etc). So, if the distribution crosses multiple orders of magnitudes, 1 always has the best shot at being the leading digit.

2 is next in line after we leave the 1's — 199 goes to 200, 1999 goes to 2000, etc. Thus, 2 has the 2nd best odds of being the leading digit. And then of course the same logic applies for 3, all the way down to 9 being the least likely.

This gives us the shape of the graph in the video!

[1] If we switch from base 10 to some other base, the points on the number line that mark the crossover from one order of magnitude to the next will all switch! But the number line itself hasn't changed — we are simply relabeling the positions.

• ### Micayah Ritchie Post author

this reminds me of VSauce's video on zipf law

• ### Astfresser Post author

I'm 9m tall and i find this offensive.

• ### Taylor Sabbag Post author

Why can't the same principle be applied to other numbers through 9? For example, 2s, 20s, 200s…etc, why wouldn't that be just as applicable?

• ### AlxndrJG Post author

this doesn't work with memes…..

"choose a number"
(☞ﾟ∀ﾟ)☞

"ITS OVER 9000 !!!"
( ಠ ͜ʖರೃ)

"…"
༼ つ ಥ_ಥ ༽つ

• ### Northfan42 Post author

Of course 1 is more common. The simple fact of its closer proximity to the origin than any other single-digit integer means it will naturally occur more frequently as a leading digit than others. Every other leading digit is dependent on all 1-lead numbers preceding them before they can occur, regardless of the order of magnitude. Likewise, all numbers with the leading digit of 3 are dependent on being preceded by all 1-lead and 2-lead numbers and so on.

That said, this is all dependent on the number set beginning at the origin and working its way up. What if an arbitrarily large number with all digits being 9s was the starting point and the number set counted down? At what point would Benford's Law cease to be inverted and take effect in normal fashion as explained here?

• ### Alex Post author

OMG I got even more impressive results! I wanted to put this to the test, so I went around asking everyone I knew what year they were born, and, shockingly, 100% of the answers started with a 1!! I mean, wow! What could be causing that?

• ### GamerZone Post author

lol 1111th comment

• ### Diego Rodríguez Post author

Human height in meters are between 1 meter and 3 meters… Okay, one meter is fine but 3 meters?

• ### jklw10 Post author

random number on my screen = 75

• ### planksunit Post author

I noticed this a long time ago, I wonder how many other people figured this out and never bothered to write it all out let alone publish.

Zipf!

• ### M K Post author

This video from the '90s.

• ### coderatchet Post author

Interested in the general case for the first digit of a base N number system.

• ### abu3qab Post author

"No one is nine meters". It made me laugh too much.

• ### Android jackson lee Post author

He looks like harry potter

• ### Pooja Sonar Post author

What about hexadecimal based numbers? Does it has bias towards base 10?

• ### Kai Na Post author

Does all numbers start with zero? Or none of the numbers start with zero?

• ### antiantiderivative Post author

Has anyone thought about looking at patterns in different bases? Maybe looking at the numbers in a different base will show some more interesting results.

• ### Justin Hill Post author

in binary it's a lot easier to guess the leading digit

• ### physics physics Post author

6:45: By the log properties, it is equal to log(d+1)-log(d).

• ### Jim DeCamp Post author

I once read that someone noticed that in tables of logarithms in libraries, pages with lower numbers , exhibited more than higher numbers. This would explain it.

• ### Yogesh YADAV Post author

Winter soldier
Teaching maths

• ### Zaph Hood Post author

I'we leaved between 1 and 0 comments. This does not span magnitudes.

• ### Clemens Schlage Post author

It's even hundred percent in binal numbers😂

• ### David Wilkie Post author

Time is prime.., because the connection of information has probability one, infinity/infinity = *, so all identity is some selection of characteristic proportion in reciprocal proportion, which-when the selection process is exponential, leads to an intersection of natural logarithm (including every and all numerical bases in "e", continuous).
An actual Mathematician could deduce the probability of the natural occurrence of identities in other number bases, but I'm only committed to comments.
Professor John D Barrow has presented a very impressive lecture on this topic.
___

In a manner of (QM-Time mathematical Intuition) speaking.., if this aspect of here-now is the metastable tip of the topological iceberg, the incident cause-effect of e-Pi rationalisation i-reflection modulation, then the natural probability occurrence of potential possibilities in the Universal Quantum Computation is a continuous expression of the number proportions perceived as Benford's Law…

• ### Some One Post author

Does Benford's Law apply to the digits after the first digit?

• ### Rob Mckennie Post author

I don't understand why the exact same logic couldn't be used for any other digit

• ### That One Guy Post author

Just looking at view counts that I find it seems to like 2, 4, and 6

• ### That One Guy Post author

It more or less depends on the caps of the numbers

• ### Simon Krahnke Post author

Any numbers leading digit is a zero. We just don't write it. So it's the first nonzero leading digit.

• ### I need no channel youtube! Post author

What happens if you do the same probability disttibution for 2? You end up with the same graph. And it only takes the same 10 change to get from 1k down to 9k.
I still dont understand this. Do numbers have a natural tendency to increase more often than decrease?

• ### Tom Ganks Post author

He looks and sounds like Taliesin

• ### Youra Moron Post author

Uhhh. No. Doesn't work for me.

• ### Jonadab the Unsightly One Post author

This only works if the numbers are chosen in a way that makes lower-magnitude numbers more common than would be expected in an even distribution.

Powers of two (or of any number) are a very nice demonstration, because they increase in magnitude at a fairly smooth rate as they go.

If the space of available numbers is finite and all numbers in that space are equally likely, you instead get something more like the sort of leading-digit probabilities people would tend to expect. So for example, in a sampling of genuinely random numbers between 1 and a billion, about 90% of them will (in base 10) be between 100 million and 1 billion; roughly 9% will be between 10 million and 100 million; and so on. With this sort of distribution, all leading digits are just about equally likely. You can make this more obvious by including the leading zeros, at which point it is straightforward that each leading digit, including 0, occurs 10% of the time.

• ### Роман Бойцов Post author

Is Benford's law valid for other number systems?

• ### Unknown Entity Post author

Now do it in base 2: MIND BLOWN!

• ### Venkatesh babu Post author

Leading number is i. then -1 then -i and then 1. similarly 1 1/2 -1/2 2 ..

• ### Twat Waffle Post author

That thumbnail lol

• ### Kaczankuku Post author

If we first start from one, the frequency of appearing one is really as it was showed in the video but we can also start from nine and go to lower and lower numbers, then the frequency will equal to the frequency of one at first condition.

lol

• ### Snowman 雪人 Post author

Please let me know if your country population is really starts with 9

• ### MrSpeedweasel Post author

Of particular importance in forensic accounting.

• ### Projeto Trebuchet Post author

That's awesome!

1÷2+ …=?

• ### Tim Lewis Post author

So would it be the mean proportion of leading values in lists of natural numbers starting from 1. e.g. the average leading value in 1; 1,2; 1,2,3; ………… 1,2,3,4,5,……,1000 etc. This would give you 1+1/2+1/3+…..+1/9+2/10+3/11+4/12+…….+11/19 and then divide that sum by the number of sets e.g. 19 in this case.
Or would it also include all possible sets that don't start with one? Maybe you don't even have to increase 1 at a time, it can be random sets of any natural numbers.

• ### Yuri Post author

Operação antifraude by Brasil Paralelo brought me here.

Open your eyes HUEHUEHUE

br

1
1
1
2
3
4
5

• ### Paul Shin Post author

Wait, isn’t it because of some connection between population growth? Like, the time it takes to go from 1 to 2 is greater than the time it takes to go from 9 to 10, and 10-20 vs 90-100, and so on? I think someone here made a connection to rivers as well… it’s easier for a river to go from 900 to 1000 meters long than it is for it to go from 100 to 200.

I think that if the system at hand is a growing dynamic system, it will show this kind of behavior. But if if we’re dealing with random numbers, this behavior will not hold.

• ### Frank Carr Post author

You got to go through 1 before you get to 9. Easy.

• ### Shane Clough Post author

How I thought about it was, using the raffle analogy, The probability of leading digit being 1 when there is only 1 ticket (tickets starting at 1) is 1, then with 2 it's 1/2, with 3 1/3 etc up till when you have 1, 2, 3, 4, 5, 6, 7, 8 and 9, at which point you have a 1/8 chance. If you add 1/n from n=1 to n=8, then divide by the number of different raffles you get (2.717…/9) = 0.3019

Not sure if it's just coincidental though.

• ### nomen nominandum Post author

not pouring anything out of a beaker…boring

• ### Sun Shine Post author

This is silly – this only works in a placeholder number system!

• ### Julian Barber Post author

youre adding 1 more often than anything, for instance there is 9 2 digit numbers that lead with one, 999 4 digit numbers, 9999 5 digit numbers. etc. makes pwerfect sense

• ### purelitenite Post author

yeah, I made that noise

• ### Drew Berry Post author

The number on this video

(5) 72624 views
(9) 600 likes
(1) 36 dislikes
(1) 238 comments

The likes will tick over to 1 soon enough so this video is a fitting example.

• ### Tom Kerruish Post author

If you're old enough, picture a slide rule. This is equivalent to a uniform distribution on one. It's especially clear if you picture a circular one; a rotation preserves the probabilities.

• ### Ivan Mirisola Post author

Would this work for forgery detection on electronic voting? After election you would have the amount of candidate votes per region or ballots, etc. Would it be possible to figure out if in the universe of total voters some have been tempered with?

• ### Braden Sorensen Post author

In about a hundred years 100% of ALL people will have been born on a year with a leading digit of 2.

• ### Guo Wei Yan Post author

I've yet to see someone three meters tall

• ### WMTeWu Post author

Benford's 1aw

• ### John B Post author

In any base, the probability of the leading digit being 0 is 100%

• ### Jack Nasty Post author

Michael the Arch Angle

3:39

• ### Reginald Carey Post author

Isn’t Bedford’s law a special case of zipfs law?

• ### Chris Morrow Post author

Why does the graph look so quadratic?

• ### zooblestyx Post author

Leading digits of distribution percentages of Benford's Law don't follow Benford's Law. Checkmate, Mould!

• ### glitch gamer Post author

Check the prices of things and 9 will blow it up not just starting but whole lot of 9s