Will YouTube Ever Run Out Of Video IDs?


Every YouTube video has a unique ID. It’s up in the URL: a string of eleven characters that uniquely identifies which video you want. Now, YouTube has millions and millions of
videos. The last stats that they released said they
have 400 hours of video being uploaded every minute. So: are they ever going to run out of those
IDs? Well, to find out, let’s talk about counting
systems. People count in Base 10. 0 to 9. That’ll be, hopefully, familiar to you. Computers count in base 2, in binary, but that’s difficult for humans to read, it gets too long to write really, really quickly, so often computers will display it in base
16, hexadecimal. You have 0 to 9, and then A to F, and then you start adding to the next column. Humans can’t understand that easily, but it’s efficient if we have to type it in
somewhere, and 16 – 2 to the power of 4 – is also easy
for computers to deal with. So how about Base 64? That’d be a ridiculous counting system, right?
Except. 64 is another one of those easy numbers for
computers, it is 2 to the power of 6. And humans can get to 64 very easily: 0 to 9, then capital letters A to Z, then small letters a to z, and two other characters. Most Base 64 uses slash and plus, but they don’t work so well in URLs, so YouTube uses hyphen and underscore. That YouTube URL, that unique ID, is really just a random number in base 64. They could have have picked base 10 or base
16, but they didn’t: they went with 64, because it will let you cram a huge number
into a small space and still make it vaguely human readable. Author and programmer Sam Hughes, by the way, pushed this to the limit, and invented Base
65,536, which includes basically every character from
every language. It is ridiculous and unnecessary, but when has that ever stopped programmers? So why didn’t YouTube just start counting
at 1 and work up? Well, first, they would have to synchronise
their counting between all the servers handling the video
uploads, or they’d have to assign each server a block
of numbers. Either way, there’s a lot of tracking to do, a lot of making sure that it’s never duplicated. Instead, they just generate a random number
for each video, see if it’s already taken, and if not, use
it. And secondly, it is a really, really bad idea to just count 1, 2, 3 and so on in URLs. Incremental counters, as they’re called, can
be a big security flaw: if you see video 283 up there, then you might
wonder: what’s video 284? Or video 282? It’s easy to enumerate, as it’s called, to run through the entire list. YouTube Unlisted videos, the ones that don’t
appear publicly but that you can send the link to people,
those wouldn’t work. And by the way? Lots of badly designed sites
do use incremental counters. And it is a terrible idea. It might tell your competitors exactly how
many customers you have, ‘cos they can just count them. It might let people download all your records
easily, ‘cos they can just run through them. And in one site that someone in Florida emailed
me about this week, it lets you look at other people’s personal
details. Don’t use incremental counters if you’re building
a web site. Use a random number. Which brings me to the question: just how big are the numbers that YouTube
uses? Well, let’s work it out. One character of base 64 lets you have 64
ID numbers. Two characters? That’s 64 by 64, or 4,096. Three characters? 64 times 64 times 64 — or
64 to the power of 3. That is already more than a quarter of a million. And if we go to four? Well, now we’re above
16 million. If you use Base 64, then you can assign an
ID number to everyone who lives in London down there
twice over, and you’ll only need four characters. This gets big fast. We can keep on doing this, and by seven characters we’re already at four
quadrillion. Now, I assume that YouTube checks through
a dictionary, and doesn’t allow any actual words to appear
up there — particularly anything rude. But that is going to be a tiny minority of
the URLs, so for our purposes, we can pretty much just
ignore that. At YouTube’s 11 characters, we are at 73 quintillion
786 quadrillion 976 trillion 294 billion 838 million 206 thousand and 464 videos. That’s enough for every single human on planet
Earth to upload a video every minute for around
18,000 years. YouTube planned ahead. Can they run out of URLs? Technically, yes. Practically? No. And if they did? They could just add one more character. [Translating these subtitles? Add your name here!] Ha! One take! One take! Yes!

100 thoughts on “Will YouTube Ever Run Out Of Video IDs?

  1. That rare occasion when a URL coincidentally actually DOES contain a word that has could conceivably be a reference to something in the video.
    It's orgasmic.

  2. "And by the way? Lots of badly designed sites do use incremental counters. And it is a terrible idea."

    roblox is becoming bad confirmed

  3. Amazing video.. there is so much to learn in it about numbers, urls, planning ahead!
    And the fact that it was done in one take is just mind blowing.. 🤯

  4. Something else interesting to take into account would be the videos removed from YouTube (for copyright, TOS violations, etc.) Could the urls for these videos be reused? If so the longevity of this system is even more expansive than it seems.

  5. about the incremental numbers
    it really depends on the use wether or not you should use it if you ask me. for example, I got a leaderboards webpage and I save the entires in those numbers A: because it's much easier to generate leaderboards with that, B: because it's way more efficient to use for profile pages for example. It's just a really bad idea if you put personal data or hidden data on those numbers AND without any verification process

  6. I just watched your Royal Institution lecture on algorithms and I think it would be a great idea for everyone to subscribe to everyone on Youtube. That might mess up the system.

  7. Like with IPv6. If I remember correctly from my calculations (8 years ago now) you can assign about 56M addresses to each gram of the entire planet earth's mass or let every lump of 420B atoms share 1 address.

  8. Using increasing numbers as codes like that is simple but has some disadvantages, which the below solution fixes:

    If I'm asked to encode the same long URL several times, it will get several entries. That wastes codes and memory.
    People can find out how many URLs have already been encoded. Not sure I want them to know.
    People might try to get special numbers by spamming me with repeated requests shortly before their desired number comes up.
    Only using digits means the codes can grow unnecessarily large. Only offers a million codes with length 6 (or smaller). Using six digits or lower or upper case letters would offer (10+26*2)6 = 56,800,235,584 codes with length 6.

  9. Youtube needs 64^11 digits to handle video content.
    Tom Scott needs just one binary digit to count the takes needed to make a video about it.

    I vote we now call you "One Take Tom"

  10. 4:07
    That is actually not true at all. There are numerous instances of words in video IDs, and there are no restrictions on what words can appear, as lots of bad words have appeared in YouTube video IDs.

  11. Want to talk about about possible new video service. YouTube is killing itself, I am working on a new site idea. But I dont have programming or site building skills.

  12. Do you know what else used incremental counting? United States Social Security Numbers.

    Yup, if you add 1 to your own SSN, you'll probably get the number of someone who was born in your hospital, minutes or hours after you.

    Such security!

  13. Wouldn't it be about 30 quintillion, rather than 74 quintillion? Correct me if I'm wrong, but if there are 64 possible characters and 11 are being used, surely you would use a permutation function (because order matters) and calculate 64P11, which is roughly 30 quintillion?

Leave a Reply

Your email address will not be published. Required fields are marked *