There have been multiple accounts created with the sole purpose of posting advertisement posts or replies containing unsolicited advertising.

Accounts which solely post advertisements, or persistently post them may be terminated.

Why a kilobyte is 1000 and not 1024 bytes

I often find myself explaining the same things in real life and online, so I recently started writing technical blog posts.

This one is about why it was a mistake to call 1024 bytes a kilobyte. It’s about a 20min read so thank you very much in advance if you find the time to read it.

Feedback is very much welcome. Thank you.

unreasonabro ,

i mean, you can’t get to 1000 by doubling twos, so, no?

Reality doesn’t care what you prefer my dude

smo ,

This has been my pet rant for a long time, but I usually explain it … almost exactly the other way around to you.

You can essentially start off with nothing using binary prefixes. IBM’s first magnetic harddrive (the IBM 350 - you’ve probably seen it in the famous “forklifting it into a plane” photo) stored 5 million characters. Not 510241024 characters, 5,000,000 characters. This isn’t some consumer-era marketing trick - this is 1956, when companies were paying half a million dollars a year (2023-inflated-adjusted) to lease a computer. I keep getting told this is some modern trick - doesn’t it blow your mind to realise hdd manufacturers have been using base10 for nearly 70 years? Line-speed was always a lie base 10, where 1200 baud laughs at your 2^n fetish (and for that matter, baud comes from telegraphs, and was defined before computers existed), 100Mbit ethernet runs on a 25MHz clock, and speaking of clocks - kHz, MHz, MT/s, GT/s etc are always specified in base 10. For some reason no-one asks how we got 3GHz in between 2 & 4GHz CPUs.

As you say, memory is the trouble-maker. RAM has two interesting properties for this discussion. One is that it heavily favours binary-prefixed “round numbers”, traditionally because no-one wanted RAM with un-used addresses because it made address decoding nightmarish (tl;dr; when 8k of RAM was usually 8x1k chips, you’d use the first 3 bits of the address to select the chip, and the other 10 bits as the address on the chip - if chips didn’t use their entire address space you’d need to actually calculate the address map, and this calculation would have to run multiples of times faster than the cpu itself) . The second, is that RAM was the first place non-CSy types saw numbers big enough for k to start becoming useful. So for the entire generation that started on microcomputers rather than big iron, memory-flavoured-k were the first k they ever tasted.

I mean, hands up who had a computer with 8-64k of RAM and a cassette deck. You didn’t measure the size of your stored program in kB, but in seconds of tape.

This shortcut than leaked into filesystems purely as an implementation detail - reading disk blocks into memory is much easier if you’re putting square pegs into square holes. So disk sectors are specified in binary sizes to enable them to fit efficiently into memory regions/pages. For example, CP/M has a 128-byte disk buffer between 0x080 and 0x100 - and its filesystem uses 128-byte sectors. Not a coincidence.

This is where we start getting into stuff like floppy disk sizes being utter madness. 360k & 720k were 720 and 1440 512-byte sectors. When they doubled up again, we doubled 2800 512-byte sectors gave us 1440k - and because nothing is ever allowed to make sense (or because 1.40625M looks stupid), we used base10 to call this 1.44M.

So it’s never been that computers used 1024-shaped-k’s. It should be a simple story of “everything uses 1,000s, except memory because reasons”. But once we started dividing base10-flavoured storage devices into base2-flavoured sectors, we lost any hope of this ever looking logical.

smo ,

aside: the little-k thing. SI has a beautifully simple rule, capital letters for prefixes >1, small letters for prefixes <1. So this disambiguates between a millivolts (mV) and megavolts (MV).

But, and there’s always a but. The kilogram was the first SI unit, before they’d really thought it through. So we got both a lower-case k breaking such a beautifully simple rule, and the kilogram as a base unit instead of a gram. The Kilogram is metric’s “screw it, we’ll do it live”.

Luckily this is almost a non-issue in computing as a fraction of a bit never shows up in practice. But! If you had a system that took 1000 seconds to transfer one bit, you could call that a millibit per second, or mbps, and really mess things up.

unreasonabro ,

2, 4, 8, 16, 32, 64, 128, 256, 512, 1024. It’s pretty fucking logical m8. You know what’s not logical? Base 10

SmartmanApps ,
@SmartmanApps@programming.dev avatar

Yeah, base ten really screws around with programming. You specifically have to use a decimal type if you really want to use it (for like finance or something), but it’s much slower.

gens ,

The mistake is thinking that a 1000 byte file takes up a 1000 bytes on any storage medium. The mistake is thinking that it even matters if a kB means 1000 or 1024 bytes. It only matters for some programmers, and to those 1024 is the number that matters.

Disregarding reality in favor of pedantics is the real mistake.

Hamartiogonic ,
@Hamartiogonic@sopuli.xyz avatar

Here’s my favorite part.

“In addition, the conversions were sometimes not even self-consistent and applied completely arbitrary. The 3½-inch floppy disk for example, which was marketed as “1.44 MB”, was actually not 1.44 MB and also not 1.44 MiB. The size of the double-sided, high-density 3½-inch floppy was 512 bytes per sector, 18 sectors per track, 160 tracks, that’s 512×18×16 = 1’474’560 bytes. To get to “1.44” you must first divide 1’474’560 by 1024 (“bEcAuSE BiNaRY obviously”) to get 1440 and then divide by 1000 for perfect inconsistency, because dividing by 1024 again would get you an ugly number and we definitely don’t want that. We finally end up with “1.44”. Now let’s add “MB” because why the heck not. We already abused those units so much it’s not like they still mean anything and it’s “close enough” anyways. By the way, that “close enough” excuse never “worked when I was in school but what would I know compared to the computer “scientists” back then.

When things get that messy, numbers don’t even mean anything any more. Might as well just label the products using entirely qualitative terms like “big” or “bigger”.

wischi OP ,

❤️ Thank you for taking the time to read it.

GenderNeutralBro ,

I suggest considering this from a linguistic perspective rather than a technical perspective.

For years (decades, even), KB, MB, GB, etc. were broadly used to mean 2^10, 2^20, 2^30, etc. Throughout the 80s and 90s, the only place you would likely see base-10 units was in marketing materials, such as those for storage media and modems. Mac OS exclusively used base-2 definitions well into the 21st century. Windows, as noted in the article, still does. Many Unix/POSIX tools do, as well, and this is unlikely to change.

I will spare you my full rant on the evils of linguistic prescriptivism. Suffice it to say that I am a born-again descriptivist, fully recovered from my past affliction.

From a descriptivist perspective, the only accurate way to define kilobyte, megabyte, etc. is to say that there are two common usages. This is what you will see if you look up the words in any decent dictionary. e.g.:

I don’t recall ever seeing KiB/MiB/etc. in the 90s, although Wikipedia tells me they “were defined in 1999 by the International Electrotechnical Commission (IEC), in the IEC 60027-2 standard”.

While I wholeheartedly agree with the goal of eliminating ambiguity, I am frustrated with the half-measure of introducing unambiguous terms on one side (KiB, MiB, etc.) while failing to do the same on the other. The introduction of new terms has no bearing on the common usage of old terms. The correct thing to have done would have been to introduce two new unambiguous terms, with the goal of retiring KB/MB/etc. from common usage entirely. If we had KiB and KeB, there’d be no ambiguity. KB will always have ambiguity because that’s language, baby! regardless of any prescriptivist’s opinion on the matter.

Sadly, even that would do nothing to solve the use of common single-letter abbreviations. For example, Linux’s ls -l -h command will return sizes like 1K, 1M, 1G, referring to the base-2 definitions. Only if you specify the non-default –si flag will you receive base-10 values (again with just the first letter!). Many other standard tools have no such options and will exclusively use base-2 numbers.

FluffyPotato ,

The only place where kilobyte is 1000 bytes has been Google and everywhere else it’s 1024 so even if it’s precise I don’t see the advantage of changing usage. It would just cause more confusion at my work than make anything clearer.

forrgott ,

Based on your other replies, no, I absolutely will not waste my time reading your opinion piece.

And, a blog post is just another way of saying this is your opinion. That’s all it is.

Cornpop ,

This is the stupid af.

KinNectar ,
@KinNectar@kbin.run avatar

Nice to learn about the SI standard notation KiB, MiB, etc. I had no idea.

HubertManne ,

I was confused when I just read the headline. Should be "Why I (that would be you not me) think a kilobyte should be 1000 instead of 1024". Unpopular opinion would be a better sub for it.

wischi OP ,

You should read the blog post. It’s not a matter of option.

HubertManne ,

I know there is no option as 1024 is what the standard is now. Im not reading that anymore than someone saying how a red light really means go.

meekah ,
@meekah@lemmy.world avatar

It totally is a matter of opinion. These are arbitrary rules, made up by us. We can make up whatever rules we want to.

I agree that it’s weird that only in CS kilo means 1024. It would be logical to change that, to keep consistency across different fields of science. But that does not make it any less a matter of opinion.

thisisnotgoingwell ,

Just because you wrote about a topic doesn’t mean you’re suddenly the authority figure lol.

DrPop ,

I know it’s already been explained but here is a visualization of why.

0 2 4 8 16 32 64 128 256 512 1024

wischi OP ,

Did you read the blog post? If you don’t find the time you should at least read “(Un)lucky coincidence” to see why it’s not (and never was) a bright idea to call 1024 “a kilo”.

pirrrrrrrr ,

Dude you’re pretty condescending for a new author on an old topic.

Yeah I read it and it’s very over worded.

1024 was the closest binary approximation of 1000 so that became the standard measurement. Then drive manufacturers decided to start using decimal for capacity because it was a great way to make numbers look better.

Then the IEC decided “enough of this confusion” and created binary naming standards (kibi gibi etc…) and enforced the standard decimal quantity values for standard names like kilo-.

It’s not ground breaking news and your constant arguing with people in the thread paints you as quite immature. Especially when plenty of us remember the whole story BECAUSE WE LIVED IT AS IT PROFESSIONALS.

We lacked a standard, a system was created. It was later changed to match global standard values.

You portray it with emotive language making decisions out to be stupid, or malicious. A decision was made that was perfectly sensible at the time. It was then improved. Some people have trouble with change.

Your writing and engagement styles scream of someone raised on clickbait news. Focus on facts, not emotion and sensationalism if you want to be taken seriously in tech writing.

Focus on emotion and bullshit of you want to work for BuzzFeed.

And if you just want an argument go use bloody twitter.

billwashere ,

Well it’s because computer science has been around for 60+ years and computers are binary machines. It was natural for everything to be base 2. The most infuriating part is why drive manufacturers arbitrarily started calling 1000 bytes a kilobyte, 1000 kilobytes a megabyte, and 1000 megabytes a gigabyte, and a 1000 gigabytes a terabyte when until then a 1 TB was 1099511627776 bytes. They did this simply because it made their drives appear 10% bigger. So good ol’ shrinkflation. You could make drives 10% smaller and sell them for the same price.

wischi OP ,

Pretty obvious that you didn’t read the article. If you find the time I’d like to encourage you to read it. I hope it clears up some misconceptions and make things clearer why even in those 60+ years it was always intellectually dishonest to call 1024 byte a kilobyte.

You should at least read “(Un)lucky coincidence”

lambda ,
@lambda@programming.dev avatar

kilobit = 1000 bits. Kilobyte = 1000 bytes.

How is anything about that intellectually dishonest??

The only ones being dishonest are the drive manufacturers, like the person above said. They sell storage drives by advertising them in the byte quantity but they’re actually in the bit quantity.

locuester ,

They sell storage drives by advertising them in the byte quantity but they’re actually in the bit quantity.

No, they absolutely don’t. That’d be off by 8x.

The subject at hand has nothing to do with bits. Please, read what OP posted. It’s about 1024 vs 1000

billwashere ,

Ok so I did read the article. For one I can’t take an article seriously that is using memes. Thing the second yes drive manufacturers are at fault because I’ve been in IT a very very long time and I remember when HD manufacturers actually changed. And the reason was greed (shrinkflation). I mean why change, why inject confusion where there wasn’t any before. Find the simplest least complex reason and that is likely true (Occam’s razor). Or follow the money usually works too.

It was never intellectually dishonest to call it a kilobyte, it was convenient and was close enough. It’s what I would have done and it was obviously accepted by lots of really smart people back then so it stuck. If there was ever any confusion it’s by people who created the confusion by creating the alternative (see above).

If you wanna be upset you should be upset at the gibi, kibi, tebi nonsense that we have to deal with now because of said confusion (see above). I can tell you for a fact that no one in my professional IT career of over 30 years has ever used any of the **bi words.

You can be upset if you want but it is never really a problem for folks like me.

Hopefully this helps…

https://lemmy.world/pictrs/image/50f959aa-ed61-4f20-b4aa-66b3729ad1c4.jpeg

CallumWells ,

I just think that kilobyte should have been 1000 (in binary, so 16 in decimal) bytes and so on. Just keep everything relating to the binary storage in binary. That couldn’t ever become confusing, right?

rottingleaf ,

Because your byte is 10 decimal bits, right? EDIT: Bit is actually an abbreviation, BIT, initially, so it would be what, DIT?.. Dits?..

shotgun_crab , (edited )

A kilobyte (kB) is 1000 bytes, that’s what the prefix kilo means. A kibibyte (KiB) is 1024 bytes (the “bi” in the prefix means base 2 or binary). People often confuse them, but they’re similar enough for smaller units, 10^3 ~ 2^10.

Oh and at first, kilobyte was used for both amounts, which is why kibibytes were introduced to fix the confusion, which perhaps was a bit late anyway.

wischi OP ,

True and that’s what the article is about. You should check out the interactive diagram in the “(Un)lucky coincidence” section.

rockSlayer , (edited )

I genuinely don’t understand your distain for using base 2 on something that caculates in base 2. Do you know how counting works in binary? Every byte is made up of 8 bits, and goes from 0000 0000 to 1111 1111, or 0-15. When converted to larger scales, 1024 bytes is a clean mathematical derivation in base 2, 1000 is a fractional number. Your pendantry seems to hinge on the use of the prefix right? I think 1024 is a better representation of kilo- in base 2, because a kilo- can be directly translated up to exabytes and down to nybbles while “1000” in base 2 is extremely difficult. The point of metric is specifically to facilitate easy measuring, right? So measuring in the units that the computer uses makes perfect sense. It’s like me saying that a kilogram should be measured in base 60, because that was the original number system.

wischi OP ,

Did you read the post? The problem I have is redefining the kilo because of a mathematical fluke.

You certainly can write a mass in base 60 and kg, there is nothing wrong about that, but calling 3600 gramm a “kilogram” because you think it’s convenient that 3600 (60^2) is “close to” 1000 so you just call it a kilogram, because that’s exactly what’s happening with binary and 1024.

If you find the time you should read the post and if not at least the section “(Un)lucky coincidence”.

rockSlayer ,

I started reading it, but the distain towards measuring in base 2 turned me off. Ultimately though this is all nerd rage bait. I’m annoyed that kilobytes aren’t measured as 1024 anymore, but it’s also not a big deal because we still have standardized units in base 2. Those alternative units are also fun to say, which immediately removes any annoyance as soon as I say gibibyte. All I ask is that I’m not pendantically corrected if the discussion is about something else involving amounts of data.

I do think there is a problem with marketing, because even the most know-nothing users are primed to know that a kilobyte is measured differently from a kilogram, so people feel a little screwed when their drive reads 931GiB instead of 1TB.

bigredgiraffe ,

Yeah I’m with you, I read most of it but I just don’t know where the disdain comes from. At most scales of infrastructure anymore you can use them interchangeably because the difference is immaterial in practical applications.

Like if I am going to provision 2TB I don’t really care if it’s 2000 or 2048GB, I’ll be resizing it when it gets to 1800 either way, and if I needed to actually store 2TB I would create a 3TB volume, storage is cheap and my time calculating the difference is not.

Wait until you learn about how different fields use different precision levels of pi.

wewbull ,

It’s not 2000 Vs 2048. It’s 1,862 Vs 2048

The GB get smaller too.

onlinepersona ,

Stop blaming drive manufacturers

The most scientific shill I’ve seen in a while

CC BY-NC-SA 4.0

wischi OP ,

I’m not sure if I’m too stupid, but how so?

onlinepersona ,

Hard drive manufacturers know exactly what they’re doing. It’s like selling something that’s 1 fluid ounce, but not saying “this is an imperial fluid ounce” --> ~2ml less than what a US food labeling ounce is. Sell 1k, 1M, 2G fluid ounces and you’re delivering less liquid than people would expect.

The same goes for any other unit that can be ambiguous. See the imperial vs US measurement systems.

Your entire argument seems to be based on kilo = 1000, kibi = 1024, which is technically correct (inb4 “best kind of correct”), but when you format a 256GB drive and find out that you don’t actually have 256GB available (even including filesystem headers etc.) it benefits the manufacturer.

You probably don’t work for a HD manufacturer, which is why I’m jokingly calling you a shill.

CC BY-NC-SA 4.0

wikibot Bot ,

Here’s the summary for the wikipedia article you mentioned in your comment:

Both the British imperial measurement system and United States customary systems of measurement derive from earlier English unit systems used prior to 1824 that were the result of a combination of the local Anglo-Saxon units inherited from Germanic tribes and Roman units. Having this shared heritage, the two systems are quite similar, but there are differences. The US customary system is based on English systems of the 18th century, while the imperial system was defined in 1824, almost a half-century after American independence.

^article^ ^|^ ^about^

wischi OP ,

So why don’t they just label drives in Terabit instead of terabyte. The number would be even bigger. Why don’t Europeans also use Fahrenheit, with the bigger numbers the temperature for sure would instantly feel warmer 🤣

Jokes aside. Even if HDD manufacturers benefit from “the bigger numbers” using the 1000 conversation is the objectively only correct answer here, because there is nothing intrinsically base 2 about hard drives. You should give the blog post a read 😉

meekah ,
@meekah@lemmy.world avatar

there is nothing intrinsically base 2 about hard drives

did you miss the part where those devices store binary data?

wischi OP ,

Binary prefixes (the ones with 1024 conversations) are used to simplify numbers that are exact powers of two - for example RAM and similar types of memory. Hard drive sizes are never exact powers of two. Disk storing bits don’t have anything to do with the size of the disk.

meekah ,
@meekah@lemmy.world avatar

sure, but one of the intrinsic properties of binary data is that it is in binary sized chunks. you won’t find a hard drive that stores 1000 bits of data per chunk.

wewbull ,

there is nothing intrinsically base 2 about hard drives

Yes there is. The addressing protocol. Sectors are 512 (2⁹) bytes, and there’s an integer number of them on a drive.

wischi OP ,

That’s true but the entire disk size is not an exact power of two that’s why binary prefixes (1024 conversation) don’t have any benefit whatsoever when it comes to hard drives. With memory it’s a bit different because other than with storage devices RAM size is always exactly a power of two.

  • All
  • Subscribed
  • Moderated
  • Favorites
  • [email protected]
  • random
  • lifeLocal
  • goranko
  • All magazines