My guess is that all the individual characters of Hello World are found inside the 0xC894 number. Every 4 bits of x shows where in this number we can find the characters for Hello World.
You can read x right to left. (Skip the rightmost 0 as it’s immediately bit shifted away in first iteration)
3 becomes H
2 becomes e
1 becomes l
5 becomes o
etc.
I guess when we’ve exhausted all bits of x only 0 will be remaining for one final iteration, which translates to !
Too readable. You’ve gotta encode the characters as the solutions of a polynomial over a finite field, implemented with linear feedback on the bit shifts. /s
Might be wrong on a few things here as I haven’t done C++ in a while, but my understanding is this. I’m sure you can guess that this is just a very cheekily written while loop to print the characters of “Hello, World!” but how does it work? So first off, all ASCII characters have an integer value. That 32 there is the value for the space character. So depending on what ((0xC894A7875116601 >> ((x >>= 4) & 15) * 7) & 0x7F)) evaluates down into you’ll get different characters. The value for “H” for example is 72 so that first iteration we know that term somehow evaluated to the number 40 as 72 - 32 = 40.
So how do we get there? That big number, 0xC894A7875116601 is getting shifted right some number of bits. Let’s start evaluating the parenthesis. (X >>= 4) means set x to be itself after bit shifting it right by 4 bits then whatever that number is we bitwise AND it with 15 or 1111 in binary. This essentially just means each iteration we discard the rightmost digit of 0x7165498511230, then pull out the new right most digit. So the first iteration the ((x >>= 4) & 15) term will evaluate to 3, then 2, then 1, then 1, etc until we run out of digits and the loop ends since effectively we’re just looking for x to be 0.
Next we take that number and multiply it by 7. Simple enough, now for that first iteration we have 21. So we shift that 0xC894A7875116601 right 21 bits, then bitwise AND that against 0x7F or 0111 1111 in binary. Just like the last time this means we’re just pulling out the last 7 bits of whatever that ends up being. Meaning our final value for that expression is gonna be some number between 0 and 127 that is finally added to 32 to tell us our character to print.
There are only 10 unique characters in “Hello, World!” So they just assigned each one a digit 0-9, making 0x7165498511230 essentially “0xdlroW ,olleH!” The first assignment happens before the first read, and the loop has a final iteration with x = 0 before it terminates. Which is how the “!” gets from one end to the other. So they took the decimal values for all those ASCII characters, subtracted 32 then smushed them all together in 7 bit chunks to make 0xC894A7875116601 the space is kinda hidden in the encoding since it was assigned 9 putting it right at the end which with the expression being 32 + stuff makes it 0 and there’s an infinitely assumed parade of 0s to the left of the C.
32 is ASCII space, the highest number you need is 114 for r (or 122 for z if you want to be generic), that’s a range of 82 or 90 values.
The target string has 13 characters, a long long has 8 bytes or 16 nibbles – 13 fits into 16 so nibbles (the (x >>= 4) & 15) it is. Also the initial x happens to have 13 nibbles in it so that makes sense. But a nibble only has 16 values, not 82, so you need some kind of compression and that’s the rest of the math, no idea how it was derived.
If I were to write that thing I’d throw PAQ at it it can probably spit out an arithmetic coding that works, and look even more arcane as you wouldn’t have the obvious nibble steps. Or, wait, throw NEAT at it: Train it to, given a specific initial seed, produce a second seed and a character, score by edit distance. The problem space is small enough for the approach to be feasible even though it’s actually a terrible use of the technique, but using evolution will produce something that’s utterly, utterly inscrutable.
Yeah, but as far as I understand that’s not a C vulnerability. It wasn’t added. C just exposes how the underlying CPU works.
If you could avoid exposing dangerous memory quirks but still retain the same power… well, you’d have invented Rust. Rust is a better language than C, I agree with that.
Edit: Yep, just double checked. Buffers live in physical memory and have to be finite, so if you advance outside of them you’ll go somewhere else. Scanf’s not special, this is just another inherent pointer issue.
Okay, but how do you code on a CPU without directly interfacing the CPU at some point? Python and JavaScript both rely on things written in mid-level languages. There’s a difference between a bad tool and one that just has limitations inherent to the technology.
Like, to echo the meme a bit, it’s not a totally straight comparison. They have different roles.
Yes, also Rust. It wasn’t an option until recently though.
The times when C or C++ is worth it definitely isn’t always, but I’m not sure I’d class much of OS programming and all embedded and high-performance computing as small. If you have actual hard data about how big those applications are relative to others, I’d be interested.
Also, it’s a nitpick, but I’d personally say a footgun has to be unforeseeable, like literal shoe guns being added to a video game where guns were previously always visible. Once you understand pointers C is reasonably consistent, just hard and human-error-prone. The quirks follow from the general concepts the language is built on.
once you understand C++ the pitfalls of C++ are reasonably consistent
All of C++? That’s unreasonable, it’s even in the name that it’s very expansive. Yes, if you already know a thing, you won’t be surprised by it, that’s a tautology.
C is more than just pointers, obviously, but the vast majority of the difficulty there is pointers.
there are like what, 3 operating systems these days? assume those are all written entirely in c and combine them and compare that to all code ever written
Plus all previous operating systems, all supercomputer climate, physics and other science simulations, all the toaster and car and so on chips using bespoke operating systems because Linux won’t fit, every computer solving practical engineering or logistics problems numerically, renderers…
Basically, if your computational resources don’t vastly exceed the task to be done, C, Rust and friends are a good choice. If they do use whatever is easy to not fuck up, so maybe Python or Haskell.
All of C++? That's unreasonable, it's even in the name that it's very expansive.
similarly, "all of pointers" is unreasonable
"all of pointers" can have a lot of unexpected results
that's literally why java exists as a language, and is so popular
Plus all previous operating systems, all supercomputer climate, physics and other science simulations, all the toaster and car and so on chips using bespoke operating systems because Linux won't fit, every computer solving practical engineering or logistics problems numerically, renderers...
sure, and the quantity of code where true low-level access is actually required is still absolutely minuscule compared to that where it isn't
“all of pointers” can have a lot of unexpected results
How? They go where they point, or to NULL, and can be moved by arithmetic. If you move them where they shouldn’t go, bad things happen. If you deference NULL, bad things happen. That’s it.
sure, and the quantity of code where true low-level access is actually required is still absolutely minuscule compared to that where it isn’t
If you need to address physical memory or something, that’s a small subset of this for sure. It also just lacks the overhead other languages introduce, though. Climate simulations could be in Java or Haskell, but usually aren’t AFIAK.
How? They go where they point, or to NULL, and can be moved by arithmetic. If you move them where they shouldn't go, bad things happen. If you deference NULL, bad things happen. That's it.
I suppose if you treat scanf as a blackbox, then yeah, that would be confusing. If you know that it’s copying information into the buffer you gave it, obviously you cant fit more data into it than it’s sized for, and so the pointer must be wandering out of range.
Maybe C would be better without stdlib, in that sense. Like, obviously it would be harder to use, but you couldn’t possibly be surprised by a library function’s lack of safeness if there were none.
yeah i mean if you grok the underlying workings of scanf then there's no problem
i'd just argue that the problem is understanding what you need to understand is the problem with straight c, and with any language like c++ where you're liable to shoot thineself in thy foot
I’m wondering now how much you could add without introducing any footguns. I’d guess quite a bit, but I can’t really prove it. Smart pointers, at least, seem like the kind of thing that inevitably will have a catch, but you could read in and process text from a file more safely than that, just by implementing some kind of error handling.
True, but AFAIK they all sucked really bad. If you needed to make something that preformed back then you wrote in assembly.
FORTRAN might be a good counterexample. It’s pretty fast, and I’m not actually sure if it’s memory safe; it might be. But, it’s definitely very painful to work with, having had the displeasure.
That’s pure assumption and, as far as I can tell, not actually true. PASCAL was a strong contender. No language was competitive with handwritten assembly for several decades after C’s invention, and there’s no fundamental reason why PASCAL couldn’t benefit from intense compiler optimizations just as C has.
Here are some papers from before C “won”, a more recent article about how PASCAL “lost”, and a forum thread about what using PASCAL was actually like. None of them indicate a strong performance advantage for C.
Hmm, that’s really interesting. I went down a bit of a rabbit hole.
One thing you might not know is that the Soviets had their own, actually older version of C, the Адресный programming language, which also had pointers and higher-order pointers, and probably was memory-unsafe as a result (though even with some Russian, I can’t find anything conclusive). The thing I eventually ran into is that Pascal itself has pointer arithmetic, and so is vulnerable to the same kinds of errors. Maybe it was better than C, which is fascinating, but not that much better.
Off-topic, that Springer paper was also pretty neat, just because it sheds light on how people thought about programming in 1979. For example:
In the following, we shall
compare how “convenient” the languages are to code our favourite solution to a programming problem,
play the devil’s advocate, and try to list all possible things that can go wrong in a program expressed in a language.
Some of us, including myself, have reservations about the validity of the second technique for comparison, the most persuasive argument being that even though some of the features are potentially dangerous, people rarely use them in those contexts. There is certainly some truth in this, but until we have experimentally collected data convincingly demonstrating this, it is wiser to disbelieve it. Take note of the observed fact of increased difficulty in formally proving the properties of programs that use these potentially hazardous features in a safe way. This is one of the reasons behind the increased redundancy (and restrictions) of the newer languages like Alphard
I don’t see a lot of people denying that 2 is a good metric today. In fact, in the rare exceptions where someone has come right out and said it, I’ve suspected JS Stockholm syndrome was involved. Murphy’s law is very real when you not only have to write code, but debug and maintain it for decades as a large team, possibly with significant turnover. Early on they were still innocent of that, and so this almost reads like something a non-CS acedemic would write about programming.
Indeed, I had no idea there are multiple languages referred to as “APL”.
I feel like most people defending C++ resort to “people shouldn’t use those features that way”. 😅
As far as I can tell, pointer arithmetic was not originally part of PASCAL; it’s just included as an extension in many implementations, but not all. Delphi, the most common modern dialect, only has optional pointer arithmetic, and only in certain regions of the code, kind of like unsafe in Rust. There are also optional bounds checks in many (possibly most) dialects. And in any case, there are other ways in which C is unsafe.
I feel like most people defending C++ resort to “people shouldn’t use those features that way”. 😅
And yeah, I’m with you, that’s a shit argument. A language is a tool, it exists to make the task easier. If it makes it harder by leading you into situations that introduce subtle bugs, that’s not a good tool. Or at least, worse than an otherwise similar one that wouldn’t.
Without a super-detailed knowledge of the history and the alternative languages to go off of, my suspicion is that being unsafe is intrinsic to making a powerful mid-level language. Rust itself doesn’t solve the problem exactly, but does control flow analysis to prove memory safety in (restricted cases of) an otherwise unsafe situation. Every other language I’m aware of either has some form of a garbage collector at runtime or potential memory issues.
I was looking up lambda functions for rust because i needed it for something and didn’t know how, what, etc. But searching anything lambda now only shows results for fucking amazon lambda bullshit! Really pisses me off… its fucked 😠
Seriously though, spring configurations are written in XML and you create variables, call functions, and have control flow. Effectively turning XML into a horrible twisted shadow of a programming language.
All in the name of “configurability” through dependency injection.
I’m fond of saying that all great code earns it’s right to become good code by starting as trash…
But I still think we should all quietly and politely let Spring die a simple dignified death, as soon as possible.
Out of wildly morbid curiosity, do Maven and Ant still shit all over each other to make sure no one has any real idea what the build inputs and outputs are?
I shouldn’t ask things I don’t really want to know, though. My inbox is gonna be full of Java apologists.
It was a markup language until someone decided to parse and execute it as a programming language. This person should be watched for other deranged behavior.
I use XML as markup language, what kind of deranged person thought to turn it into a programming language? My problems with the Lua API led me down the rabbit hole of making my own VM and implementation, not looking at a markup languge, then go “what if I used this for scripting?”.
When they make XML do these things (or the way Github Actions does it with YAML), they’re essentially creating a representation of the AST that the compiler would make internally from a mini language. So there’s a few possibilities:
They don’t know how compilers work and reach for a tool they do know
They know, but figure the problem at hand doesn’t need the complexity of a mini language and start the project the quick and dirty way, and it gets out of hand as they add features
They may or may not know, but they do get caught up in the hype of some other tool (likely what happened with XSLT)
Was looking for the Pantheon reference in this thread! Just finished that show and loved it. Of course it takes plenty of liberties for the sake of the storytelling, but still, at least it explores these interesting topics!
Anyone reading this thread, do yourself a favor and check out Pantheon!
Right? Like what if as cells die or degrade instead of being replaced by the body naturally they are replaced by nanites/cybernetics/tech magic. If the process of fully converting took place over the course of 10 years, then I don’t see how the subject would even notice.
IDE: Oh! You mean farfignewton right? I found that in some completely unrelated library you didn’t write. Allow me complete that for you while you’re not paying attention.
I try my best to make my IDEs follow the principal that I should be able to type without looking at the screen, but apparently IDEs are really invested in return accepting completions to the point it’s often not configurable even when every other key is.
Indeed, God help whoever NASA puts in charge of date and time conversion.
If we do a lot of space travel we’ll have to get used to this, though. And even worse, there’s no consistent way of defining a frame of reference not subject to gravity, so there’s a chance any standard one will fall into a black hole, which is funny because it’s a tangible thing destroying a concept.
I go out of my way to find components that don’t have RGB lighting on them. When I use my computer, I want to be looking at the screens (the two-monitor part is true,) not the case.
I’ve got a piece of black tape over the power line on my computer, because it is too bright. And I have masking tape over the caps/num/scroll-lock lights on my keyboard; because they are also too bright. (The light is much gentler through the masking tape.)
programmer_humor
Top
This magazine is from a federated server and may be incomplete. Browse more on the original instance.