There have been multiple accounts created with the sole purpose of posting advertisement posts or replies containing unsolicited advertising.

Accounts which solely post advertisements, or persistently post them may be terminated.

Dark_Arc ,
@Dark_Arc@social.packetloss.gg avatar

I actually work on a C++ compiler… I think I should weigh in. The general consensus here that things are lossy is correct but perhaps non-obvious if you’re not familiar with the domain.

When you compile a program you’re taking the source, turning into a graph that represents every aspect of the program, and then generating some kind of IR that then gets turned into machine code.

You lose things like code comments because the machine doesn’t care about the comments right off the bat.

Then you lose local variable and function parameter names because the machine doesn’t care about those things.

Then you lose your class structure … because the machine really just cares about the total size of the thing it’s passing around. You can recover some of this information by looking at the functions but it’s not always going to be straight forward because not every constructor initializes everything and things like unions add further complexity … and not every memory allocation uses a constructor. You won’t get any names of any data members/fields though because … again the machine doesn’t care.

So what you’re left with is basically the mangled names of functions and what you can derive from how instructions access memory.

The mangled names normally tell you a lot, the namespace, the class (if any), and the argument count and types. Of course that’s not guaranteed either, it’s just because that’s how we come up with unique stable names for the various things in your program. It could function with a bunch of UUIDs if you setup a table on the compilers side to associate everything.

But wait! There’s more! The optimizer can do some really wild things in the name of speed… Including combining functions. Those constructors? Gone, now they’re just some more operations in the function bodies. That function you wrote to help improve readability of your code? Gone. That function you wrote to deduplicate code? Gone. That eloquent recursive logic you wrote? Gone, now it’s the moral equivalent of a giant mess of goto statements. That template code that makes use of dozens of instantiated functions? Those functions are gone now too; instead it’s all the instantiated logic puked out into one giant function. That piece of logic computing a value? Well the compiler figured out it’s always 27, so the logic to compute it? Gone.

Now all of that stuff doesn’t happen every time, particularly not all of those things are always possible optimizations or good optimizations … But you can see how incredibly difficult it is to reconstruct a program once it’s been compiled and gone through optimization. There’s a very low chance if you do reconstruct it, that it will look anything like what you started with.

kirkmoodey ,
@kirkmoodey@universeodon.com avatar

@Squizzy
Lots of other people have addressed this, so I won't repeat the whole thing. You can absolutely do disassembly work, it's just a pain in the rear.
But it's actually been done for Mario, since you brought it up:
https://github.com/IsoFrieze/SMWDisX
And also Pokemon.

schnurrito ,

Others have explained that decompiling is a thing.

I mainly work in Java where (due to the way Java bytecode works) decompiled code is actually very close to the original source code.

Most games are written in low level languages like C++ where that is not the case, variable and function names are lost during compilation.

spudwart ,

Well, actually it can be. It just takes a lot more to decompile code than compile it. Depending on the objective accuracy.

Example: the Super Mario 64 Decompilation project. This was a project that used various debug data that was left in the rom to decompile the game back to a source code that compiled a byte accurate version of the rom. This took about 3 years and a lot of skilled developers to accomplish.

Side note: Super Mario Bros wasn’t built using a compiled language, but rather Assembly. So technically that would be a Disassembly not a Decompilation.

  • All
  • Subscribed
  • Moderated
  • Favorites
  • [email protected]
  • random
  • lifeLocal
  • goranko
  • All magazines