I have been thinking that maybe modern programming languages should move away from supporting IEEE 754 all within one data type.
Like, we’ve figured out that having a null value for everything always is a terrible idea. Instead, we’ve started encoding potential absence into our type system with Option or Result types, which also encourages dealing with such absence at the edges of our program, where it should be done.
Well, NaN is null all over again. Instead, we could make the division operator an associated function which returns a Result<f64> and disallow f64 from ever being NaN.
My main concern is interop with the outside world. So, I guess, there would still need to be a IEEE 754 compliant data type. But we could call it ieee_754_f64 to really get on the nerves of anyone wanting to use it when it’s not strictly necessary.
Well, and my secondary concern, which is that AI models would still want to just calculate with tons of floats, without error-handling at every intermediate step, even if it sometimes means that the end result is a shitty vector of NaNs, that would be supported with that, too.
Well, that is what I meant. That NaN is effectively an error state. It’s only like null in that any float can be in this error state, because you can’t rule out this error state via the type system.
Why do you feel like it’s not a great solution to make NaN an explicit error?
Theres plenty of cases where I would like to do some large calculation that can potentially give a NaN at many intermediate steps. I prefer to check for the NaN at the end of the calculation, rather than have a bunch of checks in every intermediate step.
How I handle the failed calculation is rarely dependent on which intermediate step gave a NaN.
This feels like people want to take away a tool that makes development in the engineering world a whole lot easier because “null bad”, or because they can’t see the use of multiplying 1e27 with 1e-30.
Well, I’m not saying that I want to take tools away. I’m explicitly saying that a ieee_754_f64 type could exist. I just want it to be named annoyingly, so anyone who doesn’t know why they should use it, will avoid it.
If you chain a whole bunch of calculations where you don’t care for NaN, that’s also perfectly unproblematic. I just think, it would be helpful to:
Nudge people towards doing a NaN check after such a chain of calculations, because it can be a real pain, if you don’t do it.
Document in the type system that this check has already taken place. If you know that a float can’t be NaN, then you have guarantees that, for example, addition will never produce a NaN. It allows you to remove some of the defensive checks, you might have felt the need to perform on parameters.
Special cases are allowed to exist and shouldn’t be made noticeably more annoying. I just want it to not be the default, because it’s more dangerous and in the average applications, lots of floats are just passed through, so it would make sense to block NaNs right away.
What do you do about a dataset which contains 11999 fine numbers, but one of them is NaN because George called in sick that week? Throw away the whole dataset because it doesn’t fit the data type?
but in statistics, you deal with NaNs all the time. Data is absent from the data set. If it would be an error every time, you wouldn’t get anything done.
My thinking is that a call to the safe division method would check after the division, whether the result is a NaN. And if it is, then it returns an Error-value, which you can handle.
Obviously, you could do the same with a NaN by just throwing an if-else after any division statement, but I would like to enforce it in the type system that this check is done.
I agree with moving away from floats but I have a far simpler proposal… just use a struct of two integers - a value and an offset. If you want to make it an IEEE standard where the offset is a four bit signed value and the value is just a 28 or 60 bit regular old integer then sure - but I can count the number of times I used floats on one hand and I can count the number of times I wouldn’t have been better off just using two integers on -0 hands.
Floats specifically solve the issue of how to store a ln absurdly large range of values in an extremely modest amount of space - that’s not a problem we need to generalize a solution for. In most cases having values up to the million magnitude with three decimals of precision is good enough. Generally speaking when you do float arithmetic your numbers will be with an order of magnitude or two… most people aren’t adding the length of the universe in seconds to the width of an atom in meters… and if they are floats don’t work anyways.
I think the concept of having a fractionally defined value with a magnitude offset was just deeply flawed from the get-go - we need some way to deal with decimal values on computers but expressing those values as fractions is needlessly imprecise.
While I get your proposal, I’d think this would make dealing with float hell. Do you really want to .unwrap() every time you deal with it? Surely not.
One thing that would be great, is that the / operator could work between Result and f64, as well as between Result and Result. Would be like doing a .map(|left| left / right) operation.
Well, not every time. Only if I do a division or get an ieee_754_f64 from the outside world. That doesn’t happen terribly often in the applications I’ve worked on.
And if it does go wrong, I do want it to explode right then and there. Worst case would be, if it writes random NaNs into some database and no one knows where they came from.
As for your suggestion with the slash accepting Results, yeah, that could resolve some pain, but I’ve rarely seen multiple divisions being necessary back-to-back and I don’t want people passing around a Result<f64> in the codebase. Then you can’t see where it went wrong anymore either.
So, personally, I wouldn’t put that division operator into the stdlib, but having it available as a library, if someone needs it, would be cool, yeah.
I just store mine in memory (meat memory, not the computer stuff). If someone wants the source code I just tell them. Version control by oral tradition.
Omg someone please help how did I get this far they’re going to realize I’m stupid when they fire me everything will collapse because it’s all in a single excel file I need to figure out how to live in a tent in the woods and hunt and forage
For a open source project like the above which has so many constant moving parts, a discord is probably a good idea to ensure the author of the issue can provide more details about their problem and respond to follow up immediately.
Because I can absolutely see a breaking change involving something outside of the open-source project itself.
I say that as a person who hates discord. But I’m also part of the older generation so waiting 3-9 months for a reply is kinda normal. And the projects I support, it’s pretty common to make a merge request that finally gets approved a two years later.
to ensure the author of the issue can provide more details about their problem and respond to follow up immediately.
if you actually visit that Discord (like I reluctantly do, from time to time), you’ll find that all issues are being discussed in a handful of general channels with multiple people discussing multiple issues at the same time in one never-eding stream of messages. if you miraculously find a proper keyword that brings up someone else having the same issue as you do, the only way to find if someone else replied to it is by scrolling through all that noise.
I don’t think that will have the impact people think it will, maybe at first, but eventually it’ll just start treating “wrong” code as a negative and reference it as a “how NOT to do things” lmao
For sure, but just like that whole “poison our pictures” from artists thing, the people building these models (be it a company or researchers or even hobbyists) are going to start modifying the training process so that the AI model can recognize bad code. And that’s assuming it doesn’t already, I think without that capability from the getgo the current models would be a lot worse at what they generate than they are as is lmao
Are they based out of the PNW? Now that I think about it, I may actually have interviewed with them at one point.
ETA: Yeah, pretty sure it was them, they’re PT and have a 425 DID for sales, and the company name is wholly unrelated to the product. Had forgotten about them entirely, and would have had the same reaction as OP to getting that email now.
And it probably is the sw product the email was referencing, since Bartender is capitalized.
Didn’t ChatGPT become very bad recently? It used to give really working code but now it gets things wrong and doesn’t follow context. It gives code but when you ask it to improve by give more context, it ignores the previous answer and give wrong code.
It even sometimes answers by saying it does not have the answer for questions that it answered few months ago.
How is that a niche api question? That’s a public api that is scraped up.
It’s also a terrible way to ask the question. It’s how a clueless newb asks questions. Anyone hoping to help needs to at least know: What are you attempting to use the end point for and What results are you receiving vs expecting?
programmer_humor
Top
This magazine is from a federated server and may be incomplete. Browse more on the original instance.