struct PizzaOrder {
size: PizzaSize,
toppings: Vec<Topping>,
crust_type: CrustType,
ordered_at: SystemTime,
}
The problem they want to address is partial equality when you want to compare orders but ignoring the ordered_at timestamp. To me, the problem is throwing too many unrelated concerns into one struct. Ideally instead of using destructuring to compare only the specific fields you care about, you'd decompose this into two structs: #[derive(PartialEq, Eq)]
struct PizzaDetails {
size: PizzaSize,
toppings: Vec<Topping>,
crust_type: CrustType,
… // additional fields
}
#[derive(Eq)]
struct PizzaOrder {
details: PizzaDetails,
ordered_at: SystemTime,
}
impl PartialEq for PizzaOrder {
fn eq(&self, rhs: &Self) -> bool {
self.details == rhs.details
}
}
I get that this is a toy example meant to illustrate the point; there are certainly more complex cases where there's no clean boundary to split your struct across. But this should be the first tool you reach for.PartialEq and Eq for PizzaDetails is good. If there is a business function that computes whether or not someone orders the same thing, then that should start by projecting the details.
It's not difficult to write the predicate same_details_as() and then it's obvious to reviewers if that's what we meant and discourages weird ad-hoc code which might stop working when the PizzaDetails is redefined.
How would you decompose a character string so that you could have a case-insensitive versus sensitive comparison?
:)
With a capitalization bit mask of course!
And you can speed up full equality comparisons with a quick cap equality check first.
(That is the how. The when is probably "never". :)
If your function gets ownership of, or an exclusive reference to an object, then you know for sure that this reference, for as long as it exists, is the only one in the entire program that can access this object (across all threads, 3rd party libraries, recursion, async, whatever).
References can't be null. Smart pointers can't be null. Not merely "can't" meaning not allowed and may throw or have a dummy value, but just can't. Wherever such type exists, it's already checked (often by construction) that it's valid and can't be null.
If your object's getter lends an immutable reference to its field, then you know the field won't be mutated by the caller (unless you've intentionally allowed mutable "holes" in specific places by explicitly wrapping them in a type that grants such access in a controlled way).
If your object's getter lends a reference, then you know the caller won't keep the reference for longer than the object's lifetime. If the type is not copyable/cloneable, then you know it won't even get copied.
If you make a method that takes ownership of `self`, then you know for sure that the caller won't be able to call any more methods on this object (e.g. `connection.close(); connection.send()` won't compile, `future.then(next)` only needs to support one listener, not an arbitrary number).
If you have a type marked as non-thread safe, then its instances won't be allowed in any thread-spawning functions, and won't be possible to send through channels that cross threads, etc. This is verified globally, across all code including 3rd party libraries and dynamic callbacks, at compile time.
Question: how to encourage such patterns within a team? I often find it difficult to do it during code reviews and leading to unproductive arguments about "code style" and "preferences".
Funnily, these arguments do not happen when a linter pops a warning instead...
The same day Cloudflare had its unwrap fiasco, I found a bug in my code because of a slice that in certain cases went past the end of a vector. Switched it to use iterators and will definitely be more careful with slices and array indexes in the future.
Was it a fiasco? Really? The rust unwrap call is the equivalent to C code like this:
int result = foo(…);
assert(result >= 0);
If that assert tripped, would you blame the assert? Of course not. Or blame C? No. If that assert tripped, it’s doing its job by telling you there’s a problem in the call to foo().You can write buggy code in rust just like you can in any other language.
If you read the postmortem, they talk in depth about what the issue really was - which from memory is that their software statically allocated room for 20 rules or something. And their database query unexpected returned more than 20 items. Oops!
I can see the argument for renaming unwrap to unwrap_or_panic. But no alternate spelling of .unwrap() would have saved cloudflare from their buggy database code.
You can go ahead and grep your codebase for this today, instead of waiting for an incident.
I'm a fairly new migrant from Java to C#, and when I do some kind of collection lookup, I still need to check whether the method will return a null, throw an exception, expect an out+variable, or worst of all, make up some kind of default. C#'s equivalent to unwrap seems to be '!' (or maybe .Val() or something?)
"The fromJust function extracts the element out of a Just and throws an error if its argument is Nothing."
Nope. Rust never makes any guarantees that code is panic-free. Quite the opposite. Rust crashes in more circumstances than C code does. For example, indexing past the end of an array is undefined behaviour in C. But if you try that in rust, your program will detect it and crash immediately.
More broadly, safe rust exists to prevent undefined behaviour. Most of the work goes to stopping you from making common memory related bugs, like use-after-free, misaligned reads and data races. The full list of guarantees is pretty interesting[1]. In debug mode, rust programs also crash on integer overflow and underflow. (Thanks for the correction!). But panic is well defined behaviour, so that's allowed. Surprisingly, you're also allowed to leak memory in safe rust if you want to. Why not? Leaks don't cause UB.
You can tell at a glance that unwrap doesn't violate safe rust's rules because you can call it from safe rust without an unsafe block.
[1] https://doc.rust-lang.org/reference/behavior-considered-unde...
Also, when I say safety guarantees, I'm not talking about safe rust. I'm talking about Rust features that prevent bugs, like the borrow checker, types like Result and many others.
All integer overflow, not just unsigned. Similarly, in release mode (by default) all integer overflow is fully defined as two's complement wrap.
You might think that the Haskell behavior is “safer” in some sense, but there’s a huge gotcha: exceptions in pure code are the mortal enemy of lazy evaluation. Lazy evaluation means that an exception can occur after the catch block that surrounded the code in question has exited, so the exception isn’t guaranteed to get caught.
Exceptions can be ok in a monad like IO, which is what they’re intended for - the monad enforces an evaluation order. But if you use a partial function like fromJust in pure code, you have to be very careful about forcing evaluation if you want to be able to catch the exception it might generate. That’s antithetical to the goal of using exceptions - now you have to write to code carefully to make sure exceptions are catchable.
The bottom line is that for reliable code, you need to avoid fromJust and friends in Haskell as much you do in Rust.
The solution in both languages is to use a linter to warn about the use of partial functions: HLint for Haskell, Clippy for Rust. If Cloudflare had done that - and paid attention to the warning! - they would have caught that unwrap error of theirs at linting time. This is basically a learning curve issue.
https://docs.rs/itertools/latest/itertools/trait.Itertools.h...
Like whenever I read posts like this, they're always fairly anecdotal. Sometimes there will even be posts about how large refactor x unlocked new capability y. But the rationale always reads somewhat retconned (or again, anecdotal*). It seems to me that maybe such continuous meta-analysis of one's own codebases would have great potential utility?
I'd imagine automated code smell checking tools can only cover so much at least.
* I hammer on about anecdotes, but I do recognize that sentiment matters. For example, if you're planning work, if something just sounds like a lot of work, that's already going to be impactful, even if that judgement is incorrect (since that misjudgment may never come to light).
We do the work that’s too large in scope for other teams to handle, and clearly documenting and enforcing best practices is one component of that. Part of that is maintaining a comprehensive linting suite, and the other part is writing documentation and educating developers. We also maintain core libraries and APIs, so if we notice many teams are doing the same thing in different ways, we’ll sit down and figure out what we can build that’ll accommodate most use cases.
https://www.moderndescartes.com/essays/readability/
(I have not read this article closely, but it is about the right concept, so I provide it as a starting point since "readability" writ large can be an ambiguous term.)
These roles don’t really have standard titles in the industry, as far as I’m aware. At Google we were part of the larger language/library/toolchain infrastructure org.
Much of what we did was quasi-political … basically coaxing and convincing people to adopt best practices, after first deciding what those practices are. Half of the tips above were probably written by interested people from the engineering org at large and we provided the platform and helped them get it published.
Speaking to the original question, no, there were no teams just manually reading code and looking for mistakes. If buggy code could be detected in an automated way, then we’d do that and attempt to fix it everywhere. Otherwise we’d attempt to educate and get everyone to level up their code review skills.
> Half of the tips above were probably written by interested people from the engineering org at large and we provided the platform and helped them get it published.
Are you aware how those engineers established their recommendations? Did they maybe perform case studies? Or was it more just a distillation of lived experience type of deal?
One question about avoiding boolean parameters, I’ve just been using structs wrapping bools. But you can’t treat them like bools… you have to index into them like wrapper.0.
Is there a way to treat the enum style replacement for bools like normal bools, or is just done with matches! Or match statements?
It’s probably not too important but if we could treat them like normal bools it’d feel nicer.
enum MyType{
...
}
impl MyType{
pub fn is_useable_in_this_way(&self) -> bool{
// possibly ...
match self {...}
}
}
and later: pub fn use_in_that_way(e: MyType) {
if e.is_useable_in_this_way() {...}
}
Or if you hate all that there's always: if let MyType::Member(x) = e {
...
}For ints you can implement the deref trait on structs. So you can treat YourType(u64) as a u64 without destructing. I couldn’t figure out a way to do that with YouType(bool).
Actually the From trait documentation is now extremely clear about when to implement it (https://doc.rust-lang.org/std/convert/trait.From.html#when-t...)
The 'defensive' nature refers to the mindset of the programmer (like when guilty people are defensive when being asked a simple question), that he isn't sure of anything in the code at any point, so he needs to constantly check every invariant.
Enterprise code is full of it, and it can quickly lead to the program becoming like 50% error handling by volume, many of the errors being impossible to trigger because the app logic is validating a condition already checked in the validation layer.
Its presence usually betrays a lack of understanding of the code structure, or even worse, a faulty or often bypassed validation layer, which makes error checking in multiple places actually necessary.
One example is validating every parameter in every call layer, as if the act of passing things around has the ability to degrade information.
A function must check its arguments. It cannot assume that the arguments are already checked (against its own requirements). This is regardless of what called it, or where the values came from.
Using ..Default::default() means “whatever additional fields are added later, I don’t care”. Which is great until someone needs to add a field to the struct, and they rely on the compiler to tell them all the places that don’t have a value for the field (so they can pass the right value depending on the situation.) Then the callers with Default are missed, and bugs can result.
Any time you say “I don’t care what happens in the future here”, you better have a good reason for that to be the case, IMO.
It's not too uncommon in other languages (sometimes under the name "immediately invoked function expression"), though depending on the language you may see lambdas involved. For example, here's one of the examples from the article ported to C++:
auto data = []() {
auto data = get_vector();
auto temp = compute_something();
data.insert_range(data.end(), temp);
std::ranges::sort(data);
return data;
}();"Defensive programming" has multiple meanings. To the extent it means "avoid using _ as a catch-all pattern so that the compiler nags you if someone adds an enum arm you need to care about", "defensive" programming is good.
That said, I wouldn't use the word "defensive" to describe it. The term lacks precision. The above good practice ends up getting mixed up with the bad "defensive" practices of converting contract violations to runtime errors or just ignoring them entirely --- the infamous pattern in Java codebases of scrawling the following like of graffiti all over the clean lines of your codebase:
if (someArgument == null) {
throw new NullPointerException("someArgument cannot be null");
}
That's just noise. If someArgument can't be null, let the program crash.Needed file not found? Just return ""; instead.
Negative number where input must be contractually not negative? Clamp to zero.
Program crashing because a method doesn't exist? if not: hasattr(self, "blah") return None
People use the term "defensive" to refer to code like the above. They programs that "defend" against crashes by misbehaving. These programs end up being flakier and harder to debug than programs that are "defensive" in that they continually validate their assumptions and crash if they detect a situation that should be impossible.
The term "defensive programming" has been buzzing around social media the past few weeks and it's essential that we be precise that
1) constraint verification (preferably at compile time) is good; and
2) avoidance of crashes at runtime at all costs after an error has occurred is harmful.
Yes. Defensively handle all the failure modes you know how to handle, but nothing else. If you're writing a service daemon and the user passes in a config filename that doesn't exist, crash and say why. Don't try to guess, or offer up a default config, or otherwise try to paper over the idea that the user asked you to do something impossible. Pretty much anything you try other than just crashing is guaranteed to be wrong.
And for the love of Knuth, don't freaking clamp to zero or otherwise convert inputs into semantically different value than specified. (Like, it's fine to load a string representation of a float into an IEEE754 datatype if you're not working with money or other exact values. But don't parse 256 as 255 and call it good enough. It isn't.)
So much end user software tries to be "friendly" by just saying "An error occurred" regardless of what's wrong or whether you can do anything about it. Rust does better and it's a reminder that you can too.
In PHP a pattern I often employ is:
match ($value) {
static::VALUE_1 => ...,
static::VALUE_2 => ...,
default => static::unreachable()
}
Where unreachable is literally just: static function unreachable() {
throw new Exception('Unreachable');
}
Now, we don't actually need the default match arm. If we just leave it off entirely, and someone passes in something we can't match, it'll throw a PHP error about unmatched cases.But what I've found is that if I do that, then other programmers go in later and just add in the case to the match statement so it runs. Which, of course, breaks other stuff down stream, because it's not a valid value we can actually use. Or worse: they add a default match arm that doesn't work! Just so the PHP interpreter doesn't complain.
But with this, now the reader knows "the person who wrote this considered what happens when something bad is passed in, and decided we cant handle it. There's probably a good reason for that". So they don't touch it.
Now, PHP has unique challenges because it's so dynamic. If someone passes in the wrong thing we might end up coercing null to zero and messing up calculations, or we might end up truncating a float or something. Ideally we prevent this with enums, but enums are a pain in the ass to write because of autoloading semantics (I don't want to write a whole new file for just a few cases)
So many rust articles are focused on people doing dark sorcery with "unsafe", and this is just normal every day api design, which is far more practical for most people.
I would have guessed linters would have complained about what's being suggested there. Is the something special about var: _ thing that avoids it?
> Underscore expressions, denoted with the symbol _, are used to signify a placeholder in a destructuring assignment.
[0]: https://doc.rust-lang.org/reference/expressions/underscore-e...