Error handling in Rust
Some error handling strategies are more equal than others.
Type-erased errors
Box<dyn std::error::Error>
1 certainly has its uses. It's very convenient if
the API consumer2 genuinely does not care what an error was, only that
there was an error. If the reactive action the API consumer needs to perform
when an error occurs is exactly the same regardless of what the error actually
was, then Box<dyn Error>
is perfectly fine. It's easy to reach for Box<dyn Error>
because getting ?
to work on all error types in your function for free
is very attractive from a convenience standpoint. However, once you need to do
something specific when a specific error occurs, you should no longer be using
Box<dyn Error>
.
In order to handle errors inside Box<dyn Error>
, you must know the exact type
signature of the error you intend to handle. With generic code, this can
sometimes be difficult, especially if you're new to Rust. The compiler can
provide almost no useful diagnostics about whether you're trying to downcast to
the right type. This also means you have to keep track of which concrete errors
are inside the Box<dyn Error>
yourself so that you don't accidentally try to
handle an error that will never occur there. The loss of (potential for)
exhaustive error handling makes it difficult to have confidence in a program's
robustness because Box<dyn Error>
does not encode descriptions of a function's
failure modes in the type system, which makes it extremely easy to overlook
easily-handled errors, instead turning them into fatal problems for program
functionality.
Something else that came to my attention was the question of what to do in the
situation where you're writing a library whose errors are caused by errors
defined in your dependencies/crates you don't control. One school of thought is
to never expose the concrete type of your dependency's error type, instead
preferring to return an enum variant containing simply Box<dyn Error>
. If your
users care about the specifics of that downstream error, they can add your
dependency to their dependencies, then downcast to the concrete type in the code
using your library. This way, if you ever make a semver-incompatible upgrade to
that package, you don't break downstream compilations.
I think this is an incredibly bad idea, because while it doesn't break
compiletime, it does break runtime. The downcast will no longer work, since
types from two versions of the same crate are not the same type. This sort of
behavior seems antithetical to Rust; we have a borrow checker for a reason.
Specifically, if your error handling path was important, and needs to be run
whenever that error occurs, this could be extremely costly in terms of data
corruption or lost capital or just time spent trying to debug why the heck your
code stopped working when you changed nothing (other than running cargo update
, which may even be done automatically by your CI pipeline). (The
alternative I propose is to simply not do this, and instead just expose the
concrete type directly.)
Strongly typed errors
The3 alternative to Box<dyn Error>
is to create a custom enum with
a variant for each error type. For example, you might have variants like
Deserialize(serde_json::Error)
, Http(reqwest::Error)
, and maybe an "unknown"
variant4 if exhaustiveness is infeasible. If you're an API consumer, the bar
for "infeasible" is as low as "I don't need to handle this anywhere so I'm not
going to make a variant for it". But, as soon as you do need to handle it, you
need to make a variant for it. If you're not an API consumer, you should aim
to be exhaustive. There are some cases where this is unreasonable, but those
situations are rare and, as such, this exception likely does not apply to you.
With error enums, knowing which errors are possible is now absolutely trivial, all you need to do is look at the variants of the enum. The compiler can also give vastly more helpful diagnostics this way, since it will be able to follow the type system around to ensure that you're handling all cases, and that you're not inventing cases that will never actually happen. No longer do you need to rely on possibly-stale manual documentation or have to read the entire call tree to determine what the failure modes are5.
Another advantage of using enums is allowing multiple errors of the same type to
have different semantics. Maybe you need to load two files (std::io::Error
),
but you need to do something different based on which file failed to load. With
enums, you can simply create two variants, one for each behavior. With Box<dyn Error>
, this is not possible6.
Now, the downside: you must manually implement std::error::Error
for your new
custom error enum. This means Display
, Debug
, and all the From
impls so
?
is still ergonomic. Luckily, the thiserror
crate provides
a procedural macro that allows you derive all of those traits. When you use the
#[from]
attribute to generate From
impls, it even correctly implements
std::error::Error::source()
for you! This makes acquiring
detailed error messages (e.g. for logging) using nothing but (ostensibly) the
standard library very easy.
Custom error types
There are some rules about creating custom error types that you should follow in order to create the best possible error messages for your users and fellow developers.
-
If your error has an inner error, or your error is caused by another error, you must pick exactly one of the following options for each inner error:
-
Return the inner error when
Error::source()
is called.With
thiserror
, this means using the#[from]
or#[source]
attributes. Generally, reach for#[from]
first unless it fails to compile, and in that situation, switch to#[source]
. Without aFrom
impl, you can useResult<T, E>::map_err()
to convert the inner error into your custom error type. -
Include the inner error's message as part of your own error's.
With
thiserror
, this means using{0}
in your#[error("...")]
, assuming the variant is a tuple variant with the inner error stored in the zeroth tuple item.
This prevents an error message from containing needlessly duplicated information. These options are also listed in the order that you should prefer to do them, the first one being much more common. If you're not sure which to do, just pick number 1.
-
-
The human-readable part of your error message should not include the
:
character. What I mean by "human-readable part" is that if, for example, your error message happens to include JSON, then don't worry about it. Just don't use:
in#[error("...")]
strings, basically.The reason for this is that
:
is commonly used to indicate causality on a single line, for example:failed to create user: failed to execute SQL statement: invalid SQL
All three of these would be a separate concrete error type, each of which being wrapped inside an enum variant of the preceding message.
-
Display
impls forstd::error::Error
implers should not start with a capital letter, except for special cases like if it begins with an initialism. For example, this looks inconsistent and gross:failed to create user: Failed to execute SQL statement: Invalid SQL
-
Don't use sentence-ending punctuation in error messages. Your error may not be the last one in the chain.
Displaying error messages
Stick the following code into src/error.rs
in your project and add thiserror = "1"
to your dev dependencies:
#![allow(unused)] fn main() { use std::{ error::Error, fmt::{self, Display, Formatter}, iter, }; /// Wraps any [`Error`][e] type so that [`Display`][d] includes its sources /// /// # Examples /// /// If `Foo` has a source of `Bar`, and `Bar` has a source of `Baz`, then the /// formatted output of `Chain(&Foo)` will look like this: /// /// ``` /// # use crate::error::Chain; // YOU WILL NEED TO CHANGE THIS /// # use thiserror::Error; /// # #[derive(Debug, Error)] /// # #[error("foo")] /// # struct Foo(#[from] Bar); /// # #[derive(Debug, Error)] /// # #[error("bar")] /// # struct Bar(#[from] Baz); /// # #[derive(Debug, Error)] /// # #[error("baz")] /// # struct Baz; /// # fn try_foo() -> Result<(), Foo> { Err(Foo(Bar(Baz))) } /// match try_foo() { /// Ok(foo) => { /// // Do something with foo /// # drop(foo); /// # unreachable!() /// } /// Err(e) => { /// assert_eq!( /// format!("foo error: {}", Chain(&e)), /// "foo error: foo: bar: baz" /// ); /// } /// } /// ``` /// /// [e]: Error /// [d]: Display #[derive(Debug)] pub(crate) struct Chain<'a>(pub &'a dyn Error); impl<'a> Display for Chain<'a> { fn fmt(&self, f: &mut Formatter<'_>) -> fmt::Result { write!(f, "{}", self.0)?; let mut source = self.0.source(); source .into_iter() .chain(iter::from_fn(|| { source = source.and_then(Error::source); source })) .try_for_each(|source| write!(f, ": {}", source)) } } }
This representation is most useful for logging, but could be easily adapted for other uses as well. Once the result of this tracking issue lands in stable, this should be even more ergonomic. Especially if they implement this person's suggestion, which is almost exactly what's written above.
A note about panicking
Panicking is not an error handling strategy, panicking is panicking. You should
only resort to panicking when an illegal state has been reached, or for
convenience when it's not possible to prove that this state will never occur
with the type system. If there is no possible way to recover and further program
execution wouldn't make any sense, is dangerous, or would invoke undefined
behavior, then you can panic. If you have proven out-of-band that this state is
unreachable, then you can panic. This also applies to working with Option
,
Result
, and other "maybe a value" types: you should only unwrap (aka get the
value you want and panic if it's not there) if the alternative is an illegal
state. I could probably fill another blog post about specific examples of when
to or when not to unwrap maybe types, so for now I'm just going to leave it at
that.
Further reading
- Rust API Guidelines' C-GOOD-ERR
- The docs for
std::error::Error
- Guidelines for implementing Display and Error::source for library errors
Footnotes
Henceforth Box<dyn Error>
. I'm eliding extra constraints like + Send + Sync + 'static
since they're not strictly relevant. Also, you can/should
substitute in its other analogs such as the anyhow
crate.
The "API consumer" is the person who will need to handle the error. If you are writing an application, you are an API consumer. If you are writing a library (at least, the public interface), you are not an API consumer. If your codebase is large and you work with other people, you might also want to consider the people you're working with as API consumers when writing new fallible code.
An? I genuinely can't think of a third option.
If you're an API consumer, using an Unknown(Box<dyn Error>)
variant
straight up is fine as long as you've taken the rest of this blog post into
consideration. If you're not an API consumer, this is somewhat more
complicated. If your code is the one creating the error and an API consumer is
expected to handle it, you may need either Unknown
, Unknown(Box<dyn Error>)
, or some other such variant with an opaque inner type. If you're
defining the contract for a function (say, with a trait, or something to do
with closures), you should use either T
directly (no enum) or an Other(T)
variant (where T
is a generic parameter) so that your API consumer can
decide what to do, instead of having their hand be forced.
Like you would with a language with an explicit concept of exceptions that either don't require type annotations and/or cannot be type-annotated (*cough* Python), or poorly written Rust.
Unless you abuse the newtype pattern. But isn't the whole point of using
Box<dyn Error>
to not care about errors? Having to create a new type sounds
like you care about errors. Also, as a new Rust user, good luck figuring out
why you need to downcast to two different types that aren't the type you want
to get the type you want.