~ August 3, 2025
wow, big rust buzz words
Here’s a little design problem for you to ponder about:
I need a vector where I can push new elements, and keep track of the index at which the newly pushed element lives. Later on this index may be used to access the element that was just pushed.
We aim for the design to both absolutely memory-safe while being as performant as a naive implementation. Have you thought of a good design? Don’t worry if you haven’t, here’s that naive implementation to get the ball rolling:
struct VecWrapper<T> {
inner: Vec<T>
}
impl<T> VecWrapper<T> {
fn new() -> Self {
VecWrapper { inner: Vec::new() }
}
fn push(&mut self, x: T) -> usize {
self.inner.push(x);
self.inner.len() - 1
}
fn get(&self, index: usize) -> Option<&T> {
self.inner.get(index)
}
}
let mut v = VecWrapper::new();
let i1 = v.push(123);
println!("{}", v.get(i1).unwrap()); // 123
A slight inconvenience is the get
method returning an Option<T>
. This is because Vec::get
returns an Option
in case the index is out-of-bounds. But semantically, we should only access elements using indexes produced by the push
method. These indexes are guaranteed to be in range (there is no functionality to pop elements), so it doesn’t make sense to ever return a None
. Not to mention the (small but non-zero) extra bounds checking computation we’ll have to perform on every get
, this can add up over time.
Furthermore, the user will have to call unwrap
after every get
, which I believe is quite unecessary. So why not have this functionality be built into the interface of VecWrapper
.
impl<T> VecWrapper<T> {
// ...
fn get(&self, index: usize) -> &T {
// unwrap is safe because indexes are generated from `push`
self.inner.get(index).unwrap()
}
// ...
}
However, this code is no longer safe. Since get
takes any usize
, there’s no stopping someone from sending an arbitrary index to this method – an index that is possibly out of bounds and will cause the code to panic:
let mut v = VecWrapper::new();
let i1 = v.push(123);
println!("{}", v.get(99999)); // panic on unwrap
A solution to this would be to somehow tie the indexes to the VecWrapper
type – only indexes generated by a VecWrapper
are allowed to be used in indexing.
Let’s define a new type, SafeIndex
, within the same module as VecWrapper
. Pushing to a VecWrapper
will now return a SafeIndex
, and accessing a VecWrapper
will only be possible through a SafeIndex
.
#[derive(Clone, Copy)]
struct SafeIndex {
index: usize
}
We purposefully make the index
field private, so that this struct can not be constructed anywhere outside of the module. That is, we abstract away the underlying implementation of this type – from the user’s point of view, this type should be nothing more than something used to index a VecWrapper
. Here’s the updated VecWrapper
implementation:
impl<T> VecWrapper<T> {
// ...
fn push(&mut self, x: T) -> SafeIndex {
self.inner.push(x);
SafeIndex {
index: self.inner.len() - 1
}
}
fn get(&self, index: SafeIndex) -> &T {
// unwrap is safe because indexes are generated from `push`.
self.inner.get(index.index).unwrap()
}
// ...
}
(A smart you may point out that we can use get_unchecked
instead of get
then unwrap
. I want to avoid using unsafe blocks until the very end when we have a much better solution.)
The SafeIndex
type is associated to the type of VecWrapper
, not a specific instance of one. That means we can do something like the following to break it:
let mut v = VecWrapper::new();
let i1 = v.push(123);
let other_v = VecWrapper::new(); // length 0
println!("{}", other_v.get(i1)); // panic on unwrap
We need to find a way to tie SafeIndex
to a particular instance of VecWrapper
. For example, in the above snippet, we need to restrict i1
to only be able to access v
and not any other instance of VecWrapper
like other_v
. An easy way to distinguish the instances is to attach to each out an ID, and generate SafeIndex
s that also store the ID. During accessing, we will assert that the VecWrapper
and index’s IDs match.
struct SafeIndex {
index: usize,
id: u64
}
struct VecWrapper<T> {
inner: Vec<T>,
id: u64
}
impl<T> VecWrapper<T> {
fn new() -> Self {
VecWrapper {
inner: Vec::new(),
id: generate_unique_id()
}
}
fn push(&mut self, x: T) -> SafeIndex {
self.inner.push(x);
SafeIndex {
index: self.inner.len() - 1,
id: self.id
}
}
fn get(&self, index: SafeIndex) -> &T {
assert_eq!(self.id, index.id);
// unwrap is safe because indexes are generated from `push`.
self.inner.get(index.index).unwrap()
}
}
let mut v1 = VecWrapper::new();
let mut v2 = VecWrapper::new();
let i1 = v1.push(123);
let i2 = v1.push(456);
println!("{}", v1.get(i1)); // 123
println!("{}", v2.get(i2)); // 456
println!("{}", v1.get(i2)); // assert panic
Hmm… this is still not perfect. There are two problems with the above approach:
get
call.Wouldn’t it be nice if we could solve both these problems at once, to get a solution that defers instance checking to compile-time? We will get proven safe code while cutting down on performance and memory overhead!
Let’s keep expanding on the previous approach that ties instances to IDs. But ideally we could offload the work of ID checking to the compiler. While the compiler can’t do this directly, something that it can do that is awfully similar to ID checking is type-checking. So what if we tie IDs to types, therefore different VecWrapper
s with different IDs would also have different types, and make use of the type-checker. We can achieve this by treating the ID as a const generic:
struct SafeIndex<const ID: u64> {
index: usize
}
struct VecWrapper<const ID: u64, T> {
inner: Vec<T>
}
impl<const ID: u64, T> VecWrapper<ID, T> {
fn new() -> Self {
VecWrapper { inner: Vec::new() }
}
fn push(&mut self, x: T) -> SafeIndex<ID> {
self.inner.push(x);
SafeIndex {
index: self.inner.len() - 1
}
}
fn get(&self, index: SafeIndex<ID>) -> &T {
self.inner.get(index.index).unwrap()
}
}
let mut v1 = VecWrapper::<1, _>::new();
let mut v2 = VecWrapper::<2, _>::new();
let i1 = v1.push(123); // SafeIndex<1>
let i2 = v2.push(456); // SafeIndex<2>
println!("{}", v1.get(i1)); // 123
println!("{}", v2.get(i2)); // 456
println!("{}", v1.get(i2)); // error!
As expected, here’s the compiler error from the last line:
| println!("{}", v1.get(i2)); //
| --- ^^ expected `1`, found `2`
| |
| arguments to this method are incorrect
|
= note: expected struct `SafeIndex<1>`
found struct `SafeIndex<2>`
If we trust that the user of VecWrapper
will diligently increment the ID used for each new instance, this code works just fine! But we can’t trust human developers to keep track of small things like this, it’s just too easy to forget what ID we need next, or if an ID has previously been used. Not to mention it’s probably annoying to have to manually type out the ID for every new VecWrapper
that is made.
So what if we somehow automatically generated a unique ID for each new instance of VecWrapper
? This could be as simple as keeping a ID counter, and incrementing it on every new instance. Somehow, this counter would also have to be evaulated entirely at compile-time, because the ID
generic must take a const value. Also, this const number would be zero-cost, and not actually exist in the compiled program. I would like to think that this is impossible given the current state of the Rust compiler… that or I’m just too lazy to figure out a way. Nonetheless, we can mimic this behaviour by creating an entirely new type (instead of number) to use as the ID for each new VecWrapper
. Wrapping this type in a PhantomData
makes it guaranteed zero-cost. Here’s what I’m thinking:
use std::marker::PhantomData;
struct SafeIndex<IDType> {
index: usize,
_id_phantom: PhantomData<IDType>
}
struct VecWrapper<IDType, T> {
inner: Vec<T>,
_id_phantom: PhantomData<IDType>
}
impl<IDType, T> VecWrapper<IDType, T> {
fn new() -> Self {
VecWrapper {
inner: Vec::new(),
_id_phantom: PhantomData
}
}
fn push(&mut self, x: T) -> SafeIndex<IDType> {
self.inner.push(x);
SafeIndex {
index: self.inner.len() - 1,
_id_phantom: PhantomData
}
}
fn get(&self, index: SafeIndex<IDType>) -> &T {
self.inner.get(index.index).unwrap()
}
}
struct ID1;
let mut v1 = VecWrapper::<ID1, _>::new();
struct ID2;
let mut v2 = VecWrapper::<ID2, _>::new();
let i1 = v1.push(123);
let i2 = v2.push(456);
println!("{}", v1.get(i1)); // 123
println!("{}", v2.get(i2)); // 456
println!("{}", v1.get(i2)); // error!
We would get a similar compiler error when trying to compile the last line. So we’ve solved the problem of having to use const numbers for IDs, but the problems of manually having to create new IDs and using them still remains. Not to fret, the reason why we switched to using types for IDs is that it is much easier to create unique types than numbers… using macros!
// ...
macro_rules! new_vec_wrapper {
() => {{
struct ID; // unique per expansion
VecWrapper::<ID, _>::new()
}};
}
let mut v1 = new_vec_wrapper!();
let mut v2 = new_vec_wrapper!();
let i1 = v1.push(123);
let i2 = v2.push(456);
println!("{}", v1.get(i1)); // 123
println!("{}", v2.get(i2)); // 456
println!("{}", v1.get(i2)); // error!
And to prove to you this indeed works as advertised, here’s the compiler error when trying to compile the last line:
| println!("{}", v1.get(i2)); // error!
| --- ^^ expected `main::ID`, found a different `main::ID`
| |
| arguments to this method are incorrect
|
= note: expected struct `SafeIndex<main::ID>`
found struct `SafeIndex<main::ID>`
Yes, I know this error message is quite abominal for debugging. But hey, at least the code is super safe, and super performant! And since we know that there is absolutely no way we would ever get an out-of-bounds panic with this code, we can squeeze out a bit more performance by changing get
to be unsafe:
impl<IDType, T> VecWrapper<IDType, T> {
// ...
fn get(&self, index: SafeIndex<IDType>) -> &T {
// SAFETY: index can only be generated by the current
// instance of `VecWrapper`, and is guaranteed to
// point to a valid index
unsafe {
self.inner.get_unchecked(index.index)
}
}
// ...
}
use std::marker::PhantomData;
struct SafeIndex<IDType> {
index: usize,
_id_phantom: PhantomData<IDType>
}
struct VecWrapper<IDType, T> {
inner: Vec<T>,
_id_phantom: PhantomData<IDType>
}
impl<IDType, T> VecWrapper<IDType, T> {
fn new() -> Self {
VecWrapper {
inner: Vec::new(),
_id_phantom: PhantomData
}
}
fn push(&mut self, x: T) -> SafeIndex<IDType> {
self.inner.push(x);
SafeIndex {
index: self.inner.len() - 1,
_id_phantom: PhantomData
}
}
fn get(&self, index: SafeIndex<IDType>) -> &T {
// SAFETY: index can only be generated by the current
// instance of `VecWrapper`, and is guaranteed to
// point to a valid index
unsafe {
self.inner.get_unchecked(index.index)
}
}
}
macro_rules! new_vec_wrapper {
() => {{
struct ID; // unique per expansion
VecWrapper::<ID, _>::new()
}};
}
That’s it. We’ve achieved zero-cost and compile-time instance checking. We can take a look at the assembly output for the original version (very first code snippet) with our improved version (above). I’ve compiled both with -opt-level=3
with rustc
version 1.88.
We can see that the compiled outputs are identical:
$ diff unchecked.rs safe.rs
$ echo $?
> 0
I think writing about the language is within the natural progression of any Rust developer. I’ve been using it extensively during my internship and have learned to fall in love with it. There are a lot of powerful safety features that the compiler can help you guarantee that would otherwise be a headache in another language. Granted, the VecWrapper
example is quite extreme and there’s probably little point in doing such checks in a real codebase. It’s all just for fun!