Providing a stable memory address

published at 19.03.2024 18:18 by Jens Weller
Save to Instapaper Pocket

Some APIs allow you to store a pointer to your data element. This is used to access additional information from your types to display them in Model/View Architecture.

A while ago I showed how you can implement a tree with shared_ptr and enable_shared_from_this and then display this in QTreeView. And when working on my current project I knew this problem would come around again. Maybe not for a tree and a tree view, but I'll clearly need to have some way to have ui panels display and edit my data classes and store a stable memory adress as a pointer in Qt models. Back in 2015 the Qt5 example still used a pointer allocated with raw new for this, in Qt6 the example uses unique_ptr. Using shared_ptr for this back in 2015 was a good decision, and the code works very well. For the moment I don't see that my current project would need to make use of enable_shared_from_this, so using unique_ptr would be a good option.

Except that this would force using unique_ptr or shared_ptr on every allocation that I'd need to provide a stable pointer for. Lots of these classes would also be stored in containers, and vector<unique_ptr<...>> is then the obvious solution. But what if a vector of your client type would do? This is fine when the vector isn't growing and no elements are removed from it. But any future code change that removes or adds elements will break your code. But having a vector which just stores smart pointers to heap allocations is also not what I'd like to refactor my code to.

So when I started a new C++ project recently, I tried to find a way to provide a stable memory address of my types without having them live inside a shared or unique_ptr.

This got me thinking, could there be a better way?

My design goal is simple: keep my type within its normal storage, and not allocate it with a smart pointer. Just because some external code needs a stable memory address is not a good reason for me to refactor my code to use smart pointers. All the other clients of my code shall not be affected by this requirement as much as a heap allocation of the type would bring. So I'd like to keep my own types out of the allocation for the stable memory address. Though my types will still need a stable memory address and the interface for this, which is the only disadvantage of my way to solve this: its a bit intrusive.

The idea for this design came when in a different place of my code I changed the way I implement a search result. Instead of copying the result by using vector, I decided to simply use vector<reference_wrapper<Record>> to store the result. I can do this because the Record type is loaded to memory and at the moment will not change for the lifetime of the search. Prior to that I've rarely used std::reference_wrapper. But when thinking about the mechanics this class allows it suddenly came to mind: what if std::reference_wrapper was the stable address? Of course by itself reference_wrapper is not stable, but when you allocate it on the heap it suddenly is. Also a class that gets moved in a vector because one of the prior elements got removed then could just update the reference_wrapper with its new memory location. Did I mention this is a bit intrusive?

Stable Memory Address implementation

While one can simply add a unique_ptr to the corresponding Type T, you'd be forced to to this again and again. So a little helper class that provides the stable memory address as a service is needed: template<class T> StableMemoryAddress:

template< class T>
class StableMemoryAddress
{
  std::unique_ptr< std::reference_wrapper< T>> ref_ptr;
  friend T;
  void setReference(std::reference_wrapper< T>& ref){*ref_ptr = ref;}
  void setReference(T& t){*ref_ptr = t;}
public:
  using ptrType = std::reference_wrapper< T>;
  StableMemoryAddress(T& t):ref_ptr(std::make_unique(t)){}
  StableMemoryAddress(std::reference_wrapper< T>& ref):ref_ptr(std::make_unique(ref)){}
  StableMemoryAddress(StableMemoryAddress&& sma):ref_ptr(std::move(sma.ref_ptr)){}
  StableMemoryAddress& operator=(StableMemoryAddress&& sma)
  {
    ref_ptr=std::move(sma.ref_ptr);
    return *this;
  }
  ptrType* getReferencePtr(){return ref_ptr.get();}
  const ptrType* getReferencePtr()const{return ref_ptr.get();}
};

For some time this was a proof of concept implementation written a few weeks ago, and this weekend I've started to take the next step: write a test for it. Inside the test there is a first class which implements the code needed to utelize the functionality of an externaly stored reference_wrapper providing a stable memory address:

struct RSMATest
{ int i = 0; RegisteredStableMemoryAddress<RSMATest> stable_address; RSMATest():stable_address(*this){} RSMATest(RSMATest&& m):i(m.i+1),stable_address(std::move(m.stable_address)) { stable_address.setReference(*this); } RSMATest& operator=(RSMATest&& m){ i = m.i+1; stable_address = std::move(m.stable_address); stable_address.setReference(*this); return *this; } RegisteredStableMemoryAddress& getStableAddress(){return stable_address;} int getI()const{return i;} };

And the test shows, that unfortunately one has to implement the various operations that move the instance of the class to a new memory position in order to update the reference wrapper. The test then proceeds in creating a vector of (R)SMATest instances, retrives a stable address from the vectors end, removes the first instance from the vector and checks the first retrieved pointer against a newly from the last element retrieved one for equalness:

  std::vector<SMATest> vec;
  vec.push_back(SMATest{});
  vec.push_back(SMATest{});
  vec.push_back(SMATest{});
  vec.push_back(SMATest{});
  StableMemoryAddress::ptrType* sma = vec.back().getStableAddress().getReferencePtr();
  vec.erase(vec.begin());
  StableMemoryAddress::ptrType* sma2 = vec.back().getStableAddress().getReferencePtr();
  REQUIRE( sma == sma2);

And the test passes. Great.

Did you notice that the implementation class from the test I showed you actually used a different StableMemoryAddress class? Thats because StableMemoryAddress by itself has a little problem which the other class tries to solve.

The first implementation allows one to create a stable pointer in memory, and somewhere this pointer is stored. This is the basic use case, and a proof of concept. But it does not allow for checking if that stored pointer is still valid. It might just point to an address that has been removed. For this implemenation the code has to provide the guarantee that this pointer stays valid - RegisteredStableMemoryAddress works with a registry, that you can query to see if the pointer thats stored is still valid:

template< class T, class Registry = PointerRegistry< T>>
class RegisteredStableMemoryAddress
{
  std::function< void(std::reference_wrapper< T>*ref)> del = []( std::reference_wrapper< T>*ref)
  {
    Registry::getRegistry().removePointer(ref);
    delete ref;
  };
  std::unique_ptr<std::reference_wrapper< T>,decltype(del)> ref_ptr;
  friend T;
  void setReference(std::reference_wrapper< T>& ref){*ref_ptr = ref;}
  void setReference(T& t){*ref_ptr = t;}
  void registerPtr(){ Registry::getRegistry().addPointer(ref_ptr.get()); }
public:
  using ptrType = std::reference_wrapper< T>;
  RegisteredStableMemoryAddress(T& t):ref_ptr(std::unique_ptr<std::reference_wrapper< T>,decltype(del)>(new std::reference_wrapper< T>(t),del)){registerPtr();}
  RegisteredStableMemoryAddress(std::reference_wrapper< T>& ref):ref_ptr(std::unique_ptr<std::reference_wrapper< T>,decltype(del)>(new std::reference_wrapper(ref),del)){registerPtr();}
  RegisteredStableMemoryAddress(RegisteredStableMemoryAddress&& sma):ref_ptr(std::move(sma.ref_ptr)){}
  RegisteredStableMemoryAddress& operator=(RegisteredStableMemoryAddress&& sma)
  {
    ref_ptr=std::move(sma.ref_ptr);
    return *this;
  }
  ptrType* getReferencePtr(){return ref_ptr.get();}
  const ptrType* getReferencePtr()const{return ref_ptr.get();}
};

This class started out as a copy of StableMemoryAddress, but then goes along and implements a custom deleter for the unique_ptr, which removes the pointer from the registry. And the constructor calls a member function to add the pointer to said registry.

This registry is a template parameter, because so far there are two different implementations for this. A simple registry, which lets one check if a pointer is valid. And one which also allows for registering a call back that is executed when the pointer becomes invalid:

template<class T>
class ObserverPointerRegistry
{
public:
  using ptr = std::reference_wrapper<T>*;
private:
  std::unordered_set< ptr> refset;
  std::unordered_map< ptr,std::unordered_map<size_t,std::function<void(ptr)>>> obs_map;
size_t counter = 1; ObserverPointerRegistry(){} public: static ObserverPointerRegistry& getRegistry() { static ObserverPointerRegistry registry; return registry; } void addPointer(const std::reference_wrapper* ref){refset.insert(ref);} void removePointer( std::reference_wrapper* ref) { if(obs_map.contains(ref)) { for(auto& p:obs_map[ref]) p.second(ref); } obs_map.erase(ref); refset.erase(ref); } bool checkPointer(const std::reference_wrapper< T>*ref){return refset.contains(ref);} [[nodiscard]] size_t addObserver(ptr p,std::function< void(ptr)>& obs) { size_t index =0; if(obs_map.contains(p)) { index = counter++; obs_map[p].insert({index,obs}); } else obs_map.insert({p,std::unordered_map< size_t,std::function<void(ptr)>>{{counter++,obs}}}); //obs_map[p].insert(); return index; } void removeObserver(ptr p,size_t index)//const std::function<void(ptr)>& obs) { if(obs_map.contains(p)) { auto& map = obs_map[p]; map.erase(index); } } size_t size()const {return refset.size();} };

In order to construct a registry I've opted for a meyers singleton: a static create function with a static member of its class being returned. That makes creating and accessing the registry easy, but at the cost of a singleton. While registering an observer itself is easy, removing it from a vector<function<...>> would not be possible. As std::function does not have an op==, in this case I've decided to use an numeric id for each callback.

With this the basic feature set is done. All code so far covered was written before starting this blog post on the weekend, though writing and reflecting on the implementation tests gave some new ideas.

One of those ideas was thinking about how shared_ptr has a weak_ptr type, that allows for checking and retrieving a shared_ptr. Right now the interface for the client is minimalistic, you can get your pointer and then have to check it yourself. You can register a callback that lets you react to the destructor of the instance. This is also the correct behavior at the moment, as I'll need the raw pointer for Qt. But for other clients, it would make sense to offer a wrapper, that allows you to access and check the contained pointer, but the wrapper it self handles the callback when the pointer becomes invalid and sets it to nullptr:

  using ptr = std::reference_wrapper<T>*;
  class PointerWrapper
  {
    ptr pointer = nullptr;
    size_t cb_id=0;
    ObserverPointerRegistry& opr;
  public:
    PointerWrapper(ptr p,ObserverPointerRegistry& opreg):pointer(p),opr(opreg){cb_id = opreg.addObserver(p,[this](ptr){ pointer = nullptr;});}
    ~PointerWrapper(){opr.removeObserver(pointer,cb_id);}
    PointerWrapper(PointerWrapper&& )=default;
    PointerWrapper& operator=(PointerWrapper&&)=default;
    PointerWrapper(const PointerWrapper&)=delete;
    PointerWrapper& operator=(PointerWrapper&)=delete;
    ptr get(){return pointer;}
    const ptr get()const{return pointer;}
    bool has_pointer()const{return pointer != nullptr;}
  };

This class is an inner class to ObserverPointerRegistry. And only in this class, but in RegisteredStableMemoryAddress this is a template member named Registry. This would be the right place to have a method offer to retrieve a PointerWrapper, but only if the type is ObserverPointerRegistry. With concepts we can implement condiditonal member functions, so this is only present when instantiated with the right type:

  [[nodiscard]] ObserverPointerRegistry::PointerWrapper getWrappedPointer()const
    requires std::is_same_v< Registry,ObserverPointerRegistry>
  {
    return typename ObserverPointerRegistry::PointerWrapper(ref_ptr.get(),ObserverPointerRegistry::getRegistry());
  }

Making it less intrusive

At this point all tests are passing, and the features I'll need are implemented. But again, I'm wondering if there is a better way to implement the way this tracks the object it is part of. Ideally in a way that allows to get rid of the need to implement move constructors and operators in the client classes. While its not so difficult to write these, its an overhead I'd like to avoid when doing the actual refactoring. With the current solution, adding all other members to a member struct could make the implementation of the machinery for moving easier. But the StableMemoryAddress classes already implements move constructor and op=, and as this class is a member of their client classes, this gives already a means of knowing when the address of the object they provide has moved.

This means having a pointer to the object of which the class is a member of and its own this pointer should be enough to calculate the offset between the two at the beginning, and when moved the class simply has to substract that offset between its own this and the clients pointer to get the new location of the object.

To play around with this idea I've written a little class called snitch, which is a proof of concept for this simple idea.

template< class T>
class snitch
{
//these are only needed if a T* needs to be stored (e.g. the stable address)
//and the offset when this class is not the first member of the client struct. T* outside_this; std::ptrdiff_t offset; public: snitch(T* t):outside_this(t){ char* ct = (char*)t; offset = ct - (char*)this; assert(offset == 0); std::cout << "offset to type: "<< offset << "\n"; } snitch(snitch&& ) { outside_this = reinterpret_cast<T*>(this); std::cout << "x of outside: "<< outside_this->x << "\n"; } snitch& operator=(snitch&& ){ outside_this = reinterpret_cast<T*>(m.outside_this); std::cout << "x of outside: "<< outside_this->x << "\n"; return *this; } };

This experiment lives on godbolt, so the test class this time has a member x which this accesses to show its actually working. While the member of a StableMemoryAddress class could be (or not?) at any point in the class, at the moment the assert enforces that it is the first. As for the moment, I'm not sure if this needs to be the first member.

Feedback on Twitter by Yurko Prokopets brought up if the reinterpret_cast should use std::launder, with a mixed opinion on this currently. Their solution also makes use of std::is_pointer_interconvertible_with_class, which interestingly only passes if the snitch is the first member. Though this code would have to be on the client side I think, having a static_assert check if snitch is a member of T would be nice. At the moment I've opted to test for this in the clients constructor. One can define a trait to check for an existing member variable in a type, so providing such a trait and an macro for creating a member with that name could be a way to check for the correctness at compiletime. Though this is for the moment something I need to dig into a bit more.

Right now the basic functionality of snitch has been copied into the RegisteredStableMemoryAddress class. The next step would be to refactor my actual client classes, but this can wait until I have a library running with Qt code that actually needs this feature and can provide further input with tests.

Another thing that could be added is a thread safe version with a shared_mutex. As this code right now mostly uses Qt as its client, I think there is no concurrent access which would need this feature. But checking PointerWrapper::has_pointer and then accessing the pointer would need to be protected by a mutex inside the StableMemoryAddress class. Then a class is able to aquire a lock on the pointer.

 

 

Join the Meeting C++ patreon community!
This and other posts on Meeting C++ are enabled by my supporters on patreon!