Is it bad to have vector in a public interface?
published at 12.07.2015 23:31 by Jens Weller
After I finished my talk at NDC Oslo about encryption in C++, the last question I was asked by an attendee was about having std::vector in public interfaces as an argument, and if that would be considered bad practice. So, is it good or bad to use std::vector in a public interface?
Lets create a simple interface and see:
void test(std::vector<T> vec);//1
void test(std::vector<T>& vec);//2
void test(const std::vector<T>& vec);//3
So, there are 3 options worth looking at IMHO: taking a vector by value, reference and const reference. You also could have a pointer to a vector as an argument, but this would behave similar to a reference, except that you could pass a null pointer instead of a vector pointer. Also, forwarding references and rvalue references are special use cases I will ignore for this post. You might want to read up on those, Scott Meyers Effective Modern C++ has a very good chapter on this.
While I also will look at C++11, the person asking, is still living in a C++98 code base. So, first lets see how things used to be, before Modern C++ became a standard. Essentially, the question is about passing potential big objects into interfaces.
Lets look, how the 3 options behave at run time:
- The first option copies the vector in C++98, with C++11 also a move could be applied, moving the contents of the vector into the function. But remember, that only if the argument is moveable, a std::move will result in an actual move, otherwise its a copy. In this case, std::vector should always perform a move, as the actual elements aren't affected by the move. This version is only good to use, if you want to force the copy, e.g. the function is a sink for the parameter. In any other case, this is the worst option!
- When you take a parameter by reference, it will not copy the vector, hence yield better performance. The non const reference hints at that the function will actually change the vector. The STL has a similar interface with std::getline and std::string, which can be very efficient in reusing the already allocated memory in the referenced string parameter. So, this design is only good if the primary goal of the function is to make changes to vector.
- The third, and best option: const correctness + reference. It avoids an unnecessary copy, and is IMHO the correct one to choose, if the function does not make any changes to the vector.
For more details on passing (and returning), look at the slides of Eric Nieblers Keynote "C++11 and No-Compromise Library Design" at Meeting C++ 2013. This talk was recorded at C++Now a year later:
So, is it good?
Its clear, that the best option should be passing by const reference or by reference if there is the need to make changes to the vector. At least that is the case, if the object passed into a function is potentially big. Which applies to vector, so, void print_options(const std::vector<str::string>& options); would be the correct way to pass a vector of strings to print_options. It is important, that you avoid copies in interfaces, when they are not needed. Taking a copy in a constructor and moving it into a member would be fine in C++11, while in C++98 a const reference would seem more natural for the same interface.
Yet, one thing makes me wonder ever since NDC Oslo, while we know how to pass objects like std::vector into interfaces correctly, the STL does not do so very often. The above mentioned std::getline is an exception, while in Qt often collections are passed to interfaces such as methods and functions. The STL prefers to not pass containers into functions, it prefers to pass iterators. The common STL interface for algorithms is begin and end iterator, often accompanied with some other parameters. And the STL does so in a generic way.
This also reveals that often the common nature on working with containers is to do something with their elements, and not the container. So, if that is the case, you should think about, if an iterator based interface is not the far better approach. Maybe also, that you don't need to write this function, because there is already an algorithm in the standard enabling you to this. print_options for example could be replaced with a call to std::copy using an ostream_iterator.
But the STLs interface leads to a lot of code like algorithmX(vec.begin(), vec.end(), ...);, so its not perfect. That is why libraries such as boost::range exist, to simplify this interface, especially when the whole container is meant to be passed in. But ranges go beyond this, and actually its Eric Niebler, who is now working on a standard version for ranges. At this years C++Now he gave a very good keynote about his range library, which is already available.
Yet, other libraries, such as wxWidgets or Qt often will pass containers and objects into interfaces. Qt often uses copy on write for their own types, hence passes most objects by value, as they are just handles to the reference counted data object hidden by the implementation. Qt is also known to have very well designed interfaces and APIs...
So, at the end, the correct answer seems to be: it depends, which design you prefer.
Here is my opinion:
- C++ is also a generic language, a generic interface might be the best option.
- Sometimes a non generic interface is better, especially in public APIs, still such APIs can be build upon generic code.
- If your interface parameter is a sink parameter (e.g. can be moved in the right place), passing by value (or as forwarding/rvalue reference) is the correct choice.
- In any other case, passing by const reference should be your default, for containers an iterator based (generic) interface offers more flexibility for the caller.
- Eric Nieblers range library shows how a modern, range based approach in C++11 and beyond could look, and as its already available, you should take a look at his work.
- Some libraries prefer other interfaces. Qt e.g. prefers to expose non generic interfaces to the end user, and often uses copy-on-write handle objects to omit expensive copies.
- Also, using std::vector is often a very good decision, use it when ever you need a "dynamic array".