Experimenting with type erasure/external polymorphism in C++17

Speaker: Olivia Quinet

Audience level: [ Beginner ]

This talk showcases a concrete example, a SAS7BDAT file reader, using type erasure also known as external polymorphism design pattern. This work was originally inspired by a conference of Klaus Iglberger about Breaking Dependencies: Type Erasure - A Design Analysis.

  • Type Erasure enables to use various concrete types through a single generic interface. The type erasure pattern in C++ is the equivalent Duck Typing in languages like Python.
  • A SAS7BDAT file is a database storage file created by Statistical Analysis System (SAS) software. It contains a binary encoded dataset used for advanced analytics, business intelligence, data management, predictive analytics, and more.

The type erasure pattern is applied here at different levels:

  • The input data stream. Any input stream can be specified: file, memory stream, async, ...
  • The data selection. Any rule for filtering the columns can be specified
  • The resulting output data stream. Any schema can be specified: from in memory to conversion to another columnar format

Performances of the resulting code has been compared to other language implementations (C, python, R, julia).