Sunday, October 29, 2017

Excessive GCC memory usage for large std::bitset arrays

The C++ slack channel had a discussion last week about the code
#include <array>
#include <bitset>
#include <cstddef>

constexpr std::size_t N = 100000;
std::array<std::bitset<N>, N> elems;

int main() {}
that makes GCC consume about 9 gigabytes of memory in the parsing phase of the compilation. This does not happen when using C-style arrays, so changing the definition of elems to
std::bitset<N> elems[N];
makes the code compile without needing an excessive amount of memory. So why does GCC consume all this memory while parsing, and only when using std::array?

The reason has to do with deficiencies in GCC’s implementation of constexpr. To see what is happening, we start by expanding the include files and removing everything not needed for the program:
namespace std
{
  typedef long unsigned int size_t;
}

namespace std __attribute__ ((__visibility__ ("default")))
{
  template<typename _Tp, std::size_t _Nm>
    struct __array_traits
    {
      typedef _Tp _Type[_Nm];
    };

  template<typename _Tp, std::size_t _Nm>
    struct array
    {
      typedef std::__array_traits<_Tp, _Nm> _AT_Type;
      typename _AT_Type::_Type _M_elems;
    };
}

namespace std __attribute__ ((__visibility__ ("default")))
{
  template<size_t _Nw>
    struct _Base_bitset
    {
      typedef unsigned long _WordT;

      _WordT _M_w[_Nw];

      constexpr _Base_bitset() noexcept
      : _M_w() { }
    };

  template<size_t _Nb>
    class bitset
    : private _Base_bitset<((_Nb) / (8 * 8) + ((_Nb) % (8 * 8) == 0 ? 0 : 1))>
    {
    };
}

constexpr std::size_t N = 100000;
std::array<std::bitset<N>, N> elems;

int main() {}
_Base_bitset has a constexpr constructor, and this makes GCC end up building the whole elems array in memory at compile time. The array is large (1,250,400,000 bytes) and GCC need to use even more memory as it represents the array by building AST nodes for the elements.

The constexpr keyword does not mean that the compiler must evaluate at compile time – it only means that it can be evaluated at compile time if the result is used where only constant expressions are allowed. I had assumed that compile-time evaluation is not needed in this case, but GCC seems to always evaluate constexpr at compile time when instantiating templates. Anyway, GCC could use a more efficient representation that does not need to keep the whole array expanded in memory...

The C-style array is not expanded in memory at compile time as it is not defined as a template. The bitset class is still expanded, but it is small, and the compiler only wastes 12,504 bytes of memory by expanding it.