Using a C++ user-defined literal to initialise an array

  • A+
Category:Languages

I have a bunch of test vectors, presented in the form of hexadecimal strings:

MSG: 6BC1BEE22E409F96E93D7E117393172A MAC: 070A16B46B4D4144F79BDD9DD04A287C MSG: 6BC1BEE22E409F96E93D7E117393172AAE2D8A57 MAC: 7D85449EA6EA19C823A7BF78837DFADE 

etc. I need to get these into a C++ program somehow, without too much editing required. There are various options:

  • Edit the test vectors by hand into the form 0x6B,0xC1,0xBE,...
  • Edit the test vectors by hand into the form "6BC1BEE22E409F96E93D7E117393172A" and write a function to convert that into a byte array at run time.
  • Write a program to parse the test vectors and output C++ code.

But the one I ended up using was:

  • User-defined literals,

because fun. I defined a helper class HexByteArray and a user-defined literal operator HexByteArray operator "" _$ (const char* s) that parses a string of the form "0xXX...XX", where XX...XX is an even number of hex digits. HexByteArray includes conversion operators to const uint8_t* and std::vector<uint8_t>. So now I can write e.g.

struct {   std::vector<uint8_t> MSG ;   uint8_t* MAC ;   } Test1 = {   0x6BC1BEE22E409F96E93D7E117393172A_$,   0x070A16B46B4D4144F79BDD9DD04A287C_$   } ; 

Which works nicely. But now here is my question: Can I do this for arrays as well? For instance:

uint8_t MAC[16] = 0x070A16B46B4D4144F79BDD9DD04A287C_$ ; 

or even

uint8_t MAC[] = 0x070A16B46B4D4144F79BDD9DD04A287C_$ ; 

I can't see how to make this work. To initialise an array, I would seem to need an std::initializer_list. But as far as I can tell, only the compiler can instantiate such a thing. Any ideas?


Here is my code:

HexByteArray.h

#include <cstdint> #include <vector>  class HexByteArray   { public:   HexByteArray (const char* s) ;   ~HexByteArray() { delete[] a ; }    operator const uint8_t*() && { const uint8_t* t = a ; a = 0 ; return t ; }   operator std::vector<uint8_t>() &&     {     std::vector<uint8_t> v ( a, a + len ) ;     a = 0 ;     return v ;     }    class ErrorInvalidPrefix { } ;   class ErrorHexDigit { } ;   class ErrorOddLength { } ;  private:   const uint8_t* a = 0 ;   size_t len ;   } ;  inline HexByteArray operator "" _$ (const char* s)   {   return HexByteArray (s) ;   } 

HexByteArray.cpp

#include "HexByteArray.h"  #include <cctype> #include <cstring>  HexByteArray::HexByteArray (const char* s)   {   if (s[0] != '0' || toupper (s[1]) != 'X') throw ErrorInvalidPrefix() ;   s += 2 ;    // Special case: 0x0_$ is an empty array (because 0x_$ is invalid C++ syntax)   if (!strcmp (s, "0"))     {     a = nullptr ; len = 0 ;     }   else     {     for (len = 0 ; s[len] ; len++) if (!isxdigit (s[len])) throw ErrorHexDigit() ;     if (len & 1) throw ErrorOddLength() ;     len /= 2 ;     uint8_t* t = new uint8_t[len] ;     for (size_t i = 0 ; i < len ; i++, s += 2)       sscanf (s, "%2hhx", &t[i]) ;     a = t ;     }   } 

 


Use a numeric literal operator template, with the signature:

template <char...> result_type operator "" _x(); 

Also, since the data is known at compile-time, we might as well make everything constexpr. Note that we use std::array instead of C-style arrays:

#include <cstdint> #include <array> #include <vector>  // Constexpr hex parsing algorithm follows: struct InvalidHexDigit {}; struct InvalidPrefix {}; struct OddLength {};  constexpr std::uint8_t hex_value(char c) {     if ('0' <= c && c <= '9') return c - '0';     // This assumes ASCII:     if ('A' <= c && c <= 'F') return c - 'A' + 10;     if ('a' <= c && c <= 'f') return c - 'a' + 10;     // In constexpr-land, this is a compile-time error if execution reaches it:     // The weird `if (c == c)` is to work around gcc 8.2 erroring out here even though     // execution doesn't reach it.     if (c == c) throw InvalidHexDigit{}; }  constexpr std::uint8_t parse_single(char a, char b) {     return (hex_value(a) << 4) | hex_value(b); }  template <typename Iter, typename Out> constexpr auto parse_hex(Iter begin, Iter end, Out out) {     if (end - begin <= 2) throw InvalidPrefix{};     if (begin[0] != '0' || begin[1] != 'x') throw InvalidPrefix{};     if ((end - begin) % 2 != 0) throw OddLength{};      begin += 2;      while (begin != end)     {         *out = parse_single(*begin, *(begin + 1));         begin += 2;         ++out;     }      return out; }  // Make this a template to defer evaluation until later         template <char... cs> struct HexByteArray {     static constexpr auto to_array()     {         constexpr std::array<char, sizeof...(cs)> data{cs...};          std::array<std::uint8_t, (sizeof...(cs) / 2)> result{};          parse_hex(data.begin(), data.end(), result.begin());          return result;     }      constexpr operator std::array<std::uint8_t, (sizeof...(cs) / 2)>() const      {         return to_array();     }      operator std::vector<std::uint8_t>() const     {         constexpr auto tmp = to_array();          return std::vector<std::uint8_t>{tmp.begin(), tmp.end()};     } };  template <char... cs> constexpr auto operator"" _$() {     static_assert(sizeof...(cs) % 2 == 0, "Must be an even number of chars");     return HexByteArray<cs...>{}; } 

Demo

Example usage:

auto data_array = 0x6BC1BEE22E409F96E93D7E117393172A_$ .to_array(); std::vector<std::uint8_t> data_vector = 0x6BC1BEE22E409F96E93D7E117393172A_$; 

As a side note, $ in an identifier is actually a gcc extension, so it's non-standard C++. Consider using a UDL other than _$.

Comment

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: