Author Topic: C++ How about this std::map based datastructure? (Read 4026 times)

Zeyneb · « **on:** November 21, 2021, 09:46:19 pm »

Hi there,

The little example program shown here compiles and runs fine in GCC with C++14 language support enabled.

The objective of this program is to simulate the behavior of microcontroller memory. Instead of declaring a giant array for the entire address range of the microcontroller memory. I'd like to declare just what is needed. So the idea is this datastructure:

Code: [Select]

struct addr_range
{
  uint8_t addr;
  uint8_t size;
};

typedef std::vector< uint8_t > vector_t;
typedef std::map< addr_range, vector_t, std::less<> > map_t;

with an initialization like this:

Code: [Select]

  map_t mem =
  {
   { {0xC0, 3}, { 1, 2, 3 } },
   { {0xD0, 8}, { 7, 6, 5, 4, 3, 2, 1, 0 } },
   { {0xB0, 4}, { 0xA, 0xB, 0xC, 0xD } },
   { {0x66, 2}, { 0xC, 0xD } },
   { {0x6D, 2}, { 0xF, 0xE } },
   { {0x70, 2}, { 0x3, 0x4 } }
  };

For now addresses and data values are just one byte. addr is the start address, size is the size to which data values are specified and the vector is just the data there. Obviously size should always be equal to the vector size.

With proper housekeeping I'll make sure that no address range will be overlapping another one. Also when a write occurs that will make an address range consecutive to another one I will assign the vector to the end of the vector in the lowest address and remove the vector in the higher address as well as the map entry. So for example a write to address 0x6F with the value 0x2 will have the memory look like:

Code: [Select]

  map_t mem =
  {
   { {0xC0, 3}, { 1, 2, 3 } },
   { {0xD0, 8}, { 7, 6, 5, 4, 3, 2, 1, 0 } },
   { {0xB0, 4}, { 0xA, 0xB, 0xC, 0xD } },
   { {0x66, 2}, { 0xC, 0xD } },
   { {0x6D, 5}, { 0xF, 0xE, 0x2, 0x3, 0x4 } }
  };

Now for the C++ code. As you can see I have a bunch of operator< overloads. The last two can compare a single uint8_t to an address range. And these are used in the map::find function call. Notice that for map::find I only have an uint8_t argument. For insertion the full keys are being compared and that is done in the first operator< overload.

I not quite sure if it is ok to have different compare function objects depending on which member function of map is called. Do you know if that is fine? Comments and caveats on this code? How would you program something like this?

Code: [Select]

#include <cstddef>
#include <cstdio>
#include <cstdint>
#include <vector>
#include <map>

struct addr_range
{
  uint8_t addr;
  uint8_t size;
};

typedef std::vector< uint8_t > vector_t;
typedef std::map< addr_range, vector_t, std::less<> > map_t;

bool operator<( const addr_range& a, const addr_range& b ){
  //printf( "try %2.2X < %2.2X: %u\n", a.addr, b.addr, a.addr < b.addr );
  return a.addr < b.addr;
}
bool operator<( const addr_range& a, const uint8_t& b )
{
  bool result = (a.addr + a.size - 1) < b;
  //printf( "1 try %2.2X+%2.2X=%2.2X < %2.2X: %u\n", 
  //  a.addr, a.size - 1, (a.addr + a.size - 1), b, result );
  return result;
}
bool operator<( const uint8_t& a, const addr_range& b )
{
  bool result = a < b.addr;
  //printf( "2 try %2.2X < %2.2X: %u\n", a, b.addr, result );
  return result;
}

int main()
{
  map_t mem =
  {
   { {0xC0, 3}, { 1, 2, 3 } },
   { {0xD0, 8}, { 7, 6, 5, 4, 3, 2, 1, 0 } },
   { {0xB0, 4}, { 0xA, 0xB, 0xC, 0xD } },
   { {0x66, 2}, { 0xC, 0xD } },
   { {0x6D, 2}, { 0xF, 0xE } },
   { {0x70, 2}, { 0x3, 0x4 } }
  };

  mem.insert( {{0xCC, 1}, {8}} );
  
  uint8_t i = 0;
  next_lookup:
  auto search = mem.find( i );
  if ( search != mem.end() )
  {
    uint8_t pos = i - search->first.addr;
    printf( "%2.2X: %2.2X\n", i, search->second[pos] );
  }
  if ( i == 255 ) { goto quit_loop; }
  i++;
  goto next_lookup;
  quit_loop:
  
  getchar();
  return 0;
}

Regards, Zeyneb

TheCalligrapher · « **Reply #1 on:** November 22, 2021, 05:04:20 am »

Quote from: Zeyneb on November 21, 2021, 09:46:19 pm

As you can see I have a bunch of operator< overloads. The last two can compare a single uint8_t to an address range. And these are used in the map::find function call. Notice that for map::find I only have an uint8_t argument. For insertion the full keys are being compared and that is done in the first operator< overload.

I not quite sure if it is ok to have different compare function objects depending on which member function of map is called. Do you know if that is fine?

It is fine since C++14, provided your map's primary comparator is transparent, i.e. it contains a type-member `is_transparent`. You used `std::less` as the primary comparator. This is a transparent comparator, which enables search by a sub-key. Of course, all comparators must comply to the same ordering. In your case, if the ranges do not overlap, this requirement appears to be met.

It is not clear though, why you insist on passing `uint8_t` values by reference.

SiliconWizard · « **Reply #2 on:** November 22, 2021, 05:02:25 pm »

Another thought is that it can be implemented in a much simpler way. But hey, to each their own. And to each their concept of simple.
And that being said, even so, I don't see much point. "Instead of declaring a giant array for the entire address range of the microcontroller memory": as much as I like avoiding software bloat, your typical MCU will have insignificant memory size compared to even the smallest desktop computer available. Frankly no need to care here. Whatsoever.
And if you're implementing "sparse" arrays not for emulating memory per se, but for emulating the whole address space, then why not, but you're probably not implementing address decoding the "right" way. But always interesting how people's minds can work in such different ways.

Zeyneb · « **Reply #3 on:** November 22, 2021, 05:33:38 pm »

Quote from: SiliconWizard on November 22, 2021, 05:02:25 pm

Another thought is that it can be implemented in a much simpler way.

Alright, so what is your idea of a much simpler way? Please be specific.

Quote from: SiliconWizard on November 22, 2021, 05:02:25 pm

but you're probably not implementing address decoding the "right" way.

You mean generating a chip select signal? If so, at this time I don't need this level of simulation. It is more about the CPU perspective of reading program memory and doing RAM reads and writes.

magic · « **Reply #4 on:** November 22, 2021, 06:35:14 pm »

Quote from: Zeyneb on November 21, 2021, 09:46:19 pm

With proper housekeeping I'll make sure that no address range will be overlapping another one.

At this point you should declare your own class with that map of vectors hidden inside as a private member and a simple read/write API. Saves sanity when you need to access the data structure in multiple places.

Quote from: Zeyneb on November 21, 2021, 09:46:19 pm

Obviously size should always be equal to the vector size.

That's redundant (the vector already knows its size). If you make it so that every read access goes through a custom class method (see above), you won't need to have that information in the key, you will have the luxury to use ordinary integer keys and map::find with the default integer comparison and then verify if the vector is long enough as a second step.

Zeyneb · « **Reply #5 on:** November 22, 2021, 08:03:04 pm »

Quote from: magic on November 22, 2021, 06:35:14 pm

Quote from: Zeyneb on November 21, 2021, 09:46:19 pm
With proper housekeeping I'll make sure that no address range will be overlapping another one.
At this point you should declare your own class with that map of vectors hidden inside as a private member and a simple read/write API. Saves sanity when you need to access the data structure in multiple places.

Tak absolutnie. Też tak myślałem. But my intention was to show a minimal example.

Quote from: magic on November 22, 2021, 06:35:14 pm

Quote from: Zeyneb on November 21, 2021, 09:46:19 pm
Obviously size should always be equal to the vector size.
That's redundant (the vector already knows its size). If you make it so that every read access goes through a custom class method (see above), you won't need to have that information in the key, you will have the luxury to use ordinary integer keys and map::find with the default integer comparison and then verify if the vector is long enough as a second step.

I doubt if you fully understand my example code. In the 2nd operator< overload I do use the addr_range size to determine if b is greater or equal than the addr_range upper bound. Otherwise surprise me with an awesome piece of code that demonstrates that I can use the vector size for that.

magic · « **Reply #6 on:** November 22, 2021, 09:06:15 pm »

Like that, I suppose

Code: [Select]

<?php
uint8_t read(addr_t a) {
  map_t::iterator i = map.find(a);
  if ((!i) || (a >= i->first + i->second.size()))
    return 0xff; // address not present, do something about it
  return i->second[a - i->first];
}

edit
Some checks may potentially be necessary to prevent/control address arithmetic overflows under pathological conditions. Particularly if addr_t were signed, which it hopefully is not going to be.

bson · « **Reply #7 on:** November 23, 2021, 03:38:12 am »

Quote from: Zeyneb on November 21, 2021, 09:46:19 pm

Just add the comparison to the key type.

Code: [Select]

struct addr_range
{
  uint8_t addr;
  uint8_t size;

  bool operator<(const addr_range& rhs) const { return addr < rhs.addr; }
};

typedef std::vector< uint8_t > vector_t;
typedef std::map< addr_range, vector_t > map_t;

TheCalligrapher · « **Reply #8 on:** November 23, 2021, 09:20:42 am »

Quote from: bson on November 23, 2021, 03:38:12 am

Just add the comparison to the key type.

Firstly, to achieve what exactly? What is this supposed to solve or answer?

Secondly: no. Don't "add the comparison to the key type". A much more sensible approach is demonstrated by the original code: implementation by a freestanding function (possibly a `friend`), not by a member function.

bson · « **Reply #9 on:** November 25, 2021, 01:23:11 am »

Quote from: TheCalligrapher on November 23, 2021, 09:20:42 am

Firstly, to achieve what exactly? What is this supposed to solve or answer?

Encapsulation. This way you define the ordering right where you declare data used as a key. In the future, if you change the data you can also change its ordering in the same place without having to search your code for all uses of the underlying data. I'd also add other useful data-specific methods, like a predicate contains() to check if a single value is contained in a range, and a predicate intersect() to test if two ranges overlap. Even in the case where these aren't actually used, they're self-documenting in that they formally describe what it means to be contained inside a range, and how ranges order.

TheCalligrapher · « **Reply #10 on:** November 25, 2021, 05:44:06 am »

Quote from: bson on November 25, 2021, 01:23:11 am

Quote from: TheCalligrapher on November 23, 2021, 09:20:42 am
Firstly, to achieve what exactly? What is this supposed to solve or answer?
Encapsulation. This way you define the ordering right where you declare data used as a key. In the future, if you change the data you can also change its ordering in the same place without having to search your code for all uses of the underlying data.

Firstly, this is not "encapsulation". The term has a completely different meaning. What you are talking about is a mere proximity of the comparator definition to the datatype definition.

And nothing prevents you from achieving the same proximity with a freestanding function:

1. You can define the comparator (as `inline`, if necessary) immediately after the definition of your datatype. There's no material difference in proximity in this case (whether it is defined before of after the closing `}` of the datatype has no noticeable impact)

Code: [Select]

struct Datatype
{
  ...
};

inline bool less_comparator(const Datatype &lhs, const Datatype &rhs)
{
  ...
}

2. Or, if you really want to place the comparator definition into the datatype definition, you can define the comparation as a `friend`, which is often a more popular approach: an idiomatic side-usage of `friend` keyword

Code: [Select]

struct Datatype
{
  ...
  friend bool less_comparator(const Datatype &lhs, const Datatype &rhs)
  {
    ...
  }
};

In both cases you get the same degree of proximity. Everything is "in the same place without having to search your code for all uses of the underlying data", exactly as you wanted.

Basically, the proximity you are talking about is a good thing to keep in mind, but for some reason you insist on employing a bizzare design decision - defining the comparator as a member function instead of a freestanding function - to achieve that objective. There's really no need for that, no need to sacrifice design to achieve the desired proximity. You can still easily have it with freestanding functions, as shown above.


EEVblog Main Site	EEVblog on Youtube	EEVblog on Twitter	EEVblog on Facebook	EEVblog on Odysee

EEVblog Electronics Community Forum

Author Topic: C++ How about this std::map based datastructure? (Read 4026 times)

Zeyneb

C++ How about this std::map based datastructure?

TheCalligrapher

Re: C++ How about this std::map based datastructure?

SiliconWizard

Re: C++ How about this std::map based datastructure?

Zeyneb

Re: C++ How about this std::map based datastructure?

magic

Re: C++ How about this std::map based datastructure?

Zeyneb

Re: C++ How about this std::map based datastructure?

magic

Re: C++ How about this std::map based datastructure?

bson

Re: C++ How about this std::map based datastructure?

TheCalligrapher

Re: C++ How about this std::map based datastructure?

bson

Re: C++ How about this std::map based datastructure?

TheCalligrapher

Re: C++ How about this std::map based datastructure?

Share me