Author Topic: looking for a Btree not recursive implementation, string keys  (Read 6163 times)

0 Members and 1 Guest are viewing this topic.

Offline legacyTopic starter

  • Super Contributor
  • ***
  • !
  • Posts: 4415
  • Country: ch
looking for a Btree not recursive implementation, string keys
« on: December 14, 2017, 12:36:57 pm »
btree is a complex algorithm, especially if implemented without recursion.
Anyway, I need to search string-keys, that return uint32_t

uint32_t ans=btree_find(string_t key);

let me know about books, papers, or chunks of good C code!  :popcorn:
« Last Edit: December 14, 2017, 12:41:42 pm by legacy »
 

Offline brucehoult

  • Super Contributor
  • ***
  • Posts: 4003
  • Country: nz
Re: looking for a Btree not recursive implementation, string keys
« Reply #1 on: December 14, 2017, 05:11:05 pm »
Why would you want to implement it without recursion?

By design, btrees are always balanced, and with typically large branching factors they are not very deep. Therefore there is almost no stack used, but it does make the algorithms simpler.

It's not hard to set some memory aside to use as a fake stack with an on the surface iterative algorithm. But it complicates the code. And, perhaps worse on small computers (and like the processor stack), makes that memory unavailable for other function to use when you're not manipulating the btree.
 
The following users thanked this post: nugglix

Offline legacyTopic starter

  • Super Contributor
  • ***
  • !
  • Posts: 4415
  • Country: ch
Re: looking for a Btree not recursive implementation, string keys
« Reply #2 on: December 14, 2017, 05:47:01 pm »
Why would you want to implement it without recursion?

Design constraint.
 

Offline brucehoult

  • Super Contributor
  • ***
  • Posts: 4003
  • Country: nz
Re: looking for a Btree not recursive implementation, string keys
« Reply #3 on: December 14, 2017, 05:55:43 pm »
Why would you want to implement it without recursion?

Design constraint.

Why?

There's no good reason, other than as a school exercise.

Which I guess this must be.
 

Offline NorthGuy

  • Super Contributor
  • ***
  • Posts: 3137
  • Country: ca
Re: looking for a Btree not recursive implementation, string keys
« Reply #4 on: December 14, 2017, 05:57:45 pm »
It's not hard to set some memory aside to use as a fake stack with an on the surface iterative algorithm. But it complicates the code. And, perhaps worse on small computers (and like the processor stack), makes that memory unavailable for other function to use when you're not manipulating the btree.

I don't think you need a "fake stack" for tree search. You don't need to walk the whole tree.

You create a variable which represents a node. Let's call it NodeVar. You assign the root to NodeVar. Then you loop. On each pass of the loop you select the node which would contain the element being searched and assign this node to NodeVar. As the next loop comes, it'll be searching within new NodeVar. And so on until the element is found (or not found). So, you only need one variable.
 

Offline brucehoult

  • Super Contributor
  • ***
  • Posts: 4003
  • Country: nz
Re: looking for a Btree not recursive implementation, string keys
« Reply #5 on: December 14, 2017, 06:16:28 pm »
It's not hard to set some memory aside to use as a fake stack with an on the surface iterative algorithm. But it complicates the code. And, perhaps worse on small computers (and like the processor stack), makes that memory unavailable for other function to use when you're not manipulating the btree.

I don't think you need a "fake stack" for tree search. You don't need to walk the whole tree.

You create a variable which represents a node. Let's call it NodeVar. You assign the root to NodeVar. Then you loop. On each pass of the loop you select the node which would contain the element being searched and assign this node to NodeVar. As the next loop comes, it'll be searching within new NodeVar. And so on until the element is found (or not found). So, you only need one variable.

For simple search, sure. You can still write it as recursive, but as the recursive call will be a tail-call (you just return its result, without any extra work after the call) any decent compiler will turn that into a loop for you.

It's a different matter if you want to insert the key if you find it is missing.
 
The following users thanked this post: nugglix

Offline legacyTopic starter

  • Super Contributor
  • ***
  • !
  • Posts: 4415
  • Country: ch
Re: looking for a Btree not recursive implementation, string keys
« Reply #6 on: December 14, 2017, 10:47:03 pm »
Why?

There's no good reason, other than as a school exercise.

Which I guess this must be.

Don't be arrogant. In avionic you can't have a stack, sometimes.
 

Offline legacyTopic starter

  • Super Contributor
  • ***
  • !
  • Posts: 4415
  • Country: ch
Re: looking for a Btree not recursive implementation, string keys
« Reply #7 on: December 14, 2017, 10:48:13 pm »
I don't think you need a "fake stack" for tree search. You don't need to walk the whole tree.

You create a variable which represents a node. Let's call it NodeVar. You assign the root to NodeVar. Then you loop. On each pass of the loop you select the node which would contain the element being searched and assign this node to NodeVar. As the next loop comes, it'll be searching within new NodeVar. And so on until the element is found (or not found). So, you only need one variable.

yes, good idea! Thanks!
 

Offline brucehoult

  • Super Contributor
  • ***
  • Posts: 4003
  • Country: nz
Re: looking for a Btree not recursive implementation, string keys
« Reply #8 on: December 15, 2017, 09:40:05 am »
Why?

There's no good reason, other than as a school exercise.

Which I guess this must be.

Don't be arrogant. In avionic you can't have a stack, sometimes.

If I have a btree I'd be a million times more worried about running out of dynamic memory (whether heap or disk) than about running out of stack.

More than that, if I know how much dynamic memory is available for the btree, I can calculate and PROVE the maximum stack usage.
 
The following users thanked this post: nugglix

Offline westfw

  • Super Contributor
  • ***
  • Posts: 4196
  • Country: us
Re: looking for a Btree not recursive implementation, string keys
« Reply #9 on: December 17, 2017, 07:31:12 am »
Quote
I can calculate and PROVE the maximum stack usage.
That doesn't help if you work in an industry with "coding standards" the prohibit recursion :-(
You'd think there would be a repository somewhere of "here are a bunch popular C++/Java/whatever containers and algorithms implemented in a way that meets the requirements set forth by MISRA/NASA/ZYZZY"?  Or is that what the "government computing industry" charges the big bucks for?

Did you really mean BTree?  Most of the stuff I was reading online indicates that they're aimed mainly at disks, and optimized to minimize the number of separate disk accesses.  But I'll admit to being unclear on the subtleties of BTree vs Binary Tree vs Binary Search tree, RB Trees and etc...
 

Offline brucehoult

  • Super Contributor
  • ***
  • Posts: 4003
  • Country: nz
Re: looking for a Btree not recursive implementation, string keys
« Reply #10 on: December 17, 2017, 09:24:27 am »
Did you really mean BTree?  Most of the stuff I was reading online indicates that they're aimed mainly at disks, and optimized to minimize the number of separate disk accesses.  But I'll admit to being unclear on the subtleties of BTree vs Binary Tree vs Binary Search tree, RB Trees and etc...

This 'ere topic (and the first post) says "btree". I took him at his word. Maybe he's confused.

BTrees were originally designed for disk, it's true, but given the way modern CPUs are designed (at least higher end ones with caches and MMUs) they are also better for in-memory use than binary trees. Chasing pointers is cheaper than a disk seek, but it's still a *lot* of CPU cycles if the result has to come from RAM. And TLB misses are very costly, needing usually a number of reads from tables in RAM.

Most programmers have absolutely NO IDEA that even i7 CPUs like Nehalem or Skylake have only 64 or 128 TLB entries, meaning that only that number of 4 KB pages (256KB - 512KB) can be used before you get TLB misses. The ARM processors in most phones have fewer, though not all that much fewer these days ..  for example the Cortex A15 has 32 L1 data TLB entries.

So, once you get a 4 KB page mapped into the TLB it makes sense to use as much information as possible from it, not just a 16 byte (maybe) RB-tree node and then follow a pointer to a different 4 KB page.

The same applies to cache lines. Once you fetched a cache line, it makes sense to look at all of it. Most modern caches are 64 bytes per line, so you should at least make your tree nodes *that* big in some useful way. So, for example, on a 32 bit machine you might put 7 32-bit keys and 8 pointers in each tree node. With standard BTree balancing you should have at least 6 pointers used in every node, so a tree with a million items will be 8 levels deep instead of 20 levels minimum (about 25-28 in practice) for a balanced binary tree. That's a lot less pointer chasing.
 
The following users thanked this post: nugglix

Offline legacyTopic starter

  • Super Contributor
  • ***
  • !
  • Posts: 4415
  • Country: ch
Re: looking for a Btree not recursive implementation, string keys
« Reply #11 on: December 17, 2017, 02:07:55 pm »
This 'ere topic (and the first post) says "btree". I took him at his word. Maybe he's confused.

btree means btree, and I mean btree with the meaning of btree
b+tree is a variant, that I may consider
 

Offline legacyTopic starter

  • Super Contributor
  • ***
  • !
  • Posts: 4415
  • Country: ch
Re: looking for a Btree not recursive implementation, string keys
« Reply #12 on: December 17, 2017, 02:16:08 pm »
MISRA/NASA/ZYZZY"

must be DO178C/level_D compliant! (1)
dynamic memory can't be used
recursion can't be used
stack must be checked for its maximal usage in the worse case

reasons don't matter, we have to it that way if we want to get paid  :D


yes, someone has asked us to have a btree module, tested, and certified. Of course I still have to design it.

edit:
(1) level_D means not critical
« Last Edit: December 17, 2017, 03:02:42 pm by legacy »
 

Offline legacyTopic starter

  • Super Contributor
  • ***
  • !
  • Posts: 4415
  • Country: ch
Re: looking for a Btree not recursive implementation, string keys
« Reply #13 on: December 17, 2017, 02:18:54 pm »
Most programmers have absolutely NO IDEA that even i7 CPUs like Nehalem or Skylake have only 64 or 128 TLB entries

There is no data-cache in the final hw.
 

Offline legacyTopic starter

  • Super Contributor
  • ***
  • !
  • Posts: 4415
  • Country: ch
Re: looking for a Btree not recursive implementation, string keys
« Reply #14 on: December 17, 2017, 02:57:32 pm »
I don't know really really what is the final purpose, my user-level is classified like if was a guest, in fact I have a pass-level in my badge that doesn't open the toilette's door of the same floor where someone must have done a mistake on allocating my ass to a seat for my desk *IF* I have to go downstairs and my badge only opens the door of the toilette used by visitors  :palm: :palm: :palm: :palm: :palm:

- Mistakes in desk-allocation are irrelevant - so I was said, and I can believe it since I also have tons of cameras and microphones pointed to my desk and I can't say a blasphemy without being heard by someone with badge. - keep your the language polite, mr P, please! - Thus I consider myself as the same level. And I am a consultant, anyway, not an employed.

I mean, I can't question, I can't do questions, the module I have to write is just a tiny piece of library to the application that runs on the top of the BSP, and details are so unknown that I look like the kind of man who has to look at the keyhole to understand what is in the room beyond to the door! In fact  I have no idea about the system integration, the only information I have to the next piece of software is what I have recently heard some rumors about the final target in the lab: there was a black-box on the table fully covered by resin, with some interfaces that, from the point of view of the documentation , are just a listing of data-types. But, if I got, and if i understand correctly, they want a sort of database to query details on tactical maps.

It's for sure for a flying board something, for an aircraft, but it may be for an Helicopters, or for a space ship. Who knows  :-// ?

Discretion is advised, and it's the promise you did - keep the language polite, and don't do questions if you want to keep your ass on the allocated desk - Not a good affair, but oh, I can stay in Norway for a long time and here Secretaries are all blond pretty pretty girls, and, the air is fresh, and the food is excellent  :D
 

Offline brucehoult

  • Super Contributor
  • ***
  • Posts: 4003
  • Country: nz
Re: looking for a Btree not recursive implementation, string keys
« Reply #15 on: December 17, 2017, 03:11:38 pm »
MISRA/NASA/ZYZZY"

must be DO178C/level_D compliant! (1)
dynamic memory can't be used
recursion can't be used
stack must be checked for its maximal usage in the worse case

reasons don't matter, we have to it that way if we want to get paid  :D


yes, someone has asked us to have a btree module, tested, and certified. Of course I still have to design it.

Good luck making a btree (or any kind of tree) without dynamic memory.

Sure, you could pre-allocate a bunch of nodes as global/static (or on disk), and then check for running out of them ... but that's no different than checking for malloc() running out of space and returning 0. You're exactly equally fucked in both cases.
 

Offline legacyTopic starter

  • Super Contributor
  • ***
  • !
  • Posts: 4415
  • Country: ch
Re: looking for a Btree not recursive implementation, string keys
« Reply #16 on: December 17, 2017, 03:32:44 pm »
oh, another interesting details that I have only recently got says that it must be a "disk-btree" algorithm!

It means you have to access 1Kbyte by some provided API's functions (e.g. disk_read(..), disk_write(..), ..) and you will get 1Kbyte of structure where your BTREE/B+TREE's data must reside.

(btree != b+tree, but I have one degree of freedom, I can change the request into b+tree if it's better)

seeks are assumed << 5mSec, but it down't matter, since now we know that it's a disk-btree! So, we have to consider: pointers to memory, vs disk access, and this changes a bit of things in my previous design draft  (at least the data structure ) :palm:

The good news is that disk is not an electro-mechanical device, and it's not pATA/sATA, it's a custom solid state device with it's custom protocol, and probably it's made with FeRAM chips, but chips may be flash-chips. Who knows ?  :-//

From a low-level requirement I read that there is also a bit super capacitor able to keep the device powered for enough time to flush data from RAM into the chips. There is a specific interrupt for that, and algorithms must keep flying-datas written onto chips as fast as possible.

Thanks god there is no (not yet?) request for "journaling". Not related to the btree.

But there is a request for a chunk of data that needs to be written into chips collecting information about failures during the missions, warm/cold start, and tons of other information. That is the easiest part, since it doesn't imply btree. It's just a circular buffer!

So, the module I have to prepare is composed by two sub-modules: btree, and circular buffer.

I will reserve the first blocks to the circular buffer, and the rest to the database.
 

Offline legacyTopic starter

  • Super Contributor
  • ***
  • !
  • Posts: 4415
  • Country: ch
Re: looking for a Btree not recursive implementation, string keys
« Reply #17 on: December 17, 2017, 03:35:59 pm »
Good luck

don't worry about that, in the worst case I will get fired  :-//
already sent my CV to a company in Norway looking for a guardian of polar bears :D

(edit: kidding)
« Last Edit: December 17, 2017, 04:01:22 pm by legacy »
 

Offline NorthGuy

  • Super Contributor
  • ***
  • Posts: 3137
  • Country: ca
Re: looking for a Btree not recursive implementation, string keys
« Reply #18 on: December 17, 2017, 03:42:10 pm »
Sure, you could pre-allocate a bunch of nodes as global/static (or on disk), and then check for running out of them ...

malloc() is not allowed, but custom-made dynamic allocators as you have just described are Ok? This is certainly a very strange set of rules.
 

Offline legacyTopic starter

  • Super Contributor
  • ***
  • !
  • Posts: 4415
  • Country: ch
Re: looking for a Btree not recursive implementation, string keys
« Reply #19 on: December 17, 2017, 03:59:26 pm »
malloc() is not allowed, but custom-made dynamic allocators as you have just described are Ok? This is certainly a very strange set of rules.

good point!

in my draft I am using a custom-made set of allocators.
They need to be approved, anyway  :-//

Code: [Select]
private p_btree_node_t  node_make
(
    p_btree_t p_btree
)
{
    p_btree_node_t p_node_new;

    p_node_new = mem_alloc(sizeof(btree_node_t));
    if (p_node_new isEqualTo NULL)
    {
        panic(module, id, "p_node_new creation");
    }

    p_node_new->is.leaf = false;
    p_node_new->item.n  = 0;
    p_node_new->head    = NULL;
    p_node_new->next    = NULL;
    return p_node_new;
}

"panic" is for the draft version, it will removed/changed into the final version!
 

Offline legacyTopic starter

  • Super Contributor
  • ***
  • !
  • Posts: 4415
  • Country: ch
Re: looking for a Btree not recursive implementation, string keys
« Reply #20 on: December 17, 2017, 04:12:06 pm »
Considering the B+tree structure ...

if it wasn't to be disk-B+tree, I would been tempted to do this way

Code: [Select]
typedef struct
{
    uint32_t key;
    map_t map_item; /* custom map's item */
    p_this_t p; /* to leaf or to internal b+tree_node */
} btree_node_item_t;

Code: [Select]
typedef struct
{
    btree_node_item_t it[btree_node_item_N];
    uint32_t          n; /* how many items have been used */
} btree_node_items_t;

the pointer "p" is useful to point to leaf, or to internal nodes
but now I have to consider that the b+tree needs to be loaded and saved from/to a storage device

I wonder if I can change "ram pointers" (uint32_t*) into "(block,sub-block) pointer" (uint32_t,uint32_t)

to do something like this:

disk_read (block_pointer) ---> 1Kbyte of data loaded into ram as "btree_node_items_t"
item = btree_node_item_t it[sub_block_pointer]

does it make sense?
 

Offline legacyTopic starter

  • Super Contributor
  • ***
  • !
  • Posts: 4415
  • Country: ch
Re: looking for a Btree not recursive implementation, string keys
« Reply #21 on: December 17, 2017, 04:16:40 pm »
I mean, leaves can also be "packed" that way, so iterating a tree (e.g. full scan or range scan) should be very efficient, because you read hundreds/thousands data-rows per single block  :-//
 

Offline Jr460

  • Regular Contributor
  • *
  • Posts: 142
Re: looking for a Btree not recursive implementation, string keys
« Reply #22 on: December 17, 2017, 04:43:52 pm »
It was over 30 years ago, but I had full blown set of Btree code in FORTRAN.   Want no recursion or dynamic memory....   Try FORTRAN-77.   I didn't even think I have a hard copy of it anywhere.  Of course it had some limits, depth of trees for example, based not he compile time size of some arrays.  Since it was used for disk access of files, it also did a bunch of caching of pages.
 

Offline asgard20032

  • Regular Contributor
  • *
  • Posts: 184
Re: looking for a Btree not recursive implementation, string keys
« Reply #23 on: December 17, 2017, 04:46:49 pm »
If you need to implement a recursive algorithm without using an hardware stack, thus avoiding recursion, then do a fake recursion implementation. A software stack. Most recursive algorithm can easily be converted to a non recursive algorithm using a software stack (a simple array with a limited length). And you can do check to ensure you never exceed the array length. Also, you can use a counter to make sure it dosen't jam in the algorithm.
 

Offline brucehoult

  • Super Contributor
  • ***
  • Posts: 4003
  • Country: nz
Re: looking for a Btree not recursive implementation, string keys
« Reply #24 on: December 17, 2017, 08:31:27 pm »
Sure, you could pre-allocate a bunch of nodes as global/static (or on disk), and then check for running out of them ...

malloc() is not allowed, but custom-made dynamic allocators as you have just described are Ok? This is certainly a very strange set of rules.

Exactly so.
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf