EEVblog Electronics Community Forum

Products => Computers => Programming => Topic started by: conducteur on September 16, 2020, 10:34:46 am

Title: best practice for implementation of graph with "conditional" adjecancy list
Post by: conducteur on September 16, 2020, 10:34:46 am
Hello,

I have a question about graphs. A common method to implement them is using the adjecancy list. I want to implement a graph, where the possible next nodes depend on the previous one.
What is best practice here to implement this in (C) code?
Title: Re: best practice for implementation of graph with "conditional" adjecancy list
Post by: hamster_nz on September 16, 2020, 11:17:52 am
I don't think that there is enough of a description to give useful advice.

Maybe it would be best to duplicate nodes, so the become a simpler structure.

So say you can get to node C from nodes A and B, and from there you can only go to D if you came from A, or E if you come from B.

Split node C into nodes C_from_A and C_from_B, Then connect C_from_A to node D and C_from_B to node E.

That way working your way through the graph will not depend on the previous state any more.
Title: Re: best practice for implementation of graph with "conditional" adjecancy list
Post by: AntiProtonBoy on September 16, 2020, 03:16:49 pm
I want to implement a graph, where the possible next nodes depend on the previous one.
Sounds like you want a tree? Or a linked list in the simplest case?
Title: Re: best practice for implementation of graph with "conditional" adjecancy list
Post by: Nominal Animal on September 16, 2020, 04:05:57 pm
The obvious extension to adjancency lists is to augment them with the restrictions on the edge, as usual.  (That is, if each edge has say a numerical cost, we associate the cost with the edge in the adjacency list exactly the same way we would this kind of restriction.)

In an initial implementation, I would split each edge from the current node to unique (required previous node):(next node) tuples,
Code: [Select]
struct node;

struct edge {
    struct node *reqd;  /* Required previous node */
    struct node *next;  /* Edge from current node to next node */
};
with the node having an array of such split edges:
Code: [Select]
struct node {
    size_t       edges;
    struct node *edge;
};
where the pointer to the node is the (permanent) identifier.  (This means you cannot reallocate a node or free it before you remove all edges to the node.) This allows a single thread to fully manipulate the graph between steps.

Because of the split nature of the forward edges, you'll need helper functions to calculate the actual number of edges from a given node (by calculating the number of unique ->edge[i].next values in the array),
Code: [Select]
size_t  unique_outgoing_edges(struct node *curr)
{
    if (!curr)
        return 0;

    const struct edge *edge = curr->edge;
    const size_t edges = curr->edges;
    size_t i, k, n = 0;

    /* Slow O(N²) implementation */
    for (i = 0; i < edges; i++) {
        /* Check for preceding duplicates of .next */
        for (k = 0; k < i; k++)
            if (edge[i].next == edge[k].next)
                break;
        /* Increment n by 1 if edge[i].next was the first occurrence of .next in the list. */       
        n += (k == i);
    }

    return n;
}
and how many edges are reachable from the current node given the previous visited node,
Code: [Select]
size_t  get_outgoing_edges(struct node *curr, struct node *prev, struct node next[], size_t count)
{
    size_t  n = 0;
    size_t  i;

    if (!curr) {
        return 0;
    }

    for (i = 0; i < curr->edges; i++) {
        if (curr->edge[i].reqd == prev) {
            if (n < count) {
                next[n] = curr->edge[i].next;
            }
            n++;
        }
    }

    return n;
}

Other than that, it should be straightforward.

When you have an initial implementation, you need to work with real-world data, and check if (and where) this simple implementation is inefficient.  If you find such cases, try to redesign the node and/or edge data structures so they'd behave better.  Often you find such simple initial implementations sufficient for further development, and can keep the simple implementation; just keep it in a self-contained unit, so you can refactor it with a better data structure and/or implementation, if necessary, later on.

It does not have to perform optimally, as long as it is sufficiently performant.  Maintainability and reliability of operation/robustness is much more important.
Title: Re: best practice for implementation of graph with "conditional" adjecancy list
Post by: SiliconWizard on September 16, 2020, 04:31:27 pm
If any node can have a number of adjacent nodes, then it looks like a tree. Basically, each node would be associated with a list of references to adjacent nodes. Depending on the language and implementation, those references may be pointers, or indices. (I would personally favor implementing this with dynamic tables, each item being a unique node, and the lists of adjacent nodes being lists of indices in this table.)

So in short:
* Graph = Table of nodes
* Node = Associated node data, Table of references to adjacent nodes
* Reference to adjacent node = Index of adjacent node (in the Table of nodes), Associated edge data (if any, like cost/weight...)

The benefit of the above structure is that it's usually more efficient memory-wise, more efficient cache-wise (for large graphs), and you can go through the graph in various ways, as you can also access nodes individually. Typical linked lists don't allow this, as you have to go through them one way or another to access any node.
Title: Re: best practice for implementation of graph with "conditional" adjecancy list
Post by: ledtester on September 16, 2020, 04:55:21 pm
We need more information about your use case, but it sounds like the "next" nodes will depend on more than just the current node. Usually this is done with (pseudo-)code like this:

Code: [Select]
     # given a node NODE
     for each possible next node NEXT of NODE:
         if NEXT is really reachable based on other criteria:
            ...do something with NEXT...

The nodes you loop over in the for-loop could be a limited set of nodes based on the current node or it could be all of the nodes in the graph. And when you "do something with NEXT" it might change the criteria used in future tests.

So the list of next nodes is never realized as a data structure in memory, but you virtually traverse it by using the for-loop and criteria test.

Title: Re: best practice for implementation of graph with "conditional" adjecancy list
Post by: hamster_nz on September 17, 2020, 01:12:22 am
This inspired me to make this directed graphs...

Both can spell Apes, Apples and Apropos, but the second is "where the possible next nodes depend on the previous one" to spell correctly.

Edit: Darn - I should have merged the E on Apes and Apples
Title: Re: best practice for implementation of graph with "conditional" adjecancy list
Post by: Nominal Animal on September 17, 2020, 02:03:34 am
I do believe
Code: [Select]
digraph {
    rankdir = "LR";
    "A" -> "P";
    "P" -> "P" [ label="A" ];
    "P" -> "E" [ label="A" ];
    "P" -> "L" [ label="P" ];
    "P" -> "R" [ label="A" ];
    "P" -> "S" [ label="P" ];
    "P" -> "O" [ label="O" ];
    "L" -> "E" [ label="P" ];
    "E" -> "S" [ label="P" ];
    "E" -> "S" [ label="L" ];
    "R" -> "O" [ label="P" ];
    "O" -> "P" [ label="R" ];
    "O" -> "S" [ label="P" ];
}
implements APPS, APES, APPLES, and APROPOS, but no other strings starting at A.  The letter near the middle of an edge names the node required to have been the node previous to this edge.

Checking this I realized that the set of valid requirements (previous nodes) at a given node is exactly the set of nodes connecting to current node.  Yes, obvious.  But, in some problem solving cases where you are say pruning a graph in some way, and start with a dense graph, and wish to end up with a much sparser graph still fulfilling some criteria, this obvious fact can be useful.
Title: Re: best practice for implementation of graph with "conditional" adjecancy list
Post by: ledtester on September 17, 2020, 02:08:25 am
From webgraphviz.com:

[attachimg=1]

Title: Re: best practice for implementation of graph with "conditional" adjecancy list
Post by: Nominal Animal on September 17, 2020, 02:14:06 am
From webgraphviz.com:
Oh, I didn't know that site existed; thanks.  Note that I had to edit the Dot code, there was a bug in the "ROPO" cycle.  Should be fixed now.
Title: Re: best practice for implementation of graph with "conditional" adjecancy list
Post by: hamster_nz on September 17, 2020, 02:29:27 am
Of cause, ASSUME and ASSES break this...
Title: Re: best practice for implementation of graph with "conditional" adjecancy list
Post by: Nominal Animal on September 17, 2020, 02:47:21 am
Here's brief C99 code that implements the APPS, APES, APPLES, APROPOS graph:
Code: [Select]
#include <stdlib.h>
#include <string.h>
#include <stdio.h>

typedef  struct node  node;
typedef  struct edge  edge;

struct edge {
    const struct node *next;
    const struct node *reqd;
};

struct node {
    const char  *name;
    const edge   edge[]; /* NULL-terminated array */
};


static const node A, E, L, O, P, R, S;

static const node A = {
    .name = "A", .edge = { { .next = &P, .reqd = NULL }, { 0 } },
};

static const node E = {
    .name = "E", .edge = { { .next = &S, .reqd = &P },
                           { .next = &S, .reqd = &L }, { 0 } },
};

static const node L = {
    .name = "L", .edge = { { .next = &E, .reqd = &P }, { 0 } },
};

static const node O = {
    .name = "O", .edge = { { .next = &P, .reqd = &R },
                           { .next = &S, .reqd = &P }, { 0 } },
};

static const node P = {
    .name = "P", .edge = { { .next = &E, .reqd = &A },
                           { .next = &L, .reqd = &P },
                           { .next = &O, .reqd = &O },
                           { .next = &P, .reqd = &A },
                           { .next = &R, .reqd = &A },
                           { .next = &S, .reqd = &P }, { 0 } },
};

static const node R = {
    .name = "R", .edge = { { .next = &O, .reqd = &P }, { 0 } },
};

static const node S = {
    .name = "S", .edge = { { 0 } },
};

void check(const node *curr, const node *prev, char *const buf, size_t len, size_t const max)
{
    const size_t  namelen = (curr && curr->name) ? strlen(curr->name) : 0;
    size_t        traversed = 0;

    /* Add node name to the buffer. */
    if (namelen > 0) {
        if (len + namelen < max) {
            memcpy(buf + len, curr->name, namelen);
            len += namelen;
        } else
            len = max;
    }

    /* Traverse each possible edge */
    for (const edge *ce = curr->edge; ce->next != (void *)0; ce++)
        if (ce->reqd == prev) {
            traversed++;
            check(ce->next, curr, buf, len, max);
        }

    if (!traversed) {
        if (len < max) {
            buf[len] = '\0';
            fputs(buf, stdout);
            fputc('\n', stdout);
        } else {
            fprintf(stderr, "(end node reached, but buffer too small)\n");
        }
    }
}

int main(void)
{
    char buffer[1024];

    check(&A, NULL, buffer, 0, sizeof buffer);

    return EXIT_SUCCESS;
}

To add ASSUME and ASSES, or more than one ##@ pattern in general, the S (#) node has to be split.
Title: Re: best practice for implementation of graph with "conditional" adjecancy list
Post by: Nominal Animal on September 17, 2020, 03:21:00 am
Here's the C99 that adds ASSES and ASSUME to the list, by splitting S into two different nodes:
Code: [Select]
#include <stdlib.h>
#include <string.h>
#include <stdarg.h>
#include <stdio.h>

typedef  struct node  node;
typedef  struct edge  edge;

struct edge {
    const struct node *next;
    const struct node *reqd;
};

struct node {
    const char  *name;
    const edge   edge[]; /* NULL-terminated array */
};


static const node A, E, L, M, O, P, R, S0, S1, U;

static const node A = {
    .name = "A", .edge = { { .next = &P, .reqd = NULL },
                           { .next = &S1, .reqd = NULL }, { 0 } },
};

static const node E = {
    .name = "E", .edge = { { .next = &S0, .reqd = &P },
                           { .next = &S0, .reqd = &L },
                           { .next = &S0, .reqd = &S1 }, { 0 } },
};

static const node L = {
    .name = "L", .edge = { { .next = &E, .reqd = &P }, { 0 } },
};

static const node M = {
    .name = "M", .edge = { { .next = &E, .reqd = &U }, { 0 } },
};

static const node O = {
    .name = "O", .edge = { { .next = &P, .reqd = &R },
                           { .next = &S0, .reqd = &P }, { 0 } },
};

static const node P = {
    .name = "P", .edge = { { .next = &E, .reqd = &A },
                           { .next = &L, .reqd = &P },
                           { .next = &O, .reqd = &O },
                           { .next = &P, .reqd = &A },
                           { .next = &R, .reqd = &A },
                           { .next = &S0, .reqd = &P }, { 0 } },
};

static const node R = {
    .name = "R", .edge = { { .next = &O, .reqd = &P }, { 0 } },
};

static const node S0 = {
    .name = "S", .edge = { { 0 } },
};

static const node S1 = {
    .name = "s", .edge = { { .next = &S1, .reqd = &A },
                            { .next = &E, .reqd = &S1 },
                            { .next = &U, .reqd = &S1 }, { 0 } },
};

static const node U = {
    .name = "U", .edge = { { .next = &M, .reqd = &S1 }, { 0 } },
};

void check(const node *curr, const node *prev, char *const buf, size_t len, size_t const max)
{
    const size_t  namelen = (curr && curr->name) ? strlen(curr->name) : 0;
    size_t        traversed = 0;

    /* Add node name to the buffer. */
    if (namelen > 0) {
        if (len + namelen < max) {
            memcpy(buf + len, curr->name, namelen);
            len += namelen;
        } else
            len = max;
    }

    /* Traverse each possible edge */
    for (const edge *ce = curr->edge; ce->next != (void *)0; ce++)
        if (ce->reqd == prev) {
            traversed++;
            check(ce->next, curr, buf, len, max);
        }

    if (!traversed) {
        if (len < max) {
            buf[len] = '\0';
            fputs(buf, stderr);
            fputc('\n', stderr);
        } else {
            fprintf(stderr, "(end node reached, but buffer too small)\n");
        }
    }
}

void dot_digraph(FILE *const out, ...)
{
    va_list     args;
    const node *curr;

    fprintf(out, "digraph {\n");
    fprintf(out, "    rankdir = \"LR\";\n");

    va_start(args, out);
    while (1) {
        curr = va_arg(args, const node *);
        if (!curr)
            break;

        fprintf(out, "    \"%p\" [ label=\"%s\" ];\n", (void *)curr, curr->name);
        for (const edge *ce = curr->edge; ce->next != (void *)0; ce++) {
            if (ce->reqd) {
                fprintf(out, "    \"%p\" -> \"%p\" [ label=\"%s\" ];\n", (void *)curr, (void *)(ce->next), ce->reqd->name);
            } else {
                fprintf(out, "    \"%p\" -> \"%p\";\n", (void *)curr, (void *)(ce->next));
            }
        }
    }
    va_end(args);

    fprintf(out, "}\n");
}

int main(void)
{
    char buffer[1024];

    check(&A, NULL, buffer, 0, sizeof buffer);

    dot_digraph(stdout, &A, &E, &L, &M, &O, &P, &R, &S0, &S1, &U, NULL);

    return EXIT_SUCCESS;
}

This one prints the strings to stderr, generating
    APES
    APPLES
    APPS
    APROPOS
    AssES
    AssUME
and a DOT language graph to standard output (but note that the function call takes the nodes to be included as parameters),
[attach=1]