[SOLVED] test if a point belongs to the output of the Bresenham's line drawing

solved

Are you simply not initializing the accumulator to the correct value? It seems like the two sides are offset into each other somehow and skipping a pixel.

For an iterated check, keep another accumulator, incremented according to the slope from the starting point to the target point. At the target's X coordinate (for a horizontal line), compare the accumulator values.

More precisely, you have a line (x1, y1)-(x2, y2) and a point (x3, y3), such that x1 <= x3 <= x2. The slope of the line is (y2 - y1) / (x2 - x1) and the slope from the start to the point is (y3 - y1) / (x3 - x1). So, each accumulator is incremented by the numerator and decremented by the denominator for the respective slope.

The math is to cross-multiply the slopes, i.e., numerator1 * denominator2 and vice versa, and compare their magnitudes.

More generally, we can take two points on a line, and one point not necessarily on the line, and do some geometry to solve for the nearest point, or projection, or sidedness, or all of those sorts of things. I have some code where I did this in 2D for a simple graphical engine (effectively the rudiments of the Build engine).

Tim

This is a very general problem in geometric computing (as discovered e.g. by the soild modelling community): dealing with very small angles, nearly tangential intersections, etc, Essentially, the problem is that the precision of the computation being performed is not sufficient to give a consistent answer in denegenrate cases.

Various answers have been proposed, but they are not great for FPGA implementation. For example, one such approach is that you compute a low precision answer together with a number that tells you if that answer is reliable. If it's not reliable, you redo the computation to higher precision.

See for example:
https://arxiv.org/pdf/math/9410209.pdf

Quote from: legacy on 05 Oct, 2018 22:18

Code: [Select]
|0000000000111111111122222222223333333333 |0123456789012345678901234567890123456789 00| | 01| B | 02| BBB | 03| BBBBBB | 04| GBBBBBBBB | 05| GGGBBBBBBBBB | 06| GGGGGGgBBBBBBBB | 07| GGGGGGGGGgbBBBBBBB | 08| GGGGGGGGGgbBBBBBBB | 09| GGGGGGGGg+bBBBBBB | 10| GGGGGGGggbbBBBBB | 11| GGGGGGGggbbBBBBB | 12| GGGGGGggbbbBBBB | 13| GGGGGgggbbbBBB | 14| GGGGGggbbbbBBB | 15| GGGGgggbbbbBB | 16| GGGgggrrbbbB | 17| GGGggrrrrbbB | 18| GGgrrrrrrrB | 19| GgrrrrrrRR | 20| GrrrrRRRR | 21| rRRRRRRRR | 22| RRRRRRRRR | 23| RRRRRRRR | 24| RRRRRRRR | 25| RRRRRRR | 26| RRRRRR | 27| RRRRRR | 28| RRRRR | 29| RRRR | 30| RRRR | 31| RRR | 32| RR | 33| RR | 34| R | 35| | <------------ defect !!! 36| R | 37| | 38| | 39| | |0000000000111111111122222222223333333333 |0123456789012345678901234567890123456789

here it is an example: see the hole on the vertex.

I think you are doing it wrong.... look down the right hand edge and how often you jump to the next column.

4,3,3,3,3,4,3,3,3,3,3,1

It should be more like
3,3,3,3,3,3,3,3,3,3,3,3

Code: [Select]

  |0000000000111111111122222222223333333333
  |0123456789012345678901234567890123456789
00|                                        |
01|                          B             |
02|                        BBB             |
03|                     BBBBBB             |
04|                  GBBBBBBBBB            |
05|                GGGBBBBBBBBB            |
06|             GGGGGGgBBBBBBBB            |
07|          GGGGGGGGGgbBBBBBBBB           |
08|           GGGGGGGGGgbBBBBBBB           |
09|            GGGGGGGGg+bBBBBBB           |
10|             GGGGGGGggbbBBBBBB          |
11|              GGGGGGGggbbBBBBB          |
12|               GGGGGGggbbbBBBB          |
13|                GGGGGgggbbbBBBB         |
14|                 GGGGGggbbbbBBB         |
15|                  GGGGgggbbbbBB         |
16|                   GGGgggrrbbbBB        |
17|                    GGGggrrrrbbB        |
18|                     GGgrrrrrrrB        |
19|                      GgrrrrrrRRR       |
20|                       GrrrrRRRRR       |
21|                        rRRRRRRRR       |
22|                        RRRRRRRRRR      |
23|                         RRRRRRRRR      |
24|                          RRRRRRRR      |
25|                           RRRRRRRR     |
26|                            RRRRRRR     |
27|                             RRRRRR     |
28|                              RRRRRR    |
29|                               RRRRR    |
30|                                RRRR    |
31|                                 RRRR   |
32|                                  RRR   |
33|                                   RR   |
34|                                    RR  |
35|                                     R  | <------------ defect !!!
36|                                     R  |
37|                                        |
38|                                        |
39|                                        |
  |0000000000111111111122222222223333333333
  |0123456789012345678901234567890123456789

It also looks like you have a defect on the left hand side too.

I suspect you have

Code: [Select]

 
   if(this_error > max_error) {
       y++;
       this_error -= max_error;
   }

Where you should have

Code: [Select]

 
   if(this_error >= max_error) {
       y++;
       this_error -= max_error;
   }

Or you have a fencepost error (a.k.a. off-by-one error) in your setup calculations.

Are you sure we are talking about the same algo?

The one I know is an unrolled version of:

Code: [Select]

for(x = x0; x <= x1; x++) {
    y  = y0 + (x-x0) * (y1-y0) / (x1-x0);  // Integer division
}

Usually with a refinement or two... or non-obvious definitions of (x0,y0) and (x1,y1)

For example, it is common to have half the error at the start, to move where the steps are:

Code: [Select]

    y  = y0 + ((x-x0) * (y1-y0)+(y1-y0)/2) / (x1-x0);

I think these days with 32-bit processors and high resolution displays, sane programmers simply use fixed-point arithmetic. Error accumulation probably made more sense in the 8-bit era...

Looking nice,

Did you ever get to grips why the error is there? Are you drawing a shape, or are you filling a polygon?

Prompted by this thread, on the weekend I looked at an old text book I had (https://www.goodreads.com/book/show/60564.3D_Computer_Graphics), and it was quite explicit that the choice of the point of sampling point vs the display coordinate system was responsible for these sorts of issues when filling areas, causing overlaps and unexpected gaps in renderings were objects share an edge.

I just found the text of the book online at https://archive.org/stream/3DComputerGraphicsAlanWatt3thEditionSelectedChapters/3D-Computer-Graphics-Alan-Watt-3th-Edition-Selected-Chapters_djvu.txt

Here is the passage, slightly reformatted from the ORC text:

Quote

One thing that has been slightly glossed over so far is the consideration of exactly where the borders of a polygon lie. This can manifest itself in adjacent polygons either by gaps appearing between them, or by them overlapping.

For example, in Figure 6.19, [a 4x4 grid showing a 3x3 square] the width of the polygon is 3 units, so it should have an area of 9 units, whereas it has been rendered with an area of 16 units. The traditional solution to this problem, and the one usually advocated in textbooks, is to consider the sample point of the pixel to lie in its centre, that is, at (x + 0,5, y + 0.5). {A pixel can be considered to be a rectangle of finite area with dimensions 1,0 x 1.0, and its sample point is the point within the pixel area where the scene is sampled in order to determine the value of the pixel.) So, for example, the intersection of an edge with a scan line is calculated for y + O.5, rather than for y, as we assumed above.

This is messy, and excludes the possibility of using integer-only arithmetic, A simpler solution is to assume that the sample point lies at one of the four corners of the pixel; we have chosen the top right-hand corner of the pixel. This has the consequence that the entire image is displaced half a pixel to the left and down, which in practice is insignificant. The upshot of this is that it provides the following simple rasterization rules;

(1) Horizontal edges are simply discarded.

(2) An edge which goes from scan line y0 to y1 should generate x values for scan lines y0 through to y1-1 (that is missing the top scan line), or if y0 = y1 then it generates no values.

(3) Similarly, horizontal segments should be filled from Xleft to Xright-1 (with no pixels generated if Xleft = Xright).

Incidentally, in rules (2) and (3), whether the first or last element is ignored is arbitrary, and the choice is based around programming convenience. The four possible permutations of these two rules define the sample point as one of the four corners of the pixel. The effect of these rules can be demonstrated in Figure 6.20. Here we have three adjacent polygons A, B and C, with edges a, b, c and d. The rounded a- values produced by these edges for the scan shown are 2, 4 ,4 and 7 respectively. Rule 3 then gives pixels 2 and 3 for polygon A, none for polygon B, and 4 to 6 for polygon C. Thus, overall, there are no gaps, and no overlapping. The reason why horizontal edges are discarded is because the edges adjacent to them will have already contributed the x values to make up the segment (for example, the base of the polygon in Figure 6.18; note also that, for the sake of simplicity, the scan conversion of this polygon was not done strictly in accordance with the rasterization rules mentioned above).

Under there rules, when you have adjacent triangles your problem should disappear (along with any very thin polygons

)

This still doesn't solve the problem of when the sampling point is exactly on the boundary, but as long as you use the same algorithm to generate the points and you always work in the same direction it does give a consistent result for which triangle a pixel is from.

Yes it does, the yellow is just brown because you don't have any gamma correction.

Intro:

Tim

I don't have the answer for you but it is possible to do a perfectly consistent implementation. The GPU guys have it solved for example. You can draw a triangle mesh with very tiny triangles, long thin triangles, etc and every pixel will be hit exactly once and there will be no holes. Here's a link to some stuff to think about and maybe give you some search terms to follow up with.

https://docs.microsoft.com/en-us/windows/desktop/direct3d11/d3d10-graphics-programming-guide-rasterizer-stage-rules

Quote from: desertgreg on 08 Oct, 2018 22:49

I don't have the answer for you but it is possible to do a perfectly consistent implementation. The GPU guys have it solved for example. You can draw a triangle mesh with very tiny triangles, long thin triangles, etc and every pixel will be hit exactly once and there will be no holes. Here's a link to some stuff to think about and maybe give you some search terms to follow up with.

https://docs.microsoft.com/en-us/windows/desktop/direct3d11/d3d10-graphics-programming-guide-rasterizer-stage-rules

That is a great page! I am sure I will find that useful in the future...

Legacy posted a link here from another thread here, so I decided to put my two cents to it. Simply put:

If the error delta derr calculation, initial error err₀ calculation, and the direction in which the line is drawn are known, then the problem is easily solved exactly. This is because (for an x-major line with positive y) we have
y = y₀ + ⌊ err₀ + derr·(x - x₀) ⌋
where ⌊ ⌋ denote the floor operation (rounding towards zero, if derr or err can be negative; usually they cannot).

For an y-major line with positive x we have correspondingly
x = x₀ + ⌊ err₀ + derr·(y - y₀) ⌋

If we do not know, or cannot guess the details of the Bresenham's line implementation, we're out of luck; I do not believe there is a solution then. Of course, we could implement a few typical variants, and test a set of lines to find out which one is correct, but that gets into heuristics and guessing.

Here is an example test bench program you can use to test my claim. It uses 32-bit signed integer coordinates (but note that the difference of any two coordinates used should still be representable by the same type), and a 31-bit error variable (1.0 = 2147483648).

Code: [Select]

#include <stdlib.h>
#include <inttypes.h>
#include <stdio.h>
#include <time.h>

/* Written by Nominal Animal, 2018.
    No rights reserved. I do not believe there is anything copyrightable here,
    but if there is, this is licensed under the Creative Commons Zero license,
        [url]https://creativecommons.org/share-your-work/public-domain/cc0/[/url]
    In other words, do whatever you like with it, but at your own risk.
    It should be safe, but I provide no warranties or guarantees of any kind. */

#define  ONE  UINT32_C(2147483648)

#define  FAIL(...) do { fprintf(stderr, __VA_ARGS__); exit(EXIT_FAILURE); } while (0)

typedef struct {
    int32_t     x;
    int32_t     y;
} i32vec2d;

/* Xorshift64* pseudo-random number generator. */
static uint64_t  prng_state = 1;

static uint64_t  prng_u64(void)
{
    uint64_t  x = prng_state;
    x ^= x >> 12;
    x ^= x << 25;
    x ^= x >> 27;
    prng_state = x;
    return x * UINT64_C(2685821657736338717);
}


static int32_t   prng_x0 = 0;
static int32_t   prng_y0 = 0;
static uint32_t  prng_dx = 0;
static uint32_t  prng_dy = 0;
static uint32_t  prng_mx = 0;
static uint32_t  prng_my = 0;

static void prng_set_point_range(const int32_t xmin, const int32_t ymin,
                                 const int32_t xmax, const int32_t ymax)
{
    if (xmin <= xmax) {
        prng_x0 = xmin;
        prng_dx = (uint32_t)(xmax - xmin);
    } else {
        prng_x0 = xmax;
        prng_dx = (uint32_t)(xmin - xmax);
    }

    if (ymin <= ymax) {
        prng_y0 = ymin;
        prng_dy = (uint32_t)(ymax - ymin);
    } else {
        prng_y0 = ymax;
        prng_dy = (uint32_t)(ymin - ymax);
    }

    prng_mx  = prng_dx;       prng_my  = prng_dy;
    prng_mx |= prng_mx >> 1;  prng_my |= prng_my >> 1;
    prng_mx |= prng_mx >> 2;  prng_my |= prng_my >> 2;
    prng_mx |= prng_mx >> 4;  prng_my |= prng_my >> 4;
    prng_mx |= prng_mx >> 8;  prng_my |= prng_my >> 8;
    prng_mx |= prng_mx >> 16; prng_my |= prng_my >> 16;
}

static i32vec2d  prng_point(void)
{
    uint64_t        r;
    uint32_t        x, y;
    i32vec2d        result;

    do {
        r = prng_u64();
        x = r & prng_mx;
        y = (r >> 32) & prng_my;
    } while (x > prng_dx || y > prng_dy);

    result.x = prng_x0 + (int32_t)x;
    result.y = prng_y0 + (int32_t)y;

    return result;
}


/* Bresenham's line algorithm with increasing major x and increasing minor y; 31-bit error term. */
static inline void line_xpyp(int32_t x0, int32_t y0,
                             int32_t x1, int32_t y1,
                             void (*set)(int32_t x0, int32_t y0, int32_t x1, int32_t y1, int32_t x, int32_t y))
{
    uint32_t       derr = (x1 > x0) ? (uint64_t)(y1 - y0) * ONE / (uint64_t)(x1 - x0) : 0;
    uint32_t       err = 0;
    int32_t        x, y = y0;

    if (x1 < x0) FAIL("line_xpyp(): x1 < x0\n");
    if (y1 < y0) FAIL("line_xpyp(): y1 < y0\n");

    for (x = x0; x <= x1; x++) {
        set(x0, y0, x1, y1, x, y);

        err += derr;
        if (err >= ONE) {
            y++;
            err -= ONE;
        }
    }
}

/* Bresenham's line algorithm with increasing major x and decreasing minor y; 31-bit error term. */
static inline void line_xpyn(int32_t x0, int32_t y0,
                             int32_t x1, int32_t y1,
                             void (*set)(int32_t x0, int32_t y0, int32_t x1, int32_t y1, int32_t x, int32_t y))
{
    uint32_t       derr = (x1 > x0) ? (uint64_t)(y0 - y1) * ONE / (uint64_t)(x1 - x0) : 0;
    uint32_t       err = 0;
    int32_t        x, y = y0;

    if (x1 < x0) FAIL("line_xpyn(): x1 < x0\n");
    if (y1 > y0) FAIL("line_xpyn(): y1 > y0\n");

    for (x = x0; x <= x1; x++) {
        set(x0, y0, x1, y1, x, y);

        err += derr;
        if (err >= ONE) {
            y--;
            err -= ONE;
        }
    }
}

/* Bresenham's line algorithm with increasing major y and increasing minor x; 31-bit error term. */
static inline void line_ypxp(int32_t x0, int32_t y0,
                             int32_t x1, int32_t y1,
                             void (*set)(int32_t x0, int32_t y0, int32_t x1, int32_t y1, int32_t x, int32_t y))
{
    uint32_t       derr = (y1 > y0) ? (uint64_t)(x1 - x0) * ONE / (uint64_t)(y1 - y0) : 0;
    uint32_t       err = 0;
    int32_t        x = x0, y;

    if (y1 < y0) FAIL("line_ypxp(): y1 < y0\n");
    if (x1 < x0) FAIL("line_ypxp(): x1 < x0\n");

    for (y = y0; y <= y1; y++) {
        set(x0, y0, x1, y1, x, y);

        err += derr;
        if (err >= ONE) {
            x++;
            err -= ONE;
        }
    }
}

/* Bresenham's line algorithm with increasing major y and decreasing minor x; 31-bit error term. */
static inline void line_ypxn(int32_t x0, int32_t y0,
                             int32_t x1, int32_t y1,
                             void (*set)(int32_t x0, int32_t y0, int32_t x1, int32_t y1, int32_t x, int32_t y))
{
    uint32_t       derr = (y1 > y0) ? (uint64_t)(x0 - x1) * ONE / (uint64_t)(y1 - y0) : 0;
    uint32_t       err = 0;
    int32_t        x = x0, y;

    if (y1 < y0) FAIL("line_ypxn(): y1 < y0\n");
    if (x1 > x0) FAIL("line_ypxn(): x1 > x0\n");

    for (y = y0; y <= y1; y++) {
        set(x0, y0, x1, y1, x, y);

        err += derr;
        if (err >= ONE) {
            x--;
            err -= ONE;
        }
    }
}

/* Bresenham's line algorithm. */
void line(int32_t x0, int32_t y0,
          int32_t x1, int32_t y1,
          void (*set)(int32_t x0, int32_t y0, int32_t x1, int32_t y1, int32_t x, int32_t y))
{
    if (!set)
        return;

    if (x1 >= x0) {
        if (y1 >= y0) {
            if (x1 - x0 >= y1 - y0) {
                line_xpyp(x0, y0, x1, y1, set);
            } else {
                line_ypxp(x0, y0, x1, y1, set);
            }
        } else {
            if (x1 - x0 >= y0 - y1) {
                line_xpyn(x0, y0, x1, y1, set);
            } else {
                line_ypxn(x1, y1, x0, y0, set);
            }
        }
    } else {
        if (y1 >= y0) {
            if (x0 - x1 >= y1 - y0) {
                line_xpyn(x1, y1, x0, y0, set);
            } else {
                line_ypxn(x0, y0, x1, y1, set);
            }
        } else {
            if (x0 - x1 >= y0 - y1) {
                line_xpyp(x1, y1, x0, y0, set);
            } else {
                line_ypxp(x1, y1, x0, y0, set);
            }
        }
    }
}

static unsigned long  pixel_count = 0;
static unsigned long  error_count = 0;

/* Analytical version of Bresenham's algorithm, with increasing x major and increasing y minor. */
void is_at_xpyp(int32_t x0, int32_t y0, int32_t x1, int32_t y1, int32_t x, int32_t y)
{
    if (x1 < x0) FAIL("is_at_xpyp(): x1 < x0");
    if (y1 < y0) FAIL("is_at_xpyp(): y1 < y0");
    if (x < x0 || x > x1) FAIL("is_at_xpyp(): x is not between x0 and x1.\n");
    if (y < y0 || y > y1) FAIL("is_at_xpyp(): y is not between y0 and y1.\n");

    const uint64_t  derr = (x1 > x0) ? (uint64_t)(y1 - y0) * ONE / (uint64_t)(x1 - x0) : 0;
    const int32_t   ay = y0 + (int32_t)( (0 + derr * (x - x0)) / ONE );

    pixel_count++;
    error_count += (ay != y);
}

/* Analytical version of Bresenham's algorithm, with increasing x major and decreasing y minor. */
void is_at_xpyn(int32_t x0, int32_t y0, int32_t x1, int32_t y1, int32_t x, int32_t y)
{
    if (x1 < x0) FAIL("is_at_xpyn(): x1 < x0");
    if (y1 > y0) FAIL("is_at_xpyn(): y1 > y0");
    if (x < x0 || x > x1) FAIL("is_at_xpyn(): x is not between x0 and x1.\n");
    if (y < y1 || y > y0) FAIL("is_at_xpyn(): y is not between y0 and y1.\n");

    const uint64_t  derr = (x1 > x0) ? (uint64_t)(y0 - y1) * ONE / (uint64_t)(x1 - x0) : 0;
    const int32_t   ay = y0 - (int32_t)( (0 + derr * (x - x0)) / ONE );

    pixel_count++;
    error_count += (ay != y);

}

/* Analytical version of Bresenham's algorithm, with increasing y major and increasing x minor. */
void is_at_ypxp(int32_t x0, int32_t y0, int32_t x1, int32_t y1, int32_t x, int32_t y)
{
    if (y1 < y0) FAIL("is_at_ypxp(): y1 < y0");
    if (x1 < x0) FAIL("is_at_ypxp(): x1 < x0");
    if (y < y0 || y > y1) FAIL("is_at_ypxp(): y is not between y0 and y1.\n");
    if (x < x0 || x > x1) FAIL("is_at_ypxp(): x is not between x0 and x1.\n");

    const uint64_t  derr = (y1 > y0) ? (uint64_t)(x1 - x0) * ONE / (uint64_t)(y1 - y0) : 0;
    const int32_t   ax = x0 + (int32_t)( (0 + derr * (y - y0)) / ONE );

    pixel_count++;
    error_count += (ax != x);
}

/* Analytical version of Bresenham's algorithm, with increasing y major and decreasing x minor. */
void is_at_ypxn(int32_t x0, int32_t y0, int32_t x1, int32_t y1, int32_t x, int32_t y)
{
    if (y1 < y0) FAIL("is_at_ypxn(): y1 < y0");
    if (x1 > x0) FAIL("is_at_ypxn(): x1 > x0");
    if (y < y0 || y > y1) FAIL("is_at_ypxn(): y is not between y0 and y1.\n");
    if (x < x1 || x > x0) FAIL("is_at_ypxn(): x is not between x0 and x1.\n");

    const uint64_t  derr = (y1 > y0) ? (uint64_t)(x0 - x1) * ONE / (uint64_t)(y1 - y0) : 0;
    const int32_t   ax = x0 - (int32_t)( (0 + derr * (y - y0)) / ONE );

    pixel_count++;
    error_count += (ax != x);
}

/* Analytical version of Bresenham's algorithm. */
void is_at(int32_t x0, int32_t y0, int32_t x1, int32_t y1, int32_t x, int32_t y)
{

    if (x1 >= x0) {
        if (y1 >= y0) {
            if (x1 - x0 >= y1 - y0) {
                is_at_xpyp(x0, y0, x1, y1, x, y);
            } else {
                is_at_ypxp(x0, y0, x1, y1, x, y);
            }
        } else {
            if (x1 - x0 >= y0 - y1) {
                is_at_xpyn(x0, y0, x1, y1, x, y);
            } else {
                is_at_ypxn(x1, y1, x0, y0, x, y);
            }
        }
    } else {
        if (y1 >= y0) {
            if (x0 - x1 >= y1 - y0) {
                is_at_xpyn(x1, y1, x0, y0, x, y);
            } else {
                is_at_ypxn(x0, y0, x1, y1, x, y);
            }
        } else {
            if (x0 - x1 >= y0 - y1) {
                is_at_xpyp(x1, y1, x0, y0, x, y);
            } else {
                is_at_ypxp(x1, y1, x0, y0, x, y);
            }
        }
    }
}

int main(int argc, char *argv[])
{
    unsigned long  i, n;
    int32_t        xmin, ymin, xmax, ymax;
    i32vec2d       xy0, xy1;
    char           dummy;

    if (argc < 4 || argc > 5) {
        fprintf(stderr, "\n");
        fprintf(stderr, "Usage: %s [ -h | --help | help ]\n", argv[0]);
        fprintf(stderr, "       %s XMIN:XMAX YMIN:YMAX LINES [ SEED ]\n", argv[0]);
        fprintf(stderr, "\n");
        return EXIT_SUCCESS;
    }

    if (sscanf(argv[1], "%" SCNi32 " : %" SCNi32 " %c", &xmin, &xmax, &dummy) != 2) {
        fprintf(stderr, "%s: Invalid x range.\n", argv[1]);
        return EXIT_FAILURE;
    }
    if (sscanf(argv[2], "%" SCNi32 " : %" SCNi32 " %c", &ymin, &ymax, &dummy) != 2) {
        fprintf(stderr, "%s: Invalid y range.\n", argv[2]);
        return EXIT_FAILURE;
    }
    if (sscanf(argv[3], " %lu %c", &n, &dummy) != 1 || n < 1) {
        fprintf(stderr, "%s: Invalid number of lines to test.\n", argv[3]);
        return EXIT_FAILURE;
    }

    if (argc > 4) {
        if (sscanf(argv[4], " %" SCNu64 " %c", &prng_state, &dummy) != 1 || !prng_state) {
            fprintf(stderr, "%s: Invalid Xorshift64* seed. (Zero is not a valid seed.)\n", argv[4]);
            return EXIT_FAILURE;
        }
    } else {
        prng_state = (uint64_t)time(NULL);
        for (i = 0; i < 100; i++) {
            prng_state ^= prng_state >> 12;
            prng_state ^= prng_state << 25;
            prng_state ^= prng_state >> 27;
        }
        if (!prng_state)
            prng_state = 1;
        fprintf(stderr, "Using seed %" PRIu64 ".\n", prng_state);
    }

    prng_set_point_range(xmin, ymin, xmax, ymax);

    for (i = 0; i < n; i++) {
        xy0 = prng_point();
        xy1 = prng_point();
        line(xy0.x, xy0.y, xy1.x, xy1.y, is_at);
    }

    if (!error_count) {
        printf("%lu pixels tested; no errors detected.\n", pixel_count);
        return EXIT_SUCCESS;
    } else {
        printf("%lu pixels tested; %lu erroneus cases (%.06f%%) found.\n", pixel_count, error_count,
               100.0 * (double)error_count / (double)pixel_count);
        return EXIT_FAILURE;
    }
}

The 0 in the calculation of ax and ay correspond to the known initial err value; zero in the classic Bresenham's line algorithm used above.

It should be very simple to add e.g. PBM/PGM/PPM output (drawing the lines to actual pixel canvas), by just redoing each line with a setpixel callback function, but as this is already quite a long code snippet, I omitted it. I'll add it, if needed.

Quote from: legacy on 11 Nov, 2018 08:15

well, I forgot mentioning that divisions must be avoided because they take too many clock-edges. This is the main reason for the existence of the Bresenham's line drawing!

Not to worry! I found the correct exact check. (Tested a million random lines in (0,0)-(1000,1000), both for false positives and false negatives, within the bounding box of the line.)

Let's say your x-major increasing y Bresenham line is

Code: [Select]

void bresenham_xpyp(const int x0, const int y0, const int x1, const int y1)
{
    const int xerr = 2*(y1-y0);
    const int yerr = 2*(x1-x0);
    int       err  = xerr - (x1-x0);
    int       x = x0, y = y0;
    while (1) {
        /* Draw pixel (x, y) */

        if (x >= x1)
            return;

        x++;

        if (err > 0) {
            y++;
            err += xerr - yerr;
        } else
            err += xerr;
    }
}

The corresponding exact test is

Code: [Select]

int on_bresenham_xpyp(const int x0, const int y0, const int x1, const int y1, const int x, const int y)
{
    /* Only tested within the axis-aligned bounding box for the line. */
    if (x < x0 || y < y0 || x > x1 || y > y1)
        return 0;

    /* Only needed for degenerate, 1-pixel long lines; when x0 == x1 and y0 == y1 */
    if (x == x0 && y == y0)
        return 1;

    const int  err = 2 * (x - x0 + 1) * (y1 - y0) - (x1 - x0) * (2 * (y - y0) + 1);

    return (err > 2*(x0 - x1 + y1 - y0)) && (err <= 2*(y1 - y0));
}

Similarly for the other quadrants. Do note that the order of the endpoints is important.

Let me know if you want the complete math walkthrough.

Quote from: legacy on 11 Nov, 2018 16:03

(...)
This is the Bresenham's algorithm. Inside the loop, the error looks something like ~err[ i+1 ] = ~err[ i ] + 2 * ( Dy – Dx ) × (y[ i+1 ] – y[ i ]). Reason why I said it's recursive.

Yes, this is a recurrence relation. It's mathematically recursive.

(Edited to fix the typo; the +1 should be outside the absolute value. Noted the error, too.)

Quote from: legacy on 11 Nov, 2018 16:03

This is the Bresenham's algorithm.

The error at point (x,y) on the Bresenham line from (x₀,y₀) to (x₁,y₁) is

2·( (abs(y-y₀)+1)·abs(x₁-x₀) - (abs(x-x₀)+1)·abs(y₁-y₀) )

because the error (as examined in magic) is initially 2·abs(x₁-x₀) - 2·abs(y₁-y₀), decreases by 2·abs(y₁-y₀) whenever x changes by one, and increases by 2·abs(x₁-x₀) whenever y changes by one.

A point is only on the Bresenham line when the error is between -abs(y₁-y₀) and abs(x₁-x₀), inclusive, because if it was any larger or smaller, either x or y would have been changed by the line-drawing routine at that point.

Quote from: legacy on 12 Nov, 2018 13:12

it seems there is something wrong

Yes, there is, and it is my facepalm-worthy error.

You see, the error formula (after fixing the typos, the +1 should be added to the absolute values) is correct, in that it yields the exact magic value on each pixel on the Bresenham line. The problem is, it is only used to determine the direction to the next pixel.

In general, there are three possible previous pixels to {abs(x-x₀), abs(x-x₀)}: {abs(x-x₀)-1, abs(x-x₀)}, {abs(x-x₀), abs(x-x₀)-1}, and {abs(x-x₀)-1, abs(x-x₀)-1}. However, because two of them are not on the Bresenham line, their error values might be off in such a way that they lead to the pixel to be tested. (I haven't bothered to check, but it seems very likely that it is possible to me.) If so, it is not possible to create a mathematically correct non-recursive formula without false positives: this version of the Bresenham line algorithm then works essentially like a hash function.

The most obvious alternative would be to modify the Bresenham line algorithm you use, so that rather than a combined error value, you use two, each tracking the error on each coordinate. It does use an extra register, and more ops to set up the variables, but the inner loop stays the same. Would that approach work for you?

Quote from: legacy on 13 Nov, 2018 02:09

With my team we have worked a speculative approach (see screenshots on previous pages), that does the job very neatly, except a few "false positive" pixels, but this is contained in 1-pixel of error as maximal error.

Based on the point-line distance in units of line length?

2 * abs( abs(x₁-x₀) * abs(y-y₀) - abs(y₁-y₀) * abs(x-y₀) ) <= abs(x₁-x₀) + abs(y₁-y₀)

applied only within the axis-aligned bounding box for the line, or something similar?

Edit: I should stop posting late at night. It has a few "false positive" errors, but I suspect there are might be good heuristics to filter those out. What I don't understand, is why you are using Bresenham and not Q15.16 fixed point for triangle filling, especially if you do texture or color blending fills.

What would happen if you compute the foot of the perpendicular, then compared the result to the input coords?

I forgot: the fixed point filling does require an unsigned division per pixel row, and that's what you want to avoid.

Just to recap to those who might not know this, there are two problems in triangle filling:

The first one is to decide which pixels belong to the triangle. The pixels between the red and green lines need to be decided. Overlap is not as bad as leaving holes. With fixed-point arithmetic with sufficient fractional bits, you can make pixel-perfect triangles with symmetric edges, with just three divisions to find out the deltas between rows.

The second one is the harder one. The number of pixels on each row varies. It is always an integer, so using a fixed-point estimator for the size of the horizontal pixel delta will not yield correct results; it can be off by up to 2 pixels. Instead, the horizontal step delta is calculated by dividing the full delta (from start of row to end of row) with the exact number of pixels used.

(In other words, you only need a fixed-point accumulators for the leftmost and rightmost pixels on each row; the change from row to another is the row delta, and requires three divisions per accumulator/weight. The start and end coordinates for each line are such accumulators also.)

However, the estimator for the horizontal pixel delta not being far off might make it suitable as a starting point for an iterative division algorthm. That is, the division by the (integer) number of pixels on each row could be approximated by a fixed-point reciprocal, with the estimator as the initial estimate, refined by a couple of rounds of Newton-Raphson, for example.

Me myself, I'd rather have an architecture with fast fixed-point multiplication and division, with a load and store instructions that use the integral parts only from two registers, and some instructions to interleave/deinterleave color component bits, when dealing with computer graphics stuff. Division being prohibitively expensive is not fun at all for me.

Quote from: legacy on 14 Nov, 2018 10:10

It avoids the division in the cycle introducing a pre-division

Yes; the numerical weakness in that is the combined properties of the two different problems I mentined above.

Mathematically, if the interpolated value is v₀ at (x_A, y_A), v₁ at (x_B, y_B), v₂ at (x_C, y_C), x₁ = x_B - x_A, y₁ = y_B - y_A, x₂ = x_C - x_A, y₂ = y_C - y_A,
then on each +x step the interpolated value changes by
v_Δx = ( (v₀ - v₂)·y₁ - (v₀ - v₁)·y₂) ) / ( x₁·y₂ - x₂·y₁ )
and on each +y step by
v_Δy = ( (v₀ - v₁)·x₂ - (v₀ - v₂)·x₁) ) / ( x₁·y₂ - x₂·y₁ )
where · denotes ordinary (scalar) multiplication, and / divison.

This is mathematically exact, derived from linear interpolation in barycentric coordinates (u, v):
Z(u, v) = (1 - v)·( (1 - u)·Z₀ + u·Z₁ ) + v·Z₂

The problem is, the pixels on the edges of the triangle (u = 0, v = 0, or u+v = 1) are rarely exactly on the edge, but a fractional pixel distance from it. This is because they only "partially" belong to the triangle. A pixel that is determined to belong to the triangle, can have barycentric coordinates outside it (inside is defined as 0 ≤ u, v, u+v ≤ 1 in barycentric coordinates), which leads to interpolation turning into extrapolation -- exceeding the range intended to be interpolated --, and thus bad color artefacts.

You can use this inside the triangle just fine, you just can't use it for the edge pixels.

How about using a mixed pixel walker approach?

The pixels at the edge are interpolated between two vertices. Each positive x step changes an interpolated variable by d_x, positive y step by d_y.
The body of the triangle is interpolated between all three vertices. Each positive x step changes an interpolated variable by r_x, positive y step by r_y.
The formulae for those in the above SVG image are exact.

You start the walk at the topmost vertex. If there is another vertex on that row, you draw the line between them, interpolating the color (noting that d_y = 0).
Then, you keep two steppers. At the beginning of the inner loop, both steppers step to a new row. They then continue stepping, until they would step to a new row, or encounter a vertex. (If one encounters the middle vertex, it updates the step parameters, and resets its coordinates to that vertex, then continues.) Then, the interval between them, not stepped yet, is walked using r_x.

(The edge walkers then keep two variables for each interpolated value: one interpolated on the edge to the next vertex using d_x and d_y, and the other for body pixel interpolation using r_x and r_y.)

If you do the edge walkers using fixed point math (i.e., calculate delta-x for each edge), you can even antialias the edge pixels if you want, with some additional math. But, this will work even for Bresenham line walking the edges.

Quote from: legacy on 16 Nov, 2018 17:56

(I mean I suspect numerical weakness)

Since the inner loops are literally just additions or subtractions on the fixed-point values, it boils down to the precision of those fixed-point accumulators. If they have at most 1 LSB of error compared to the exact mathematical value, then the error is less than half a bit for the interpolated integer values for 32767 steps (not area; Manhattan distance).

To calculate the fixed-point step factors to that precision, I'd need an unsigned integer division with 31-bit divisor, and 47-bit dividend with 16 low bits always zero; and a 31-bit result. This is the crucial operation. Eight of these need to be done for color interpolation, and three for edge walking (unless Bresenham is used to walk the edges). If that is an acceptable cost, then this makes sense; otherwise it is just an intellectual exercise without practical applications. I don't want to see that much trouble for just an intellectual exercise.

Quote from: legacy on 17 Nov, 2018 16:38

Quote from: Nominal Animal on 17 Nov, 2018 15:17
47-bit dividend with 16 low bits always zero; and a 31-bit result

how have you calculated the size of the this?

Assuming 16-bit signed integer coordinates whose difference never exceeds 2¹⁵-1 = 32767, as otherwise Q15.16 fixed point does not have sufficient range.

With that constraint, the dividend and the divisor are both 32-bit signed integers. Handling the signs separately, that makes 31-bit dividend and divisor. The result requires 16 fractional bits, so multiplying dividend by 2¹⁶ before division, the quotient will be correct to within an ULP (as this truncates the extra bits). If you add half the divisor to the multiplied dividend, the quotient will be correct to within half an ULP (but then the 16 low bits in the 47-bit divisor are no longer zero).

Here is the verifier:

Code: [Select]

#include <stdlib.h>
#include <inttypes.h>
#include <stdio.h>
#include <errno.h>
#include <time.h>
#include <math.h>

typedef struct {
    int32_t  x;
    int32_t  y;
} i32vec;

static uint64_t  prng_state = 1;

static inline uint64_t  prng_read_uint64(const char *path)
{
    uint64_t  result = 0;
    FILE     *src;

    if (!path || !*path)
        return 0;

    src = fopen(path, "rb");
    if (!src)
        return 0;

    if (fread(&result, sizeof result, 1, src) != 1) {
        fclose(src);
        return 0;
    }

    fclose(src);
    return result;
}

static inline uint64_t  prng_randomize(void)
{
    uint64_t  seed = 0;
    int       rounds = 100;

    /* Try some known randomness sources. */
    if (!seed) seed ^= prng_read_uint64("/dev/urandom");
    if (!seed) seed ^= prng_read_uint64("/dev/arandom");
    if (!seed) seed ^= prng_read_uint64("/dev/random");

    /* Mix in a bit of time. */
    seed ^= (uint64_t)time(NULL) * UINT64_C(59121997897);

    /* Mix in a bit of clock, as well. */
    seed ^= (uint64_t)clock() * UINT64_C(5054973881);

    /* Make sure it is nonzero. */
    if (!seed)
        seed = 1;

    /* Churn it a bit, to make it less predictable. */
    while (rounds-->0) {
        seed ^= seed >> 12;
        seed ^= seed << 25;
        seed ^= seed >> 27;
    }

    /* Done. */
    prng_state = seed;
    return seed;
}

static inline uint64_t  prng_u64(void)
{
    uint64_t  state = prng_state;
    state ^= state >> 12;
    state ^= state << 25;
    state ^= state >> 27;
    prng_state = state;
    return state * UINT64_C(2685821657736338717);
}

static uint32_t  prng_xmask = 0;
static uint32_t  prng_ymask = 0;
static uint32_t  prng_xmax  = 0;
static uint32_t  prng_ymax  = 0;
static int32_t   prng_xmin  = 0;
static int32_t   prng_ymin  = 0;

static inline i32vec  prng_i32vec(void)
{
    uint32_t  x, y;
    i32vec    result;

    do {
        uint64_t  u = prng_u64();
        x = u & prng_xmask;
        y = (u >> 32) & prng_ymask;
    } while (x > prng_xmax || y > prng_ymax);

    result.x = x + prng_xmin;
    result.y = y + prng_ymin;
    return result;
}

static inline void  prng_i32vec_range(int32_t  xmin, int32_t  ymin,
                                      int32_t  xmax, int32_t  ymax)
{
    if (xmin <= xmax) {
        prng_xmin = xmin;
        prng_xmax = xmax - xmin;
    } else {
        prng_xmin = xmax;
        prng_xmax = xmin - xmax;
    }

    if (ymin <= ymax) {
        prng_ymin = ymin;
        prng_ymax = ymax - ymin;
    } else {
        prng_ymin = ymax;
        prng_ymax = ymin - ymax;
    }

    prng_xmask  = prng_xmax;
    prng_xmask |= prng_xmask >> 1;
    prng_xmask |= prng_xmask >> 2;
    prng_xmask |= prng_xmask >> 4;
    prng_xmask |= prng_xmask >> 8;
    prng_xmask |= prng_xmask >> 16;

    prng_ymask  = prng_ymax;
    prng_ymask |= prng_ymask >> 1;
    prng_ymask |= prng_ymask >> 2;
    prng_ymask |= prng_ymask >> 4;
    prng_ymask |= prng_ymask >> 8;
    prng_ymask |= prng_ymask >> 16;
}

static inline double fx_div_correct(const int32_t  n, const int32_t  d)
{
    return (double)n * 65536.0 / (double)d;
}

static uint64_t  max_n = 0;
static uint32_t  max_d = 0;

static inline int32_t fx_div(const int32_t  n, const int32_t  d)
{
    uint64_t  dividend;
    uint32_t  un, ud;
    int32_t   result;
    int       negative;

    if (n < 0) {
        un = -n;
        negative = 1;
    } else {
        un =  n;
        negative = 0;
    }

    if (d < 0) {
        ud = -d;
        negative = !negative;
    } else {
        ud =  d;
    }

    if (max_d < ud)
        max_d = ud;

    /* 47-bit division */
    dividend = ((uint64_t)un) << 16;
#ifdef  ROUNDING
    dividend += (ud / 2);
#endif
    if (max_n < dividend)
        max_n = dividend;

    result = dividend / ud;
    return (negative) ? -result : result;
}

static int parse_ulong(const char  *src, unsigned long  *to)
{
    const char    *end;
    unsigned long  val;

    if (!src || !*src)
        return -1;

    end = src;
    errno = 0;
    val = strtoul(src, (char **)&end, 0);
    if (errno)
        return -1;
    if (end == src)
        return -1;

    while (*end == '\t' || *end == '\n' || *end == '\v' ||
           *end == '\f' || *end == '\r' || *end == ' ')
        end++;

    if (*end)
        return -1;

    if (to)
        *to = val;
    return 0;
}

static int parse_i32(const char  *src, int32_t  *to)
{
    const char  *end;
    long         val;

    if (!src || !*src)
        return -1;

    end = src;
    errno = 0;
    val = strtol(src, (char **)&end, 0);
    if (errno)
        return -1;
    if (end == src)
        return -1;
    if (val < INT32_MIN || val > INT32_MAX)
        return -1;

    while (*end == '\t' || *end == '\n' || *end == '\v' ||
           *end == '\f' || *end == '\r' || *end == ' ')
        end++;

    if (*end)
        return -1;

    if (to)
        *to = val;
    return 0;
}

int main(int argc, char *argv[])
{
    int32_t        min, max;
    double         fraction, correct, err, errmin, errmax;
    i32vec         xy;
    unsigned long  i, iters;

    if (argc != 4) {
        fprintf(stderr, "\n");
        fprintf(stderr, "Usage: %s [ -h | --help | help ]\n", argv[0]);
        fprintf(stderr, "       %s MIN MAX ITERATIONS\n", argv[0]);
        fprintf(stderr, "\n");
        return EXIT_SUCCESS;
    }

    if (parse_i32(argv[1], &min)) {
        fprintf(stderr, "%s: Invalid minimum. Try 1.\n", argv[1]);
        return EXIT_FAILURE;
    }
    if (parse_i32(argv[2], &max)) {
        fprintf(stderr, "%s: Invalid maximum. Try 32767.\n", argv[2]);
        return EXIT_FAILURE;
    }
    if (parse_ulong(argv[3], &iters) || iters < 1) {
        fprintf(stderr, "%s: Invalid number of iterations.\n", argv[3]);
        return EXIT_FAILURE;
    }

    prng_i32vec_range(min, min, max, max);

    errmin =  HUGE_VAL;
    errmax = -HUGE_VAL;

    printf("Using seed %" PRIu64 ".\n", prng_randomize());

    for (i = 0; i < iters; i++) {
        do {
            xy = prng_i32vec();
            correct = fx_div_correct(xy.x, xy.y);
        } while (xy.y == 0 || correct < (double)INT32_MIN || correct > (double)INT32_MAX);

        fraction = fx_div(xy.x, xy.y);

        err = correct - fraction;
        if (errmin > err)
            errmin = err;
        if (errmax < err)
            errmax = err;
    }

    printf("Largest dividend was %" PRIu64 " (%.0f bits)\n", max_n, ceil(log((double)max_n)/log(2.0)));
    printf("Largest divisor was %" PRIu32 " (%.0f bits)\n", max_d, ceil(log((double)max_d)/log(2.0)));
    printf("Maximum negative error found: %9.6f ULP\n", errmin);
    printf("Maximum positive error found: %9.6f ULP\n", errmax);

    return EXIT_SUCCESS;
}

If we look at the formulae, the largest possible dividend and divisor with such coordinates do not exceed 32767×32767×2 = 2147352578 in magnitude. So, if we run the verifier with parameters -2147352578 2147352578 1000000000, it'll test a billion uniform random divisions with such dividends and divisors, limited to cases where the fixed-point quotient is within the 32-bit signed integer range.

Note that the divisor in the formulae are an effective "area": the area of a square with edge length the same as Manhattan distance between two points, or twice the area of the triangle. Since the dividend is always a multiple of "width" or "height", all cases we get in practice will have small fixed-point results; so only considering the well-behaved cases (where (the integer part of) the quotient is within the same range as "width" or "height") is warranted.

If it was compiled without defining ROUNDING preprocessor macro, the output is something like
Using seed 15011178025248273303.
Largest dividend was 140728898355200 (47 bits)
Largest divisor was 2147352578 (31 bits)
Maximum negative error found: -1.000000 ULP
Maximum positive error found: 1.000000 ULP
Note that ULP refers to Units in Least-significant Place. If you look at the fixed-point result as a simple signed integer, then the error in ULPs is the difference to the actual floating-point quotient multiplied by 2¹⁶.

If you define ROUNDING at compile time, then the divisor has 47 significant bits (not low 16 bits zero), and the output is something like
Using seed 16834975864486337449.
Largest dividend was 140729911937246 (47 bits)
Largest divisor was 2147352555 (31 bits)
Maximum negative error found: -0.500000 ULP
Maximum positive error found: 0.500000 ULP

The numerical testing indicates my logic is correct: when the coordinate and value differences are less than 2¹⁵, unsigned integer division using a 47-bit dividend and a 31-bit divisor yields fixed-point results with at most half an ULP of error. In other words, that calculating the division using double-precision floating point and multiplying the quotient by 2¹⁶, rounds to the same value as that unsigned integer division; the difference between the two is within -0.5 to +0.5.

This also means that if you do intend to use Q15.16 fixed point arithmetic, you really do need a 47:31 to 31-bit unsigned integer division (or 48:32 to 32-bit signed integer division) to construct accurate fixed point values in the first place. I do suspect that most of the numerical instabilities/inaccuracies you've seen were due to inaccurately calculated fixed point deltas.

(If a Q15.16 delta is off by 1 ULP, the error is at most 1/65536 per step. So, although the integer part may differ at each step, the error is less than 0.5 for up to 32767 steps. Since walking involves a dual loop, that 32768 steps is the Manhattan distance from the starting point. This means that for all shapes where all vertices are within 32767 steps (Manhattan distance) from each other, correctly calculated Q15.16 deltas will have less than 0.5 of error at each point in the shape.)

Quote from: legacy on 18 Nov, 2018 12:30

This is exactly what I mean with DSP-engine!
Both the multiplier and divider need more internal bits.

Right. You never showed how fx1616_mul() or fx1616_div() were implemented; I just didn't know.

Quote from: legacy on 18 Nov, 2018 12:30

This approach seems to cost more complexity and LE

I suspected that might be the case, but not having an FPGA, I cannot check/test. If you did have a 48:32 to 32 division instruction already implemented (essentially (reg1<<16 + reg2) / reg3), even a relatively slow one, this approach might be competitive (and such an instruction would help in many practical cases when defining fixed-point numbers, although might be tricky to get a compiler to generate).

If I have any further ideas, I shall get a cheap FPGA, and find out for myself, then report my findings in a separate thread.

[SOLVED] test if a point belongs to the output of the Bresenham's line drawing

Are you sure?

Are you sure?

Are you sure?

Are you sure?

Are you sure?

Are you sure?

Are you sure?

Are you sure?

Are you sure?

Are you sure?

Are you sure?

Are you sure?

Are you sure?

Are you sure?

Are you sure?

Are you sure?

Are you sure?

Are you sure?

Are you sure?

Are you sure?

Are you sure?

Are you sure?

Are you sure?

Are you sure?

Navigation

Common actions