Author Topic: Books on Basic and Advanced Algorithms to program in C/C ++ (Read 11706 times)

rstofer · « **Reply #25 on:** July 06, 2018, 09:07:13 pm »

Quote from: FlyingHacker on July 06, 2018, 03:41:00 am

Sedgewick is good, but I prefer Weiss’ teaching and writing style:

https://www.pearson.com/us/higher-education/program/Weiss-Weiss-Data-Struc-Algor-Analy-C-4-4th-Edition/PGM148299.html

Weiss is another prolific writer. There are versions of his books all over Alibris.
https://www.alibris.com/booksearch?keyword=data+structures%2C+weiss

rhb · « **Reply #26 on:** July 06, 2018, 09:49:32 pm »

Personal recommendations:

Algorithms in C
Robert Sedgewick

Introduction to Algorithms
Cormen, Leiserson & Rivest

The Art of Programming
Knuth

I have both editions of Knuth. I have used it very seldom over the years, but when you need it, there is no substitute. But I'd suggest you start with the current editions of the first two. Once you've read those, you can sit down with a copy of Knuth and decide very quickly if it's a reference you want to have.

Other recommendations:

Code Complete
McConnell

Rapid Development
McConnell

A copy of the C & C++ language standards. Read the relevant section any time you do anything unusual. Pay particular attention to where is says the behavior is "undefined" or "implementation dependent" and tread very carefully in that area.

I have 80 ft of computer books. There are a lot of books that I read and then referenced once every few years. But having them allowed me to get a much better rate as a contractor than most.

Kjelt · « **Reply #27 on:** July 07, 2018, 06:53:10 am »

The other recommendations are Pretty old books from the MS press end 90s when they were hot, still have them myself but i don't know if they are relevant today or would recommend them, so much has changed.

Not for algorythms but Uncle Bob with his clean coder and clean architecture series is worthwhile although his videos are really bugging me, he get the right point. You don't make friends with a lot of SW archs showing them his work

rhb · « **Reply #28 on:** July 07, 2018, 05:01:09 pm »

Quote from: Kjelt on July 07, 2018, 06:53:10 am

The other recommendations are Pretty old books from the MS press end 90s when they were hot, still have them myself but i don't know if they are relevant today or would recommend them, so much has changed.

Not for algorythms but Uncle Bob with his clean coder and clean architecture series is worthwhile although his videos are really bugging me, he get the right point. You don't make friends with a lot of SW archs showing them his work

I wasn't aware that the concept of writing correct code had become obsolete, though I'll certainly admit it doesn't seem to be as popular as using the latest fad language or development methodology. The software engineering section of my library is 20 ft or so of shelf space. The McConnell books were the standouts with Jon Bentley and a long list of shorter and more obscure books close behind.

After some almost 40 years of programming in multiple languages on multiple platforms, the thing that amazes me the most is the degree to which the same mistakes are repeated every 10 years or so. The typical programmer today has not the least clue as to what is actually going on. Sadly this was true of most over my working career. I was the person they came to to sort out why things would not compile or link, or if they did did not work.

I've lost count of the number of times I've said, "Section 8.3.5 of the FORTRAN 77 standard states that upon return from a function or subroutine, NAMED COMMON may be undefined. This was included in the standard to accommodate the Burroughs company which wanted to offer a standard compliant FORTRAN compiler for the B5000 successors. As it was a stack architecture machine, performance was very dependent upon allocating variables on the stack."

The early SPARC and MIPS compilers broke a lot of old FORTRAN, so I spent a lot of time explaining this to people I worked with.

I also have an annoying habit of pointing out that some construct in the C or C++ standards is either "undefined" or "implementation dependent". Of course, we become "friends" again when their code quits working.

Kjelt · « **Reply #29 on:** July 07, 2018, 10:07:41 pm »

Well the problem today is more the definition of done and making sure the sw teams on three continents working on the same stack have the same definition

Rapid development has become Safe agile and it takes more time for all safe consultants to learn the concept to the projectleaders and management that they should leave the sw teams plan for them selves and not to muddle with the planning.
A lot has changed however you're right in that sense that a lot of mistakes that were made 30 years ago are still being made today.

bd139 · « **Reply #30 on:** July 07, 2018, 10:17:51 pm »

It’s more complicated than that. We’re also stil in the steam age of software. Designing software is still a pretty new thing that we haven’t quite sussed yet as a species.

We can all make little things work but anything large is a disaster in the making.

Kjelt · « **Reply #31 on:** July 07, 2018, 10:24:13 pm »

I agree, what frustrates me the most is that new languages pop up like mushrooms but they do not solve anything meaningfull. Anyway end 90s a common thought was that by 2020 human programmers would be extinct, computers would program themselves

I sometimes think that programmers make new languages, new development and simulation tools just to keep on having work.

bd139 · « **Reply #32 on:** July 07, 2018, 10:37:41 pm »

That would have been good. I remember the dreams of sipping cocktails on a Caribbean island while the machines do the work for us. What actually happened was we built DSLs, tools and meta languages so the unskilled business analysts could describe the systems they wanted instead. Unfortunately it turns out they had no idea what they wanted so delegated it back to the programmers who didn’t have time for all that shit and didn’t want to do themselves out of a job. Ergo, wincing and slow head shaking at the analysts whenever they want something and say “that’ll take at least three months as we have to shovel it onto the pile of shit you changed your mind half way through last time”. Actually what happens is the engineers become a poorly educated crap filter neither understanding the problem domain adequately nor the tools they chose because someone had some loud marketing. So we’re all still chipping away at rocks slaving away with no real improvements.

And yes you’re right. I think new programming languages are invented so someone can showboat that they read Knuth and implement it poorly, again, touting advantage X of a language whilst hiding disadvantages Y,Z and A-W.

Problem is programmers are like flies around shit. A new pile of shit appears and they’re on it.

This post was sponsored by 20 years of working in the software sector and half a bottle of red

NorthGuy · « **Reply #33 on:** July 07, 2018, 11:29:32 pm »

Quote from: bd139 on July 07, 2018, 10:37:41 pm

This post was sponsored by 20 years of working in the software sector and half a bottle of red

I watch people play golf. Most of them have very inefficient golf swings, they cannot hit the ball straight or far. New clubs appear all the time, promising instant improvements if only you buy the club. People buy new clubs hoping that the club would help them hit the ball better. It doesn't. But somehow, by some magical way of thinking, they believe that they hit balls better with their new clubs. They truly believe that the new club has helped them a lot. So, when a new club comes up they buy it again.

Same think with programmers, and I suspect everywhere else.

About books - I found the "Numerical Recipes in C" to be useful. It is not exactly C related, but it covers wide range of algorithms.

I don't think it is a good idea to build your program from algorithms as from lego blocks. Every task is unique, thus requires an unique approach. Most of the time, you need to invent something by yourself.

coppice · « **Reply #34 on:** July 07, 2018, 11:48:04 pm »

Quote from: NorthGuy on July 07, 2018, 11:29:32 pm

I don't think it is a good idea to build your program from algorithms as from lego blocks. Every task is unique, thus requires an unique approach. Most of the time, you need to invent something by yourself.

Right on. Components are for wimps. Build every solution from raw transistors up, to be finely tuned for purpose.

GeorgeOfTheJungle · « **Reply #35 on:** July 08, 2018, 12:04:43 am »

DSLs will always exist because the linux kernel or a device driver is one thing (C, systems programming), a web app is another (javascript), a hardware description language another (vhdl), a page description is another (postscript), a database another, the presentation/display+human interface another... etc, etc.

What current (application) languages can't yet do very well is help exploit concurrency and parallelism with ease, in part because it's a relatively recent need (multiple cores), in part because for many years we've been being saved by ever faster clock speeds coming to the rescue, in part because due to inertia we as programmers still mostly "think linear and singlethreadedly", and also because simply nobody has yet been clever enough to devise "the right way to do it", which threads (alone) is clearly not.

coppice · « **Reply #36 on:** July 08, 2018, 02:12:22 am »

Quote from: GeorgeOfTheJungle on July 08, 2018, 12:04:43 am

What current (application) languages can't yet do very well is help exploit concurrency and parallelism with ease, in part because it's a relatively recent need (multiple cores)

This is not true at all. There has been extensive research since the early 1960s to find and exploit parallelism in algorithms, from the micro-instruction level up to the most macroscopic level. Machines with 2 to many thousands of cores are not at all new. They are only new to home users. The problem is a few things are trivial to parallelize, like rendering animation cells, while most things hit "B can't done before we know the result of A" issues so very often that most algorithms won't even allow a reasonable amount of ILP from today's very complex cores.

Nominal Animal · « **Reply #37 on:** July 08, 2018, 09:49:10 am »

First thing to work out is to find out what you want to achieve.
The second thing is to work out how to do it.
The third thing is to find out what to do when things go wrong.
The fourth thing is to work it all out in a way that you can fix, when you find out you were wrong.
If you do not document your intentions and ideas at each step, you screw someone, usually the later version of yourself.

I liked Weiss' old Data Structures and Algorithm Analysis in C, because not only does it show good examples of various data structures, you also shows you how to evaluate and compare them along different properties (time and space complexity, best/worst/average case).

It seems that a lot of developers do not realize that on many architectures (possibly all non-microcontroller ones), typical tasks are constrained by memory bandwidth, and not CPU speed. Even in things like Discrete Cosine Transform (used in most audio and video compression schemes), the heavy computation itself is easy to optimize; the hard part is to efficiently decode and extract the components out of a binary stream.

Current server and desktop architectures use multi-level caches to mitigate this, which means that to maximize memory efficiency, the memory access patterns must be such that the CPU can predict them to a high degree, transferring data between cache levels and memory efficiently, without unnecessary stalling.

A possibly surprising result is that if you implement some algorithm that in a microbenchmark behaves optimally, using that same algorithm in a real program on real data may be slower than an implementation that behaves worse in that same microbenchmark. This is because in microbenchmarks, the cache is only used by the implementation tested. In real life programs, the cache is shared by the entire task, and can even be evicted every now and then due to process switches on that core. So, if you design an algorithm for a CPU with specific sizes at specific cache levels, it is really unlikely to perform that way in a real programs. However, it does not seem to hinder people from developing just such algorithms and implementations...

Whenever discussing parallel (or distributed) programming, typically the most important part for actual software development, the level at which parallelism yields best results, is completely skipped.

You can develop a 3D dot product for three CPU cores, and optimize it to perform optimally even cache-wise, before you realize that saving and restoring the internal hardware state needed for each execution thread is several orders of magnitude higher than any speedup such approach might gain, for any number of vectors.

The nastiest part about that is that it is usually a human-level question, not something you can compute or automatically find out or even assume. Sometimes the user wants a task to be done in the background, using the fewest resources needed, since it can take a few minutes longer since the user is right now doing something more important. Other times, the CPU resources needed are not important, as long as the results are obtained with as little latency (in as little wall clock time) as possible.

Then, there is the problem of what "performance" means. When optimizing computation for speed, do you optimize wall clock time taken or CPU time taken? Those two are very, very different things. Or by performance, do you mean bandwidth? What about latencies? Or bounded execution times (any sort of guarantees on performance)?

Many HPC simulations involve the Monte Carlo method. One of my favourite gripes is when a large simulation is spatially divided to allow the same simulation to be calculated in parallel; i.e. distributed over multiple computational cores. If each random change affects a nonzero neighborhood, and they always do if the simulation is at all useful, changes near the CPU core boundaries need to be communicated between cores, or the core boundaries will show up in the final results. In most scientific papers on Molecular Dynamics where the base energy state of some chunk of matter is found using Monte Carlo simulations and atom swaps, you can clearly see the boundaries, because communication between cores simply slows down the computation way too much -- several orders of magnitude slower, in fact. As far as I know the solution, sliding windows, is still not proven to retain the detail balance (meaning, whether it yields the same results as a single simulation), but it definitely beats the results containing core boundaries. My own solution, using a space-filling curve, and transmitting information during computation, at least one widow step ahead of the computation, is apparently too complicated for typical scientists to grok.

During that research, about a decade ago, I came to the conclusion that most programmers (or developers) do not understand that to get high performance, data must flow constantly. You cannot do computation, then communication, in turns: you need to do them at the same time to get your money's worth out of the hardware you have.

Using larger buffers does not help, either. At each buffer, additional latency is induced, when the data flows to fill the buffer before it continues onwards. This is known as bufferbloat in packet-switched networks, but it affects all software, really. After all, when you read data from storage, do some kind of work on it, then write the data out, you do exactly the same thing a network switch does to packets.

Apologies for the ranty wall of text.

tggzzz · « **Reply #38 on:** July 08, 2018, 10:27:09 am »

Quote from: GeorgeOfTheJungle on July 08, 2018, 12:04:43 am

What current (application) languages can't yet do very well is help exploit concurrency and parallelism with ease, in part because it's a relatively recent need (multiple cores), in part because for many years we've been being saved by ever faster clock speeds coming to the rescue, in part because due to inertia we as programmers still mostly "think linear and singlethreadedly", and also because simply nobody has yet been clever enough to devise "the right way to do it", which threads (alone) is clearly not.

Not true.

One significant example is:
1970s: Tony Hoare (of Quicksort, monitors fame and null reference infamy) invents Communicating Sequential Processes (CSP) as a formalism for expressing and analysing parallel processing
1980s: CSP embodied in Occam running on Transputers
2000s: xC and XCORE are the spiritual descendents of Occam and the Transputer, for embedded systems.

Today you can buy a 4000MIPS chip with 32 processors and an IDE that guarantees min/max loop and i/o timings to a 4ns resolution. Cost: £22/23 one off, from Digikey https://www.digikey.co.uk/products/en/integrated-circuits-ics/embedded-microcontrollers/685?k=&pkeyword=&pv1989=0&FV=fffc0370%2Cffe002ad%2Cac4001b&quantity=0&ColumnSort=0&page=1&stock=1&pageSize=25

Do they suitable for all problems? Of course not.
Are they perfect? Of course not.
Are they useful? Demonstrably yes, given the commercial investment and products.

bd139 · « **Reply #39 on:** July 08, 2018, 10:54:09 am »

Most software projects depend on staff availability these days so that kind of shoots that down.

GeorgeOfTheJungle · « **Reply #40 on:** July 08, 2018, 10:56:45 am »

IMO the focus has been mostly on objects (OO), it's been the fad, objects everywhere, in many many cases unnecessarily, and not so much interest in concurrency. Also, a generation has lost their time with Java and irreparably damaged their brains forever with classical inheritance

bd139 · « **Reply #41 on:** July 08, 2018, 11:13:00 am »

OO is fine if you program to interfaces and favour composition over inheritance.

Again going back to an earlier comment, most of the theoretical stuff works fine in a very small problem domain but doesn’t scale well. OO sucks in very small problem domains but scales to large ones.

nctnico · « **Reply #42 on:** July 08, 2018, 11:51:57 am »

Quote from: luiHS on July 05, 2018, 10:36:12 am

Hello.

I am looking for good books on Basic and Advanced Algorithms to program in C / C ++. Structures, Sorting, Search, Graphics, etc ...

'C++ Data structures' by Dale Teague is a nice book to read about how to deal with data. It shows some inner workings of the STL data storage containers.

westfw · « **Reply #43 on:** July 08, 2018, 10:39:46 pm »

Quote

the concept of writing correct code had become obsolete ... "Section 8.3.5 of the FORTRAN 77 standard states
... some construct in the C or C++ standards is either "undefined" or "implementation dependent".

Yeah sure, but now we're well out of the context of "algorithms and data structures", at least as commonly taught. Good coding practices, standards adherence, requirements specification, effective debugging... Not subjects for an algorithms book. (and I'm sure we've all known a CS PhD with recognizable brilliance in some segment, who "can't code his way out of a paper bag", or who "accidentally" blew the memory or performance budget of some less-popular product he wasn't familiar with...) (I remember ... Smalltalk on Sun-1 systems, and how un-impressed I was at how much you had the deck out the system to get "moderately acceptable" performance. Well beyond standard memory configurations, for instance.)

Quote

the focus has been mostly on objects

Now THAT'S an interesting point. Has anyone seen a book that specifically covers "Objects" as a data type and/or target for standard algorithms?

SiliconWizard · « **Reply #44 on:** July 09, 2018, 07:38:31 pm »

Quote from: coppice on July 08, 2018, 02:12:22 am

Quote from: GeorgeOfTheJungle on July 08, 2018, 12:04:43 am
What current (application) languages can't yet do very well is help exploit concurrency and parallelism with ease, in part because it's a relatively recent need (multiple cores)
This is not true at all. There has been extensive research since the early 1960s to find and exploit parallelism in algorithms, from the micro-instruction level up to the most macroscopic level. Machines with 2 to many thousands of cores are not at all new.

Absolutely, and for good reasons.
At the time, processors were so limited that it soon became very logical to try and use many processors at once. Multi-processor designs and algorithms appeared way sooner than multi-core processors for technological reasons, but the underlying principles were basically the same.

And looking at it on a larger scale, parallelizing activities (and the underlying constraints and approaches) is a concept that is almost as old as life itself. And it's definitely not trivial except in very obvious cases where the tasks at hand are completely independent.

The fact that programming languages don't include more specific parallel processing stuff is not because the need is recent. It's fundamentally because it's hard to even devise what would be useful in the general case.

bd139 · « **Reply #45 on:** July 09, 2018, 07:43:30 pm »

This always looked interesting to me: http://www.greenarraychips.com/index.html

GeorgeOfTheJungle · « **Reply #46 on:** July 09, 2018, 08:32:46 pm »

Quote from: bd139 on July 09, 2018, 07:43:30 pm

This always looked interesting to me: http://www.greenarraychips.com/index.html

To me too, it's irresistible from a nerd's perspective.

tggzzz · « **Reply #47 on:** July 09, 2018, 09:06:31 pm »

Quote from: SiliconWizard on July 09, 2018, 07:38:31 pm

The fact that programming languages don't include more specific parallel processing stuff is not because the need is recent. It's fundamentally because it's hard to even devise what would be useful in the general case.

Sigh. Some commercially significant languages do just that - and have done since the 80s.

Hoare's CSP is from the 70s, embodied directly in Occam in the 80s, and in xC in the 00s to date. Plus key concepts appear in other fashionable languages such as Go and Rust.

tggzzz · « **Reply #48 on:** July 09, 2018, 09:09:20 pm »

Quote from: bd139 on July 09, 2018, 07:43:30 pm

This always looked interesting to me: http://www.greenarraychips.com/index.html

Yes - but only until you start to think how you might use it and program it. Hardware is easy; software and usable hardware is difficult.

And that's why xCORE plus xC is so significant: hardware and software working well together in real-life hard realtime commercial applications.

bd139 · « **Reply #49 on:** July 09, 2018, 10:11:03 pm »

Usable compilers and schedulers are harder than all problems however so YMMV. Consider scheduling even a trivial concurrent system such as a map and reduce operation. How do you schedule that including IO bottlenecks, data paths and throughout optimisation? You now have n! problems to solve.

I’m very sceptical of such systems, even with CSP thrown in.


EEVblog Main Site	EEVblog on Youtube	EEVblog on Twitter	EEVblog on Facebook	EEVblog on Odysee

Author Topic: Books on Basic and Advanced Algorithms to program in C/C ++ (Read 11706 times)

Share me