Again I did consider that but it brings in OpenCV and the like and then things get complicated and expensive to run.
Like I said, potential rabbit-hole.
I suspect that there exist some much better algorithms for saying "
image A and image B are the same image, albeit different at the binary level, with a probability of X%" that would be less expensive than going "full OpenCV on its ass". If one wanted to find such an algorithm I'd expect to find it in or adjacent to the literature on watermarking images - after all, what you need for a watermark is something that will spit out a hash for an image that is the same whether it's been resized, compressed with JPEG, GIF, RLE or whatever. I vaguely remember one of Ross Anderson's guys or gals coming up with a superior version of something like this a few years back while trying to produce attacks on watermarking algorithms.
A quick search on "image hashing algorithm compression size invariant" suggests that the key term is "robust image hashing". OK, that's as close to the rabbit-hole as I'm gonna get. [Backs away carefully...]
Yes deep deep rabbit hole. I saw someone do something similar once and forget to solve the original problem they were trying to solve which was to make money. Fortunately he was bailed out by "rich daddy" and I continued to mug them for contract fees for another 6 months
. The task in question was I suppose a very early version of Uber's routing stuff targeted at goods movement. The rabbit hole was hyper-optimisation of a working serendipitous discovery algorithm rather than marketing it properly
Anyway back to playing AWS account consolidation whack-a-mole
*raises a paw from the back of the classroom*Oklay... so I'm guessing there must be some massive assache involved in post-processing the DB to have it check an attachment URL to see if it has already been posted, then rewrite those to link all occurrences of said URL to a single file?
I assume that at this point, we're only talking aboot data capture; parsing and optimizing that data can come later?
mnem
Yes deep deep rabbit hole. I saw someone do something similar once and forget to solve the original problem they were trying to solve which was to make money. Fortunately he was bailed out by "rich daddy" and I continued to mug them for contract fees for another 6 months . The task in question was I suppose a very early version of Uber's routing stuff targeted at goods movement. The rabbit hole was hyper-optimisation of a working serendipitous discovery algorithm rather than marketing it properly
Many years ago when I was working at Reed Corrugated Cases (doing support) I wandered into the the developer's pit for a "pass the time of day" chat. Turns out I'd walked in on the chief programmer and one of his minions right at the point where they'd started tearing their hair out over a problem that they'd been working on for weeks. They were trying to produce optimum loads for our delivery lorries, both in terms of packing everything optimally into the lorry and having it available in the right order to take the optimum delivery route. I had to point out that "
optimum packing and depletion*" and "
the travelling salesman problem" are two of the hardest problems known to computer science and had eluded the greatest minds in the field for years, and they were trying to solve them both simultaneously. They didn't know whether to thank me for offering a graceful way out of the pit they had dug for themselves, or curse me for not clairvoyantly telling them so weeks earlier.
* I told then that, from memory, "
optimum packing and depletion" was actually computationally undecidable, and then had to go on and try to explain the
Entscheidungsproblem to guys who were bright**, but didn't have the background to really quickly grasp it all, and I'm
not the best man to ask to explain it.
** But not that bright, the Chief Programmer had deliberately chosen an Austin Allegro as his company car.
*raises a paw from the back of the classroom*
Oklay... so I assume there must be some massive assache involved in post-processing the DB to have it check an attachment URL to see if it has already been posted, then rewrite those to link all occurrences of said URL to a single file?
I assume that at this point, we're only talking aboot data capture; parsing and optimizing that data can come later?
mnem
No, you can't trust that:
(1) any given URL always points to the same image
(2) that any given image only has one unique URL that points to it.
It's easier to grab every image as you encounter it, hash it, see if you've already saved an image with the same hash, and either store it as a 'new' image if you haven't seen it before, or point to the previously stored image if you have previously encountered it.
Whether you do that de-duplication on the first pass, or as a post-processing pass, is to some extent a matter of taste. However, it's marginally easier to embed a local image cache URL on the first pass rather than re-parse messages and then edit them on a second pass. If you do it on a first pass anything in your database is "ready to serve". Do it on a second pass and you partition your database into "ready to serve" messages and "awaiting post processing" messages.
Dietary cooking time.
Bean Stew.
hacked tomatos, 1 pound minced meat, 3 peppers, mushrooms,onions, beans, hand full black pepper, hand full hot paprika spice, Oregano, basil, bit of salt, 1 Habanero,1 Carolina Reaper.
Good timing.
Last time I had crispy chilli chicken Beanflying moaned that I didn't have enough garnish. This better?
Note, foreshortening has bitten here, that bowl of salad is 10" across and is actually bigger than the chicken and chips in the foreground.
that's something I currently must not even think of. Also, Hubby has munched all my croissants that were in the freezer ...
grrr.
Yes deep deep rabbit hole. I saw someone do something similar once and forget to solve the original problem they were trying to solve which was to make money. Fortunately he was bailed out by "rich daddy" and I continued to mug them for contract fees for another 6 months . The task in question was I suppose a very early version of Uber's routing stuff targeted at goods movement. The rabbit hole was hyper-optimisation of a working serendipitous discovery algorithm rather than marketing it properly
Many years ago when I was working at Reed Corrugated Cases (doing support) I wandered into the the developer's pit for a "pass the time of day" chat. Turns out I'd walked in on the chief programmer and one of his minions right at the point where they'd started tearing their hair out over a problem that they'd been working on for weeks. They were trying to produce optimum loads for our delivery lorries, both in terms of packing everything optimally into the lorry and having it available in the right order to take the optimum delivery route. I had to point out that "optimum packing and depletion*" and "the travelling salesman problem" are two of the hardest problems known to computer science and had eluded the greatest minds in the field for years, and they were trying to solve them both simultaneously. They didn't know whether to thank me for offering a graceful way out of the pit they had dug for themselves, or curse me for not clairvoyantly telling them so weeks earlier.
* I told then that, from memory, "optimum packing and depletion" was actually computationally undecidable, and then had to go on and try to explain the Entscheidungsproblem to guys who were bright**, but didn't have the background to really quickly grasp it all, and I'm not the best man to ask to explain it.
** But not that bright, the Chief Programmer had deliberately chosen an Austin Allegro as his company car.
I have been on many sales courses where the trainer had claimed to have cracked the travelling salesman problem by dividing the salesmen's patch up into sectors and then arranging visits to each sector on certain days, so you could tell a potential new customer when you're going to be in his area next and make the appointment.
Completely missing the points about where does most of the business come from, also of course, not taking into account the customers possible potential, the urgency of their request etc. Needless to say no real sales person would ever work their plan as they have zero area knowledge or company awareness, they were infact failed sales people who could not cut it in the real world of sales. That problem is not solvable without the salespersons own special knowledge of the area and the customer and that is ever flexible thing that is impossible to quantify
Yes deep deep rabbit hole. I saw someone do something similar once and forget to solve the original problem they were trying to solve which was to make money. Fortunately he was bailed out by "rich daddy" and I continued to mug them for contract fees for another 6 months . The task in question was I suppose a very early version of Uber's routing stuff targeted at goods movement. The rabbit hole was hyper-optimisation of a working serendipitous discovery algorithm rather than marketing it properly
Many years ago when I was working at Reed Corrugated Cases (doing support) I wandered into the the developer's pit for a "pass the time of day" chat. Turns out I'd walked in on the chief programmer and one of his minions right at the point where they'd started tearing their hair out over a problem that they'd been working on for weeks. They were trying to produce optimum loads for our delivery lorries, both in terms of packing everything optimally into the lorry and having it available in the right order to take the optimum delivery route. I had to point out that "optimum packing and depletion*" and "the travelling salesman problem" are two of the hardest problems known to computer science and had eluded the greatest minds in the field for years, and they were trying to solve them both simultaneously. They didn't know whether to thank me for offering a graceful way out of the pit they had dug for themselves, or curse me for not clairvoyantly telling them so weeks earlier.
* I told then that, from memory, "optimum packing and depletion" was actually computationally undecidable, and then had to go on and try to explain the Entscheidungsproblem to guys who were bright**, but didn't have the background to really quickly grasp it all, and I'm not the best man to ask to explain it.
** But not that bright, the Chief Programmer had deliberately chosen an Austin Allegro as his company car.
I have been on many sales courses where the trainer had claimed to have cracked the travelling salesman problem by dividing the salesmen's patch up into sectors and then arranging visits to each sector on certain days, so you could tell a potential new customer when you're going to be in his area next and make the appointment.
Completely missing the points about where does most of the business come from, also of course, not taking into account the customers possible potential, the urgency of their request etc. Needless to say no real sales person would ever work their plan as they have zero area knowledge or company awareness, they were infact failed sales people who could not cut it in the real world of sales. That problem is not solvable without the salespersons own special knowledge of the area and the customer and that is ever flexible thing that is impossible to quantify
Except none of that is
"The Travelling Salesman Problem", a well-known mathematical problem which is "
Given a list of cities and the distances between each pair of cities, what is the shortest possible route that visits each city exactly once and returns to the origin city?".
It sounds deceptively simple and in fact is very hard to solve in a scalable way. It seems easy because if you have three cities you can easily list all the possible routes and find an optimal one, but it scales too fast to keep up as you increase the number of cities. The problem is to find a method that scales linearly or in polynomial complexity with the number of cities rather than in factorial complexity - nobody has. It's what's know as an
NP-hard problem in mathematics, or "
just plain too hard to figure out properly for more than a small example before hell freezes over" to the rest of us.
I reckon it’s got a really simple answer that will take a million people a hundred years to trip over accidentally.
In the case of the aforementioned one it’s not actually the travelling salesman problem as such. It turned out the sources and destinations of stuff were actually fairly consistent and the drivers worked out that if they resold their jobs on the same site under a different account they could build an arbitrary hub and spoke architecture and circumvent the commission model. Turns out a few thousand monkeys are smarter than a computer scientist.
IT contractor was the smartest of the lot those as he was the only one who made any money out of it in the end
Edit: removed small rant off the end of that because I am kinder than I was 5 minutes ago
Sometimes pins on a map/corkboard really is the best way to figure something out.
mnem
Oooh... lookit all the little flags!
Dietary cooking time.
Bean Stew.
hacked tomatos, 1 pound minced meat, 3 peppers, mushrooms,onions, beans, hand full black pepper, hand full hot paprika spice, Oregano, basil, bit of salt, 1 Habanero,1 Carolina Reaper.
That looks pretty much exactly like the one I did for dinner today. Except yours has some kick. If I put any chillis with balls in mine none of the kids would eat it. I have to add that to my plate afterwards via this
https://www.waitrose.com/ecom/products/encona-carolina-reaper-chilli-sauce/600058-662732-662733
Sometimes pins on a map/corkboard really is the best way to figure something out.
mnem
Oooh... lookit all the little flags!
No, a six year-old girl with a blackboard and chalk is almost always the best way to tackle any problem. Unless the problem is "Who gets the biggest bit of cake?", at which point the natural good sense, innocence, and innate sense of fair play of a six year-old girl goes out of the window. Ditto "What's the best colour?" unless "pink" is the answer you were already looking for.
Dietary cooking time.
Bean Stew.
hacked tomatos, 1 pound minced meat, 3 peppers, mushrooms,onions, beans, hand full black pepper, hand full hot paprika spice, Oregano, basil, bit of salt, 1 Habanero,1 Carolina Reaper.
That looks pretty much exactly like the one I did for dinner today. Except yours has some kick. If I put any chillis with balls in mine none of the kids would eat it. I have to add that to my plate afterwards via this https://www.waitrose.com/ecom/products/encona-carolina-reaper-chilli-sauce/600058-662732-662733
The "reading ahead of what I'm actually reading" bit of my brain tried to read that as "enema".
Well it does that after a few hours anyway so that's a fair mistake to make
Sometimes pins on a map/corkboard really is the best way to figure something out.
mnem
Oooh... lookit all the little flags!
No, a six year-old girl with a blackboard and chalk is almost always the best way to tackle any problem. Unless the problem is "Who gets the biggest bit of cake?", at which point the natural good sense, innocence, and innate sense of fair play of a six year-old girl goes out of the window. Ditto "What's the best colour?" unless "pink" is the answer you were already looking for.
I think you probably need to meet my kids. They're weird. Pink is definitely off the menu here. If you buy them cake they will eat the sponge and leave the topping. Any suggestion of child like things, especially other people's parties, are met with a look of disdain.
They also can tell the difference between a 2n3904 and a ceramic capacitor based on the feeling to the bottom of the foot (that'll get 'em back for the sodding lego!)
So you're saying that your phenotype breeds true, then...?
mnem
"You laugh at me because I'm different; I laugh at you because you're all the same."
They also can tell the difference between a 2n3904 and a ceramic capacitor based on the feeling to the bottom of the foot (that'll get 'em back for the sodding lego!)
Take up building things out of wire-wrapped SN7400 series logic in DIL. That'll teach 'em. Enough pointy bits to make up for the Lego
and the nappy changing.
It sounds deceptively simple and in fact is very hard to solve in a scalable way. It seems easy because if you have three cities you can easily list all the possible routes and find an optimal one, but it scales too fast to keep up as you increase the number of cities. The problem is to find a method that scales linearly or in polynomial complexity with the number of cities rather than in factorial complexity - nobody has. It's what's know as an NP-hard problem in mathematics, or "just plain too hard to figure out properly for more than a small example before hell freezes over" to the rest of us.
A related problem is "build a redundant network between cities" and is most easily solved by outsourcing it.
Ham friend using my equipment to perform complete alignment of an Icom IC-7100 transceiver. The hp 437B worked like a charm.
The HP 8570A arrived on Tuesday, naked and in the rain as the Palletways guy had cut it off the pallet in order to get it to my door. Oh well... time to sit it by the fire for a day or so.
Finding a space for the thing, even on a temporary basis, is a problem, as it's absolutely huge (full rack, 4U, 600mm deep, and 20-25kg).
As I suspected, the fuse looked like this:
Now I just need to learn how to use it, so I can test if it works as it should. I did notice the CRT was a little unstable on initial power up, so there's likely some connection/condensation/crapacitor issues to look at.
The bottom left corner of the screen is slightly darkened, but not to the point it's a problem. I think it'll clean up nicely!
congradulations! a fine piece of gear.
but study this picture......because when you open the case some of these will fall out......and you will need to collect them to epoxy back on the rotary discs. and you will be hating life when you reassemble the fecking things. but you will be loving life after.
*raises a paw from the back of the classroom*
Oklay... so I assume there must be some massive assache involved in post-processing the DB to have it check an attachment URL to see if it has already been posted, then rewrite those to link all occurrences of said URL to a single file?
I assume that at this point, we're only talking aboot data capture; parsing and optimizing that data can come later?
mnem
No, you can't trust that:
(1) any given URL always points to the same image
(2) that any given image only has one unique URL that points to it.
It's easier to grab every image as you encounter it, hash it, see if you've already saved an image with the same hash, and either store it as a 'new' image if you haven't seen it before, or point to the previously stored image if you have previously encountered it.
Whether you do that de-duplication on the first pass, or as a post-processing pass, is to some extent a matter of taste. However, it's marginally easier to embed a local image cache URL on the first pass rather than re-parse messages and then edit them on a second pass. If you do it on a first pass anything in your database is "ready to serve". Do it on a second pass and you partition your database into "ready to serve" messages and "awaiting post processing" messages.
Wow, things have really gone off the deep end since yesterday.
How does it go? The project always grows to fill all the time possible and more?
@bd139: TEArchiver feature creep #276: Ability to index and list all of mnem's sig quips.
#277: Notify when there's a dupe in the list from #276.
LOL I should have taken a screen shot, but I got an "error giving thanks" on that last post by bitseeker. Never seen that error before.
Tried a second time and it worked.