Author Topic: Looking for a site that will reduce HTML or make compatible with older browsers  (Read 2965 times)

0 Members and 1 Guest are viewing this topic.

Offline edyTopic starter

  • Super Contributor
  • ***
  • Posts: 2387
  • Country: ca
    • DevHackMod Channel
As someone who likes to hang on to retro tech, I often find myself booting up old computers or old phones with ancient browsers. The problem is, when I try to navigate to modern websites they often fail to load or have certificate/security issues.

My question is, does anyone know if a "pass-through" type website exists which will basically grab a page that you want, scrub it (or convert it) to be compatible with older browsers, and then present it to you?

For example, I would visit the converter site (call it "www.oldbrowser.com") with a request like this:

   http://www.oldbrowser.com/www.eevblog.com (or something like that)

It would then grab the page (at www.eevblog.com) and present it to me scrubbed of most of the modern HTML5 stuff and let me at least see most of the text and some basic images, and without security cert issues. It could also convert all the links so that they also have "http://www.oldbrowser.com/" in front so when I click on a link it will load it through the converter website as well.

That way, if I want to maybe load up a page even just to see some text, it will display something. Back in the day they had "m" versions of sites for mobile, but now everything is being handled by scripts trying to detect your OS and screen and it customizes it for your device. Unfortunately none of that seems to work with the oldest browsers. The sites are completely BROKEN on older browsers, they die after a while trying to load the page or will crap out due to memory overload or not draw or format anything correctly, even on a page where all I want is the basic text and images.

Does such a site exist? Is anyone working on such a project or see it as a useful thing? I can't imagine it would take much to make this happen. If I register "oldbrowser.com" and then when someone requests a URL from it (in the format I suggested above), it would fetch the page in question, try to recode certain tags or eliminate others altogether, or do some conversion (there are likely scripts already available and document conversion tools in Linux that could clean it up) and then return that modified page to the user.

Thoughts? Suggestions? I just want my old browsers to at least fetch some useful information still.

Here is an example of an old website that is still working and displays perfectly on my BlackBerry Curve:  :-DD

http://www.dolekemp96.org/main.htm
« Last Edit: December 21, 2020, 08:46:18 pm by edy »
YouTube: www.devhackmod.com LBRY: https://lbry.tv/@winegaming:b Bandcamp Music Link
"Ye cannae change the laws of physics, captain" - Scotty
 
The following users thanked this post: cdev

Offline cdev

  • Super Contributor
  • ***
  • !
  • Posts: 7350
  • Country: 00
You could download the openssl source code and compile it for your old computer, using a compatible compiler you still have, also download the certificates.
"What the large print giveth, the small print taketh away."
 

Offline retiredfeline

  • Frequent Contributor
  • **
  • Posts: 572
  • Country: au
A lot of sites require Javascript to work. You can observe this if you have the Noscript add-on installed in Firefox. So those would be a problem for your scheme.
 

Offline MIS42N

  • Frequent Contributor
  • **
  • Posts: 528
  • Country: au
I think it would be more sensible to have an in-house site to do this. Have a server on your own network. I think it would be technically feasible, it would need a modern browser back end, Apache front end, glue in the middle to take the output of the browser rendering engine, recode it as HTML2 (or whatever level the ancient browser could handle) dropping all the bits that can't recode.

I don't think one person could do this, but there are Special Interest Groups for all sorts of minimalist projects. Many variants of Linux. Good luck.
 

Offline cdev

  • Super Contributor
  • ***
  • !
  • Posts: 7350
  • Country: 00
What about all the backend chatter?

They have intentionally made it so older browsers break in order to make it nearly impossible to be anonymous on the web.

This is because they sell that information.
"What the large print giveth, the small print taketh away."
 

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 7192
  • Country: fi
    • My home page and email address
My question is, does anyone know if a "pass-through" type website exists which will basically grab a page that you want, scrub it (or convert it) to be compatible with older browsers, and then present it to you?

For example, I would visit the converter site (call it "www.oldbrowser.com") with a request like this:

   http://www.oldbrowser.com/www.eevblog.com (or something like that)

It would then grab the page (at www.eevblog.com) and present it to me scrubbed of most of the modern HTML5 stuff and let me at least see most of the text and some basic images, and without security cert issues. It could also convert all the links so that they also have "http://www.oldbrowser.com/" in front so when I click on a link it will load it through the converter website as well.

Spammers will celebrate the second such a website is published, especially if it executes JavaScript, and only provides the rendered page to the client.  You see, that neatly sidesteps all email scraping protections.  Spammers only need to suck the pages they are interested in through your converter site, and they'll be able to scrape all emails ever published on the web.

I for one will not help you with that.

If I were you, I'd set up a remote desktop on a Linux machine within your local network, and instead of browsing the net directly on old machines, use a browser on the remote desktop.  You should be able to find VNC clients even for older phones and OSes.
 

Offline janoc

  • Super Contributor
  • ***
  • Posts: 3925
  • Country: de
What about all the backend chatter?

They have intentionally made it so older browsers break in order to make it nearly impossible to be anonymous on the web.

This is because they sell that information.

Did you consider in your paranoia that maintaining sites working for old browsers that don't support modern standards and are full of security holes which have been patched since can be actually an issue? It costs resources and money, you know?

And what for? So that someone can look at your website in an old version of Internet Explorer or Mozilla and possibly get their data stolen due to a browser exploit that has been patched years ago?

This has absolute zilch with "being anonymous" on the web. First the idea that you can be anonymous online is a dangerous myth. If nothing else, unless you are using something like Tor you are leaving you IP address and browser identification behind. And even Tor can and has been de-anonymized.

The fact is that the older browsers were orders of magnitude worse in this regard because there was no separation between scripts (one tab could spy on every other tab), there were no anti-tracking nor anti-fingerprinting features in the browsers, there was little to no sandboxing of the browser extensions (ever heard about drive-by downloads installing malware and extensions onto your computer? Internet Explorer was notorious for this), etc.

I suggest you put away your tinfoil hat and try to do a bit of research about what and how browsers do instead of spreading this nonsense. Yes, tracking, spying and selling your data is real. However it has absolutely nothing to do with the problem the OP has.


To the OP:

Such "scrubbing" tool would be difficult to make, modern sites will break horribly if you strip or block javascript or some things the scripts expect are not working/supported. And that is not only the frontend (the html code) but also backend - any forms, fetching data in the background, etc.

You could try some filtering proxy tools that you install locally and point the browser at them. You could try a tool like Privoxy, which will strip away ads and a lot of the junk, perhaps it will make the site more palatable for the old browser:

http://www.privoxy.org/

« Last Edit: December 22, 2020, 04:08:41 pm by janoc »
 
The following users thanked this post: tooki

Offline janoc

  • Super Contributor
  • ***
  • Posts: 3925
  • Country: de

Spammers will celebrate the second such a website is published, especially if it executes JavaScript, and only provides the rendered page to the client.  You see, that neatly sidesteps all email scraping protections.  Spammers only need to suck the pages they are interested in through your converter site, and they'll be able to scrape all emails ever published on the web.

That's by far the least of the problem. If you are publishing e-mail address on your website and are relying on javascript obfuscation to "protect it", you are only deluding yourself.

There are plenty of scraping tools that both are capable of executing javascript and even defeat simpler forms of captchas, e.g. using OCR.

And that completely ignores the elephant in the room which is that anyone can pay few bucks for a database of millions of addresses that were leaked from one of the many corporate hacks. Many are even available for free, along with your other personal information. Scraping e-mails from websites is pretty worthless today and it is only done while scanning for other things (such as unpatched/vulnerable Wordpress) because it doesn't really cost the spammer anything extra.
 

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 7192
  • Country: fi
    • My home page and email address
Spammers will celebrate the second such a website is published, especially if it executes JavaScript, and only provides the rendered page to the client.  You see, that neatly sidesteps all email scraping protections.  Spammers only need to suck the pages they are interested in through your converter site, and they'll be able to scrape all emails ever published on the web.
That's by far the least of the problem. If you are publishing e-mail address on your website and are relying on javascript obfuscation to "protect it", you are only deluding yourself.
Those who use leaked database addresses risk a liability.  OCR is slow, and scraping thousands of pages through OCR is just not cost-effective: you're burning more money in electricity than you can recover from selling the mailing lists.

I think you are overestimating the effort spammers are willing to go to gather email addresses.  That which is technically possible and easy does not always make sense financially.

Fact is, phishing and scamming emails are deliberately full of typos and errors, because they are not interested in those who notice those; they are interested in those who do not, and are therefore easier targets.  Advertisers use mailing lists based on user profiles, not scraped ones; and they buy those off eBay sellers, Banggood, Facebook, Google, et al. "legally".
 

Offline Syntax Error

  • Frequent Contributor
  • **
  • Posts: 584
  • Country: gb
@edy What is your baseline browser? IE3, Netscape Navigator, MOSIAC, AoL, Compuserve... etc... or even the ancient text only Lynx (my first WWW experience). Or which HTML version are you targetting, before HTML 4?

Concept wise, it's no issue coding an intermediary to pull out 'modern' syntax, but how do you replicate the now dead document nodes? For example, a DIV tag is often populated with a set of client side JQuery callbacks;  antique IE4 would do nothing or (normal) segfault. The intermediary would need to run and translate the JQuery result.

Worse for translation though, modern web 'design paterns' have turned the trusty TABLE tag into a thought crime for the 'responsive' developer. Page layouts are most likely styled divs with a few flexboxes thrown in for trendiness. And let's not get into websockets, event handlers, or CSS with it's crazy @page directives. Truth, web sites are not made for decades old legacy. Or even days old!

The question is rather like asking, can I have a new car AND rip out the Canbus? Difficult.



 

Offline cdev

  • Super Contributor
  • ***
  • !
  • Posts: 7350
  • Country: 00
Janoc,

You have a right to your opinion, but I have a right to mine. That information they collect is intended for commercial use in the future. It wont always be the way it is now. Companies see it as a huge race for all the money in the world, they want to get their share before its gone. Everybody else is. They are making all this into their entitlements. 

Consider I was just speculating as to how he could update the core cryptographic libraries in his old computer. So your argument doesnt make sense - Maybe he just wants to be able to use the web to look up the occasional datasheet.

Thats what has happened. Zap, cant use web at all. Nor can people without specialized technical knowledge fix it.

Thank God for Linux. I saw this coming so switched to Linux years ago.

What about all the backend chatter?

They have intentionally made it so older browsers break in order to make it nearly impossible to be anonymous on the web.

This is because they sell that information.

Did you consider in your paranoia that maintaining sites working for old browsers that don't support modern standards and are full of security holes which have been patched since can be actually an issue? It costs resources and money, you know?

And what for? So that someone can look at your website in an old version of Internet Explorer or Mozilla and possibly get their data stolen due to a browser exploit that has been patched years ago?

This has absolute zilch with "being anonymous" on the web. First the idea that you can be anonymous online is a dangerous myth. If nothing else, unless you are using something like Tor you are leaving you IP address and browser identification behind. And even Tor can and has been de-anonymized.

The fact is that the older browsers were orders of magnitude worse in this regard because there was no separation between scripts (one tab could spy on every other tab), there were no anti-tracking nor anti-fingerprinting features in the browsers, there was little to no sandboxing of the browser extensions (ever heard about drive-by downloads installing malware and extensions onto your computer? Internet Explorer was notorious for this), etc.

I suggest you put away your tinfoil hat and try to do a bit of research about what and how browsers do instead of spreading this nonsense. Yes, tracking, spying and selling your data is real. However it has absolutely nothing to do with the problem the OP has.


Chromium is a mess as far as privacy is concerned. Firefox is too.

Have you ever built either of them from source?


I think simpler is often much more secure. 
« Last Edit: December 22, 2020, 06:41:46 pm by cdev »
"What the large print giveth, the small print taketh away."
 

Offline janoc

  • Super Contributor
  • ***
  • Posts: 3925
  • Country: de
Spammers will celebrate the second such a website is published, especially if it executes JavaScript, and only provides the rendered page to the client.  You see, that neatly sidesteps all email scraping protections.  Spammers only need to suck the pages they are interested in through your converter site, and they'll be able to scrape all emails ever published on the web.
That's by far the least of the problem. If you are publishing e-mail address on your website and are relying on javascript obfuscation to "protect it", you are only deluding yourself.
Those who use leaked database addresses risk a liability. 

Since when a criminal cares about a (I assume criminal) liability? 90+% of spam are various scams.

Fact is, phishing and scamming emails are deliberately full of typos and errors, because they are not interested in those who notice those; they are interested in those who do not, and are therefore easier targets. 

That is an interesting theory - in other words, with this strategy you would be intentionally seeding your mail with keywords that are very unlikely to occur in legitimate mail of your target and thus they make wonderful "spamminess" indicators for various filters. Even the ancient SpamAssassin used this, almost twenty years ago. Makes total sense ...  :palm:

Not to mention that most people these days are trained to treat the mail as suspicious/scam when it is full of typos and errors, regardless of whether it is actually one. It is such a tell-tale sign.

I think that:

a) You are giving way too much credit to scammers who are most often not native speakers of the language of the targeted population and thus use weird and wonderful linguistical constructions and typos most people wouldn't make. A lot of spam is even machine translated.

b) You have never had to actually administer a larger e-mail system and deal with both web scraping and spam ... I did, for many years.

 

Offline janoc

  • Super Contributor
  • ***
  • Posts: 3925
  • Country: de
Quote from: cdev link=topic=264266.msg3381

Consider I was just speculating as to how he could update the core cryptographic libraries in his old computer. So your argument doesnt make sense - Maybe he just wants to be able to use the web to look up the occasional datasheet.

But fixing his cryptographic libraries won't help him squat when the library is built into an old browser! You can't just swap DLLs around and expect things to work. And how is he going to update e.g. Microsoft's crypto built into something like Windows XP? Because that is what e.g. Internet Explorer uses.

Furthermore, it is a complete red herring - if a root certificate is not being recognized or is expired and the browser is (correctly) refusing to load the site, then what needs to be done is importing of a new certificate and not messing with any crypto libraries (and likely breaking your OS in the process)!

Quote from: cdev link=topic=264266.msg3381
Thats what has happened. Zap, cant use web at all. Nor can people without specialized technical knowledge fix it.

He should install a more modern browser on the machine (if possible) or use another machine to browse web. He obviously has some available when he posted on this forum. What you are proposing is a good way to get your computer hijacked - and still won't solve his problems.

Quote from: cdev link=topic=264266.msg3381
Thank God for Linux. I saw this coming so switched to Linux years ago.

I am writing this from a Linux machine myself. I guess you don't realize that Linux uses the exactly same browsers as Windows does these days (not counting Internet Explorer). Try to browse current web with a 20 years old Mozilla Suite and you will see how broken things will be. The choice of the operating system plays little to no role there.


Quote from: cdev link=topic=264266.msg3381
Chromium is a mess as far as privacy is concerned. Firefox is too.

Have you ever built either of them from source?

I think simpler is often much more secure.

Not quite sure what your point here is. If you don't like bundled telemetry in Chrome or Firefox then use versions without it - e.g. Chromium or Brave or whatever else. You have a choice there.

However compiling something from source will still not make Facebook or Google or whatever other website work with an old browser that doesn't support current javascript and web standards. Good luck browsing internet with Lynx then. Or NCSA Mosaic. You will need it.

Simpler is indeed more secure - as long as it a) works b) doesn't re-introduce a ton of security holes that have been long fixed in more recent versions.
« Last Edit: December 22, 2020, 07:03:53 pm by janoc »
 

Offline edyTopic starter

  • Super Contributor
  • ***
  • Posts: 2387
  • Country: ca
    • DevHackMod Channel
@edy What is your baseline browser? IE3, Netscape Navigator, MOSIAC, AoL, Compuserve... etc... or even the ancient text only Lynx (my first WWW experience). Or which HTML version are you targetting, before HTML 4?

Wow, I didn't realize how deep down the rabbit hole modern browsers have taken us. Reading up on it, I see that if you aren't keeping up with compatibility even a few years out or less, your site may break. I remember coding a webapp that used JQuery for some features and after an OS update on the phone, some of my app features didn't work. I then had to go back and figure out if it was JQuery, the update to the webkit engine, some function name/parameter changes that caused it. What a nightmare, and only so I could upload a "new" version of the same free app so that users wouldn't complain (not to mention having an old version still there for users who were still on the old OS version).  |O

Anyways, back to the question...

Let's say I have an old Ubuntu 6 (dapper drake) Linux machine running some 11 year old version of Mozilla Firefox, or I have an old BlackBerry Curve phone I'm using to stream music (released on Rogers Wireless in Canada on Aug 4, 2010). Or even a BlackBerry Playbook tablet that was released 2011 and had the last OS update in 2012 which basically left it with a stale browser. Unfortunately I can't update these devices but they still operate perfectly fine and connect to the internet Wi-Fi no problem.

The goal is not to "scrape" sites. I understand most sites will not function properly. I just want a way to read a page and render it in a way that can be displayed on an old browser. Yes I can always use VNC on the device (like the Playbook) and just connect to a modern Linux machine with a VNC server and browser the internet that way. Perhaps the same for my old Ubuntu machine, if I can install an old version of VNC. However certain devices (especially "dead" ones without app support anymore) like phones would benefit from having an intermediary site that can render/reformat the page even in some way that lightens the bandwidth considerably and gets rid of much of the "bells and whistles" while still retaining at least the relevant information. That's the hard part.

For example, imagine I want to load up a news site. I could theoretically just grab the entire page, convert it to a static IMAGE and then just put interactive "hotspots" on the image where there are URL links, so that when hovered over it will allow you to click it and navigate. So the entire site isn't even text anymore... just graphical elements. But that wouldn't work for many other sites that do more complicated things. Obviously any kind of pop-up players wouldn't work, but I don't usually need them.

On the BlackBerry curve, I usually listen to music from internet radio stations. To get the IP address, I go to my favourite radio station on www.streema.com, click on a station and then open up the PAGE SOURCE and find the URL of the stream. I can browse to that directly in my BlackBerry Curve browser it will play the station. However, the phone fails completely when it comes to even loading up the Streema.com website!

I'll give you an example... on streema.com I can search for Top 40 hits and find the following website pop-up with an embedded player:

http://streema.com/radios/play/American_Top_40_AT40

Since none of this works on my BlackBerry Curve, I open source of that page and find the following URL in there:

<source src='http://stream.revma.ihrhls.com/zc4802' type='audio/mpeg' />

I type that link into my BlackBerry Curve browser and it will give me the option to OPEN or SAVE. If I save it, it will start writing an mp3 file to local storage which will go on indefinitely (basically saving the music stream which I can play back later). Or I can choose OPEN and it will start playing it... I connect to line-out phono jack and have internet radio stream to some speakers. So while streema.com fails to load at all on my BlackBerry Curve (complains about unable to connect using current security settings), if it was formatted another way it would still be useful.

I think the problem will be that in order to pull out useful information, the reformatting for each site has to be customized and that will be an impossible task to make for a single "pass-through" site. That's probably the crux of the problem.
« Last Edit: December 22, 2020, 08:49:33 pm by edy »
YouTube: www.devhackmod.com LBRY: https://lbry.tv/@winegaming:b Bandcamp Music Link
"Ye cannae change the laws of physics, captain" - Scotty
 

Offline james_s

  • Super Contributor
  • ***
  • Posts: 21611
  • Country: us
All this javascript and other similar garbage is the reason for the security holes in the first place. Almost none of it does anything worthwhile either, IMO 90% of the web would be vastly improved by a return to early 2000's website design consisting of mostly static html. One of my favorite examples of excellent web design is http://lamptech.co.uk, good information density, nice photos, logical layout, and it works nicely even on browsers that don't have javascrip and similar crap enabled. One of my favorite examples of absolutely *awful* web design is http://komonews.com, it is bloated, there is an auto-play video window that follows you around and pops up again every time you navigate back from an article, then the part that really baffles me is if you mouse over the upper area of the page a new menu page pops up over the whole page and sometimes it's hard to get out of it. I don't know what kind of crack smoking idiots they have designing their site but it has gotten steadily worse and more baffling with each update over the years.
 

Offline cdev

  • Super Contributor
  • ***
  • !
  • Posts: 7350
  • Country: 00
You could likely use the mobile versions of sites now, if your TLS library was up to date.

Also, similarly to my earlier suggestion, you could download the source for modern browsers and possibly compile some of them yourself, if you had the dependencies. Depending on how old your OS is, this might be impossible, for the most modern browsers, but it is entirely possible you could find a midpoint but there should be a midpoint where basic browsing still worked. With some OS's you need to compile everything yourself. Its not that bad. It just takes a lot of time.

You could perhaps even cross-compile it on a newer machine, setting the target to be this older version.
« Last Edit: December 22, 2020, 09:28:44 pm by cdev »
"What the large print giveth, the small print taketh away."
 

Offline Syntax Error

  • Frequent Contributor
  • **
  • Posts: 584
  • Country: gb
@edy For rabbit hole read black hole. Historically...

The elephant dung in the room is that there are, and have never been, any web standards (say wot?). There is no ISO, DIN or IEEE body making definitive rulings about the syntax of the WWW. Instead, we have vague Request-For-Comment RFC documents from a fuzzy entity called the World Wide Web Consortium. The result, there is only ever convergence of practice until someone (Apple, Google, Mozilla) decides they want to create their own features that only they support. There is no web law that says we cannot create the <bobbydazzler> tag or the @ripper css directive, and push it out over the eevblog browser. When the geeks on Reddit hear of this new 'thing', an RFC gets raised and now it's a 'web standard'. For five minutes.

The early days of the WWW were littered with the floppy-disked corpses of browsers that succumbed in the Browser Wars of the 1990s. Then, it was not about collective convergence but corporate divergence.

Early website developers were faced with supporting at least Internet Explorer AND Netscape. Two browsers which did events and active content handling in very different ways. Early sites would often show an image that said, "this site is best viewed in Netscape." Meaning, we're not doing a friggin IE version as well. To span this chasm, early sites used Java Applets, which was a whole new parallel universe of plugin divergence. But at least JAs used the real Java language, with a graphics canvas that you could draw on. HTML5 rendered Java Applets and the whole zoo of other active object tags obsolete. Which was a good thing.

You probably could build a web time machine, but your 'Tardis' process will need to regenerate your Doctor Who backwards; from Jodie Whittaker to David Tennant.
 

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 7192
  • Country: fi
    • My home page and email address
a) You are giving way too much credit to scammers who are most often not native speakers of the language of the targeted population and thus use weird and wonderful linguistical constructions and typos most people wouldn't make. A lot of spam is even machine translated.
Spam, scams, and phishing are three different problems.  Email addresses are scraped for spam.  Hacked data is used for scams and phishing.

Spam is basically advertising.  Those who buy mass marketing services, don't want to pay for sending mail to addresses obtained via hacks, because it can backfire.  (Advertisers really don't like it when they are contacted by police investigating a hack, and having received email to an address that has never been public.)
Advertisers prefer profile databases sold by social networks.  Those cost money.  By scraping existing web pages with relevant keywords, spammers can construct mailing lists that do have a tenuous link to search terms, for much less money.  OCR'ing the pages, or setting up machinery to extract javascript-obfuscated email addresses is not cost effective.  If you have the technical skill to do that, you make more money by copying entire sites, and putting advertising on them.

If you ever buy anything from eBay or Banggood, your email address will be sold to spammers, for advertising.

Scams are easiest to filter out.  They do deliberately have errors, because they want responses from gullible targets.  If you can tell the message is a scam, they don't like to waste time on you anyway.  They use whatever email addresses they can find.

Phishing attempts tend to try to look as genuine as possible, and range from disguised files to links to fake web sites.  These are the hardest to filter out.  They also occur in two completely different categories: targeted, and scattershot.  Scattershot uses whatever email addresses they can find, including hacked databases, but are easily filtered out when detected (based on keywords or URI fragments in the message body).

b) You have never had to actually administer a larger e-mail system and deal with both web scraping and spam ... I did, for many years.
No large email servers, only one large mailing list server for a few years, but lots of web servers of different kinds.  I do know scraping well – both how to scrape, and how to make scraping as frustrating as possible.  Which is the point: to keep addresses off mass marketing lists.  Scams we can mostly filter out; and users just have to learn to be wary of phishing.  FWIW, one of my email addresses has been in active use over 25 years now.
 

Offline cdev

  • Super Contributor
  • ***
  • !
  • Posts: 7350
  • Country: 00
@edy For rabbit hole read black hole. Historically...

The elephant dung in the room is that there are, and have never been, any web standards (say wot?). There is no ISO, DIN or IEEE body making definitive rulings about the syntax of the WWW. Instead, we have vague Request-For-Comment RFC documents from a fuzzy entity called the World Wide Web Consortium. The result, there is only ever convergence of practice until someone (Apple, Google, Mozilla) decides they want to create their own features that only they support. There is no web law that says we cannot create the <bobbydazzler> tag or the @ripper css directive, and push it out over the eevblog browser. When the geeks on Reddit hear of this new 'thing', an RFC gets raised and now it's a 'web standard'. For five minutes.

The early days of the WWW were littered with the floppy-disked corpses of browsers that succumbed in the Browser Wars of the 1990s. Then, it was not about collective convergence but corporate divergence.

Early website developers were faced with supporting at least Internet Explorer AND Netscape. Two browsers which did events and active content handling in very different ways. Early sites would often show an image that said, "this site is best viewed in Netscape." Meaning, we're not doing a friggin IE version as well. To span this chasm, early sites used Java Applets, which was a whole new parallel universe of plugin divergence. But at least JAs used the real Java language, with a graphics canvas that you could draw on. HTML5 rendered Java Applets and the whole zoo of other active object tags obsolete. Which was a good thing.

You probably could build a web time machine, but your 'Tardis' process will need to regenerate your Doctor Who backwards; from Jodie Whittaker to David Tennant.

It seems to be not as bad as it used to be, if you avoid using the kinds of tools that produce wildly bloated code.

You bring up good points, but I do also think, all other factors aside, we do need the ability to invent new kinds of HTML tags, otherwise there would never be new kinds of content for the web.

There should be a setting in the browser that lets the browser request simpler code, too. A subset of code. Or all the bleeding edge functionality the server(s) can throw at you.


« Last Edit: December 22, 2020, 09:42:07 pm by cdev »
"What the large print giveth, the small print taketh away."
 

Offline tooki

  • Super Contributor
  • ***
  • Posts: 13156
  • Country: ch
Fact is, phishing and scamming emails are deliberately full of typos and errors, because they are not interested in those who notice those; they are interested in those who do not, and are therefore easier targets. 

That is an interesting theory - in other words, with this strategy you would be intentionally seeding your mail with keywords that are very unlikely to occur in legitimate mail of your target and thus they make wonderful "spamminess" indicators for various filters. Even the ancient SpamAssassin used this, almost twenty years ago. Makes total sense ...  :palm:

Not to mention that most people these days are trained to treat the mail as suspicious/scam when it is full of typos and errors, regardless of whether it is actually one. It is such a tell-tale sign.

I think that:

a) You are giving way too much credit to scammers who are most often not native speakers of the language of the targeted population and thus use weird and wonderful linguistical constructions and typos most people wouldn't make. A lot of spam is even machine translated.
No, Nominal Animal is absolutely correct: the misspellings are almost entirely deliberate. Why? Because it's trivial to create spam filters to find critical words, correctly spelled. (This is why, for example, people like me who earned their university diploma cum laude don't dare write that into the body of the message, instead writing "with honors". Never mind the poor residents of Scunthorpe.) So the spammers quickly began using misspellings to get around such keyword-based spam filters. Nowadays, with Unicode, they frequently use lookalike characters to fool spam filters, e.g. ΑΒΕϜΗΙΚΜΝΟΡΤΥΧΖ instead of ABEFHIKMNOPTYXZ, ϳ in place of j, р instead of p, etc. And that's just different alphabets, never mind how those alphabets are repeated multiple times as mathematical symbols. Depending on the font, they may be completely indistinguishable, and even if they're not, a human casually reading it won't know that's not deliberate. Meanwhile, software not specifically designed to equate lookalike characters will not recognize the words.

Sure, you can train a filter to find a misspelling, or even to use fuzzy logic or something to find similar misspellings. But nonetheless, parsing the correctly-spelled content of an email remains a key part of spam filtering. And the sheer variety of misspellings and weird wording that are possible mean that recognizing the fingerprint of one particular variant may not give you much by which to recognize others.
 

Offline amyk

  • Super Contributor
  • ***
  • Posts: 8526
Filtering proxy like Proxomitron or Proximodo or any of its various clones will be able to help with the TLS/SSL side of things, and could be used to "reduce HTML" in any way you want.

10 years is not that old, I thought you were going for something much older (I personally use a much older setup as my "daily driver".)
 

Offline S. Petrukhin

  • Super Contributor
  • ***
  • Posts: 1273
  • Country: ru
I can also suggest the idea of using your own modern PC, which you can access via RDP, Teamviewer, etc. from an ancient PC, using it as a thin client. RDP can work as a separate window  for the application, not necessarily for the entire desktop.
And sorry for my English.
 

Offline NiHaoMike

  • Super Contributor
  • ***
  • Posts: 9319
  • Country: us
  • "Don't turn it on - Take it apart!"
    • Facebook Page
There was a proxy for browsing the modern Web on truly vintage computers like the Amiga, but I'm having difficulty finding it.
Cryptocurrency has taught me to love math and at the same time be baffled by it.

Cryptocurrency lesson 0: Altcoins and Bitcoin are not the same thing.
 

Offline cdev

  • Super Contributor
  • ***
  • !
  • Posts: 7350
  • Country: 00
Run a browser on another machine displaying on your old machine via X-windows, (this works cross platform) or use VNC ?

(There are lots of VNC clients for different OSs)
"What the large print giveth, the small print taketh away."
 

Offline ebclr

  • Super Contributor
  • ***
  • Posts: 2331
  • Country: 00
cloudflare.com may help
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf