Regarding the use of SCLR itself, you probably have an answer there:
https://www.intel.com/content/www/us/en/support/programmable/articles/000075977.htmlAs to the second version not even using ENA, you need to take a look at the schematic you got. It's more "severe" than just not using ENA. As you can see, the result *looks* very inefficient, with the output of the register fed back to an input MUX, and an additional MUX to deal with the synchronous clear. While it's not "pretty" to look at, it may actually not make a difference. The reason for the first point (synthesis not necessarily using SCLR) is an optimization one. In FPGAs, you don't deal with fully independent logic structures as shown on schematics, you deal with logic blocks, and their efficient use is something that often eludes us poor humans. So you may think it didn't do a good job, while actually the end result is everything as good, if not better.
Now the last point here regarding your second version is the use of both a clock enable and a synchronous clear. One thing to check in Intel's docs is whether the synchronous clear signal of registers is subject to the clock enable signal or not. In the code you posted, you are assuming it's not. I am not sure.