Author Topic: C++ Template Meta Programming for regular expressions?  (Read 1897 times)

0 Members and 1 Guest are viewing this topic.

Offline westfwTopic starter

  • Super Contributor
  • ***
  • Posts: 4199
  • Country: us
C++ Template Meta Programming for regular expressions?
« on: November 11, 2019, 10:11:08 am »
As I understand things, regular expressions require that the standard format we're used to be "compiled" into an FSM before actual processing occurs.  I'm curious whether anyone has written a C++ library that does the FSM generation at compile time rather than during execution (presumably using template meta programming?)
 

Offline Ash

  • Regular Contributor
  • *
  • Posts: 161
  • Country: au
Re: C++ Template Meta Programming for regular expressions?
« Reply #1 on: November 11, 2019, 10:28:09 am »
It is possible, and there is a pretty amazing library under development:

https://github.com/hanickadot/compile-time-regular-expressions

I've not used it, but I've been following its development.. there are a number of CppCon videos on Youtube about it.

*edit* Just a warning you will need a modern compiler.. C++17 minimum I think.

Enjoy,
Ash.
« Last Edit: November 11, 2019, 10:31:30 am by Ash »
 

Offline magic

  • Super Contributor
  • ***
  • Posts: 6779
  • Country: pl
Re: C++ Template Meta Programming for regular expressions?
« Reply #2 on: November 11, 2019, 10:14:34 pm »
It's such a basic problem I'm sure there are tools which take regular expression and spit out a C function which recognizes it.
 

Online SiliconWizard

  • Super Contributor
  • ***
  • Posts: 14470
  • Country: fr
Re: C++ Template Meta Programming for regular expressions?
« Reply #3 on: November 11, 2019, 11:48:49 pm »
As it's possible to program a Tetris game with C++ templates apparently, what you're after should be perfectly possible. I don't know of any such available solution, but I'm sure it's possible.

Apart from performance reasons, what would be your rationale for using that as opposed to a generic regex library? Code size I suppose?

Edit: just found this: https://github.com/NTSFka/template_regex
« Last Edit: November 11, 2019, 11:51:06 pm by SiliconWizard »
 

Offline Cerebus

  • Super Contributor
  • ***
  • Posts: 10576
  • Country: gb
Re: C++ Template Meta Programming for regular expressions?
« Reply #4 on: November 12, 2019, 12:08:18 am »
It's such a basic problem I'm sure there are tools which take regular expression and spit out a C function which recognizes it.

Since the very earliest versions of Unix, it's called 'lex'. Not the easiest thing to use, but it's there and pretty much always has been.
Anybody got a syringe I can use to squeeze the magic smoke back into this?
 

Offline westfwTopic starter

  • Super Contributor
  • ***
  • Posts: 4199
  • Country: us
Re: C++ Template Meta Programming for regular expressions?
« Reply #5 on: November 12, 2019, 12:36:18 am »
Quote
what would be your rationale for using that as opposed to a generic regex library? Code size I suppose?
Mainly code and data memory size.  For the traditional "generate first", you need the FSM generation code present, and the resulting FSM will end up in RAM.  Doing things at compile time, you could at least theoretically eliminate the generation code and the original string, and put the whole FSM in ROMish memory.

(Mostly, this seemed like an "obvious over-optimization" that someone would have done already.)

Quote
it's called 'lex'
Hmm.  I haven't actually used LEX, but I never got the impression that it could be used similarly to modern regexp libraries, or that it aimed at producing particularly efficient code.


 

Offline Cerebus

  • Super Contributor
  • ***
  • Posts: 10576
  • Country: gb
Re: C++ Template Meta Programming for regular expressions?
« Reply #6 on: November 12, 2019, 01:21:42 am »
Hmm.  I haven't actually used LEX, but I never got the impression that it could be used similarly to modern regexp libraries, or that it aimed at producing particularly efficient code.

Lex is a bit specialised, in that it's intended to really be used to produce the lexical front end for parsing tools. But at its heart it's just a regex => FSM generator. The generated code is very fast, table based, and as efficient as you're going to get without actually generating object code that directly executes the FSM. More modern implementations such as GNU flex are functionally similar but a bit easier to use. Lex and flex are really at home when you're trying to evaluate several regexes in parallel and finding which one matches, which is what they are designed for.

If you're desperate for a solution, you could cobble one with lex or flex, but, were it me, it's not the route I'd take if I could find something better fitted. As someone else said, it's unlikely that there isn't a tool out there that's a better fit but, offhand, I don't know of one.

By pure chance it happens that I may well be starting on writing a lex/flex style tool myself in the near future. If and when I do I'll consider whether it's practical to produce a sideline to it that's targetted at generating simple standalone recogniser routines from regexes.
Anybody got a syringe I can use to squeeze the magic smoke back into this?
 

Offline magic

  • Super Contributor
  • ***
  • Posts: 6779
  • Country: pl
Re: C++ Template Meta Programming for regular expressions?
« Reply #7 on: November 13, 2019, 09:41:12 am »
Another RE to C compiler.

https://re2c.org/manual/manual.html
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf