I'd disagree slightly, but it's all about semantics (in the everyday sense). C and the whole Wurthian set of languages are equally easy to parse, but C is harder to translate.
Whatever you include in parsing and what you mean by "translating". Please elaborate, getting into academic semantic subtleties was not the point. To me parsing is the whole thing from tokenization to getting some kind of tree which makes sense of the source code from a syntactic POV.
(As defined on Wikipedia for instance:QuoteA parser is a software component that takes input data (frequently text) and builds a data structure – often some kind of parse tree, abstract syntax tree or other hierarchical structure, giving a structural representation of the input while checking for correct syntax. The parsing may be preceded or followed by other steps, or these may be combined into a single step.)
Again, with C, a recursive descent approach or whatever you choose is NOT sufficient, because of the heavy dependency on context. The whole idea is that the more context-dependent a given grammar is, the more difficult it is to parse. (And obviously the more difficult it is to correctly define the grammar itself to begin with.)
Have you actually written a complete C parser? It IS a bitch. I'll give you just one example: C declarations. Try writing a parser that correctly identifies all parts of a C declaration in the general case, and make it correct. Now do the same with Pascal (or Modula, Oberon...) Tell me what you think. I can guarantee you that C declarations are fun. As I said earlier, correctly parsing a C declaration involves taking the context into account in a non-trivial way. You may think that (as can be read sometimes) C declarations are hard to read for humans, but very easy to parse for a machine. It's not completely false, but a very naive way of seeing it. Take a complex C declaration and see how well your parser does for identifying the correct type and whatever is actually the identifier being declared (a task that is merely syntactic and that I would definitely include in a parser's job even with a strict definition of parsing.) It's impossible to do correctly without properly identifying the context, whereas parsing a - say - Pascal declaration can be done almost context-free.
To play with C declarations there's a fun site: https://cdecl.org/
void fred()
{
Marshmallow bert = charlie (7);
}
Reminds me of that beauty in PHP. Not intentional — it was a bug. But still amazing: Bug #61095 PHP can't add hex numbers.
@Cerebus: I think this is sort of getting fruitless and largely outside of the scope of this thread.
C was clearly designed with an expectation that it would be parsed by recursive descent, the Wurthian languages were most likely written with the assumption that a parser generator tool would be to hand, or at least more powerful parsing techniques than recursive descent.
C assumes the use of a table-driven LALR parser.
C assumes the use of a table-driven LALR parser.Which is rooted in C's origins, specifically lex and yacc. Even today, a lot of C compiler/interpreter projects use flex and bison, so one could say LALR parsers are kinda-sorta built-in to the C ecosystem; it is therefore not surprising that the standard (implicitly?) assumes one is used to parse C itself.
An understandable mistake because of the association of them all with Unix, but there exists here (Ritchie's description) and here (just source) a "primeval" C compiler from circa 1972 the earliest working C compiler to be preserved, written by Dennis Ritchie, with a hand written parser and no trace of lex and yacc anywhere in it. Lex was written by Mike Lesk circa 1975, yacc by Johnson also circa 1975. Lex and Yacc were used for pcc the portable C compiler, but that came later. The original version of yacc was written in B, only later to be translated to C, so it was available to be used in bootstrapping a C compiler if required, but wasn't.
*Contrary to your personal Finnish expectations he pronounces it "Ay-Ho", I know this because we once corresponded by email** and I asked him explicitly.
*Contrary to your personal Finnish expectations he pronounces it "Ay-Ho", I know this because we once corresponded by email** and I asked him explicitly.Contrary to your personal British supremacist expectations, I don't expect people with Finnish surnames and origin, but a different native language, to pronounce their names like Finns do. Most Finns don't. They often say how that name would be pronounced if the person was Finnish, but do not expect them to pronounce it that way. Weird, eh?
when I see "Aho" I expect the Finnish pronunciation - is it so foolish of me to expect a Finn would too? And how the hell do you get to "British supremacist expectations"
Chip on one's Shoulder
when I see "Aho" I expect the Finnish pronunciation - is it so foolish of me to expect a Finn would too? And how the hell do you get to "British supremacist expectations"Exactly that. It's not foolish, just completely wrong. I believe you think so because English is the lingua franca on the internet, and naturally assume it extends to culture (general individual behaviour) as well. It does not.
when I see "Aho" I expect the Finnish pronunciation - is it so foolish of me to expect a Finn would too? And how the hell do you get to "British supremacist expectations"Exactly that. It's not foolish, just completely wrong. I believe you think so because English is the lingua franca on the internet, and naturally assume it extends to culture (general individual behaviour) as well. It does not.
I get what you're saying, overall. But that bit above still seems arse about face - the "everybody speaks English" assumption, were one to apply Anglophone cultural imperialism, would be to assume that everybody would use the anglicised pronunciations (e.g. Aho => Ay-Ho, Braun => Brawn, Porche => Poorsh - score one for the Americans, who actually pronounce it like Ferdinand Porche did), not the other way around, making the effort to get the native pronunciation right.
The assumption that a native of a given language would pronounce a name from that language in the native fashion isn't some form of British cultural imperialism born out of the [typical] British inability to get 'foreign' names pronounced properly and therefore an assumption that the rest of the world would be equally crap at it, it's just a reasonable working assumption that a speaker of any language would make about natives of another country and its language.
Consider it an overblown reaction, then; and that I wouldn't, nor would the majority of Finns I know, assume that a Canadian surname Aho should be pronounced Aho, or is "properly" pronounced Aho, just because the person has Finnish roots.
Consider it an overblown reaction, then; and that I wouldn't, nor would the majority of Finns I know, assume that a Canadian surname Aho should be pronounced Aho, or is "properly" pronounced Aho, just because the person has Finnish roots.So noted. Alfred did give me quite a good potted history of the Aho family. From memory, so this may not be accurate (the emails are lost to about 3 mail server upgrades since), the family originally settled in Minnesota USA.
When MAXIMAL MUNCH attacks. "Due to maximal munch, hexadecimal integer literals ending in e and E, when followed by the operators + or -, must be separated from the operator with whitespace or parentheses in the source"
Can you see what is wrong with this, without compiling it?Code: [Select]#include <stdio.h>
int main(int argc,char *argv[]) {
printf("The next number after 0xE is 0x%X\n", 0xE+1);
}
No return value!
No return value!While this is code smell, in C this is a well defined function returning 0. main is a very special beast in C (and even weirder in C++). If main it returns int, reaching the terminating } is equivalent to calling exit(0).(1)
____
(1) 5.1.2.2.3§1