Quantcast

Again Cobol:

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Again Cobol:

Mark Taylor
I'm working on a Cobol grammar (oh, the foolishness of youth...  wait I'm not THAT young...) and I need some advice about the ambiguities.  In particular I'm getting the famous: "error(211): CobolTest.g:11:30: [fatal] rule if has non-LL(*) decision due to recursive rule invocations reachable from alts 1,2.  Resolve by left-factoring or using syntactic predicates or using backtrack=true option."  Yes, this has come up before, but there was no clear answer.  This time I have a specific example (see below).

Below is the smallest grammar which exhibits the problem.  You can see I have stmt+ in both the IF rule and the PERFORM rule.  The problem is the 'END-IF'?.  Since END-IF is optional in Cobol, there is no clear scope terminator.   I've tried left refactoring the (stmt+ ....) into a separate rule for both IF and PERFORM but that doesn't seem to work either.  I don't see how a syntactic predicate could be applied to this either.

If I were writing this as a recursive descent parser by hand (I'm trying Antlr so I don't have to do this) I would write a statementlist() method that would simply loop on all statement beginnings keywords.  Then when an END-IF, ELSE, END-PERFORM, or some other arbitrary scope terminator appeared in the input the loop would simple exit and return the list of valid statements.  The question is: how to get Antlr to behave like this?

Any advice is appreciated.

<pre>

grammar CobolTest;

program:     sentence+ EOF;

sentence:    stmt+ '.' ;

stmt:        if
        |    move
        |   perform ;

if:            'if' condition 'then'? stmt+ ('else' stmt+)? 'end-if'? ;

move:        'move' ID 'to' ID ;

perform:     'perform' stmt+ 'end-perform' ;

condition:     ID '==' ID ;

ID  :       ('a'..'z'|'A'..'Z')+ ;
INT :       '0'..'9'+ ;
NEWLINE:    '\r'? '\n' {skip();} ;
WS  :       (' '|'\t')+ {skip();} ;

</pre>


List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Again Cobol:

Dave Dutcher
I'm new to Antlr myself, so maybe you already know everything I'm going to suggest.
 
So your grammar has trouble with input like
 
if A == B then move Y to Z move X to Y.
 
And you're suggesting that the parser should just consume the move X to Y as part of the if statement, even though it could also be interpreted as another seperate statement?  This reminds me of the classic IF IF ELSE problem, but I don't have my Antlr book with me to lookup how that is usually solved.
 
One method might be to just turn on backtracking.  Otherwise I would think you could add syntatic predicates like ((stmt)=>stmt)+ which, as I understand it, would make Antlr consume all the statements it can as that part of the rule.  I haven't tested this though.
 
Dave
 
 


From: [hidden email] [mailto:[hidden email]] On Behalf Of Mark Taylor
Sent: Thursday, June 11, 2009 7:39 AM
To: [hidden email]
Subject: [antlr-interest] Again Cobol:

I'm working on a Cobol grammar (oh, the foolishness of youth...  wait I'm not THAT young...) and I need some advice about the ambiguities.  In particular I'm getting the famous: "error(211): CobolTest.g:11:30: [fatal] rule if has non-LL(*) decision due to recursive rule invocations reachable from alts 1,2.  Resolve by left-factoring or using syntactic predicates or using backtrack=true option."  Yes, this has come up before, but there was no clear answer.  This time I have a specific example (see below).

Below is the smallest grammar which exhibits the problem.  You can see I have stmt+ in both the IF rule and the PERFORM rule.  The problem is the 'END-IF'?.  Since END-IF is optional in Cobol, there is no clear scope terminator.   I've tried left refactoring the (stmt+ ....) into a separate rule for both IF and PERFORM but that doesn't seem to work either.  I don't see how a syntactic predicate could be applied to this either.

If I were writing this as a recursive descent parser by hand (I'm trying Antlr so I don't have to do this) I would write a statementlist() method that would simply loop on all statement beginnings keywords.  Then when an END-IF, ELSE, END-PERFORM, or some other arbitrary scope terminator appeared in the input the loop would simple exit and return the list of valid statements.  The question is: how to get Antlr to behave like this?

Any advice is appreciated.

<pre>

grammar CobolTest;

program:     sentence+ EOF;

sentence:    stmt+ '.' ;

stmt:        if
        |    move
        |   perform ;

if:            'if' condition 'then'? stmt+ ('else' stmt+)? 'end-if'? ;

move:        'move' ID 'to' ID ;

perform:     'perform' stmt+ 'end-perform' ;

condition:     ID '==' ID ;

ID  :       ('a'..'z'|'A'..'Z')+ ;
INT :       '0'..'9'+ ;
NEWLINE:    '\r'? '\n' {skip();} ;
WS  :       (' '|'\t')+ {skip();} ;

</pre>


List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Again Cobol:

Mark Taylor
Dave: Yes, Cobol is that evil.  The period closes all previous scopes that don't require explicit scope terminators (some constructs like PERFORM do, some don't (COMPUTE)).  Thanks, that worked.  I replaced the PERFORM rule with:

perform:     'perform' ((stmt)=>stmt)+ 'end-perform' ;

and I no longer get the error.

Now some rhetorical questions: why does it work?  Or, why is it not obvious?  Where are syntactic predicates explained?  Is this in the online docs somewhere I can't find?  Clearly I have some studying to do.  I haven't picked up T.P.s book yet but I aim too. 



On Thu, Jun 11, 2009 at 8:42 AM, Dave Dutcher <[hidden email]> wrote:
I'm new to Antlr myself, so maybe you already know everything I'm going to suggest.
 
So your grammar has trouble with input like
 
if A == B then move Y to Z move X to Y.
 
And you're suggesting that the parser should just consume the move X to Y as part of the if statement, even though it could also be interpreted as another seperate statement?  This reminds me of the classic IF IF ELSE problem, but I don't have my Antlr book with me to lookup how that is usually solved.
 
One method might be to just turn on backtracking.  Otherwise I would think you could add syntatic predicates like ((stmt)=>stmt)+ which, as I understand it, would make Antlr consume all the statements it can as that part of the rule.  I haven't tested this though.
 
Dave
 
 


From: [hidden email] [mailto:[hidden email]] On Behalf Of Mark Taylor
Sent: Thursday, June 11, 2009 7:39 AM
To: [hidden email]
Subject: [antlr-interest] Again Cobol:

I'm working on a Cobol grammar (oh, the foolishness of youth...  wait I'm not THAT young...) and I need some advice about the ambiguities.  In particular I'm getting the famous: "error(211): CobolTest.g:11:30: [fatal] rule if has non-LL(*) decision due to recursive rule invocations reachable from alts 1,2.  Resolve by left-factoring or using syntactic predicates or using backtrack=true option."  Yes, this has come up before, but there was no clear answer.  This time I have a specific example (see below).

Below is the smallest grammar which exhibits the problem.  You can see I have stmt+ in both the IF rule and the PERFORM rule.  The problem is the 'END-IF'?.  Since END-IF is optional in Cobol, there is no clear scope terminator.   I've tried left refactoring the (stmt+ ....) into a separate rule for both IF and PERFORM but that doesn't seem to work either.  I don't see how a syntactic predicate could be applied to this either.

If I were writing this as a recursive descent parser by hand (I'm trying Antlr so I don't have to do this) I would write a statementlist() method that would simply loop on all statement beginnings keywords.  Then when an END-IF, ELSE, END-PERFORM, or some other arbitrary scope terminator appeared in the input the loop would simple exit and return the list of valid statements.  The question is: how to get Antlr to behave like this?

Any advice is appreciated.

<pre>

grammar CobolTest;

program:     sentence+ EOF;

sentence:    stmt+ '.' ;

stmt:        if
        |    move
        |   perform ;

if:            'if' condition 'then'? stmt+ ('else' stmt+)? 'end-if'? ;

move:        'move' ID 'to' ID ;

perform:     'perform' stmt+ 'end-perform' ;

condition:     ID '==' ID ;

ID  :       ('a'..'z'|'A'..'Z')+ ;
INT :       '0'..'9'+ ;
NEWLINE:    '\r'? '\n' {skip();} ;
WS  :       (' '|'\t')+ {skip();} ;

</pre>



List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Again Cobol:

Dave Dutcher
I read about syntatic predicates in TP's book.  And looking over the online docs, I can't really find where they are explained either.  A google search did turn up this article http://www.antlr.org/wiki/display/ANTLR3/How+to+remove+global+backtracking+from+your+grammar 
 
which looks like it might answer some of your questions.
 
Dave
 


From: Mark Taylor [mailto:[hidden email]]
Sent: Thursday, June 11, 2009 9:34 AM
To: Dave Dutcher
Cc: [hidden email]
Subject: Re: [antlr-interest] Again Cobol:

Dave: Yes, Cobol is that evil.  The period closes all previous scopes that don't require explicit scope terminators (some constructs like PERFORM do, some don't (COMPUTE)).  Thanks, that worked.  I replaced the PERFORM rule with:

perform:     'perform' ((stmt)=>stmt)+ 'end-perform' ;

and I no longer get the error.

Now some rhetorical questions: why does it work?  Or, why is it not obvious?  Where are syntactic predicates explained?  Is this in the online docs somewhere I can't find?  Clearly I have some studying to do.  I haven't picked up T.P.s book yet but I aim too. 



On Thu, Jun 11, 2009 at 8:42 AM, Dave Dutcher <[hidden email]> wrote:
I'm new to Antlr myself, so maybe you already know everything I'm going to suggest.
 
So your grammar has trouble with input like
 
if A == B then move Y to Z move X to Y.
 
And you're suggesting that the parser should just consume the move X to Y as part of the if statement, even though it could also be interpreted as another seperate statement?  This reminds me of the classic IF IF ELSE problem, but I don't have my Antlr book with me to lookup how that is usually solved.
 
One method might be to just turn on backtracking.  Otherwise I would think you could add syntatic predicates like ((stmt)=>stmt)+ which, as I understand it, would make Antlr consume all the statements it can as that part of the rule.  I haven't tested this though.
 
Dave
 
 


From: [hidden email] [mailto:[hidden email]] On Behalf Of Mark Taylor
Sent: Thursday, June 11, 2009 7:39 AM
To: [hidden email]
Subject: [antlr-interest] Again Cobol:

I'm working on a Cobol grammar (oh, the foolishness of youth...  wait I'm not THAT young...) and I need some advice about the ambiguities.  In particular I'm getting the famous: "error(211): CobolTest.g:11:30: [fatal] rule if has non-LL(*) decision due to recursive rule invocations reachable from alts 1,2.  Resolve by left-factoring or using syntactic predicates or using backtrack=true option."  Yes, this has come up before, but there was no clear answer.  This time I have a specific example (see below).

Below is the smallest grammar which exhibits the problem.  You can see I have stmt+ in both the IF rule and the PERFORM rule.  The problem is the 'END-IF'?.  Since END-IF is optional in Cobol, there is no clear scope terminator.   I've tried left refactoring the (stmt+ ....) into a separate rule for both IF and PERFORM but that doesn't seem to work either.  I don't see how a syntactic predicate could be applied to this either.

If I were writing this as a recursive descent parser by hand (I'm trying Antlr so I don't have to do this) I would write a statementlist() method that would simply loop on all statement beginnings keywords.  Then when an END-IF, ELSE, END-PERFORM, or some other arbitrary scope terminator appeared in the input the loop would simple exit and return the list of valid statements.  The question is: how to get Antlr to behave like this?

Any advice is appreciated.

<pre>

grammar CobolTest;

program:     sentence+ EOF;

sentence:    stmt+ '.' ;

stmt:        if
        |    move
        |   perform ;

if:            'if' condition 'then'? stmt+ ('else' stmt+)? 'end-if'? ;

move:        'move' ID 'to' ID ;

perform:     'perform' stmt+ 'end-perform' ;

condition:     ID '==' ID ;

ID  :       ('a'..'z'|'A'..'Z')+ ;
INT :       '0'..'9'+ ;
NEWLINE:    '\r'? '\n' {skip();} ;
WS  :       (' '|'\t')+ {skip();} ;

</pre>



List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Again Cobol:

Jim Idle
In reply to this post by Mark Taylor
Mark Taylor wrote:
> Dave: Yes, Cobol is that evil.  The period closes all previous scopes
> that don't require explicit scope terminators (some constructs like
> PERFORM do, some don't (COMPUTE)).  Thanks, that worked.  I replaced
> the PERFORM rule with:
>
> perform:     'perform' ((stmt)=>stmt)+ 'end-perform' ;
That will parse every stmt twice and performance will be terrible. Make
a predicate rule instead:

stmt_pred
    : 'if' | 'move' | 'foo'
    ;


((stmt_pred)=>stmt)+

Jim

List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Again Cobol:

Mark Taylor
In reply to this post by Dave Dutcher
Thanks Dave and Jim, that helps a lot!

On Thu, Jun 11, 2009 at 10:10 AM, Dave Dutcher <[hidden email]> wrote:
I read about syntatic predicates in TP's book.  And looking over the online docs, I can't really find where they are explained either.  A google search did turn up this article http://www.antlr.org/wiki/display/ANTLR3/How+to+remove+global+backtracking+from+your+grammar 
 
which looks like it might answer some of your questions.
 
Dave
 


From: Mark Taylor [mailto:[hidden email]]
Sent: Thursday, June 11, 2009 9:34 AM
To: Dave Dutcher
Cc: [hidden email]
Subject: Re: [antlr-interest] Again Cobol:

Dave: Yes, Cobol is that evil.  The period closes all previous scopes that don't require explicit scope terminators (some constructs like PERFORM do, some don't (COMPUTE)).  Thanks, that worked.  I replaced the PERFORM rule with:

perform:     'perform' ((stmt)=>stmt)+ 'end-perform' ;

and I no longer get the error.

Now some rhetorical questions: why does it work?  Or, why is it not obvious?  Where are syntactic predicates explained?  Is this in the online docs somewhere I can't find?  Clearly I have some studying to do.  I haven't picked up T.P.s book yet but I aim too. 



On Thu, Jun 11, 2009 at 8:42 AM, Dave Dutcher <[hidden email]> wrote:
I'm new to Antlr myself, so maybe you already know everything I'm going to suggest.
 
So your grammar has trouble with input like
 
if A == B then move Y to Z move X to Y.
 
And you're suggesting that the parser should just consume the move X to Y as part of the if statement, even though it could also be interpreted as another seperate statement?  This reminds me of the classic IF IF ELSE problem, but I don't have my Antlr book with me to lookup how that is usually solved.
 
One method might be to just turn on backtracking.  Otherwise I would think you could add syntatic predicates like ((stmt)=>stmt)+ which, as I understand it, would make Antlr consume all the statements it can as that part of the rule.  I haven't tested this though.
 
Dave
 
 


From: [hidden email] [mailto:[hidden email]] On Behalf Of Mark Taylor
Sent: Thursday, June 11, 2009 7:39 AM
To: [hidden email]
Subject: [antlr-interest] Again Cobol:

I'm working on a Cobol grammar (oh, the foolishness of youth...  wait I'm not THAT young...) and I need some advice about the ambiguities.  In particular I'm getting the famous: "error(211): CobolTest.g:11:30: [fatal] rule if has non-LL(*) decision due to recursive rule invocations reachable from alts 1,2.  Resolve by left-factoring or using syntactic predicates or using backtrack=true option."  Yes, this has come up before, but there was no clear answer.  This time I have a specific example (see below).

Below is the smallest grammar which exhibits the problem.  You can see I have stmt+ in both the IF rule and the PERFORM rule.  The problem is the 'END-IF'?.  Since END-IF is optional in Cobol, there is no clear scope terminator.   I've tried left refactoring the (stmt+ ....) into a separate rule for both IF and PERFORM but that doesn't seem to work either.  I don't see how a syntactic predicate could be applied to this either.

If I were writing this as a recursive descent parser by hand (I'm trying Antlr so I don't have to do this) I would write a statementlist() method that would simply loop on all statement beginnings keywords.  Then when an END-IF, ELSE, END-PERFORM, or some other arbitrary scope terminator appeared in the input the loop would simple exit and return the list of valid statements.  The question is: how to get Antlr to behave like this?

Any advice is appreciated.

<pre>

grammar CobolTest;

program:     sentence+ EOF;

sentence:    stmt+ '.' ;

stmt:        if
        |    move
        |   perform ;

if:            'if' condition 'then'? stmt+ ('else' stmt+)? 'end-if'? ;

move:        'move' ID 'to' ID ;

perform:     'perform' stmt+ 'end-perform' ;

condition:     ID '==' ID ;

ID  :       ('a'..'z'|'A'..'Z')+ ;
INT :       '0'..'9'+ ;
NEWLINE:    '\r'? '\n' {skip();} ;
WS  :       (' '|'\t')+ {skip();} ;

</pre>




List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
Loading...