[sv-ac] semantics issues


Subject: [sv-ac] semantics issues
From: John Havlicek (john.havlicek@motorola.com)
Date: Wed Apr 09 2003 - 19:50:20 PDT


All:

> Will tune expansion to align with semantic defintion:
> (1 [*0:$](sequence_expr1) ##[0:$] 1 ) intersect sequence_expr2
> Detailed discussion of matching of empty words and whether the overlapping
> concatentation absorbs the empty word. It has been defined that it does, but
> John sees an associativity. John will own this as part of semantics effort.

In my opinion, the semantics of within should be defined as follows:

   sequence_expr1 within sequence_expr2

means

   (1[*0:$] ##1 sequence_expr1 ##1 1[*0:$]) intersect sequence_expr2

This is the way it is defined in the semantics document we are finishing
up.

Note that for any sequence expression S, the following are equivalent:

1. (1[*0] ##1 S)
2. (S ##1 1[*0])
3. S

This is the usual way non-overlapping concatenation interacts with
a regular expression matching the empty word.

In my opinion, we should not use the "##0" operator in the definition
of within. "##0" can match only if both its operands match non-empty
words. This is because "##0" requires the last letter of the lhs word
to overlap with the first letter of the rhs word, so, a fortiori, both
words must have at least one letter. Thus,

   (1[*0:$] ##0 S ##0 1[*0:$]) intersect T

is equivalent to
  
   (1[*1:$] ##0 (1 and S) ##0 1[*1:$]) intersect T

which forbids S, hence T, from matching empty.

Regarding association of "##0" and "##1", note that

   (R ##0 S) ##1 T

is equivalent to

   ((1 and R) ##0 (1 and S)) ##1 T

Both R and S must match non-empty words to get a match. On the
other hand,

   R ##0 (S ##1 T)

is equivalent to

   (1 and R) ##0 (1 and (S ##1 T))

Here R must match non-empty, but only the concatenation S ##1 T,
rather than S specifically, must match non-empty. To show this
difference more concretely, if I write

   a ##0 b[*0:$] ##1 c

and rely on left-association of operators ##0 and ##1 at the
same precedence, what I get is equivalent to

   a ##0 b[*1:$] ##1 c

and this is not equivalent to

   a ##0 (b[*0:$] ##1 c)

It has been suggested to me that pushing the precedence of
##0 lower than ##m m > 0 is problematic for parsing.

Perhaps we should consider changing the associativity for
concatenation from left to right. What is the PSL convention
for associativity?

Best regards,

John Havlicek



This archive was generated by hypermail 2b28 : Wed Apr 09 2003 - 19:51:48 PDT