Eigenstate: myrddin-dev mailing list

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: A reference of the match compiler?


>> My question is, *should the Dtree of a match statement contain more than one 'any' node following the design of the algorithm*? Is that expected?
>> As I understand(guess), it means the 'default' case of a switch statement.
>> Am I correct?
>> 
Looking closer into it, I realized that I was wrong.
Now, I am seeing it as a placeholder for wildcard variables, representing an unconditional branch (in contrast to a compare/conditional branch).

On Fri, May 10, 2019, at 7:35 PM, Mura Li wrote:
> I tried to log the Dtree with more details to help understand it.
> The logging routines can be found at https://github.com/oridb/mc/compare/master...typeless:match-expri
> (I guess the following log snippet can be understood by the author of the algorithm without looking at the logging code. :D )
> 
> Given the following test case:
>> match (0, 0, 0)
>> | (0, 1, 3) : 1
>> | (3, 4, _) : 2
>> | _ : 3
>> ;;
> 
> I got
>> + .L4 (Otup) -> accept:.L0 any:(null) .next:[ ] end:[ ]
>>  + .L4 (Olit) -> accept:.L5 any:(null) .next:[ ] end:[ ]
>>  - .any:(null) .next:[ .L5 ] end:[ .L5 ]
>>  + .L5 (Olit) -> accept:.L6 any:(null) .next:[ ] end:[ ]
>>  - .any:(null) .next:[ .L6 ] end:[ .L6 ]
>>  + .L6 (Olit) -> accept:.L0 any:(null) .next:[ ] end:[ ]
>>  - .any:(null) .next:[ .L0 ] end:[ .L0 ]
>> - .any:(null) .next:[ .L5 ] end:[ .L0 ]
>> + .L4 (Otup) -> accept:.L1 any:(null) .next:[ .L5 ] end:[ .L0 ]
>>  + .L4 (Olit) -> accept:.L7 any:(null) .next:[ .L5 ] end:[ ]
>>  - .any:(null) .next:[ .L5 .L7 ] end:[ .L7 ]
>>  + .L7 (Olit) -> accept:.L8 any:(null) .next:[ ] end:[ ]
>>  - .any:(null) .next:[ .L8 ] end:[ .L8 ]
>>  + .L8 (Ogap) -> accept:.L1 any:(null) .next:[ ] end:[ ]
>>  + .L8 (wildrec) -> accept:.L1 any:(null) .next:[ ] end:[ ]
>>  - .any:.L1 .next:[ ] end:[ .L1 ]
>>  - .any:.L1 .next:[ ] end:[ .L1 ]
>> - .any:(null) .next:[ .L5 .L7 ] end:[ .L0 .L1 ]
>> + .L4 (Ogap) -> accept:.L2 any:(null) .next:[ .L5 .L7 ] end:[ .L0 .L1 ]
>>  + .L4 (wildrec) -> accept:.L2 any:(null) .next:[ .L5 .L7 ] end:[ .L0 .L1 ]
>>  + .L4 (wildrec) -> accept:.L9 any:(null) .next:[ .L5 .L7 ] end:[ ]
>>  - .any:.L9 .next:[ .L5 .L7 ] end:[ .L5 .L7 .L9 ]
>>  + .L5 (wildrec) -> accept:.L10 any:(null) .next:[ .L6 ] end:[ ]
>>  - .any:.L10 .next:[ .L6 ] end:[ .L6 .L10 ]
>>  + .L7 (wildrec) -> accept:.L10 any:(null) .next:[ .L8 ] end:[ .L6 .L10 ]
>>  - .any:.L10 .next:[ .L8 ] end:[ .L6 .L10 .L8 .L10 ]
>>  + .L9 (wildrec) -> accept:.L10 any:(null) .next:[ ] end:[ .L6 .L10 .L8 .L10 ]
>>  - .any:.L10 .next:[ ] end:[ .L6 .L10 .L8 .L10 .L10 ]
>>  + .L6 (wildrec) -> accept:.L2 any:(null) .next:[ .L0 ] end:[ ]
>>  - .any:.L2 .next:[ .L0 ] end:[ .L0 .L2 ]
>>  + .L10 (wildrec) -> accept:.L2 any:(null) .next:[ ] end:[ .L0 .L2 ]
>>  - .any:.L2 .next:[ ] end:[ .L0 .L2 .L2 ]
>>  + .L8 (wildrec) -> accept:.L2 any:.L1 .next:[ ] end:[ .L0 .L2 .L2 ]
>>  - .any:.L1 .next:[ ] end:[ .L0 .L2 .L2 .L1 ]
>>  + .L10 (wildrec) -> accept:.L2 any:.L2 .next:[ ] end:[ .L0 .L2 .L2 .L1 ]
>>  - .any:.L2 .next:[ ] end:[ .L0 .L2 .L2 .L1 .L2 ]
>>  + .L10 (wildrec) -> accept:.L2 any:.L2 .next:[ ] end:[ .L0 .L2 .L2 .L1 .L2 ]
>>  - .any:.L2 .next:[ ] end:[ .L0 .L2 .L2 .L1 .L2 .L2 ]
>>  - .any:.L9 .next:[ .L5 .L7 ] end:[ .L0 .L1 .L0 .L2 .L2 .L1 .L2 .L2 ]
>> - .any:.L9 .next:[ .L5 .L7 ] end:[ .L0 .L1 .L0 .L2 .L2 .L1 .L2 .L2 ]
> 
> As you can see, the last 'gap' pattern of the match statement expands into 9 Dtree entries.
> I believe that the code related to those entries is at https://github.com/oridb/mc/blob/master/mi/match.c#L292
> 
> My question is, *should the Dtree of a match statement contain more than one 'any' node following the design of the algorithm*? Is that expected?
> As I understand(guess), it means the 'default' case of a switch statement.
> Am I correct?
> 
> 
> On Thu, May 9, 2019, at 4:25 AM, Ori Bernstein wrote:
>> On Wed, 08 May 2019 04:49:49 -0400, "Mura Li" <mail@xxxxxxx> wrote:
>> 
>> > What is the algorithm on which the current match compiler (mi/match.c) is based?
>> 
>> A thing that came out of my head. It's kind of janky -- the idea is that it's
>> based on incrementally building up a DFA. So, for example:
>> 
>> match x
>> | `Some (123, a): do1
>> | `Some (234, 789): do2
>> | `Some (234, b): do3
>> | `Some _: do4
>> | `None do5
>> ;;
>> 
>> Gets compiled incrementally to an automaton like so:
>> 
>> 
>> First match added: no multiple tails to attach to, just go through
>> and create a linear set of states.
>> 
>>  (tag(x) == `Some) ---> (some.val == 123) ---> wildcard --> do1
>> 
>> Second one matches branches after the tag:
>> 
>>  (tag(x) == `Some) -----> (some.val == 123) ---> wildcard --> do1
>>  \
>>  +-> (some.val == 234) ---> 789 --> do2
>> 
>> Third one adds in a wildcard, so we tack that on:
>> 
>>  (tag(x) == `Some) -----> (some.val == 123) ---> wildcard --> do1
>>  \
>>  +-> (some.val == 234) -----> 789 --> do2
>>  \
>>  +-> wildcard --> do3
>> 
>> the fourth one adds wildcards for all tuple, so we need to walk down
>> all branches and tack it in:
>> 
>>  +-----------> wildcard -> do4
>>  /
>>  (tag(x) == `Some) -----> (some.val == 123) ---> wildcard --> do1
>>  \ 
>>  \
>>  +-> (some.val == 234) -----> 789 --> do2
>>  | \
>>  | +-> wildcard --> do3
>>  \
>>  \---> wildcard -----------> wildcard --> do4
>> 
>> And finally, we add the none in.
>> 
>> > Any reference paper or book about it?
>> 
>> 
>> Not that I'm aware of.
>> 
>> > I have looked into a few from the internet, but they do not look like a one-to-one translation.
>> 
>> Yeah, it's a confusing stateful mess. I'd be happy to tear it out
>> and looking at put in a new one, possibly based on Augustsson's
>> paper on how to compile pattern matching.
>> 
>> -- 
>>  Ori Bernstein
>> 
> 

Follow-Ups:
Re: A reference of the match compiler?Ori Bernstein <ori@xxxxxxxxxxxxxx>
References:
A reference of the match compiler?"Mura Li" <mail@xxxxxxx>
Re: A reference of the match compiler?Ori Bernstein <ori@xxxxxxxxxxxxxx>
Re: A reference of the match compiler?"Mura Li" <mail@xxxxxxx>