[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: A reference of the match compiler?
- Subject: Re: A reference of the match compiler?
- From: "Mura Li" <mail@xxxxxxx>
- Reply-to: myrddin-dev@xxxxxxxxxxxxxx
- Date: Fri, 10 May 2019 08:02:33 -0400
- To: myrddin-dev@xxxxxxxxxxxxxx, "Ori Bernstein" <ori@xxxxxxxxxxxxxx>
>> My question is, *should the Dtree of a match statement contain more than one 'any' node following the design of the algorithm*? Is that expected?
>> As I understand(guess), it means the 'default' case of a switch statement.
>> Am I correct?
>>
Looking closer into it, I realized that I was wrong.
Now, I am seeing it as a placeholder for wildcard variables, representing an unconditional branch (in contrast to a compare/conditional branch).
On Fri, May 10, 2019, at 7:35 PM, Mura Li wrote:
> I tried to log the Dtree with more details to help understand it.
> The logging routines can be found at https://github.com/oridb/mc/compare/master...typeless:match-expri
> (I guess the following log snippet can be understood by the author of the algorithm without looking at the logging code. :D )
>
> Given the following test case:
>> match (0, 0, 0)
>> | (0, 1, 3) : 1
>> | (3, 4, _) : 2
>> | _ : 3
>> ;;
>
> I got
>> + .L4 (Otup) -> accept:.L0 any:(null) .next:[ ] end:[ ]
>> + .L4 (Olit) -> accept:.L5 any:(null) .next:[ ] end:[ ]
>> - .any:(null) .next:[ .L5 ] end:[ .L5 ]
>> + .L5 (Olit) -> accept:.L6 any:(null) .next:[ ] end:[ ]
>> - .any:(null) .next:[ .L6 ] end:[ .L6 ]
>> + .L6 (Olit) -> accept:.L0 any:(null) .next:[ ] end:[ ]
>> - .any:(null) .next:[ .L0 ] end:[ .L0 ]
>> - .any:(null) .next:[ .L5 ] end:[ .L0 ]
>> + .L4 (Otup) -> accept:.L1 any:(null) .next:[ .L5 ] end:[ .L0 ]
>> + .L4 (Olit) -> accept:.L7 any:(null) .next:[ .L5 ] end:[ ]
>> - .any:(null) .next:[ .L5 .L7 ] end:[ .L7 ]
>> + .L7 (Olit) -> accept:.L8 any:(null) .next:[ ] end:[ ]
>> - .any:(null) .next:[ .L8 ] end:[ .L8 ]
>> + .L8 (Ogap) -> accept:.L1 any:(null) .next:[ ] end:[ ]
>> + .L8 (wildrec) -> accept:.L1 any:(null) .next:[ ] end:[ ]
>> - .any:.L1 .next:[ ] end:[ .L1 ]
>> - .any:.L1 .next:[ ] end:[ .L1 ]
>> - .any:(null) .next:[ .L5 .L7 ] end:[ .L0 .L1 ]
>> + .L4 (Ogap) -> accept:.L2 any:(null) .next:[ .L5 .L7 ] end:[ .L0 .L1 ]
>> + .L4 (wildrec) -> accept:.L2 any:(null) .next:[ .L5 .L7 ] end:[ .L0 .L1 ]
>> + .L4 (wildrec) -> accept:.L9 any:(null) .next:[ .L5 .L7 ] end:[ ]
>> - .any:.L9 .next:[ .L5 .L7 ] end:[ .L5 .L7 .L9 ]
>> + .L5 (wildrec) -> accept:.L10 any:(null) .next:[ .L6 ] end:[ ]
>> - .any:.L10 .next:[ .L6 ] end:[ .L6 .L10 ]
>> + .L7 (wildrec) -> accept:.L10 any:(null) .next:[ .L8 ] end:[ .L6 .L10 ]
>> - .any:.L10 .next:[ .L8 ] end:[ .L6 .L10 .L8 .L10 ]
>> + .L9 (wildrec) -> accept:.L10 any:(null) .next:[ ] end:[ .L6 .L10 .L8 .L10 ]
>> - .any:.L10 .next:[ ] end:[ .L6 .L10 .L8 .L10 .L10 ]
>> + .L6 (wildrec) -> accept:.L2 any:(null) .next:[ .L0 ] end:[ ]
>> - .any:.L2 .next:[ .L0 ] end:[ .L0 .L2 ]
>> + .L10 (wildrec) -> accept:.L2 any:(null) .next:[ ] end:[ .L0 .L2 ]
>> - .any:.L2 .next:[ ] end:[ .L0 .L2 .L2 ]
>> + .L8 (wildrec) -> accept:.L2 any:.L1 .next:[ ] end:[ .L0 .L2 .L2 ]
>> - .any:.L1 .next:[ ] end:[ .L0 .L2 .L2 .L1 ]
>> + .L10 (wildrec) -> accept:.L2 any:.L2 .next:[ ] end:[ .L0 .L2 .L2 .L1 ]
>> - .any:.L2 .next:[ ] end:[ .L0 .L2 .L2 .L1 .L2 ]
>> + .L10 (wildrec) -> accept:.L2 any:.L2 .next:[ ] end:[ .L0 .L2 .L2 .L1 .L2 ]
>> - .any:.L2 .next:[ ] end:[ .L0 .L2 .L2 .L1 .L2 .L2 ]
>> - .any:.L9 .next:[ .L5 .L7 ] end:[ .L0 .L1 .L0 .L2 .L2 .L1 .L2 .L2 ]
>> - .any:.L9 .next:[ .L5 .L7 ] end:[ .L0 .L1 .L0 .L2 .L2 .L1 .L2 .L2 ]
>
> As you can see, the last 'gap' pattern of the match statement expands into 9 Dtree entries.
> I believe that the code related to those entries is at https://github.com/oridb/mc/blob/master/mi/match.c#L292
>
> My question is, *should the Dtree of a match statement contain more than one 'any' node following the design of the algorithm*? Is that expected?
> As I understand(guess), it means the 'default' case of a switch statement.
> Am I correct?
>
>
> On Thu, May 9, 2019, at 4:25 AM, Ori Bernstein wrote:
>> On Wed, 08 May 2019 04:49:49 -0400, "Mura Li" <mail@xxxxxxx> wrote:
>>
>> > What is the algorithm on which the current match compiler (mi/match.c) is based?
>>
>> A thing that came out of my head. It's kind of janky -- the idea is that it's
>> based on incrementally building up a DFA. So, for example:
>>
>> match x
>> | `Some (123, a): do1
>> | `Some (234, 789): do2
>> | `Some (234, b): do3
>> | `Some _: do4
>> | `None do5
>> ;;
>>
>> Gets compiled incrementally to an automaton like so:
>>
>>
>> First match added: no multiple tails to attach to, just go through
>> and create a linear set of states.
>>
>> (tag(x) == `Some) ---> (some.val == 123) ---> wildcard --> do1
>>
>> Second one matches branches after the tag:
>>
>> (tag(x) == `Some) -----> (some.val == 123) ---> wildcard --> do1
>> \
>> +-> (some.val == 234) ---> 789 --> do2
>>
>> Third one adds in a wildcard, so we tack that on:
>>
>> (tag(x) == `Some) -----> (some.val == 123) ---> wildcard --> do1
>> \
>> +-> (some.val == 234) -----> 789 --> do2
>> \
>> +-> wildcard --> do3
>>
>> the fourth one adds wildcards for all tuple, so we need to walk down
>> all branches and tack it in:
>>
>> +-----------> wildcard -> do4
>> /
>> (tag(x) == `Some) -----> (some.val == 123) ---> wildcard --> do1
>> \
>> \
>> +-> (some.val == 234) -----> 789 --> do2
>> | \
>> | +-> wildcard --> do3
>> \
>> \---> wildcard -----------> wildcard --> do4
>>
>> And finally, we add the none in.
>>
>> > Any reference paper or book about it?
>>
>>
>> Not that I'm aware of.
>>
>> > I have looked into a few from the internet, but they do not look like a one-to-one translation.
>>
>> Yeah, it's a confusing stateful mess. I'd be happy to tear it out
>> and looking at put in a new one, possibly based on Augustsson's
>> paper on how to compile pattern matching.
>>
>> --
>> Ori Bernstein
>>
>