Eigenstate : CLI Parsing

Summary: Command Line Parsing

Command line parsing is something that nearly every program needs. This section of libstd provides simple command line parsing with autogenerated help.

pkg std =
        type optdef = struct
                argdesc : byte[:]   /* the description for the usage */
                minargs : std.size  /* the minimum number of positional args */
                maxargs : std.size  /* the maximum number of positional args (0 = unlimited) */
                noargs  : std.bool  /* whether we accept args at all */
                opts    : optdesc[:]    /* the description of the options */
        dest    : std.option(byte[:]#)  /* destination for option */
        ;;
        type optdesc = struct
                opt : char
                arg : byte[:]
                desc    : byte[:]
                optional    : bool
        ;;
        type optparsed = struct
                opts    : (char, byte[:])[:]
                args    : byte[:][:]
        ;;

        const optparse  : (optargs : byte[:][:], def : optdef# -> optparsed)
        const optusage  : (prog : byte[:], def : optdef# -> void)
;;

Syntax

A command line is composed of a list of words, known as. These arguments may be options that the program can act on, known as "flags". These flags may take up to one argument. To avoid confusing with the top level arguments, this document will refer to them as "values". Anything that is not a flag is a "positional argument", or simply an "argument".

In general, the POSIX syntax for arguments is followed, with a few minor enhancements. Myrddin program will use the following semantics for command line options:

Types

The API provided for command line parsing is relatively declarative, with the options specified in a struct passed to the parsing.

type optdef = struct
    argdesc : byte[:]
    minargs : std.size
    maxargs : std.size
    noargs  : std.bool
    opts    : optdesc[:]
;;

The optdef is the top level structure describing the command line arguments. It contains the following fields:

argdesc

Argdesc is a string describing the positional arguments passed to the program. It doesn't change the way that the arguments are parsed, but is used to document the arguments to the user.

In general, this should be a summary of any expected argument. If a variable number of them are expected, the argument should be followed with a ....

For example, a program that takes an output and a list of inputs may provide the following for argdesc:

"output inputs..."

When the help string is generated, the output would look like:

myprog [-h?] [-o option] output inputs...

minargs

This argument limits the minimum number of arguments that the program will accept without error. If at minimum 3 inputs are needed, for example, then this value should be set to 3. This does not count flags, nor does it count the program name.

If set to 0, this value is ignored. This is the default value.

maxargs

This argument limits the maximum number of arguments that the program will accept without error. If the program takes at most 1 argument, for example, example, then this value should be set to 1. Just like minargs, this does not count flags or the program name.

If set to 0, this value is ignored. This is the default value.

noargs

This argument causes the program to reject any arguments at all.

opts

This is a list of descriptions of the options that this program takes. This list may be empty, at which point this api still provides a good way of checking that no invalid arguments are passed.

type optdesc = struct
    opt : char
    arg : byte[:]
    desc    : byte[:]
    optional    : bool
    dest    : std.option(byte[:]#)  /* destination for option */
;;

This is a description of a command line argument. It contains the following fields to be set by the user:

opt

This is a single unicode character that is used for the option flag.

arg

This is a single word description of the argument. If it is not present or has zero length, this indicates that the flag takes no value. Otherwise, the value is mandatory, unless the optional flag is set.

optional

This is a boolean that allows for the value arg to be optionally omitted when using the flag. It is disabled by default.

desc

This is a short sentence describing arg. It has no semantic effect on the option parsing, and is only used in generating help output for the arguments.

dest

If there is an arg parameter, and this is value is `Some ptr, then the pointer is initialized with the argument value.

type optparsed = struct
        opts    : (char, byte[:])[:]
        args    : byte[:][:]
        prog        : byte[:]
;;

This is the final result of parsing the options. The opts member contains a list of options in the form of (opt, val) pairs. The option opt will be repeated once for every time that the flag opt is seen within the command line.

If there is no value passed with the flag, then the string will be the empty string. Otherwise, it will contain the string passed.

The args member contains the arguments, collected for easy iteration, and the prog member contains the binary name.

Functions

const optparse  : (optargs : byte[:][:], def : optdef# -> optparsed)

Optparse takes an array optargs containing the command line arguments passed to the program, as well as an optdef pointer describing the expected arguments, and spits out out an optparsed. The arguments optargs are expected to contain the program name.

const optusage  : (prog : byte[:], def : optdef# -> void)

Optusage takes the string prog containing the program name, and an def containing an optdef which describes the arguments to provide help for. It prints these out on stderr (fd 1), and returns.

Examples:

This example is a trivial one, which parses no flags, and merely errors if given any.

const main = {args
    var cmd

    cmd = std.optparse(args, &[
            .argdesc = "vals",
    ])
    for arg : cmd.args
            std.put("arg: {}\n", arg)
    ;;
}

This example shows some more advanced usage, and is extracted from mbld.

const main = {args
var dumponly
var targname
var bintarg
var cmd 
var libpath

cmd = std.optparse(args, &[
    .argdesc = "[inputs...]",
    .opts = [
        [.opt='t', .desc="list all available targets"],
        [.opt='T', .arg="tag", .desc="build with specified systag"],
        [.opt='S', .desc="generate assembly when building"],
        [.opt='d', .desc="dump debugging information for mbld"],
        [.opt='I', .arg="inc", .desc="add 'inc' to your include path"],
        [.opt='R', .arg="root", .desc="install into 'root'"],
        [.opt='b', .arg="bin", .desc="compile binary named 'bin' from inputs"],
        [.opt='l', .arg="lib", .desc="compile lib named 'lib' from inputs"],
        [.opt='r', .arg="rt", .desc="link against runtime 'rt' instead of default"],
        [.opt='C', .arg="mc", .desc="compile with 'mc' instead of the default compiler"],
        [.opt='M', .arg="mu", .desc="merge uses with 'mu' instead of the default muse"],
    ][:]
])
targname = ""
tags = [][:]
for opt : cmd.opts
    match opt
    | ('t', ""):    dumponly = true
    | ('S', ""):    bld.opt_genasm = true
    | ('I', arg):   bld.opt_incpaths = std.slpush(bld.opt_incpaths, arg)
    | ('R', arg):   bld.opt_instroot = arg
    | ('T', tag):   tags = std.slpush(tags, tag)
    | ('b', arg):
        targname = arg
        bintarg = true
    | ('l', arg):
        targname = arg
        bintarg = false
    | ('r', arg):
        if std.sleq(arg, "none")
            bld.opt_runtime = ""
        else
            bld.opt_runtime = arg
        ;;
    /*
    internal undocumented args; used by compiler suite for
    building with an uninstalled compiler.
    */
    | ('d', arg): bld.opt_debug = true
    | ('C', arg): bld.opt_mc = arg
    | ('M', arg): bld.opt_muse = arg
    | _:    std.die("unreachable\n")

    ;;
;;

    for arg : cmd.args
            /* build stuff */
    ;;
}