Monday, August 23, 2010

Ad hoc interface matching in ParaSail

When a module is instantiated in ParaSail to form a type, actual parameters must be specified for each module formal parameter.  When the module formal is itself a type, the module actual must be a type that matches the formal appropriately.   For example, given:

interface Set<Element_Type is Assignable<>> is
    function Union(Left, Right : Set) -> Set;
    function Unit_Set(Elem : Element_Type) -> Set;
end interface Set;

we will want to be able to write:

    type My_Int_Set is Set<My_Integer_Type>;

Now the question is whether My_Integer_Type must be based on an interface that is explicitly indicated as extending or implementing Assignable, or should we simply require that My_Integer_Type has all of the operations expected of a type based on Assignable.  In other words, does an actual type match a formal type only if it extends or implements the specified interface, or is ad hoc matching permitted, where at the point of instantiation a check is made that all of the required operations are present?  A similar question would come up when converting an object of a particular type to a polymorphic type (such as Assignable+).

Our initial instincts were to require that the actual type (or the type being converted) explicitly extend or implement the formal type (or the target type).  However, this will tend to cause a proliferation of interfaces being added to the list of implements for any given module, such as Assignable or Comparable or Numeric or ..., and still the actual subset of operations of interest might not be mentioned explicitly. 

Given the above concern, we now favor allowing ad hoc matching requirements for matching of module actuals to module formals, and when converting from a normal type to a polymorphic type.  There are some potential problems during maintenance if the operations of an interface change, but those problems exist anyway, since any code that makes calls on individual operations defined in an interface may suffer maintenance issues when operations change.  Note, however, that adding operations to the actual type, or removing them from the formal type, don't create maintenance issues.

In general, there are many fewer places where modules are instantiated, or conversions to polymorphic types occur, than there are normal calls on operations, and defining and using (presumably abstract) interfaces such as Assignable or Comparable as module formals, if they capture exactly what operations of interest are needed by the module, could reduce rather than increase maintenance problems when trying to decide whether to change an existing operation of some interface.

So the answer to the above question about instantiating Set with My_Integer_Type is now answered as follows:  My_Integer_Type must provide the operations defined in the interface Assignable, but it need not directly or indirectly extend or implement Assignable.  This generally means that the list of interfaces specified in the implements list is there primarily for documentation, and for imposing a linkage such that if additional operations are added to the implemented interface, the need to define the operations is known at the point of module definition, rather than being deferred until some point of use.  This also implies that when ad hoc matching is used, there is a presumption that the module formal (or target type) is very stable and new operations are not expected to be added to it.  A type like Assignable<> certainly qualifies. 


  1. Isn't what you call "ad-hoc interface matching" what most people call "duck typing"?

  2. Yes, it sounds very similar. In the WikiPedia article on "Duck Typing," they contend that Duck Typing is inherently a run-time type checking approach. We are not suggesting any run-time type checking at this point. In the WikiPedia article, they introduce the term "Structural type systems" and describe them as follows:

    Duck typing is similar to but distinct from structural typing. Structural typing is a static typing system that determines type compatibility and equivalence by a type's structure, whereas duck typing is dynamic and determines type compatibility by only that part of a type's structure that is accessed during run time.
    The Objective Caml language uses a structural type system.

  3. Ah, i see. Sorry about the confusion :-)

    I encounter this mainly with C++ templates. There, it usually comes back to bite us when we implement "generic" algorithms that turn out not to be so generic. The most common error is that somebody uses less-than to compare complex numbers, when they should be comparing magnitudes. This usually isn't hard to fix, but the error messages can be nasty. How do you plan to make debugging those kinds of syntax errors easier than in (say) C++?

  4. C++ templates define essentially no "contract" so there is no easy way to find out what sort of type will be acceptable without trying the instantiation. By contrast, in ParaSail the only operations that are available within a module are those defined for the module formal. The module actual must provide at least the same operations, and you only need look at the definition of the module formal to know what operations are needed. You don't need to look "inside" the module.

  5. Here comes a long comment, so watch out ;-) Oh, and i have to split it in two.

    Tuck wrote: "C++ templates define essentially no 'contract' so there is no easy way to find out what sort of type will be acceptable without trying the instantiation."

    C++ programmers get around this with "concept checks," which test the required syntax by, well, trying the instantiation ;-) in a way that makes it easy to catch syntax errors. They can be tedious, because doing them right requires two things:

    1. Concept checking code, that verifies the interface statically and as simply as possible (so that compiler error messages are easier to read)

    2. An "archetype class": like a Java interface, except with (skeletal) implementations of the required interface

    In the above, i'm using the terminology of the Boost Concept Check Library:

    #1 alone adds to the code maintenance burden, since as the "contract" evolves, you have to synchronize the checks with how you are actually using the object. #2 alone doesn't ensure synchronization of your idea of the contract, with the classes you think are implementing it. Nevertheless, there are still four separate bodies of code to synchronize: the archetype class, the concept checking code, the application class(es), and the application code that invokes these class(es). The most straightforward (but not easiest) way to avoid this burden would be to

  6. 1. Write an archetype class, and make the compiler check the concept, using application code: "Declared archetypes"

    This approach has implications for avoiding error-prone tedium. A common case for templates is generic code for different kinds of numbers. I've written a lot of code templated on "Scalar", which could be anything representing a real or complex number, and Ordinal, which represents an array index type. It would be tedious to declare an archetype supporting all the different kinds of things one might like to do with a Scalar, for example. It would at least have to include all of arithmetic, and all kinds of transcendental functions (such as trigonometric functions and logarithms). That would be an awful lot of effort, especially since most people are only going to instantiate Scalar with "float" or "double". (A few unlucky folks will make Scalar a complex number and discover all the places where the supposedly generic code was using less-than on Scalar!)

    To avoid this tedium, the "concepts" proposal considered for addition to the C++ standard (but rejected) offered a shorthand for "aggregate archetypes." You could say that Scalar is "FloatingPointLike" or that Ordinal is "SignedIntegralLike", and that would hopefully bring in the syntax that you want. You could also "inherit" from archetypes to create your own.

    The trouble with this approach is that somebody has to write a huge library of arbitrarily named predefined archetypes. Huge, because pretty much anything you might want to do with a "plain old data" type has to have its own concept. Arbitrarily named, because somebody has to decide what "ArithmeticLike" means. (Does it mean integers or real numbers?) This path seems to call for abstract algebra, where you have semigroups and rings and fields of a certain characteristic, and all kinds of things that nonmathematicians don't want to understand. (I'm convinced the main reason why people don't use Haskell more is because it's so hard to explain why you want to know what a monad is.)

    This is overkill because in most cases, programmers can accomplish their work without so much formality. You allude to the reason why: "... when ad hoc matching is used, there is a presumption that the module formal (or target type) is very stable and new operations are not expected to be added to it." The typical use case of C++ templates (for example) is for things that "look like double" or "look like int." That suggests a second approach:

    2. "Archetype by prototype" or "looks like type T"

    If T is something simple like "double" or "int", then you save a lot of syntax and / or library writing. If T is complicated, this forces you to write an archetype class, which is probably a good thing if T is complicated! For the numerical linear algebra algorithms i write and maintain, this would help remind me whether an algorithm (that claims to be generic on a Scalar type) can handle complex numbers. (Complex arithmetic changes linear algebra in subtle ways that don't only have to do with syntax.)

    This approach does not require special syntax for defining archetypes. However, it might still be nice to have such a syntax, and also to have a small library of archetype classes. I could see this being useful for iterators, for example.

  7. Thanks for the long comment. I think the proposed approach for ParaSail is a reasonable solution. You could create an interface (archetype) just for the purpose of using it as a module formal, but often there will be an existing interface that does the job. With the ad hoc interface matching, the only requirement is that the module actual have at least the same operations, but it need not be officially "related" to the module formal.

    ParaSail also allows the passing of individual operations as module parameters, so for things like sorting, you could simply pass in the comparison operation, rather than having to cobble together some kind of "Sortable" interface.

  8. That long comment was just me thinking out loud, so thanks for your patience ;-)

    Ad-hoc interface matching sounds like a good idea for simple use cases -- less to type, less carpal tunnel!

  9. I'm missing something; how can an interface have "the same operations" as Assignable without extending Assignable?

    Here's the definition of Assignable:

    abstract interface Assignable<> extends Any is
    operator "copy" Assignable -> Assignable;
    end interface Assignable;

    If I write another interface My_Assignable_1:

    abstract interface My_Assignable_1<> extends Any is
    operator "copy" My_Assignable_1 -> My_Assignable_1;
    end interface My_Assignable_1;

    Surely My_Assignable_1::"copy" is not the same as Assignable::"copy"?

    Maybe this:

    abstract interface My_Assignable_2<> extends Any is
    operator "copy" Assignable -> Assignable;
    operator "copy" My_Assignable_2 -> My_Assignable_2;
    end interface My_Assignable_2;

    then My_Assignable_2:"copy" (Assignable) at least appears to be the same as Assignable::"copy". But does this even make sense? What does it mean for an operation in an interface to not use the type of the interface?

  10. In ParaSail, when a module inherits an operation from an given interface, the types of the parameters that are of the same type as the existing interface are systematically substituted with that of the new module. Hence:

    interface My_Type<>
      implements Assignable is
       // operator "copy" My_Type -> My_Type;
       //   is inherited

    Other OOP languages have different rules. In general, the implicit ("this") parameter is always substituted with the new type, but whether other parameters of the same type are substituted varies.

    When we talk about ad hoc matching, the question is whether a given type implements some required interface, and hence can be passed as a parameter when instantiating some module, or can be converted to the polymorphic type associated with the interface. Without ad hoc matching, the answer is easy -- a type implements some interface if and only if its module explicitly specifies the interface as one of the ones it implements or extends, directly or through some sequence of explicitly implemented/extended interfaces.

    With ad hoc matching, a type implements an interface if it defines all of the operations required by the interface (after systematically substituting the type into the interface's operations as described above). To define a particular operation means to define (or inherit) an operation of the same name and the same parameter profile (after systematic substitution). In other words, if the names and parameter profiles match, then from ParaSail's point of view, they are the same operation.

    This blog entry was debating whether it is preferable to require explicit implements or extends for all interfaces that a module hopes to support, or allow after-the-fact ad hoc matching. We decided on ad hoc matching, but it is certainly still open for debate. We are definitely not going as far as C++ where there is no specification of the operations required of the formal types of a generic template. In ParaSail, a module specifies exactly which operations each formal type must provide, by specifying an interface, as in:

    interface Set
     <Element_Type is Comparable<>> is

    The formal type "Element_Type" must define all of the operations provided by the Comparable interface. But with ad hoc matching, it doesn't have to actually, explicitly implement or extend the interface Comparable. It just has to provide all the same operations.