www

Unnamed repository; edit this file 'description' to name the repository.
Log | Files | Refs | README

commit 878e262e9ae092e5844c3ac17417add990aba833
parent 3fff5a1dbc0c52a079a6c0344289249cd410dda4
Author: Georges Dupéron <georges.duperon@gmail.com>
Date:   Wed, 22 Mar 2017 17:05:11 +0100

Re-indented scribble file.

Diffstat:
Mscribblings/state-of-the-art.scrbl | 670++++++++++++++++++++++++++++++++++++++++----------------------------------------
1 file changed, 335 insertions(+), 335 deletions(-)

diff --git a/scribblings/state-of-the-art.scrbl b/scribblings/state-of-the-art.scrbl @@ -66,346 +66,346 @@ spirit of implementing languages as libraries@~cite{tobin-hochstadt_languages_as_libraries_2011}} - @asection{ +@asection{ @atitle{Algebraic datatypes for compilers (phc-adt)} - The @tt{phc-adt} library implements algebraic datatypes (variants and - structures) which are adapted to compiler-writing. - - There is an existing ``simple'' datatype library for Typed/Racket (source - code available at @url{https://github.com/pnwamk/datatype}). It differs - from our library on several points: - @itemlist[ - @item{``datatype'' uses nominal typing, while we use structural typing - (i.e. two identical type declarations in distinct modules yield the same - type in our case). This avoids the need to centralize the type - definition of ASTs.} - @item{``datatype'' uses closed variants, where a constructor can only - belong to one variant. We simply define variants as a union of - constructors, where a constructor can belong to several variants. This - allows later passes in the compiler to add or remove cases of variants, - without the need to duplicate all the constructors under slightly - different names.} - @item{``datatype'' does not support row polymorphism, or similarly the - update and extension of its product types with values for existing and - new fields, regardless of optional fields. We implement the - latter.@htodo{ Probably the former too.}}] - - @asection{ + The @tt{phc-adt} library implements algebraic datatypes (variants and + structures) which are adapted to compiler-writing. + + There is an existing ``simple'' datatype library for Typed/Racket (source + code available at @url{https://github.com/pnwamk/datatype}). It differs + from our library on several points: + @itemlist[ + @item{``datatype'' uses nominal typing, while we use structural typing + (i.e. two identical type declarations in distinct modules yield the same + type in our case). This avoids the need to centralize the type + definition of ASTs.} + @item{``datatype'' uses closed variants, where a constructor can only + belong to one variant. We simply define variants as a union of + constructors, where a constructor can belong to several variants. This + allows later passes in the compiler to add or remove cases of variants, + without the need to duplicate all the constructors under slightly + different names.} + @item{``datatype'' does not support row polymorphism, or similarly the + update and extension of its product types with values for existing and + new fields, regardless of optional fields. We implement the + latter.@htodo{ Probably the former too.}}] + + @asection{ @atitle{The case for bounded row polymorphism} - @todo{Explain the ``expression problem''.} - @todo{Talk about the various ways in which it can be ``solved'', and which - tradeoffs we aim for. Mention @url{ - http://docs.racket-lang.org/extensible-functions}, another solution - to the expression problem in Racket, but with rather different - tradeoffs.} - - We strive to implement compilers using many passes. A pass should be - able to accept a real-world AST, and produce accordingly a real-world - transformed AST as its output. It should also be possible to use - lightweight ``mock'' ASTs, containing only the values relevant to the - passes under test (possibly only the pass being called, or multiple - passes tested from end to end). The pass should then return a - corresponding simplified output AST, omitting the fields which are - irrelevant to this pass (and were not part of the input). Since the - results of the pass on a real input and on a test input are equivalent - modulo the irrelevant fields, this allows testing the pass in isolation - with simple data, without having to fill in each irrelevant field with - @tt{null} (and hope that they are indeed irrelevant, without a - comforting guarantee that this is the case), as is commonly done when - creating ``mocks'' to test object-oriented programs. - - This can be identified as a problem similar to the ``expression - problem''. In our context, we do not strive to build a compiler which - can be extended in an ``open world'', by adding new node types to the - AST, or new operations handling these nodes. Rather, the desired outcome - is to allow passes to support multiple known subsets of the same AST - type, from the start. - - We succinctly list below some of the different sorts of polymorphism, - and argue that row polymorphism is well-suited for our purpose. More - specifically, bounded row polymorphism gives the flexibility needed when - defining passes which keep some fields intact (without reading their - value), but the boundedness ensures that changes in the input type of a - pass do not go unnoticed, so that the pass can be amended to handle the - new information, or its type can be updated to ignore this new - information. - - @subsubsub*section{Subtyping, polymorphism and alternatives} - @|~|@(linebreak) - @~cite{ducournau-cours-lots}. @htodo{The course includes a couple of - other kinds of polymorphism and subtyping (or makes finer distinctions - than in the list below). Refine and augment the list below using - Roland's classification.} @htodo{Probably also cite:} - @~cite{cardelli1985understanding} @htodo{(apparently does not cover - ``Higher-Kinded Polymorphism'', ``Structural Polymorphism'' and ``Row - Polymorphism'')} - @itemlist[ - @item{Subtyping (also called inclusion polymorphism, subtype - polymorphism, or nominal subtyping ?): Subclasses and interfaces in - @csharp and Java, sub-structs and union types in Typed Racket, - polymorphic variants in @CAML @~cite[#:precision @list{chap. 6, sec. Polymorphic Variants}]{minsky2013real}} - @; https://realworldocaml.org/v1/en/html/variants.html + @todo{Explain the ``expression problem''.} + @todo{Talk about the various ways in which it can be ``solved'', and which + tradeoffs we aim for. Mention @url{ + http://docs.racket-lang.org/extensible-functions}, another solution + to the expression problem in Racket, but with rather different + tradeoffs.} + + We strive to implement compilers using many passes. A pass should be + able to accept a real-world AST, and produce accordingly a real-world + transformed AST as its output. It should also be possible to use + lightweight ``mock'' ASTs, containing only the values relevant to the + passes under test (possibly only the pass being called, or multiple + passes tested from end to end). The pass should then return a + corresponding simplified output AST, omitting the fields which are + irrelevant to this pass (and were not part of the input). Since the + results of the pass on a real input and on a test input are equivalent + modulo the irrelevant fields, this allows testing the pass in isolation + with simple data, without having to fill in each irrelevant field with + @tt{null} (and hope that they are indeed irrelevant, without a + comforting guarantee that this is the case), as is commonly done when + creating ``mocks'' to test object-oriented programs. + + This can be identified as a problem similar to the ``expression + problem''. In our context, we do not strive to build a compiler which + can be extended in an ``open world'', by adding new node types to the + AST, or new operations handling these nodes. Rather, the desired outcome + is to allow passes to support multiple known subsets of the same AST + type, from the start. + + We succinctly list below some of the different sorts of polymorphism, + and argue that row polymorphism is well-suited for our purpose. More + specifically, bounded row polymorphism gives the flexibility needed when + defining passes which keep some fields intact (without reading their + value), but the boundedness ensures that changes in the input type of a + pass do not go unnoticed, so that the pass can be amended to handle the + new information, or its type can be updated to ignore this new + information. + + @subsubsub*section{Subtyping, polymorphism and alternatives} + @|~|@(linebreak) + @~cite{ducournau-cours-lots}. @htodo{The course includes a couple of + other kinds of polymorphism and subtyping (or makes finer distinctions + than in the list below). Refine and augment the list below using + Roland's classification.} @htodo{Probably also cite:} + @~cite{cardelli1985understanding} @htodo{(apparently does not cover + ``Higher-Kinded Polymorphism'', ``Structural Polymorphism'' and ``Row + Polymorphism'')} + @itemlist[ + @item{Subtyping (also called inclusion polymorphism, subtype + polymorphism, or nominal subtyping ?): Subclasses and interfaces in + @csharp and Java, sub-structs and union types in Typed Racket, + polymorphic variants in @CAML @~cite[#:precision @list{chap. 6, sec. Polymorphic Variants}]{minsky2013real}} + @; https://realworldocaml.org/v1/en/html/variants.html - @; See also: use of exceptions as dynamically extensible sum types: - @; http://caml-list.inria.narkive.com/VJcoGfvp/dynamically-extensible-sum-types - @; - @; quote: I need a dynamically extensible sum type. I can think of three approaches: - @; quote: - @; quote: (1) Use polymorphic variants: `Foo of a * b, `Bar of c * d * e, etc - @; quote: - @; quote: (2) Use exceptions: exception Foo of a * b, exception Bar of c * d * e, etc - @; quote: - @; quote: (3) Use thunks: (fun () -> foo a b), (fun () -> bar c d e), etc - @; quote: - @; quote: Using exceptions seems somewhat sneaky to me. Does it have any advantages over - @; quote: polymorphic variants? The polymorphic variants seem like they might be better - @; quote: since you could actually limit the domain of certain functions... thus, one - @; quote: part of your program could be constrained to a subrange of the sum type, while - @; quote: other parts could be opened up fully. - @; quote: - @; quote: Until now I have been using the thunking approach in an event-based - @; quote: architecture (each event on the queue is a unit->unit thunk). This seems to - @; quote: work pretty well. But now I'm thinking that the other approaches would allow - @; quote: arbitrary filters to be applied to events; i.e., the thunk approach imposes a - @; quote: "read once" discipline on elements of the sum type, and in some applications - @; quote: you might want "read multiple". - @; quote: - @; quote: I'm not asking the question solely in terms of event-based architectures, - @; quote: though, and I'm interested in others experience with the different approaches - @; quote: to dynamically extensible sum types, and what led you to choose one approach - @; quote: over the others. Thanks! - @item{Multiple inheritance. @NIT, @CLOS, @CPP, @csharp interfaces, Java - interfaces. As an extension in ``untyped'' Racket with Alexis King's - safe multimethods@note{@url{https://lexi-lambda.github.io/blog/2016/02/18/simple-safe-multimethods-in-racket/}}. + @; See also: use of exceptions as dynamically extensible sum types: + @; http://caml-list.inria.narkive.com/VJcoGfvp/dynamically-extensible-sum-types + @; + @; quote: I need a dynamically extensible sum type. I can think of three approaches: + @; quote: + @; quote: (1) Use polymorphic variants: `Foo of a * b, `Bar of c * d * e, etc + @; quote: + @; quote: (2) Use exceptions: exception Foo of a * b, exception Bar of c * d * e, etc + @; quote: + @; quote: (3) Use thunks: (fun () -> foo a b), (fun () -> bar c d e), etc + @; quote: + @; quote: Using exceptions seems somewhat sneaky to me. Does it have any advantages over + @; quote: polymorphic variants? The polymorphic variants seem like they might be better + @; quote: since you could actually limit the domain of certain functions... thus, one + @; quote: part of your program could be constrained to a subrange of the sum type, while + @; quote: other parts could be opened up fully. + @; quote: + @; quote: Until now I have been using the thunking approach in an event-based + @; quote: architecture (each event on the queue is a unit->unit thunk). This seems to + @; quote: work pretty well. But now I'm thinking that the other approaches would allow + @; quote: arbitrary filters to be applied to events; i.e., the thunk approach imposes a + @; quote: "read once" discipline on elements of the sum type, and in some applications + @; quote: you might want "read multiple". + @; quote: + @; quote: I'm not asking the question solely in terms of event-based architectures, + @; quote: though, and I'm interested in others experience with the different approaches + @; quote: to dynamically extensible sum types, and what led you to choose one approach + @; quote: over the others. Thanks! + @item{Multiple inheritance. @NIT, @CLOS, @CPP, @csharp interfaces, Java + interfaces. As an extension in ``untyped'' Racket with Alexis King's + safe multimethods@note{@url{https://lexi-lambda.github.io/blog/2016/02/18/simple-safe-multimethods-in-racket/}}. - This in principle could help in our case: AST nodes would have - @tt{.withField(value)} methods returning a copy of the node with - the field's value updated, or a node of a different type with that - new field, if it is not present in the initial node type. This would - however require the declaration of many such methods in advance, so - that they can be used when needed (or with a recording mechanism - like the one we use, so that between compilations the methods called - are remembered and generated on the fly by a macro). Furthermore, - @typedracket lacks support for multiple inheritance on structs. It - supports multiple inheritance for classes @todo{?}, but classes - currently lack the ability to declare immutable fields, which in - turn causes problems with occurrence typing (see the note in the - ``row polymorphism'' point below).} - @item{Parametric polymorphism: Generics in @csharp and Java, - polymorphic types in @CAML and @typedracket} - @item{F-bounded polymorphism: Java, @csharp, @CPP, Eiffel. Possible - to mimic to some extent in @typedracket with (unbounded) parametric - polymorphism and intersection types. - @todo{Discuss how it would work/not work in our case.}} - @item{Operator overloading (also called overloading polymorphism?) and - multiple dispatch: - @itemlist[ - @item{Operator overloading in @csharp} - @item{Nothing in Java aside from the built-in cases for arithmetic and string concatenation, but those are not extensible} - @item{@CPP} - @item{typeclasses in Haskell? @todo{I'm not proficient enough in Haskell to - be sure or give a detailed description, I have to ask around to double-check.}} - @item{LISP (CLOS): has multiple dispatch} - @item{nothing built-in in @|typedracket|.} - @item{@|CAML|?}]} - @item{Coercion polymorphism (automatic casts to a given type). This - includes Scala's implicits, @csharp implicit coercion operators - (user-extensible, but at most one coercion operator is applied - automatically, so if there is a coercion operator @${A → B}, - and a coercion operator @${B → C}, it is still impossible to - supply an @${A} where a @${C} is expected without manually coercing the - value), and @CPP's implicit conversions, where single-argument - constructors are treated as implicit conversion operators, unless - annotated with the @tt{explicit} keyword. Similarly to @csharp, - @CPP allows only one implicit conversion, not two or more in a chain. - - Struct type properties in untyped Racket can somewhat be used to that - effect, although they are closer to Java's interfaces than to coercion - polymorphism. Struct type properties are unsound in @typedracket and are - not represented within the type system, so their use is subject to caution - anyway.} - @item{Coercion (downcasts). Present in most typed languages. This - would not help in our case, as the different AST types are - incomparable (especially since @typedracket lacks multiple inheritance)} + This in principle could help in our case: AST nodes would have + @tt{.withField(value)} methods returning a copy of the node with + the field's value updated, or a node of a different type with that + new field, if it is not present in the initial node type. This would + however require the declaration of many such methods in advance, so + that they can be used when needed (or with a recording mechanism + like the one we use, so that between compilations the methods called + are remembered and generated on the fly by a macro). Furthermore, + @typedracket lacks support for multiple inheritance on structs. It + supports multiple inheritance for classes @todo{?}, but classes + currently lack the ability to declare immutable fields, which in + turn causes problems with occurrence typing (see the note in the + ``row polymorphism'' point below).} + @item{Parametric polymorphism: Generics in @csharp and Java, + polymorphic types in @CAML and @typedracket} + @item{F-bounded polymorphism: Java, @csharp, @CPP, Eiffel. Possible + to mimic to some extent in @typedracket with (unbounded) parametric + polymorphism and intersection types. + @todo{Discuss how it would work/not work in our case.}} + @item{Operator overloading (also called overloading polymorphism?) and + multiple dispatch: + @itemlist[ + @item{Operator overloading in @csharp} + @item{Nothing in Java aside from the built-in cases for arithmetic and string concatenation, but those are not extensible} + @item{@CPP} + @item{typeclasses in Haskell? @todo{I'm not proficient enough in Haskell to + be sure or give a detailed description, I have to ask around to double-check.}} + @item{LISP (CLOS): has multiple dispatch} + @item{nothing built-in in @|typedracket|.} + @item{@|CAML|?}]} + @item{Coercion polymorphism (automatic casts to a given type). This + includes Scala's implicits, @csharp implicit coercion operators + (user-extensible, but at most one coercion operator is applied + automatically, so if there is a coercion operator @${A → B}, + and a coercion operator @${B → C}, it is still impossible to + supply an @${A} where a @${C} is expected without manually coercing the + value), and @CPP's implicit conversions, where single-argument + constructors are treated as implicit conversion operators, unless + annotated with the @tt{explicit} keyword. Similarly to @csharp, + @CPP allows only one implicit conversion, not two or more in a chain. + + Struct type properties in untyped Racket can somewhat be used to that + effect, although they are closer to Java's interfaces than to coercion + polymorphism. Struct type properties are unsound in @typedracket and are + not represented within the type system, so their use is subject to caution + anyway.} + @item{Coercion (downcasts). Present in most typed languages. This + would not help in our case, as the different AST types are + incomparable (especially since @typedracket lacks multiple inheritance)} - @item{Higher-kinded polymorphism: Type which is parameterized by a - @${\mathit{Type} → \mathit{Type}} function. Scala, Haskell. Maybe - @|CAML|? - - The type expander library which we developed for @typedracket - supports @${Λ}, used to describe anonymous type-level - macros. They enable some limited form of - @${\mathit{Type} → \mathit{Type}} functions, but are - actually applied at macro-expansion time, before typechecking is - performed, which diminishes their use in some cases. For example, - they cannot cooperate with type inference. Also, any recursive use - of type-level macros must terminate, unless the type ``function'' - manually falls back to using @${\mathit{Rec}} to create a regular - recursive type. This means that a declaration like - @${F(X) := X × F(F(X))} is not possible using anonymous type-level - macros only. - @; See https://en.wikipedia.org/wiki/Recursive_data_type#Isorecursive_types - @; Is this a matter of isorecursive vs equirecursive ? - - As an example of this use of the type expander library, our - cycle-handling code uses internally a ``type traversal'' macro. In - the type of a node, it performs a substitution on some subparts of - the type. It is more or less a limited form of application of a - whole family of type functions @${aᵢ → bᵢ}, which have the same inputs - @${aᵢ …}, part of the initial type, but different outputs @${bᵢ …} which - are substituted in place of the @${aᵢ …} in the resulting type. The - ``type traversal'' macro expands the initial type into a standard - polymorphic type, which accepts the desired outputs @${bᵢ …} as type - arguments.} - @item{Lenses. Can be in a way compared to explicit coercions, where - the coercion is reversible and the accessible parts can be altered.} - @item{Structural polymorphism (also sometimes called static duck-typing): Scala, - TypeScript. It is also possible in @typedracket, using the algebraic - datatypes library which we implemented. Possible to mimic in Java - and @csharp with interfaces ``selecting'' the desired fields, but - the interface has to be explicitly implemented by the class (i.e. at - the definition site, not at the use-site). - - Palmer et al. present TinyBang@~cite{types-for-flexible-objects}, a - typed language in which flexible manipulation of objects is possible, - including adding and removing fields, as well as changing the type of - a field. They implement in this way a sound, decidable form of static - duck typing, with functional updates which can add new fields and - replace the value of existing fields. Their approach is based on two - main aspects: - @itemlist[ - @item{@emph{Type-indexed records supporting asymmetric concatenation}: - by concatenating two records @${r₁ \& r₂}, a new record is obtained - containing all the fields from @${r₁} (associated to their value in - @${r₁}), as well as the fields from @${r₂} which do not appear in @${r₁} - (associated to their value in @${r₂}). Primitive types are eliminated - by allowing the use of type names as keys in the records: integers - then become simply records with a @${int} key, for example.} - @item{@emph{Dependently-typed first-class cases}: pattern-matching - functions are expressed as - @${pattern \mathbin{\texttt{->}} expression}, and can be concatenated - with the @${\&} operator, to obtain functions matching against - different cases, possibly with a different result type for each - case. The leftmost cases can shadow existing cases (i.e. the case - which is used is the leftmost case for which the pattern matches - successfully).}] - - TinyBang uses an approach which is very different from the one we - followed in our Algebraic Data Types library, but contains the - adequate primitives to build algebraic data types which would - fulfill our requirements (aside from the ability to bound the set of - extra ``row'' fields). We note that our flexible structs, which are - used within the body of node-to-node functions in passes, do support - an extension operation, which is similar to TinyBang's @${\&}, with - the left-hand side containing a constant and fixed set of - fields.@htodo{we probably can have / will have the ability to merge - non-constant left-hand side values too.}} - @item{Row polymorphism: Apparently, quoting a post on - Quora@note{@hyperlink["https://www.quora.com/Object-Oriented-Programming-What-is-a-concise-definition-of-polymorphism"]{https://www.quora.com/Object-Oriented-Programming-What-is-a-concise-definition@|?-|-of-polymorphism}\label{quora-url-footnote}}: - @aquote{ - Mostly only concatenative and functional languages (like Elm and PureScript) support this. - } - - Classes in @typedracket can have a row type argument (but classes in - @typedracket cannot have immutable fields (yet), and therefore - occurrence typing does not work on class fields. Occurrence typing - is an important idiom in @typedracket, used to achieve safe but - concise pattern-matching, which is a feature frequently used when - writing compilers). - - Our Algebraic Data Types library implements a bounded form of row - polymorphism, and a separate implementation (used within the - body of node-to-node functions in passes) allows unbounded row - polymorphism.} - @item{@todo{Virtual types}} - @item{So-called ``untyped'' or ``uni-typed'' languages: naturally - support most of the above, but without static checks. - - @htodo{ - @; phc/reading/CITE/Object-Oriented Programming_ What is a concise definition of polymorphism_ - Quora.html - @; TODO: fix the footnote here! - See also post on Quora@superscript{\ref{quora-url-footnote}}, - which links to @~cite{cardelli1985understanding}, and to a blog post by Sam - Tobin-Hochstadt@note{@url|{https://medium.com/@samth/on-typed-untyped-and-uni-typed-languages-8a3b4bedf68c}|} - The blog post by Sam Tobin-Hochstadt explains how @typedracket tries to - explore and understand how programmers think about programs written in - so-called ``untyped'' languages (namely that the programmers still - conceptually understand the arguments, variables etc as having a type (or a - union of several types)). @todo{Try to find a better citation for that.}} - - Operator overloading can be present in ``untyped'' languages, but is - really an incarnation of single or multiple dispatch, based on the - run-time, dynamic type (as there is no static type based on which the - operation could be chosen). However it is not possible in ``untyped'' - languages and languages compiled with type erasure to dispatch on - ``types'' with a non-empty intersection: it is impossible to - distinguish the values, and they are not annotated statically with a - type. - - As mentioned above, @typedracket does not have operator overloading, - and since the inferred types cannot be accessed reflectively at - compile-time, it is not really possible to construct it as a - compile-time feature via macros. @typedracket also uses type erasure, - so the same limitation as for untyped languages applies when - implementing some form of single or multiple dispatch at run-time --- - namely the intersection of the types must be empty. @todo{Here, - mention (and explain in more detail later) our compile-time - ``empty-intersection check'' feature (does not work with polymorphic - variables).}}] - - @todo{Overview of the existing ``solutions'' to the expression problems, make - a summary table of their tradeoffs (verbosity, weaknesses, strengths).} - - @todo{Compare the various sorts of subtyping and polymorphism in that light - (possibly in the same table), even those which do not directly pose as a - solution to the expression problem.} + @item{Higher-kinded polymorphism: Type which is parameterized by a + @${\mathit{Type} → \mathit{Type}} function. Scala, Haskell. Maybe + @|CAML|? + + The type expander library which we developed for @typedracket + supports @${Λ}, used to describe anonymous type-level + macros. They enable some limited form of + @${\mathit{Type} → \mathit{Type}} functions, but are + actually applied at macro-expansion time, before typechecking is + performed, which diminishes their use in some cases. For example, + they cannot cooperate with type inference. Also, any recursive use + of type-level macros must terminate, unless the type ``function'' + manually falls back to using @${\mathit{Rec}} to create a regular + recursive type. This means that a declaration like + @${F(X) := X × F(F(X))} is not possible using anonymous type-level + macros only. + @; See https://en.wikipedia.org/wiki/Recursive_data_type#Isorecursive_types + @; Is this a matter of isorecursive vs equirecursive ? + + As an example of this use of the type expander library, our + cycle-handling code uses internally a ``type traversal'' macro. In + the type of a node, it performs a substitution on some subparts of + the type. It is more or less a limited form of application of a + whole family of type functions @${aᵢ → bᵢ}, which have the same inputs + @${aᵢ …}, part of the initial type, but different outputs @${bᵢ …} which + are substituted in place of the @${aᵢ …} in the resulting type. The + ``type traversal'' macro expands the initial type into a standard + polymorphic type, which accepts the desired outputs @${bᵢ …} as type + arguments.} + @item{Lenses. Can be in a way compared to explicit coercions, where + the coercion is reversible and the accessible parts can be altered.} + @item{Structural polymorphism (also sometimes called static duck-typing): Scala, + TypeScript. It is also possible in @typedracket, using the algebraic + datatypes library which we implemented. Possible to mimic in Java + and @csharp with interfaces ``selecting'' the desired fields, but + the interface has to be explicitly implemented by the class (i.e. at + the definition site, not at the use-site). + + Palmer et al. present TinyBang@~cite{types-for-flexible-objects}, a + typed language in which flexible manipulation of objects is possible, + including adding and removing fields, as well as changing the type of + a field. They implement in this way a sound, decidable form of static + duck typing, with functional updates which can add new fields and + replace the value of existing fields. Their approach is based on two + main aspects: + @itemlist[ + @item{@emph{Type-indexed records supporting asymmetric concatenation}: + by concatenating two records @${r₁ \& r₂}, a new record is obtained + containing all the fields from @${r₁} (associated to their value in + @${r₁}), as well as the fields from @${r₂} which do not appear in @${r₁} + (associated to their value in @${r₂}). Primitive types are eliminated + by allowing the use of type names as keys in the records: integers + then become simply records with a @${int} key, for example.} + @item{@emph{Dependently-typed first-class cases}: pattern-matching + functions are expressed as + @${pattern \mathbin{\texttt{->}} expression}, and can be concatenated + with the @${\&} operator, to obtain functions matching against + different cases, possibly with a different result type for each + case. The leftmost cases can shadow existing cases (i.e. the case + which is used is the leftmost case for which the pattern matches + successfully).}] + + TinyBang uses an approach which is very different from the one we + followed in our Algebraic Data Types library, but contains the + adequate primitives to build algebraic data types which would + fulfill our requirements (aside from the ability to bound the set of + extra ``row'' fields). We note that our flexible structs, which are + used within the body of node-to-node functions in passes, do support + an extension operation, which is similar to TinyBang's @${\&}, with + the left-hand side containing a constant and fixed set of + fields.@htodo{we probably can have / will have the ability to merge + non-constant left-hand side values too.}} + @item{Row polymorphism: Apparently, quoting a post on + Quora@note{@hyperlink["https://www.quora.com/Object-Oriented-Programming-What-is-a-concise-definition-of-polymorphism"]{https://www.quora.com/Object-Oriented-Programming-What-is-a-concise-definition@|?-|-of-polymorphism}\label{quora-url-footnote}}: + @aquote{ + Mostly only concatenative and functional languages (like Elm and PureScript) support this. + } + + Classes in @typedracket can have a row type argument (but classes in + @typedracket cannot have immutable fields (yet), and therefore + occurrence typing does not work on class fields. Occurrence typing + is an important idiom in @typedracket, used to achieve safe but + concise pattern-matching, which is a feature frequently used when + writing compilers). + + Our Algebraic Data Types library implements a bounded form of row + polymorphism, and a separate implementation (used within the + body of node-to-node functions in passes) allows unbounded row + polymorphism.} + @item{@todo{Virtual types}} + @item{So-called ``untyped'' or ``uni-typed'' languages: naturally + support most of the above, but without static checks. + + @htodo{ + @; phc/reading/CITE/Object-Oriented Programming_ What is a concise definition of polymorphism_ - Quora.html + @; TODO: fix the footnote here! + See also post on Quora@superscript{\ref{quora-url-footnote}}, + which links to @~cite{cardelli1985understanding}, and to a blog post by Sam + Tobin-Hochstadt@note{@url|{https://medium.com/@samth/on-typed-untyped-and-uni-typed-languages-8a3b4bedf68c}|} + The blog post by Sam Tobin-Hochstadt explains how @typedracket tries to + explore and understand how programmers think about programs written in + so-called ``untyped'' languages (namely that the programmers still + conceptually understand the arguments, variables etc as having a type (or a + union of several types)). @todo{Try to find a better citation for that.}} + + Operator overloading can be present in ``untyped'' languages, but is + really an incarnation of single or multiple dispatch, based on the + run-time, dynamic type (as there is no static type based on which the + operation could be chosen). However it is not possible in ``untyped'' + languages and languages compiled with type erasure to dispatch on + ``types'' with a non-empty intersection: it is impossible to + distinguish the values, and they are not annotated statically with a + type. + + As mentioned above, @typedracket does not have operator overloading, + and since the inferred types cannot be accessed reflectively at + compile-time, it is not really possible to construct it as a + compile-time feature via macros. @typedracket also uses type erasure, + so the same limitation as for untyped languages applies when + implementing some form of single or multiple dispatch at run-time --- + namely the intersection of the types must be empty. @todo{Here, + mention (and explain in more detail later) our compile-time + ``empty-intersection check'' feature (does not work with polymorphic + variables).}}] + + @todo{Overview of the existing ``solutions'' to the expression problems, make + a summary table of their tradeoffs (verbosity, weaknesses, strengths).} + + @todo{Compare the various sorts of subtyping and polymorphism in that light + (possibly in the same table), even those which do not directly pose as a + solution to the expression problem.} - ``Nominal types'': our tagged structures and node types are not nominal types. - - The ``trivial'' Racket library tracks static information about the types in - simple cases. The ``turnstile'' Racket language @todo{is a follow-up} work, - and allows to define new typed Racket languages. It tracks the types of - values, as they are assigned to variables or passed as arguments to functions - or macros. These libraries could be used to implement operator overloads which - are based on the static type of the arguments. It could also be used to - implement unbounded row polymorphism in a way that does not cause a - combinatorial explosion of the size of the expanded code.@todo{Have a look at - the implementation of row polymorphism in @typedracket classes, cite their - work if there is something already published about it.} - - From the literate program (tagged-structure-low-level): - - @quotation{ - Row polymorphism, also known as "static duck typing" is a type system - feature which allows a single type variable to be used as a place - holder for several omitted fields, along with their types. The - @tt{phc-adt} library supports a limited form of row polymorphism: - for most operations, a set of tuples of omitted field names must be - specified, thereby indicating a bound on the row type variable. - - This is both a limitation of our implementation (to reduce the - combinatorial explosion of possible input and output types), as well as a - desirable feature. Indeed, this library is intended to be used to write - compilers, and a compiler pass should have precise knowledge of the - intermediate representation it manipulates. It is possible that a compiler - pass may operate on several similar intermediate representations (for - example a full-blown representation for actual compilation and a minimal - representation for testing purposes), which makes row polymorphism - desirable. It is however risky to allow as an input to a compiler pass any - data structure containing at least the minimum set of required fields: - changes in the intermediate representation may add new fields which should, - semantically, be handled by the compiler pass. A catch-all row type variable - would simply ignore the extra fields, without raising an error. Thanks to - the bound which specifies the possible tuples of omitted field names, - changes to the the input type will raise a type error, bringing the - programmer's attention to the issue. If the new type is legit, and does not - warrant a modification of the pass, the fix is easy to implement: simply - adding a new tuple of possibly omitted fields to the bound (or replacing an - existing tuple) will allow the pass to work with the new type. If, on the - other hand, the pass needs to be modified, the type system will have - successfully caught a potential issue. - } - } - }@;{Algrbraic datatypes for compilers (phc-adt)} + ``Nominal types'': our tagged structures and node types are not nominal types. + + The ``trivial'' Racket library tracks static information about the types in + simple cases. The ``turnstile'' Racket language @todo{is a follow-up} work, + and allows to define new typed Racket languages. It tracks the types of + values, as they are assigned to variables or passed as arguments to functions + or macros. These libraries could be used to implement operator overloads which + are based on the static type of the arguments. It could also be used to + implement unbounded row polymorphism in a way that does not cause a + combinatorial explosion of the size of the expanded code.@todo{Have a look at + the implementation of row polymorphism in @typedracket classes, cite their + work if there is something already published about it.} + + From the literate program (tagged-structure-low-level): + + @quotation{ + Row polymorphism, also known as "static duck typing" is a type system + feature which allows a single type variable to be used as a place + holder for several omitted fields, along with their types. The + @tt{phc-adt} library supports a limited form of row polymorphism: + for most operations, a set of tuples of omitted field names must be + specified, thereby indicating a bound on the row type variable. + + This is both a limitation of our implementation (to reduce the + combinatorial explosion of possible input and output types), as well as a + desirable feature. Indeed, this library is intended to be used to write + compilers, and a compiler pass should have precise knowledge of the + intermediate representation it manipulates. It is possible that a compiler + pass may operate on several similar intermediate representations (for + example a full-blown representation for actual compilation and a minimal + representation for testing purposes), which makes row polymorphism + desirable. It is however risky to allow as an input to a compiler pass any + data structure containing at least the minimum set of required fields: + changes in the intermediate representation may add new fields which should, + semantically, be handled by the compiler pass. A catch-all row type variable + would simply ignore the extra fields, without raising an error. Thanks to + the bound which specifies the possible tuples of omitted field names, + changes to the the input type will raise a type error, bringing the + programmer's attention to the issue. If the new type is legit, and does not + warrant a modification of the pass, the fix is easy to implement: simply + adding a new tuple of possibly omitted fields to the bound (or replacing an + existing tuple) will allow the pass to work with the new type. If, on the + other hand, the pass needs to be modified, the type system will have + successfully caught a potential issue. + } + } + }@;{Algrbraic datatypes for compilers (phc-adt)}