I am interested in the discovery of mathematical and logical structures of computation so that they can be manifested in programming languages and effectively guide the programmer. I approach problems mainly from the perspective of intuitionistic type theory or, in general, functional programming. For theoretic modelling, my weapon of choice is Agda; for practical programming, I use Haskell.
Over the past few years, I have mainly been working on bidirectional programming for dealing with consistency maintenance (synchronisation) problems. So far my results are built around the small “putback-based” bidirectional programming language BiGUL (Bidirectional Generic Update Language), about which there are an Agda formalisation (of which an initial version was reported at PEPM 2016), a Haskell port (which is available on Hackage), and a semi-formal treatment in terms of an axiomatic semantics (which was presented at POPL 2018). The ultimate goal is to bring the form of and support for bidirectional programming closer to general-purpose programming, such that general bidirectional programming techniques can be developed and applied in a wide variety of real-world synchronisation scenarios.
I have also been exploring the potential of dependently typed programming, where sophisticated correctness properties and proofs can be integrated into types and programs, and mechanically verified by type-checking. More interestingly, with an interactive development environment (like the Emacs mode offered by Agda) that provides semantic hints in the form of type information, the programmer can gain better “situation awareness” regarding the meaning of the program being constructed. (See, for example, my (draft) functional pearl on programming metamorphic algorithms.) This is made possible primarily because of inductive families, and my DPhil thesis developed datatype-generic techniques for improving usability and reusability of inductive families. I also studied the Algebra of Programming (a.k.a. Bird–Meertens formalism or program calculation) and its formalisation using dependent types.
I was born and grew up in Taiwan. After my undergraduate study (during which I worked as a part-time research assistant to Shin-Cheng Mu) and compulsory substitute service, I went to the UK for my DPhil (supervised by Jeremy Gibbons) and then came to Japan in 2014 to do postdoctoral research with Zhenjiang Hu at NII. Starting from October 2017, I am employed as an Assistant Professor by Special Appointment at NII to work on the BISCUITS project (Bidirectional Information Systems for Collaborative, Updatable, Interoperable, and Trusted Sharing).
My Chinese name 向上 (Hsiang-Shang), which I publish under, means “upwards” and is an unusual one; sadly, it’s impossible to approximate its pronunciation acceptably accurately in English. On the other hand, “Josh” is a name I’ve used for years (since 1996, to be precise), so I am perfectly happy (in fact happier) to be called by the name in English. (It turns out that “Hsiang-Shang” is also difficult for Japanese people to pronounce, and my Japanese colleagues are happier to call me “Josh” too.) Even in a Chinese-speaking environment, using an English name can be helpful in avoiding the strictly hierarchical and somewhat distant way of addressing people in East Asia and making it (slightly) easier to establish equality.
I listen to classical music (mainly from the Romantic period) and play the piano — I now own a Kawai CA65 digital piano, and strive to practise daily. For piano music, I feel closest to Chopin, but I’m also getting to know Beethoven and Schubert. More broadly, I also admire Brahms and Tchaikovsky.
Recently I’m attracted to technical and operational aspects of civil aviation; I collect 1/200 and 1/400 airliner models, and I’m learning to “fly” the Boeing 777 in X-Plane.
Language designers usually need to implement parsers and printers. Despite being two closely related programs, in practice they are often designed separately, and then need to be revised and kept consistent as the language evolves. It will be more convenient if the parser and printer can be unified and developed in a single program, with their consistency guaranteed automatically. Furthermore, in certain scenarios (like showing compiler optimisation results to the programmer), it is desirable to have a more powerful reflective printer that, when an abstract syntax tree corresponding to a piece of program text is modified, can propagate the modification to the program text while preserving layouts, comments, and syntactic sugar.
To address these needs, we propose a domain-specific language BiYacc, whose programs denote both a parser and a reflective printer for a fully disambiguated context- free grammar. BiYacc is based on the theory of bidirectional transformations, which helps to guarantee by construction that the generated pairs of parsers and reflective printers are consistent. Handling grammatical ambiguity is particularly challenging: we propose an approach based on generalised parsing and disambiguation filters, which produce all the parse results and (try to) select the only correct one in the parsing direction; the filters are carefully bidirectionalised so that they also work in the printing direction and do not break the consistency between the parsers and reflective printers. We show that BiYacc is capable of facilitating many tasks such as Pombrio and Krishnamurthi’s ‘resugaring’, simple refactoring, and language evolution.
Session types are a type discipline for eliminating communication errors in concurrent computing. These types can be thought of as a representation of communication protocols implemented by communicating processes. One application scenario that can be naturally supported by session types is semantics-preserving transformation of processes in response to protocol changes due to optimization, evolution, refactoring, etc. Such transformation can be seen as a particular kind of synchronization problem that has long been studied by the bidirectional transformations (BX) community. This short paper offers a preliminary analysis of the process–type synchronization problem in terms of BX, describing the prospects and challenges.
We conduct an experiment with interactive type-driven development in Agda, developing algorithms from their specifications encoded as intrinsic types, to see how useful the hints provided by Agda during an interactive development process can be. The algorithmic problem we choose is metamorphisms, whose definitional behaviour is consuming a data structure to compute an intermediate value and then producing a codata structure from that value, but there are other ways to compute metamorphisms. We develop Gibbons’s  streaming algorithm and Nakano’s  jigsaw model interactively with Agda, turning intuitive ideas about these algorithms into formal conditions and programs that are correct by construction.
Languages for programming state-based asymmetric lenses are usually based on lens combinators, whose style, having a functional programming origin, is alien to most programmers and awkward to use even for experienced functional programmers. We propose a visual syntax mimicking circuit diagrams for the combinator-based language BiGUL, provide a relational interpretation that allows the diagrams to be understood bidirectionally, and sketch how an editor for the visual syntax can help to construct, understand, and debug lens combinator programs in an intuitive and friendly way.
Putback-based bidirectional programming allows the programmer to write only one putback transformation, from which the unique corresponding forward transformation is derived for free. A key distinguishing feature of putback-based bidirectional programming is full control over the bidirectional behavior, which is important for specifying intended bidirectional transformations without any ambiguity. In this chapter, we will introduce BiGUL, a simple yet powerful putback-based bidirectional programming language, explaining the underlying principles and showing how various kinds of bidirectional application can be developed in BiGUL.
2.5.2 with Standard Library version 0.13
Among the frameworks of bidirectional transformations proposed for addressing various synchronisation (consistency maintenance) problems, Foster et al.’s  asymmetric lenses have influenced the design of a generation of bidirectional programming languages. Most of these languages are based on a declarative programming model, and only allow the programmer to describe a consistency specification with ad hoc and/or awkward control over the consistency restoration behaviour. However, synchronisation problems are diverse and require vastly different consistency restoration strategies, and to cope with the diversity, the programmer must have the ability to fully control and reason about the consistency restoration behaviour. The putback-based approach to bidirectional programming aims to provide exactly this ability, and this paper strengthens the putback-based position by proposing the first fully fledged reasoning framework for a bidirectional language — a Hoare-style logic for Ko et al.’s  putback-based language BiGUL. The Hoare-style logic lets the BiGUL programmer precisely characterise the bidirectional behaviour of their programs by reasoning solely in the putback direction, thereby offering a unidirectional programming abstraction that is reasonably straightforward to work with and yet provides full control not achieved by previous approaches. The theory has been formalised and checked in Agda, but this paper presents the Hoare-style logic in a semi-formal way to make it easily understood and usable by the working BiGUL programmer.
Pregel is a popular distributed computing model for dealing with large-scale graphs. However, it can be tricky to implement graph algorithms correctly and efficiently in Pregel’s vertex-centric model, especially when the algorithm has multiple computation stages, complicated data dependencies, or even communication over dynamic internal data structures. Some domain-specific languages (DSLs) have been proposed to provide more intuitive ways to implement graph algorithms, but due to the lack of support for remote access — reading or writing attributes of other vertices through references — they cannot handle the above mentioned dynamic communication, causing a class of Pregel algorithms with fast convergence impossible to implement.
To address this problem, we design and implement Palgol, a more declarative and powerful DSL which supports remote access. In particular, programmers can use a more declarative syntax called chain access to naturally specify dynamic communication as if directly reading data on arbitrary remote vertices. By analyzing the logic patterns of chain access, we provide a novel algorithm for compiling Palgol programs to efficient Pregel code. We demonstrate the power of Palgol by using it to implement several practical Pregel algorithms, and the evaluation result shows that the efficiency of Palgol is comparable with that of hand-written code.
Bidirectional transformation (bx) approaches provide a systematic way of specifying, restoring, and maintaining the consistency of related models. The current diversity of bx approaches is certainly beneficial, but it also poses challenges, especially when it comes to comparing the different approaches and corresponding bx tools that implement them. Although a benchmark for bx (referred to as a benchmarx) has been identified in the community as an important and currently still missing contribution, only a rather abstract description and characterisation of what a benchmarx should be has been published to date. In this paper, therefore, we focus on providing a practical and pragmatic framework, on which future concrete benchmarx can be built. To demonstrate its feasibility, we present a first non-trivial benchmarx based on a well-known example, and use it to compare and evaluate three bx tools, chosen to cover the broad spectrum of bx approaches.
18.104.22.168 with Standard Library version 0.12
Dependently typed programming advocates the use of various indexed versions of the same shape of data, but the formal relationship amongst these structurally similar datatypes usually needs to be established manually and tediously. Ornaments have been proposed as a formal mechanism to manage the relationships between such datatype variants. In this paper, we conduct a case study under an ornament framework; the case study concerns programming binomial heaps and their operations — including insertion and minimum extraction — by viewing them as lifted versions of binary numbers and numeric operations. We show how current dependently typed programming technology can lead to a clean treatment of the binomial heap constraints when implementing heap operations. We also identify some gaps between the current technology and an ideal dependently typed programming language that we would wish to have for our development.
Language designers usually need to implement parsers and printers. Despite being two intimately related programs, in practice they are often designed separately, and then need to be revised and kept consistent as the language evolves. It will be more convenient if the parser and printer can be unified and developed in one single program, with their consistency guaranteed automatically.
Furthermore, in certain scenarios (like showing compiler optimisation results to the programmer), it is desirable to have a more powerful reflective printer that, when an abstract syntax tree corresponding to a piece of program text is modified, can reflect the modification to the program text while preserving layouts, comments, and syntactic sugar.
To address these needs, we propose a domain-specific language BiYacc, whose programs denote both a parser and a reflective printer for an unambiguous context-free grammar. BiYacc is based on the theory of bidirectional transformations, which helps to guarantee by construction that the pairs of parsers and reflective printers generated by BiYacc are consistent. We show that BiYacc is capable of facilitating many tasks such as Pombrio and Krishnamurthi’s “resugaring”, language evolution, and refactoring.
22.214.171.124 with Standard Library version 0.11
The version of BiGUL described in this paper is outdated. See the “axiomatic basis” paper for a semi-formal axiomatic introduction to the current BiGUL.
Putback-based bidirectional programming allows the programmer to write only one putback transformation, from which the unique corresponding forward transformation is derived for free. The logic of a putback transformation is more sophisticated than that of a forward transformation and does not always give rise to well-behaved bidirectional programs; this calls for more robust language design to support development of well-behaved putback transformations. In this paper, we design and implement a concise core language BiGUL for putback-based bidirectional programming to serve as a foundation for higher-level putback-based languages. BiGUL is completely formally verified in the dependently typed programming language Agda to guarantee that any putback transformation written in BiGUL is well-behaved.
the more complete SLE’16 paper, which, in particular, describes the internals of BiYacc in much more detail.
Language designers usually need to implement parsers and printers. Despite being two related programs, in practice they are designed and implemented separately. This approach has an obvious disadvantage: as a language evolves, both its parser and printer need to be separately revised and kept synchronised. Such tasks are routine but complicated and error-prone. To facilitate these tasks, we propose a language called BiYacc, whose programs denote both a parser and a printer. In essence, BiYacc is a domain-specific language for writing putback-based bidirectional transformations — the printer is a putback transformation, and the parser is the corresponding get transformation. The pairs of parsers and printers generated by BiYacc are thus always guaranteed to satisfy the usual round-trip properties. The highlight that distinguishes this reflective printer from others is that the printer — being a putback transformation — accepts not only an abstract syntax tree but also a string, and produces an updated string consistent with the given abstract syntax tree. We can thus make use of the additional input string, with mechanisms such as simultaneous pattern matching on the view and the source, to provide users with full control over the printing-strategies.
2.5.2 with Standard Library version 0.13
The revised version uses https DOI links and fixes identifier hyperlinks and a few typos in the Bodleian version.
Based on a natural unification of logic and computation, Martin-Löf’s intuitionistic type theory can be regarded simultaneously as a computationally meaningful higher-order logic system and an expressively typed functional programming language, in which proofs and programs are treated as the same entities. Two modes of programming can then be distinguished: in externalism, we construct a program separately from its correctness proof with respect to a given specification, whereas in internalism, we encode the specification in a sophisticated type such that any program inhabiting the type also encodes a correctness proof, and we can use type information as a guidance on program construction. Internalism is particularly effective in the presence of inductive families, whose design can have a strong influence on program structure. Techniques and mechanisms for facilitating internalist programming are still lacking, however.
This dissertation proposes that internalist programming can be facilitated by exploiting an interconnection between internalism and externalism, expressed as isomorphisms between inductive families into which data structure invariants are encoded and their simpler variants paired with predicates expressing those invariants. The interconnection has two directions: one analysing inductive families into simpler variants and predicates, and the other synthesising inductive families from simpler variants and specific predicates. They respectively give rise to two applications, one achieving a modular structure of internalist libraries, and the other bridging internalist programming with relational specifications and program derivation. The datatype-generic mechanisms supporting the applications are based on McBride’s ornaments. Theoretically, the key ornamental constructs — parallel composition of ornaments and relational algebraic ornamentation — are further characterised in terms of lightweight category theory. Most of the results are completely formalised in the Agda programming language.
2.3.3 with Standard Library version 0.6
my DPhil dissertation, which subsumes and streamlines the content of this paper, and includes more examples.
Dependently typed programming is hard, because ideally dependently typed programs should share structure with their correctness proofs, but there are very few guidelines on how one can arrive at such integrated programs. McBride’s algebraic ornamentation provides a methodological advancement, by which the programmer can derive a datatype from a specification involving a fold, such that a program that constructs elements of that datatype would be correct by construction. It is thus an effective method that leads the programmer from a specification to a dependently typed program. We enhance the applicability of this method by generalising algebraic ornamentation to a relational setting and bringing in relational algebraic methods, resulting in a hybrid approach that makes essential use of both dependently typed programming and relational program derivation. A dependently typed solution to the minimum coin change problem is presented as a demonstration of this hybrid approach. We also give a theoretically interesting “completeness theorem” of relational algebraic ornaments, which sheds some light on the expressive power of ornaments and inductive families.
my DPhil dissertation, which subsumes and streamlines the content of this paper, and includes more examples.
Dependently typed programmers are encouraged to use inductive families to integrate constraints with data construction. Different constraints are used in different contexts, leading to different versions of datatypes for the same data structure. For example, sequences might be constrained by length or by an ordering on elements, giving rise to different datatypes “vectors” and “sorted lists” for the same underlying data structure of sequences. Modular implementation of common operations for these structurally similar datatypes has been a longstanding problem. We propose a datatype-generic solution, in which we axiomatise a family of isomorphisms between datatypes and their more refined versions as datatype refinements, and show that McBride’s ornaments can be translated into such refinements. With the ornament-induced refinements, relevant properties of the operations can be separately proven for each constraint, and after the programmer selects several constraints to impose on a basic datatype and synthesises a new datatype incorporating those constraints, the operations can be routinely upgraded to work with the synthesised datatype.
Same topic at the Workshop on Dependently Typed Programming (DTP) 2011, and “numerical representations à la ornamentation” at Fun in the Afternoon 2012
This paper has been superseded by an extended version in Progress in Informatics.
Dependently typed programmers are encouraged to use inductive families to integrate constraints with data construction. Different constraints are used in different contexts, leading to different versions of datatypes for the same data structure. Modular implementation of common operations for these structurally similar datatypes has been a longstanding problem. We propose a datatype-generic solution based on McBride’s datatype ornaments, exploiting an isomorphism whose interpretation borrows ideas from realisability. Relevant properties of the operations are separately proven for each constraint, and after the programmer selects several constraints to impose on a basic datatype and synthesises an inductive family incorporating those constraints, the operations can be routinely upgraded to work with the synthesised inductive family.
Relational program derivation is the technique of stepwise refining a relational specification to a program by algebraic rules. The program thus obtained is correct by construction. Meanwhile, dependent type theory is rich enough to express various correctness properties to be verified by the type checker. We have developed a library, AoPA, to encode relational derivations in the dependently typed programming language Agda. A program is coupled with an algebraic derivation whose correctness is guaranteed by the type system. Two non-trivial examples are presented: an optimisation problem, and a derivation of quicksort where well-founded recursion is used to model terminating hylomorphisms in a language with inductive types.
This paper has been superseded by an extended version in the Journal of Functional Programming.
Dependent type theory is rich enough to express that a program satisfies an input/output relational specification, but it could be hard to construct the proof term. On the other hand, squiggolists know very well how to show that one relation is included in another by algebraic reasoning. We demonstrate how to encode functional and relational derivations in a dependently typed programming language. A program is coupled with an algebraic derivation from a specification, whose correctness is guaranteed by the type system.