Exploring 4 Languages: Integrity and Consistency

Making, and keeping, promises – with Rust, Elm, F, and ReasonML.

March 24, 2018Filed under Tech#domain-driven design#elm#four-languages#fsharp#functional programming#reasonml#rustMarkdown source

In chapter 6, Wlaschin turns to one of the most important aspects of “domain modeling”: keeping it consistent. It’s all well and good to set up a domain model, but if you don’t have a way to make sure that model is reliable everywhere you use it, well… you’ve done a lot of extra work and you’re not going to see a lot of results for all that effort! But as Wlaschin points out, we can actually use the type systems, and the types we wrote up in the previous chapter, to help us enforce the business rules for our domain (as well as the business shapes in the domain).

An important note: you can see the latest version of this code (along with history indicating some of my travails in getting there!) in this public repository on GitHub.

A simple example: WidgetCode

We’ll start with one of the simpler examples: validating that a WidgetCode is legitimate. A WidgetCode, in this domain, is valid if, and only if, it has a W followed by four digits.

The basic tack we’ll take, in all four languages, is to leverage the way the types work to make it so we have to use a function to create a valid instance of a WidgetCode. That’s a bit of extra work (though especially in the functional-first languages, it ends up not being a lot of extra work) but it lets us use Result types to handle invalid data up front.

The downside is that we can’t just get directly at the value inside our wrapper types using basic pattern matching. Instead, we need to be provide a function for “unwrapping” it. Tradeoffs!

We’ll go at this using the most appropriate tool from each language, but in every case we’ll end up with a create function that takes a string and returns a Result with the successful option being a WidgetCode and the error option being a string describing the error; and a value function to unwrap a valid code. Throughout, I also assume an essentially-identical implementation of a related GizmoCode type; I pull both in to show how they end up being used side by side.

Rust

We are using a tuple struct to wrap the string value here. Since there is no pub modifier in the wrapped String, it’s opaque from the perspective of the caller—and this is exactly what we want. We’ll pull in the Regex crate and validate the code passed to us on creation.

use regex::Regex;

pub struct WidgetCode(String);

impl WidgetCode {
    pub fn create(code: &str) -> Result<WidgetCode, String> {
        let re = Regex::new(r"W\d{4}").expect(r"W\d{4} is a valid regex");
        if re.is_match(code) {
            Ok(WidgetCode(String::from(code)))
        } else {
            Err(String::from(
                "`WidgetCode` must begin with a 'W' and be followed by 4 digits",
            ))
        }
    }

    pub fn value(&self) -> &str {
        &self.0
    }
}

This is fairly idiomatic Rust: we’re borrowing a reference to the code as a “string slice”, and creating a new, wrapped String instance to wrap up the code or return a new String as an error. When we get the value out, we return a reference to the string,1 with &self.0: & to indicate a reference, .0 to indicate the first item of a tuple. Note as well that the final if block here is an expression. There’s no semicolon terminating it, and this whole if block ends up being the resulting value of the function.

One other point of interest here is that the creation of the regex itself is checked by the compiler for us! If we pass an invalid regular expression, this simply won’t compile.

This could also live in its own module, ordering/widget_code.rs, and in fact that’s how I would normally do this (and have in the repository where I’m working): every one of these small types would get its own module file within the containing Ordering module. It’s not necessary, but as the domain model grows, it becomes increasingly convenient in that you always know where to find things.2

Then we can import it and use it like this in ordering/mod.rs:

mod widget_code;
mod gizmo_code;

use widget_code::WidgetCode;
use gizmo_code::GizmoCode;

pub enum ProductCode {
    Widget(WidgetCode),
    Gizmo(GizmoCode),
}

fn demo_it() {
    let valid = WidgetCode::create("W1234");
    let invalid = WidgetCode::create("wat");

    let unwrapped = match valid {
        Ok(ref code) => code.value(),
        Err(_) => "",
    };
}

Notice that in Rust, the mod.rs file declares all child modules. If you had a widget_code.rs on the file system but no mod widget_code;, Rust would just ignore the declaration entirely. Then Rust also requires us to use widget_code; to access its contents. The distinction between declaring and using a given module makes some sense: by the time all is said and done with this exercise, we won’t be doing much of anything in this Ordering module; it’ll exist primarily as a grouping construct for all the other modules.

In this case, we go ahead and import the WidgetCode type from the module. We only have the one type there, with no standalone functions: everything is attached to the type via the impl block; so we can just call everything directly off of the type. This ends up feeling kind of like the way we’d do things in a traditional OOP language, but also really not, because we still have a separation between the data type and the implementation of functionality attached to it. It’s not obvious here, but we could write impl WidgetCode in some other module in the crate, and as long as there’s no conflict between the implementations, it’s fine! And then we could call whatever function we defined in that block “on” WidgetCode. This is on the one hand totally unlike what we’ll see in the other languages, and on the other hand weirdly analogous to them.

I’m going to pass over why we need ref code here, as it gets into details of Rust’s model of ownership and reference borrowing and it’s going to be unneeded because of improvements to Rust’s compiler fairly soon. The one thing to note here is that we get nice memory/allocation behavior, i.e. we’re not doing a bunch of separate heap string allocations here. This is one of the big upsides to Rust in general! It’s not quite as pretty as what we’ll see below, but the performance wins are awesome.

Elm

Elm introduces us to a pattern we’ll see in each of the more traditional “functional” languages: the use of modules for this kind of structure. First the code, then some comments on it:

-- src/ordering/WidgetCode.elm
module Ordering.WidgetCode exposing (WidgetCode, create, value)

import Regex exposing (contains, regex)


type WidgetCode
    = WidgetCode String


create : String -> Result String WidgetCode
create code =
    if contains (regex "W\\d{4}") code then
        Ok (WidgetCode code)
    else
        Err "`WidgetCode` must begin with a 'W' and be followed by 4 digits"


value : WidgetCode -> String
value (WidgetCode code) =
    code

Elm’s module system lets you choose exactly what to expose. In this case, we’re only exporting the type itself along the create and value functions—but, importantly, not the normal type constructors for the type.

You can import the things exposed both as a module and as individual items. Assume we implemented GizmoCode the same way. We’d import and use them in Ordering.elm like this:

-- Ordering.elm
import Ordering.WidgetCode as WidgetCode exposing (WidgetCode)
import Ordering.GizmoCode as GizmoCode exposing (GizmoCode)

type ProductCode
    = Widget WidgetCode
    | Gizmo GizmoCode
    
valid =
    WidgetCode.create "W1234"


invalid =
    WidgetCode.create "wat"


unwrapped =
    case valid of
        Result.Ok code ->
            WidgetCode.value (code)

        Result.Err _ ->
            ""

As with Rust, we can’t construct the type without using the provided function. As I’ve written the imports, you’d create a WidgetCode by writing WidgetCode.create "W1234". You could also import it directly, but that would have its own problems once you had the create function imported for GizmoCode as well.

Finally, notice the way we aliased the module name here with as on the import: we don’t have to write out the fully qualified path this way. And there’s no conflict between the aliased module name and the type name – they live in their own namespaces (as it should be!). Importing the type name distinctly is handy because it means we don’t have to write the body of the union type out as Widget WidgetCode.WidgetCode.

F

The F code looks a lot like the Elm code. The main differences here have to do with their module systems.

namespace Ordering

type WidgetCode = private WidgetCode of string
module WidgetCode =
    let create code =
        if Regex.IsMatch(code, @"W\d{4}") then
            Ok (WidgetCode code)
        else
            Error "`WidgetCode` must begin with a 'W' and be followed by 4 digits"

    let value (WidgetCode code) = code

Here we declare that we’re in the namespace Ordering. Everything here will be publicly visible to everything else in the namespace Ordering. We could also make this a module, and in that case we’d need to explicitly open it in other modules. Because it’s part of the base namespace we’re using for Ordering, though, we get it for “free”. There’s a downside to this, though. More on that below.

Also notice that this means that we have yet one more “namespace” for names to live in: namespace names are different from module are different from type names! So here we declare a top-level module Ordering here so that we can actually write code that does something in the file – namespaces can only contain type definitions (including module definitions).

namespace Ordering

type ProductCode =
    | Widget of WidgetCode
    | Gizmo of GizmoCode

module DemoIt =
    let valid = WidgetCode.create "W1234"
    let invalid = WidgetCode.create "wat"
    
    let unwrapped =
        match valid with
        | Ok(code) -> WidgetCode.value code
        | Error(_) -> ""

The things to notice here as particularly different from the others:

  1. We don’t have to explicitly import the module names, because we used the same namespace (Ordering) to group them. We could also have done namespace Ordering.WidgetCode and open Ordering.WidgetCode; that might actually make more or less sense in the context. I think this is probably more idiomatic, however, which is why I picked it.
  2. Since we’re keeping the rest of the containing module in the same namespace, we do have to declare module DemoIt for functionality – not just types – to live in. This is true for both Ordering.fs and WidgetCode.fs and so on.

This way of structuring things works really well, but it has one major downside compared to Elm and Rust: where any given name comes from is not obvious from any given text file. Using modules instead of namespaces and using more fully qualified names could help here, but the reality is simply that F (like C) basically leaves you out to dry here. My take is that this is basically what happens when you design a language assuming IDE-like tooling. But especially when looking at e.g. GitHub diff views, or just browsing source code in general, I strongly prefer the way Elm and Rust generally lead you to do explicit imports or fully qualified paths. (Both have an escape hatch: Rust’s use path::to::module::*; and Elm’s import Path.To.Module exposing (..), but both are actively discouraged as bad practice in most situations.)

Reason

Interestingly, Reason looks most like Rust but behaves most like F. The biggest difference is that I need a separate interface file for Reason to get the privacy benefits that I’m getting in all the other languages.

We put the definition file at ordering/Ordering_WidgetCode.rei. (I’ll comment on the long name in a moment.)

type gizmoCode = pri | GizmoCode(string);

let create: string => Js.Result.t(widgetCode, string);

let value: widgetCode => string;

With that module definition in place, we can separately supply the implementation, in ordering/Ordering_WidgetCode.re.

type widgetCode =
  | WidgetCode(string);

let create = code => {
  let isMatch =
    Js.Re.fromString("W\\d{4}") |> Js.Re.exec(code) |> Js.Option.isSome;
  if (isMatch) {
    Js.Result.Ok(WidgetCode(code));
  } else {
    Js.Result.Error(
      "`WidgetCode` must begin with a 'W' and be followed by 4 digits"
    );
  };
};

let value = (WidgetCode(code)) => code;

Note that you could do the same thing with an interface file for F. We’re also doing something that’s similar in principle to the use of private types in in F, but unlike in F we have to use the module interface to make it work as far as I can tell. The interface can declare the type private, but in the actual implementation, the type has to be non-private to be constructable. (If I’m wrong, please send me a note to let me know! But that’s what I gathered from reading OCaml docs, as well as from command line error messages as I played around.) Also, the fact that Reason has landed on the keyword pri instead of OCaml and F’s much saner private is super weird.

The interface file just defines the types, and has the .rei extension. type widgetCode here is an abstract type, which provides no information about what it contains. Note the function types are provided as well. Here I’m using specifically the Js.Result type; there is also a Result type in at least one of the OCaml standard libraries. This is one of the more complicated things about Reason compared to the others: there are… several standard libraries to choose from, which will or won’t work differently depending on what compile target you’re picking.

In any case, once we have both the module and the implementation defined, we can use it like this in ordering.re:

module WidgetCode = Ordering_WidgetCode;

module GizmoCode = Ordering_GizmoCode;

open WidgetCode;

open GizmoCode;

type productCode =
  | Widget(widgetCode)
  | Gizmo(gizmoCode);

let valid = WidgetCode.create("W1234");

let invalid = WidgetCode.create("wat");

let unwrapped =
  switch valid {
  | Js.Result.Ok(code) => WidgetCode.value(code)
  | Js.Result.Error(_) => ""
  };

We do this mapping from Ordering_WidgetCode to WidgetCode here because OCaml and therefore Reason has only a single global namespace for its module names as defined by the file system. You can nest modules, but only within files. The workaround is, well… Ordering_ and remapping the name as we have here. This lets you access the nested modules as Ordering.WidgetCode and so on elsewhere.

Then we open WidgetCode etc. so that we can write widgetCode instead of WidgetCode.widgetCode in the productCode definition. This is basically the same effect we get from just being in the same namespace in F (which, again, we could rewrite exactly this way), or from the kinds of imports we discussed above for Rust and Elm.

Numeric validation: UnitQuantity

So far, the showing tilts heavily in F’s and Elm’s favor in terms of expressiveness and elegance. However, there’s a lot of variation depending on exactly what you’re doing. If, for example, you want to validate a range, well… then Rust actually has a pretty good approach! Once again, you’ll note that these all have a lot in common; the difference mostly comes down to the degree of syntactical noise required to express the same basic thing.

In this section, I’m not really going to spend a lot of time discussing the details and differences; I’m just leaving it here to show an interesting example where the languages’ design decisions end up have slightly different ergonomic tradeoffs.

Rust

// ordering/unit_quantity.rs
pub struct UnitQuantity(u32);

impl UnitQuantity {
    pub fn create(qty: u32) -> Result<UnitQuantity, String> {
        match qty {
            0 => Err(String::from("`UnitQuantity` cannot be less than 1")),
            1...1000 => Ok(UnitQuantity(qty)),
            _ => Err(String::from("`UnitQuantity` cannot be greater than 1000")),
        }
    }

    pub fn value(&self) -> u32 {
        self.0
    }

    pub fn minimum() -> UnitQuantity {
        UnitQuantity(1)
    }
}

Elm

-- ordering/UnitQuantity.elm
module Ordering.UnitQuantity exposing (UnitQuantity, create, value)


type UnitQuantity
    = UnitQuantity Int


create : Int -> Result String UnitQuantity
create qty =
    if qty < 1 then
        Err "`UnitQuantity` cannot be less than 1"
    else if qty > 1000 then
        Err "`UnitQuantity` cannot be greater than 1000"
    else
        Ok (UnitQuantity qty)


value : UnitQuantity -> Int
value (UnitQuantity qty) =
    qty

    
minimum : UnitQuantity
minimum = UnitQuantity 1

F

// ordering/UnitQuantity.fs
namespace Ordering

type UnitQuantity = private UnitQuantity of uint32
module UnitQuantity =
    let create qty =
        if qty < 1u then
            Error "`UnitQuantity` cannot be less than 1"
        else if qty > 1000u then
            Error "`UnitQuantity` cannot be greater than 1000"
        else
            Ok (UnitQuantity qty)

    let value (UnitQuantity qty) = qty
    
    let minimum = UnitQuantity 1

Reason

/* ordering/Ordering_UnitQuantity.rei */
type unitQuantity = pri | UnitQuantity(int);

let create: int => Js.Result.t(unitQuantity, string);

let value: unitQuantity => int;

let minimum: unitQuantity;
/* ordering/Ordering_UnitQuantity.re */

type unitQuantity =
  | UnitQuantity(int);

let create = qty =>
  if (qty < 1) {
    Js.Result.Error("`UnitQuantity` cannot be less than 1");
  } else if (qty > 1000) {
    Js.Result.Error("`UnitQuantity` cannot be greater than 1000");
  } else {
    Js.Result.Ok(UnitQuantity(qty));
  };

let value = (UnitQuantity(qty)) => qty;

let minimum = UnitQuantity(1);

Aside: On Documentation

One thing that became extremely clear in the course of working all of this out is that the documentation stories for these languages are in vastly, vastly different places.

Figuring out how to write this private create/value approach was very straightforward in Rust, because it’s literally just right there in how impl blocks and the pub keyword work: things default to private, including the contents of a struct, and you always define the related functionality with pub fn declarations in the related impl block.

Elm and F were both slightly harder, in that I had to poke around a bit to figure out the right way to do it. But not that much harder. Both use module-level isolation to accomplish this; the main difference there was that F just lets you do it inline and Elm explicitly ties modules to files.

Reason… was very, very difficult to get sorted out. This is just a function of the state of the ecosystem. Reason is distinct syntax for OCaml, but it also leans on BuckleScript. That means that if you want to figure out how to do anything, you probably need to search in the docs for all of those, and if your answer turns out to come from OCaml then you have to figure out how to translate it back into Reason and BuckleScript! Ultimately, I was able to figure it out and get the project layout to how you see it in the repository, but… it took a lot more digging than with any of the other projects!

Summary

As with our previous foray, we can see a ton of similarities across these languages. All lean heavily on pattern-matching for dealing with different scenarios; all let us make use of a Result type for handling success or failure; all make heavy use of expression-bodied-ness; and all supply some way to make types constructable only in safe/controlled ways.

For Rust, that’s a matter of leaving the internals of a struct private and making pub fn helpers to do the construction and value retrieval. For Elm, F, and Reason, that’s a matter of having the normal type constructors be private while exposing the types themselves normally. They do that in different ways (F’s private type, Elm’s exposing, and Reason’s pri annotation on the type variant in a module interface file), but the effect is essentially identical, and functionally equivalent to what we see in Rust.

The main differences we see across Elm, F, and Reason have to do with the nature of the various module systems. In a lot of ways, Reason’s is the least capable for this specific purpose, because it’s directly tied to OCaml’s module system, which substantially predates any of the others. (I say “in a lot of ways” because OCaml’s modules are surprisingly capable; they end up being their own kind of types and you can do some crazy things with them, all of which I’d like to actually come to understand… eventually.) Rust’s module system, meanwhile, has a lot of similarities to Elm’s in particular, but because we actually carry functions along with the types they impl (though they get defined separately, with all the power that entails), we have a bit less boilerplate we need to write just to get at the specific functions in play.

Next time (probably only a couple of weeks away because we’re working through the book at work in a book club!), I’ll be looking at Chapter 7: Modeling Workflows as Pipelines. I suspect this will be a place where the true functional orientation of Elm, F, and Reason will much more sharply differentiate them from the sometimes-functionalish-but-not-actually-functional way we write things in Rust.


  1. This reference will live and be valid as long as the underlying WidgetCode is. We could also return a String if we wanted that value to live independently of the WidgetCode instance backing it.

  2. Putting it in its own module, whether in a separate does have implications for privacy, though we don’t much care about them in this case. Rust lets us set the privacy on a whole spectrum, from “visible everywhere” to “only visible in this specific module.”