Fork me on GitHub

F# Advent Calendar 2016

This post is part of the F# Advent Calendar in English 2016. Please check out the other posts as well.

I would also like to thank Scott Wlaschin for his willingness to make an e-book with the posts and raise money for a good cause: Tweet. Be sure to contribute with this cause (retweet and economically).

Background

It should be no secret that I’m a big fan of Elm and it shouldn’t be no secret as well that I have been working with F# for a long time and is still a bit of a fanboy.

One of the features that is missed the most when jumping between these two fantastic languages is the Semantic Versioning to handle package versions from Elm (bump and diff).

In this blog post I will argue that I have made a .NET library (very early in development stage but almost complete, you folks will decide) that will be able to help you out whenever you have to decide if a new version of your Assembly or NuGet package is a Major, Minor or just a Patch (Major.Minor.Patch).

Semantic Versioning

So what is Semantic Versioning (SemVer)? If we look into the following website: semver.org, we can read statements like:

  • In the world of software management there exists a dread place called “dependency hell”.

  • The bigger your system grows and the more packages you integrate into your software, the more likely you are to find yourself in it.

  • If dependencies are specified too loosely, you will probably end up breaking your build more than desired.

In the mentioned website there are described 12 rules, to be enforced by documentation or the code itself, to solve the issue but we will stay with the following summary:

  • … given a version number (MAJOR.MINOR.PATCH), increment the:

    • MAJOR version when you make incompatible API changes,

    • MINOR version when you add functionality in a backwards-compatible manner, and

    • PATCH version when you make backwards-compatible bug fixes

And that is pretty much it.

Note: I have removed the word documentation, as I am in the belief that this task should be done automatically by a computer and not by a human. In a similar way as a computer help us catch errors at compile time instead of end-users at runtime.

elm-package bump and diff

If we look into Elm’s tool for package publishing, called elm-package, it has two main actions which are related to SemVer: bump and diff

elm-package bump

As we can see from the GitHub repository of the tool, the version rules as very similar to the ones we just described in the previous section:

  • Versions all have exactly three parts: MAJOR.MINOR.PATCH

  • All packages start with initial version 1.0.0

  • Versions are incremented based on how the API changes:
    • PATCH - the API is the same, no risk of breaking code ✓
    • MINOR - values have been added, existing values are unchanged ✓
    • MAJOR - existing values have been changed or removed ✓
  • elm-package will bump versions for you, automatically enforcing these rules ✓✓✓

Note: You can’t actually publish a package without running the bump action first.

elm-package diff

The previous action, bump is really good to tell the hole world (broadcast) the changes you have done to your library, while, diff, is great for a consumer of a library to actually see which changes have been made and if they are actually going to have an impact on the library they are developing. Example, if changes are done to some data structures I’m not using, I would probably be able to just upgrade to the newer version without having to refactor any code:

Rust and others

Other languages have been looking into SemVer as well:

I think we all have tried to use a given package that failed to install due to issues with dependent packages right? Frustration, most of the time, tend to dropping a given package and sometimes even moving on to other languages …

SpiseMisu.SemanticVersioning library

Here is my proposal of SemVer for .NET libraries as well as for NuGet packages

Note: I support both C#/F# (VB? I’m not the right guy to do this task …)

As with Elm, I would like the rules to be enforcement by the code itself, instead of by humans. Otherwise we would be back to square one as humans tend to fail with repetitive task and when the library becomes of a considerable size, the task will become unmanageable.

Elm has it easy as everything is Open Source, therefore source code can be parsed while with .NET (proprietary libraries) we will have to fallback to handle this task just like Unit Tests are handled in Fsharp.Core with Reflection:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
asm.GetExportedTypes()
...

(* extract canonical string form for every public member of every type *)
seq {
    yield! 
	  t.GetRuntimeEvents()
	  |> Seq.filter (fun m -> m.AddMethod.IsPublic)
	  |> Seq.map cast
    yield! 
	  t.GetRuntimeProperties()
	  |> Seq.filter (fun m -> m.GetMethod.IsPublic)
	  |> Seq.map cast
    yield! 
	  t.GetRuntimeMethods()
	  |> Seq.filter (fun m -> m.IsPublic)
	  |> Seq.map cast
    yield! 
	  t.GetRuntimeFields()
	  |> Seq.filter (fun m -> m.IsPublic)
	  |> Seq.map cast
    yield! 
	  ti.DeclaredConstructors
	  |> Seq.filter (fun m -> m.IsPublic)
	  |> Seq.map cast
    yield! 
	  ti.DeclaredNestedTypes
	  |> Seq.filter (fun ty -> ty.IsNestedPublic) 
	  |> Seq.map cast
} |> Array.ofSeq

There is a bit more to it as the main issue with basic Reflection, is that it works great with C# libraries, but not so much with F#. The following functions signatures are represented on the same way in .NET canonical form (no curried arguments):

1
2
3
4
5
6
7
8
9
let foo (x,y) = x + y
let bar x y = x + y

(* both represented as *)
x:System.Int32 * y:System.Int32 -> z:System.Int32

(* but should be respectively *)
x:System.Int32 * y:System.Int32 -> z:System.Int32
x:System.Int32 -> y:System.Int32 -> z:System.Int32

Therefore we need to have in mind a few other cases such as Product Types, Modules and even Enums & Sum Types (due to pattern matching) that needs to be handled in a special way:

  • Cases like Active/Partial Patterns and MeasureOfUnits are not handled (yet? Is it even necessary?).

  • Please look into the Open Source code to see what is done for each cases.

The main goal is to create a bijective function that would replace the non-injective and surjective function which will ensure that a given input value will always have a unique output value. Think of it as a perfect hash function with no collisions.

I’m also becoming really keen to Haskell and Elm signatures readability, where the last type is the return type of the function while the others are input parameters. Example:

1
FooBar : Foo -> (Bar * Baz) -> Qux

This is also the reason why I have begun to write F# code like this:

1
2
let foobar : int -> (int * int) -> int =
    fun x (y,z) -> x + y + z

And therefore I’m aiming to produce output that comply with these signatures.

Note: The learning curve for F# would be non-existent as we already see this kind of signatures when evaluating with F# Interactive but it might be hard, at the beginning, to read for the C# Community, but they probably get used to it.

As mentioned in the title, I’m aiming to support both Assemblies and NuGet packages:

  • .NET Library (Assembly): Is usually a single file compiled to target a specific version of the .NET Framework. Example:
mscorlib,Version=4.0.0.0, Culture=neutral,PublicKeyToken=...
  • .NET NuGet package: Is a unit of distribution containing some metadata as well as binaries. In most cases, there are binaries targeting several versions of the .NET Framework. We are only interested in libraries (lib/…/???.dll)

Here are a few examples on the usage of the library:

.NET NuGet package

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
#!/usr/bin/env fsharpi

#I @"./SpiseMisu.SemanticVersioning/"
#r @"SpiseMisu.SemanticVersioning.dll"

open System
open System.Diagnostics
open System.Reflection

open SpiseMisu

let pkgid = @"Newtonsoft.Json"

let assembly =
  Assembly.LoadFile
    @"./packages/Newtonsoft.Json.7.0.1/lib/net45/Newtonsoft.Json.dll"

Semantic.Versioning.nugetbump
  pkgid
  NuGet.dotNet.Net45
  assembly
|> printfn "%s"

Semantic.Versioning.nugetdiff
  pkgid
  NuGet.dotNet.Net45
  (Some "7.0.1")
  pkgid
  NuGet.dotNet.Net45
  None
|> Array.iter(printfn "%s")

Output Gist SpiseMisu.NuGet.SemVer.md

.NET Library (Assembly)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
#!/usr/bin/env fsharpi

#I @"./SpiseMisu.SemanticVersioning/"
#r @"SpiseMisu.SemanticVersioning.dll"

open System
open System.Diagnostics
open System.Reflection

open SpiseMisu

let released =
  Assembly.LoadFile
    @"./packages/Newtonsoft.Json/lib/net45/Newtonsoft.Json.dll"
let modified =
  Assembly.LoadFile
    @"./packages/Newtonsoft.Json.7.0.1/lib/net45/Newtonsoft.Json.dll"

Semantic.Versioning.asmbump released modified
|> printfn "%s"

Semantic.Versioning.asmdiff released modified
|> Array.iter(printfn "%s")

Output Gist SpiseMisu.Assembly.SemVer.md

.NET Library (documentation)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
#!/usr/bin/env fsharpi

#I @"./SpiseMisu.SemanticVersioning/"
#r @"SpiseMisu.SemanticVersioning.dll"

open System
open System.Diagnostics
open System.Reflection

open SpiseMisu

let assembly =
  Assembly.LoadFile
    @"./packages/Newtonsoft.Json/lib/net45/Newtonsoft.Json.dll"

Semantic.Versioning.markdown assembly
|> Array.iter(printfn "%s")

Output Gist SpiseMisu.Docs.SemVer.md

Note: A side-effect of the code is that I was actually able to document the hole public API of a given library. Therefore I thought that I would just expose this logic, this was not intentional.

.NET Library (raw)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#!/usr/bin/env fsharpi

#I @"./SpiseMisu.SemanticVersioning/"
#r @"SpiseMisu.SemanticVersioning.dll"

open System
open System.Diagnostics
open System.Reflection

open SpiseMisu

let assembly =
  Assembly.LoadFile
    @"./packages/Newtonsoft.Json/lib/net45/Newtonsoft.Json.dll"

Semantic.Versioning.raw assembly
|> Set.toArray
|> Array.iter(fun (prefix, body) -> printfn "%s - %s" prefix body)

Output Gist SpiseMisu.Raw.SemVer.txt

Note: I have decided that my output should be of the type of Markdown, if other should differ from this opinion, they can just work with the raw data and produce their own.

Trivial Pursuit

Just to have an understanding of what is going on under the hood. Lets take a look at the following example code:

Enum.cs

1
2
3
4
5
6
7
8
namespace SpiseMisu
{
    public enum Enum
    {
	Foo=0,
	Bar=42
    }
}

Enum2.cs

1
2
3
4
5
6
7
8
namespace SpiseMisu
{
    public enum Enum
    {
	Foo=42,
	Bar=0
    }
}

Enum3.cs

1
2
3
4
5
6
7
8
9
10
namespace SpiseMisu
{
    public enum Enum
    {
	Foo=0,
	Bar=42,
	Baz=4,
	Qux=2
    }
}

which is compiled with Mono producing .NET assemblies.

1
2
3
4
5
6
7
8
9
10
#!/bin/bash

for f in src/*.cs
do
    name=${f##*/}
    if [ ${name%.*} != "AssemblyInfo" ] ; then
	echo "Compiling:" $f
	mcs -target:library -out:"./lib/"${name%.*}".dll" $f "./src/AssemblyInfo.cs"
    fi
done

We will now compare the Enum.cs vs Enum2.cs and Enum.cs vs Enum3.cs

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
#!/usr/bin/env fsharpi

#I @"./SpiseMisu.SemanticVersioning/"
#r @"SpiseMisu.SemanticVersioning.dll"

open System
open System.Diagnostics
open System.Reflection

open SpiseMisu

let asm1 =
  Assembly.LoadFile
    @"./lib/Enum.dll"
let asm2 =
  Assembly.LoadFile
    @"./lib/Enum2.dll"
let asm3 =
  Assembly.LoadFile
    @"./lib/Enum3.dll"

Semantic.Versioning.asmbump asm1 asm2
|> printfn "%s"

Semantic.Versioning.asmdiff asm1 asm2
|> Array.iter(printfn "%s")

Semantic.Versioning.asmbump asm1 asm3
|> printfn "%s"

Semantic.Versioning.asmdiff asm1 asm3
|> Array.iter(printfn "%s")

What is the outcome that you expect?

Output Gist SpiseMisu.Talk.SemVer.md

If you guessed that it should be a MAJOR update in both cases, then you are actually right. I’m one of those guys that always compile my F# code with the following flag: –warnaserror:25 to ensure that all patterns are covered. Therefore, if you add a new value to Enums and Sum Types, this should state that is a MAJOR update and you should probably revise your code to handle the newly added patterns.

What’s next?

I have given a talk on this matter at MF#K where we had an interesting discussion (a few feature addition were requested). I will see how fast I can make these changes and finalize the project, based on Project Scaffold, and then I will release the library as Open Source @ GitHub.

When the code is publicly available, I would RFC from the .NET Community, specially experts in the subject, to review the code.

Note: Have in mind, that I’m not using the F# Compiler Services as I’m aiming for a very lightweight library with as few dependencies as possible

The outcome I desire for this library is that it gets integrated with:

  • NuGet: Or something similar, please steal with pride. The goal is that we get this awesome feature and not so much that it was me who did it.

  • FAKE: This is one of the reasons I produce Markdown output as it is possible to just pipe the result of bump and diff to the top of the RELEASE_NOTES.md to automate the generation of this document, as it sometimes can be a bit tedious and boring.

  • Paket: This would be optional, but because this tool already has NuGet package publishing capabilities, therefore it would give sense to add the bump and diff functionality.

Last, but not least, in order to catch on with the C# Community, I guess that we will need a standalone executable (something like paket.exe) which is totally transparent with regard of it’s F# nature, as we all know some .NET people are allergic to the this amazing programming language. I’m thinking about using for this task:

Conclusion:

I hope I’ve convinced you of why SemVer enforced by code is so awesome.

I’m also hopping that you can agree that the presented library:

  • Supports .NET Assemblies and NuGet packages as well as C# and F# (even proprietary due to Reflection).

  • SemVer rules are also enforced by the code itself, just like elm-package.

  • The produced Markdown output is easily readable.

As I mentioned before, I gave a talk on the subject at the F#unctional Copenhageners Meetup Group where we had a really constructive discussion with different points. One of the main points, delivered by Ulrik Rasmussen was that Semantic Versioning shouldn’t be called that in the first place, but instead it should be called Syntactic Versioning as we are only able to see if a dependencies break when signatures change (syntax) and not if the behavior (semantics) change. Let’s look into a simple example where we just refactor our foo function from fold left with right:

1
2
3
4
5
let foo : int list -> int list =
  fun xs -> List.fold(fun a x -> x :: a) [] xs

let foo : int list -> int list =
  fun xs -> List.foldBack(fun x a -> x :: a) xs []

Syntactically the signatures are the same, but the semantics of the application aren’t.

That said, I think the intentions of Tom Preston-Werner was to keep the dependency hell to a minimum as well as limiting the amount of builds breaking. As with everything else, there is really no silver bullet to solve all our problems at once, but we should strive to achieve it though:

Perfection is not attainable, but if we chase perfection we can catch excellence” –Vince Lombardi

Code Snippet

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
-- HELPERS


isHex : String -> String
isHex str =
    case String.startsWith "0x" str of
        True ->
            hexToString (String.dropLeft 2 str)

        False ->
            toSingleByte str


hexToString : String -> String
hexToString hex =
    -- Take chunks of 2 hex digits and apply formula: (a * 16^1) + (b * 16^0)
    -- Example (d_16 = 13_10): 0xd3 => (d * 16^1) + (3 * 16^0) = 208 + 3 = 211
    let
        zero =
            hex |> String.all (\c -> c == '0')

        asciiToNumber x =
            let
                y =
                    x |> Char.toCode
            in
                case y > 96 of
                    -- ASCII codes for [a-f] => [97-102]
                    True ->
                        y - 97 + 10

                    -- ASCII codes for [0-9] => [48-57]
                    False ->
                        y - 48

        rec xs acc =
            case xs == "" of
                False ->
                    let
                        ( y, ys ) =
                            case String.uncons xs of
                                Nothing ->
                                    ( Nothing, "" )

                                Just ( a, rest ) ->
                                    ( Just a, rest )

                        ( z, zs ) =
                            case String.uncons ys of
                                Nothing ->
                                    ( Nothing, "" )

                                Just ( b, rest ) ->
                                    ( Just b, rest )

                        keyCode =
                            case ( y, z ) of
                                ( Just a, Just b ) ->
                                    ((asciiToNumber a) * 16)
                                        + (asciiToNumber b)

                                ( Just a, _ ) ->
                                    (asciiToNumber a) * 16

                                ( _, _ ) ->
                                    0
                    in
                        rec zs (acc ++ (keyCode |> stringify))

                True ->
                    acc
    in
        case zero of
            True ->
                ""

            False ->
                rec (hex |> String.toLower) ""

References:

Utils Code Snippet

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
// You can disable this warning by using '--mlcompatibility' or '--nowarn:62
#nowarn "62" 

module Util =
    open System
    
    let iso8601 : int -> int -> int -> string =
        fun y m d -> (new DateTime(y,m,d)).ToString("o") + "Z"
    
    let byteToHex : byte -> string =
        fun b -> b.ToString("x2")
        
    let bytesToHex : byte array -> string =
        fun bytes -> bytes |> Array.fold (fun a x -> a + (byteToHex x)) ""
    
    let utf8ToBytes : string -> byte array =
        fun utf8 -> System.Text.Encoding.UTF8.GetBytes utf8
    
    let sha256' : byte array -> byte array =
        fun bytes ->
            use sha256 = System.Security.Cryptography.SHA256.Create()
            sha256.ComputeHash(buffer = bytes)
    
    (* mon@razerRamon:~$ echo -n 'foo' | sha256sum
       2c26b46b68ffc68ff99b453c1d30413413422d706483bfa0f98a5e886266e7ae  - *)
    let sha256 : string -> string =
        fun utf8 -> utf8 |> (utf8ToBytes >> sha256' >> bytesToHex)
    
    let ceilPow : uint64 -> uint64 =
        fun n ->
            let rec loop : (uint64 * int) -> uint64 = function
                | 0UL, acc -> 1 <<< acc |> uint64
                | m  , acc ->
                    let m' = m &&& 1UL
                    (m-m' >>> 1, acc+1) |> loop
            (n-1UL,0) |> loop

Utils Code output:

> 
module Util = begin
  val iso8601 : y:int -> m:int -> d:int -> string
  val byteToHex : b:byte -> string
  val bytesToHex : bytes:byte array -> string
  val utf8ToBytes : utf8:string -> byte array
  val sha256' : bytes:byte array -> byte array
  val sha256 : utf8:string -> string
  val ceilPow : n:uint64 -> uint64
end

JSON Code Snippet

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
module Json =
    (* http://json.org/ *)
    type value =
        | String  of string
        | Number  of float
        | Object  of (string * value) list
        | Array   of value list
        | Boolean of bool
        | Null
    with
        override json.ToString() =
            let rec print : value -> string = function
                | String  s  -> sprintf "\"%s\"" s
                | Number  n  -> sprintf "%f" n
                | Object  xs -> xs |> objectHelper
                | Array   xs -> xs |> arrayHelper
                | Boolean b  -> sprintf "%b" b
                | Null       -> "null"
            and objectHelper : (string * value) list -> string =
                function
                | [] -> "{ }"
                | xs ->
                    sprintf "{ %s }"
                      (xs
                       |> List.map (fun (name,value) ->
                            sprintf "%s: %s"
                                (String name |> print) (value |> print))
                       |> List.reduce (fun x y -> sprintf "%s, %s" x y))
            and arrayHelper : value list -> string = function
                | [] -> "[ ]"
                | xs ->
                    sprintf "[ %s ]"
                      (xs
                       |> List.map print
                       |> List.reduce (fun x y -> sprintf "%s, %s" x y))
            
            json |> print

JSON Code output:

> 
module Json = begin
  type value =
    | String of string
    | Number of float
    | Object of (string * value) list
    | Array of value list
    | Boolean of bool
    | Null
    with
      override ToString : unit -> string
    end
end

Merkle Code Snippet

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
module Merkle =
    open Util
    
    type hash  = string
    type json  = string
    type count = uint64
    
    type tree =
        private
        | Leaf   of json option
        | Branch of (hash * count) * tree * tree
    with
        override tree.ToString() =
            let rec print : (int * tree) -> string = function
                | i, Leaf  None -> 
                    sprintf "\n%s NIL" (String.replicate i "\t")
                | i, Leaf (Some json) ->
                    sprintf "\n%s json: %s" (String.replicate i "\t") json
                | i, Branch ((h,n),left,right) ->
                    sprintf "\n%s * hash:  %s" (String.replicate i "\t") h +
                    sprintf "\n%s * count: %i" (String.replicate i "\t") n +
                    sprintf "\n%s   - left: %s"
                        (String.replicate i "\t") ((i+2,left)  |> print) +
                    sprintf "\n%s   - right: %s"
                        (String.replicate i "\t") ((i+2,right) |> print)
            (0,tree) |> print
    
    module Tree =
        let init   : json -> tree =
            fun msg ->
                let h = msg |> Util.sha256
                Branch((h, 1UL), msg |> Some |> Leaf, Leaf None)
        let insert : json -> tree -> tree =
            fun msg tree ->
                let helper : tree -> hash option = function
                    | Leaf  None        -> None
                    | Leaf (Some msg)   -> msg |> Util.sha256 |> Some
                    | Branch((h,_),_,_) -> h   |> Some
                
                let rec loop : tree -> tree  = function
                    | Leaf  None         -> msg |> Some |> Leaf
                    | Leaf (Some x) as l ->
                        let h1 = x       |> Util.sha256
                        let h2 = msg     |> Util.sha256
                        let h  = h1 + h2 |> Util.sha256
                        
                        Branch((h,2UL), l, msg |> Some |> Leaf)
                        
                    | Branch((h,n),l,r) as b ->
                        match n > 1UL && n = (n |> ceilPow) with
                            | true  ->
                                let h' = h + (msg |> Util.sha256) |> Util.sha256
                                Branch((h',n+1UL), b, msg |> Some |> Leaf)
                            | false ->
                                let rt = r  |> loop
                        
                                let lh = l  |> helper
                                let rh = rt |> helper
                        
                                let h' = (lh,rh) |> function
                                    | None   , None    -> h
                                    | Some v , None
                                    | None   , Some v  -> v
                                    | Some h1, Some h2 -> h1 + h2 |> Util.sha256
                                
                                Branch((h',n+1UL), l, rt)
                
                tree |> loop

Merkle Code output:

> 
module Merkle = begin
  type hash = string
  type json = string
  type count = uint64
  type tree =
    private | Leaf of json option
            | Branch of (hash * count) * tree * tree
    with
      override ToString : unit -> string
    end
  module Tree = begin
    val init : msg:json -> tree
    val insert : msg:json -> tree:tree -> tree
  end
end

Example, see References:

                              +------------- 6 -------------+ 
                              |                             |
                    +-------- 4 --------+         +-------- 2 --------+
                    |                   |         |                   |
               +--- 2 ---+         +--- 2 ---+   'e'                 'f'
               |         |         |         |
              'a'       'b'       'c'       'd'
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
let a,b,c,d,e =
    Json.Object[
        "name", Json.String "Bridge Cafe"
        "rating", Json.Number 4.
        "date", Util.iso8601 2014 02 20 |> Json.String
        ]
    , Json.Object[
        "name", Json.String "Prima Doner"
        "rating", Json.Number 2.
        "date", Util.iso8601 2014 04 15 |> Json.String
        ]
    , Json.Object[
        "name", Json.String "The Bull"
        "rating", Json.Number 3.
        "date", Util.iso8601 2014 06 05 |> Json.String
        ]
    , Json.Object[
        "name", Json.String "The Tall Ship"
        "rating", Json.Number 5.
        "date", Util.iso8601 2014 10 30 |> Json.String
        ]
    , Json.Object[
        "name", Json.String "Roy's Rolls"
        "rating", Json.Number 3.
        "date", Util.iso8601 2015 01 10 |> Json.String
        ]

let f =
    Json.Object[
        "name", Json.String "Prima Doner"
        "rating", Json.Number 4.
        "date", Util.iso8601 2015 02 12 |> Json.String
        ]

let mtree =
    ( Merkle.Tree.init (a |> string), [b;c;d;e] )
    ||> List.fold (fun a x -> a |> Merkle.Tree.insert (x |> string))

let mtree' =
    mtree
    |> Merkle.Tree.insert (f |> string)
> 
val e : Json.value =
  Object
    [("name", String "Roy's Rolls"); ("rating", Number 3.0);
     ("date", String "2015-01-10T00:00:00.0000000Z")]
val d : Json.value =
  Object
    [("name", String "The Tall Ship"); ("rating", Number 5.0);
     ("date", String "2014-10-30T00:00:00.0000000Z")]
val c : Json.value =
  Object
    [("name", String "The Bull"); ("rating", Number 3.0);
     ("date", String "2014-06-05T00:00:00.0000000Z")]
val b : Json.value =
  Object
    [("name", String "Prima Doner"); ("rating", Number 2.0);
     ("date", String "2014-04-15T00:00:00.0000000Z")]
val a : Json.value =
  Object
    [("name", String "Bridge Cafe"); ("rating", Number 4.0);
     ("date", String "2014-02-20T00:00:00.0000000Z")]
> 
val f : Json.value =
  Object
    [("name", String "Prima Doner"); ("rating", Number 4.0);
     ("date", String "2015-02-12T00:00:00.0000000Z")]
> 
val mtree : Merkle.tree =
  
 * hash:  dc999d3a9b252bebd171775e24668293e0ec1691f8d60331e85eed24ec6ca392
 * count: 5
   - left: 
     * hash:  1ae6f3cb6407d42c9be994971b46de89b6b5facb53e7d1a01c04a92f74264483
     * count: 4
       - left: 
         * hash:  28ee16e42affeecfc1b997487e4294f5067ced3bef2ca7c6324dcf86b7961954
         * count: 2
           - left: 
             json: { "name": "Bridge Cafe", "rating": 4.000000, "date": "2014-02-20T00:00:00.0000000Z" }
           - right: 
             json: { "name": "Prima Doner", "rating": 2.000000, "date": "2014-04-15T00:00:00.0000000Z" }
       - right: 
         * hash:  255a0ad108003e34e449159a63306a292357fd0d40f6449f148467ec2532ed0c
         * count: 2
           - left: 
             json: { "name": "The Bull", "rating": 3.000000, "date": "2014-06-05T00:00:00.0000000Z" }
           - right: 
             json: { "name": "The Tall Ship", "rating": 5.000000, "date": "2014-10-30T00:00:00.0000000Z" }
   - right: 
     json: { "name": "Roy's Rolls", "rating": 3.000000, "date": "2015-01-10T00:00:00.0000000Z" }
> 
val mtree' : Merkle.tree =
  
 * hash:  fb8b96a10235da8cc444a0ddf41bdcfef035f743e84d69b15b146c1af48c6848
 * count: 6
   - left: 
     * hash:  1ae6f3cb6407d42c9be994971b46de89b6b5facb53e7d1a01c04a92f74264483
     * count: 4
       - left: 
         * hash:  28ee16e42affeecfc1b997487e4294f5067ced3bef2ca7c6324dcf86b7961954
         * count: 2
           - left: 
             json: { "name": "Bridge Cafe", "rating": 4.000000, "date": "2014-02-20T00:00:00.0000000Z" }
           - right: 
             json: { "name": "Prima Doner", "rating": 2.000000, "date": "2014-04-15T00:00:00.0000000Z" }
       - right: 
         * hash:  255a0ad108003e34e449159a63306a292357fd0d40f6449f148467ec2532ed0c
         * count: 2
           - left: 
             json: { "name": "The Bull", "rating": 3.000000, "date": "2014-06-05T00:00:00.0000000Z" }
           - right: 
             json: { "name": "The Tall Ship", "rating": 5.000000, "date": "2014-10-30T00:00:00.0000000Z" }
   - right: 
     * hash:  dae71b4d5d4f57af9abd8cbf2a621e6d1eb110bef0ed34d0a0356e5dc766eff7
     * count: 2
       - left: 
         json: { "name": "Roy's Rolls", "rating": 3.000000, "date": "2015-01-10T00:00:00.0000000Z" }
       - right: 
         json: { "name": "Prima Doner", "rating": 4.000000, "date": "2015-02-12T00:00:00.0000000Z" }

UnitTest for SHA256

mon@razerRamon:~$ echo -n '{ "name": "Bridge Cafe", "rating": 4.000000, "date":
"2014-02-20T00:00:00.0000000Z" }' | sha256sum
b07cd889093179c2923fa8bfc480bfa153fe74c0b7009c46b33045d1e2d5632b -

mon@razerRamon:~$ echo -n '{ "name": "Prima Doner", "rating": 2.000000, "date":
"2014-04-15T00:00:00.0000000Z" }' | sha256sum
32128e3d309816c07db4ff4c995aa692c3390b48d23f6ac7429538b57dc2c372 -

mon@razerRamon:~$ echo -n
'b07cd889093179c2923fa8bfc480bfa153fe74c0b7009c46b33045d1e2d5632b
32128e3d309816c07db4ff4c995aa692c3390b48d23f6ac7429538b57dc2c372' | sha256sum
28ee16e42affeecfc1b997487e4294f5067ced3bef2ca7c6324dcf86b7961954 -
1
2
3
4
5
6
7
8
9
10
let unitTest =
    let a' =
        a |> string
          |> Util.sha256
    let b' =
        b |> string
          |> Util.sha256

    "28ee16e42affeecfc1b997487e4294f5067ced3bef2ca7c6324dcf86b7961954"
        = ((a' + b') |> Util.sha256)
> 
val unitTest : bool = true

References:

Code Snippet

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
let toBinary : int -> uint64 -> string =
    fun bits n ->
        let pad : int -> string -> string = fun n x -> x.PadLeft(n,'0')
        let rec loop : (uint64 * string list) -> string list = function
            | 0UL,acc -> acc
            | m  ,acc ->
                let m' = m &&& 1UL
                ( (m-m') >>> 1, string m' :: acc ) |> loop
        ( n, [] ) |> loop |> List.fold (fun acc x -> x + acc) "" |> pad bits

let x64 = toBinary 64
let x32 = toBinary 32
let x16 = toBinary 16
let x8  = toBinary  8

225UL |> x8
225UL |> x16
42UL  |> x32
0UL   |> x64

open System

Int32.MaxValue  |> uint64 |> x32
Int64.MaxValue  |> uint64 |> x64
UInt64.MaxValue           |> x64

Code output:

> 
val toBinary : bits:int -> n:uint64 -> string

> 
val x64 : (uint64 -> string)
val x32 : (uint64 -> string)
val x16 : (uint64 -> string)
val x8 : (uint64 -> string)

> val it : string = "10000111"
> val it : string = "0000000010000111"
> val it : string = "00000000000000000000000000010101"
> val it : string =
  "0000000000000000000000000000000000000000000000000000000000000000"

> val it : string = "01111111111111111111111111111111"
> val it : string =
  "0111111111111111111111111111111111111111111111111111111111111111"
> val it : string =
  "1111111111111111111111111111111111111111111111111111111111111111"

References:

Background

This seems a topic that keeps showing up again and again. After every MF#K Meetup last Tuesday of every month we always go out for a couple of beers and speak heavily in favor of the language that we like the most. There are people who seem to need types to code, I will include myself in this group, while others seem to do fine with languages without types as for example Clojure, Erlang, Elixir, … My former workmate, Brandon Lucas, keeps trolling on how you can’t model a state-machine with types, and until I wrote this post, I would totally agree.

What you normally see in blog post when this topic is explained is something similar to this (I will draw some ASCII art to give a better understanding):

                           +: State
                           #: Transition
                          
                           +--------------------------+
                           |   TurnedOn (On Switch)   |
                           +--------------------------+
                                ʌ              |
                                |              v
                           #----------#   #-----------#
                           |  TurnOn  |   |  TurnOff  |
                           #----------#   #-----------#
                                ʌ               |
                                |               v
                           +--------------------------+
                           |  Turnedoff (Off Switch)  |
                           +--------------------------+
1
2
3
4
5
6
7
8
9
10
11
12
module WhatYouNormallySee =
    type State = On | Off
    
    // Bug due to lack of testing
    // Note: ALWAYS use FsCheck, F# implementation of Haskells QuickCheck
    let transition = function
        | On  -> On
        | Off -> On
    
    let transitionFixed = function
        | On  -> Off
        | Off -> On

What is represented here is a state machine for a light switch. The state is defined as a sum type (algebraic data type) of the two values it can be. But, then when you need to perform the state transition, you would see how people fallback to a function to handle this logic.

In my example, I have deliberately introduced a bug in the transition function just to prove why this approach is problematic.

This is one of the misconceptions that you hear people talking about when they make the transition to functional programming languages. They think just because they have modeled the domain with a few sum and product types (algebraic data types) it’s all good and you can then claim absolute sentences like: “Make illegal states unrepresentable” and “Making Impossible States Impossible” and therefore you probably don’t need to test that part of the code, which is obviously a wrong misconception of what the authors tries to point out.

We need to be very thoughtful (and mostly careful) when we make those kind of statements, specially due to the audiences that might receive (conceive) these messages.

Note: I’m not dishing neither “Yaron Minsky” nor “Richard Feldman” as I have a HUGE respect for both on their work on OCaml and Elm respectively.

Use the type system instead of functions

So how can we move the logic from the function into the type domain by using the type system?

Well firstly we will need to introduce the following three simple concepts:

  1. Phantom Types: Are parametrised types whose parameters do not all appear on the right-hand side of its definition. Example: type 'a Foo = Bar.

  2. Function Types: Define a function signature as a type. Example for the identity function: type 'a Id = 'a -> 'a.

  3. Not accessible Sum Type Case Constructors: By hiding the underlying case constructors for a given sum type, you can ensure that only specific parts of the code can instantiate your type. Example: type FooBar = private | Foo of int | Bar of float

Lets see how I use them to re-model the light switch state machine:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
module Light =
    type 'a Switch = private | State
    
    and TurnedOn  = On  Switch
    and TurnedOff = Off Switch
    
    and On  = On
    and Off = Off
    
    and TurnOn  = TurnedOff -> TurnedOn
    and TurnOff = TurnedOn  -> TurnedOff
    
    module Switch =
        let private initHelper = State
        let private turnHelper = fun _ -> State
        
        let initOn  : TurnedOn  = initHelper
        let initOff : TurnedOff = initHelper
        
        let turnOn  : TurnOn    = turnHelper
        let turnOff : TurnOff   = turnHelper
    
    module Output =
        // Expensive call cos of .NET Type Reflection
        let state (x:'a Switch)  =
            match typedefof<'a> with
                | t when t = typedefof<On>  -> "on"
                | t when t = typedefof<Off> -> "off"
                | _________________________ -> "invalid type"
> 
module Light = begin
  type 'a Switch = private | State
  and TurnedOn = On Switch
  and TurnedOff = Off Switch
  and On = | On
  and Off = | Off
  and TurnOn = TurnedOff -> TurnedOn
  and TurnOff = TurnedOn -> TurnedOff
  module Switch = begin
    val private initHelper : 'a Switch
    val private turnHelper : 'a -> 'b Switch
    val initOn : TurnedOn
    val initOff : TurnedOff
    val turnOn : TurnOn
    val turnOff : TurnOff
  end
  module Output = begin
    val state : x:'a Switch -> string
  end
end

We combine the concepts 1. and 3. to define the State type, which we limit to only two states: TurnedOn and TurnedOff, which also requires to introduce two type terms: On and Off.

Finally, we just need to expand our domain with the transition types, which we can use concept 2. to create two transition states: TurnOn and TurnOff, which will subsequently require to have the opposite state as input parameter.

That’s it. Now our domain model contains all the logic while our functions just are pure interfaces with no logic whatsoever, see both helper functions, for the initXXX and turnXXX functions. The functions just return the internal State type, which gets tagged by the type definitions. Pretty nifty right?

And we can be sure that no invalid State is created because we ensured that it can’t be instantiated from outside the module (and sub modules). So even though type 'a Switch is a generic type, we have limited only to the two states mentioned before.

The only minor issue is that type abbreviation (alias) in F# are erased at compile time and therefore not available at runtime, as Marcus Griep points out in the following tweet, therefore it’s a bit more difficult to output the currently state (see in next coding blocks how this can be overcome).

Demo:

So lets see how its used:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
open Light

let on =
    Switch.initOff
    |> Switch.turnOn

let off =
    on
    |> Switch.turnOff

let error =
    off
    // |> Switch.turnOff
    (* error FS0001: Type mismatch. Expecting a
           TurnedOff -> 'a    
       but given a
           TurnOff    
       The type 'Off' does not match the type 'On' *)

// on = off
(* error FS0001: Type mismatch. Expecting a
       TurnedOn    
   but given a
       TurnedOff    
   The type 'On' does not match the type 'Off' *)

on  |> Output.state
off |> Output.state

Produces the following output:

> val on : TurnedOn
> val off : TurnedOff
> val error : TurnedOff
> val it : string = "on"
> val it : string = "off"

A bit more complex example where we just want to automate the switch to turn on/off a couple of times in a row. To be able to do this, we introduce the Either sum type for better readability, but the built-in Choice<'a,'b> F# type could be used as well. This construct will also allow us to make a better output printer than the one that is based on .NET Reflection as we have a guarantee of which types go in to the Left and Right wrappers.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
// You can disable this warning by using '--mlcompatibility' or '--nowarn:62'
#nowarn "62" 

module Util =
    type ('a,'b) Either = Right of 'a | Left of 'b

open Util
open Light

let blinking : (TurnedOn,TurnedOff) Either -> (TurnedOn,TurnedOff) Either =
    function
        | Right on  -> on  |> Switch.turnOff |> Left
        | Left  off -> off |> Switch.turnOn  |> Right

// Or just "on" or "off" as we have the certainty of the types and it's a lot
// cheaper than .NET Reflection
let sprint : (TurnedOn,TurnedOff) Either -> string = function
    | Right s -> s |> Output.state // or just "on"
    | Left  s -> s |> Output.state // or just "off"

let foldHelper = function
    | [   ] -> fun _ -> [                   ]
    | x::xs -> fun _ -> blinking x :: x :: xs

([ Left off ], [ 1 .. 15 ])
||> List.fold foldHelper
|> List.map sprint
|> List.iter (printf "%s ")
> 
module Util = begin
  type ('a,'b) Either =
    | Right of 'a
    | Left of 'b
end

> 
val blinking :
  _arg1:(TurnedOn,TurnedOff) Either -> (TurnedOn,TurnedOff) Either

> 
val sprint : _arg1:(TurnedOn,TurnedOff) Either -> string

> 
val foldHelper :
  _arg1:(TurnedOn,TurnedOff) Either list ->
    ('a -> (TurnedOn,TurnedOff) Either list)

> on off on off on off on off on off on off on off on off val it : unit = ()

Conclusion:

I hope I can convince others that it is possible to model a state machine exclusively by using the type system, while keeping the logic out of the function layer. It uses a few type tricks that are present in F# but probably also in other ML alike languages.

Note: Don’t forget to ALWAYS use FsCheck, F# implementation of Haskells QuickCheck, even if you use this kind of approach. We are all human and therefore can fail. If you just remember this last part, You would make me a happy person.

References: