Pipeline operator: Hack vs F#

2021-09-17

Taking a look at the proposed pipeline operator in JavaScript, seeing why it is useful, and comparing the Hack vs F# proposals. Plus getting both sides of the argument.

A fire red pipeline suspended in the air by cables. The first sections of the pipeline are clearly visible, the later sections are surrounded by a thick fog. The fog represents the uncertain future of the pipeline operator.
Photo by JJ Ying

Intro

Last week the Hack proposal for the pipeline operator has advanced to stage 2. The competing F# proposal is still in stage 1. This has caused quite a stir in some corners of the JavaScript ecosystem.

This is because some people prefer the one over the other.

In this post I will try to explain the differences between the two approaches. I'll do so in a way which is more friendly for beginners, as the documents you'll find on GitHub are quite technical.

Why pipelines

First let us look at an example where pipelines would be useful.

In the example we are an HR manager. We are responsible for doing the yearly salary rounds at our company. Instead of using Excel we program this routine in JavaScript.

The data structure we are going to process, is an array of objects:

const employees = [
  {
    name: 'John',
    wage: 2800,
    level: "medior",
    performance: 4,
  },
  {
    name: 'Sandra',
    wage: 3800,
    level: "medior",
    performance: 10,
  }
  // and more
];

The performance is a number from 0 to 10, which indicates how well an employee did this year. It will determine how much of an increase the employee will get.

The level is a string which can either be junior,medior or senior. The level is purely based on someone's salary.

The program we run yearly (after putting in the inflation, and performance) is:

function processEmployee(employee) {
  const employeeIncreasedByPerformance = increaseByPeformance(employee);
  const employeeIncreasedByInflation = increaseByInflation(employeeIncreasedByPerformance, 1.25);
  const employeeRounded = roundWage(employeeIncreasedByInflation);
  return promoteEmployee(employeeRounded);
}

function increaseByInflation(employee, inflation) {
  const wage = employee.wage + employee.wage / 100 * inflation;
  return {...employee, wage};
}

function increaseByPeformance(employee) {
  const wage = employee.wage + employee.wage / 100 * employee.performance;
  return {...employee, wage};
}

function roundWage(employee) {
  const wage = Math.floor(employee.wage / 100) * 100;
  return {...employee, wage};
}

function promoteEmployee(employee) {
  if (employee.wage >= 4000) {
    return {...employee, level: "senior"};
  } else if (employee.wage >= 2500) {
    return {...employee, level: "medior"};
  } else {
    return {...employee, level: "junior"};
  }
}

console.log(employees.map(processEmployee));

The main function we are going to look at in this post is the processEmployee function. It does the following things:

  • Increase the wage based on the performance.
  • Increase the wage based on the inflation rate.
  • Round the wages so they look nice, and don't feel as random.
  • Finally promote all employees which are now in a new wage bracket.

The helper functions handle each step in the process. Each helper does not mutate the employee, but returns a copy.

The reason our helpers copy instead of mutate, is that we, as the HR manager, want to compare the old and previous wages. We like to use the developer console to interactively run some simulations. When the employee is not actually mutated we can call our helper functions safely, without fear of mutating the data.

Without this constraint we could have just mutated the employee instead in each helper function, and we'd get the same end result. But we'd loose the ability to experiment / mess around.

Now lets focus on this bit of code in particular:

function processEmployee(employee) {
  const employeeIncreasedByPerformance = increaseByPeformance(employee);
  const employeeIncreasedByInflation = increaseByInflation(employeeIncreasedByPerformance, 1.25);
  const employeeRounded = roundWage(employeeIncreasedByInflation);
  return promoteEmployee(employeeRounded);
}

Say we really hate the idea of those result variables. They do not add much, and they take up a lot of space. What can we do about this?

Lets look at some solutions, some with pipelines and some without.

A. Nested calls

function processEmployee(employee) {
  return promoteEmployee(
            roundWage(
              increaseByInflation(
                increaseByPeformance(employee), 
                1.25)));
}

This solution removes all result variables by simply passing the result directly into the next function.

As a negative consequence, the code must now be read from the inner most function call, to the outer most function call.

In increaseByInflation you see the weakness of this approach: having more than one parameter for a function breaks the readability. This example only has one such function, just imagine if there where more.

My verdict: this is not an improvement.

B. Fluent interface

function processEmployee(employee) {
  return fluentify(employee)
    .increaseByPeformance()
    .increaseByInflation(1.25)
    .roundWage()
    .promoteEmployee()
    .value();
}

function fluentify(employee) {
  return {
    increaseByInflation(inflation) {
      const e = increaseByInflation(employee, inflation)
      return fluentify(e);
    },
    increaseByPeformance() {
      const e = increaseByPeformance(employee)
      return fluentify(e);
    },
    roundWage() {
      const e = roundWage(employee)
      return fluentify(e);
    },
    promoteEmployee() {
      const e = promoteEmployee(employee)
      return fluentify(e);
    },
    value() {
      return employee;
    }
  }
}

// The rest of the helper functions remain the same.

This solution uses a so called fluent interface. Fluent interfaces allows calls to be chained together, making the code very readable.

The price we pay is in the fluentify function: it is rather mind bending. It returns an object with methods which are named after each "helper", each "helper" is wrapped so it can be chained. By calling value the chain stops and the final answer is given.

The point is that a fluent interface requires either more code, such as in the example above. Or that you change the way you write the functions in the first place.

Is it worth it?

C. A pipe function

function processEmployee(employee) {
  return pipe(employee, [
    increaseByPeformance,
    (e) => increaseByInflation(e, 1.25),
    roundWage,
    promoteEmployee
  ]);
}

function pipe(value, functions) {
  return functions.reduce((acc, fn) => fn(acc), value);
}

Here we see a very simple pipe function: it accepts a value and an array of functions, and calls each function sequentially with the result of the previous function.

This idea is not new: lodash's flow, and is ramdajs's pipe are quite similar. The difference is that they both return a function which invokes the provided functions, instead of returning the value directly, as in the example above.

What is the drawback here? Nothing really. Except maybe the function calling functions can get a little abstract.

Pipelines

We've now seen three solutions, each with their own drawbacks. Lets look at the |> pipeline operator now.

F# solution

With the F# version of the pipeline operator, the processEmployee function can be re-written like so:

function processEmployee(employee) {
  return employee
    |> increaseByPeformance
    |> (employee) => increaseByInflation(employee, 1.25) 
    |> roundWage
    |> promoteEmployee;
}

Hack solution

With the Hack version of the pipeline operator we can write processEmployee like so.

function processEmployee(employee) {
  return employee
    |> increaseByPeformance(^) 
    |> increaseByInflation(^, 1.25) 
    |> roundWage(^)
    |> promoteEmployee(^);
}

The gist of what happens in both proposals is this: a value is "piped" from one function / expression to the next. So instead of writing:

const result = qux(bar(foo(42)))

You can write:

const result = 42 |> foo |> bar |> qux

If you already know Array.map you might look at it as a map which works on any value. Hypothetically this would work like this (if it were possible):

const result = 42
    .map(foo)
    .map(bar)
    .map(qux);

Alas numbers in JavaScript are not objects so the example above is not possible.

If you look at both pipeline examples it is easy to see why this new operator can be appealing:

  • No more need for result / intermediate variables, but still get to read the code from top to bottom.
  • Also without intermediate variables there is less room for assigning to the wrong variable by accident.
  • It makes nested calls unnecessary, and less tempting to use, this should help with the readability our code.
  • You can get the same benefits of fluent interfaces without having to alter your code, or write you code with fluency in mind.
  • It provides pipeline's out of the box, no libraries needed.

What both proposals have in common is that they both use the |> character sequence for the pipeline operator, but the similarities end there.

Both have a completely different outlook on how the pipeline operator should work.

Lets dive into both proposals in more detail.

F#

The name for the F# proposal comes from the F# (F sharp) functional programming language. It contains a pipeline operator: |>, after which the proposal is based.

In the F# proposal the right hand side (RHS) of the |> is always a function. That function should take exactly one argument, which is the value being piped.

Hack

Hack is also a programming language which contains a |> pipeline operator. This means both proposals are inspired by other programming languages.

In the Hack proposal the right hand side (RHS) of the |> is an expression. In the expression you can use a placeholder, which represents the piped value's. In the examples above I've used the ^ as the placeholder.

The final placeholder is still being discussed so nothing is finalized yet. You can encounter examples on the internet using: % as it was the old candidate. Also note that at the time of writing the babel plugin for only supports using % or #.

F# vs Hack

In this section lets break apart the arguments you will encounter online.

Syntax tax

Taxes are not fun, but inevitable.

In the context of programming languages the idea of a syntax tax is that no one likes typing, but it is inevitable. So whenever a programming language designer adds a feature, they want to make it the most ergonomic for us developers.

This often means striking a balance: you want to make the most common path to be the easiest. Meaning the least amount of characters / the nicest to read. But at the same time also make the lesser used paths possible.

In the case of the pipeline proposal the syntax tax is the following:

// F#
|> (employee) => increaseByInflation(employee, 1.25) 

// Hack
|> increaseByInflation(^, 1.25)

For the F# proposal we pay more syntax tax whenever the function we want to pipe has more than one argument. In the example above the pesky second argument for the inflation percentage, causes the F# variant to blow up in size.

The perceived downsides of having to use lambda's / arrow functions in F# is mitigated by another proposal called: partial application.

The partial application proposal would introduce the means to easily create partial functions, using a special placeholder syntax. With partial application you can fix a parameter / argument of a function to a specific value. I'll go into more detail later.

But this would allow the F# solution, to our employee problem, to be written as:

|> increaseByInflation(?, 1.25)

One argument you will hear from the F# side: is that you cannot judge Hack vs F# correctly, without also taking the partial application proposal into consideration.

A function with one argument is a called a unary function and it is what the F# excels in. So here Hack pays the most taxes:

// F#
|> increaseByPeformance

// Hack
|> increaseByPeformance(^)

In the above example you can see that there is no need for a magical symbol ^ in the F# variant.

If you have a bunch of simple expression and you do not want to explicitly name them / turn them into functions, F# pays the most taxes:

// F#
|> (x) => x * x;

// Hack
|> ^ * ^

This is because Hack is based on expressions, and F# requires functions. This means that the lambda / arrow function is required in the F# example.

Some arguments based on syntax tax:

  • The F# proposal contains less magic, because it does not require a placeholder.
  • A counter from the Hack crowd against point one: F# argues against magic placeholders, but the often touted partial application proposal also contains one. Placeholders are not always a bad idea.
  • The Hack proposal will pay less syntax tax in the long run, because most functions in most code bases are not unary.

await and yield

Both proposals work differently with the await and yield keywords.

In an old post of mine the yield keyword is explained.

Here is an example of fetching Pokemon by ID from a REST api, in F#:

function buildUrl(id) {
  return `https://pokeapi.co/api/v2/pokemon/${id}`;
} 

const pokemon = 1
 |> buildUrl
 |> fetch
 |> await
 |> (response) => response.json()
 |> await;

Here is the equivalentin Hack:

function buildUrl(id) {
  return `https://pokeapi.co/api/v2/pokemon/${id}`;
} 

const pokemon = 1
 |> buildUrl(^)
 |> await fetch(^)
 |> await ^.json();

In the F# proposal await and yield require a special syntax, the reason it requires a special syntax is because await and yield are not functions.

The special syntax is that you must put the await / yield on its own separate "pipe". Whereas in the Hack proposal you can put it in the same pipe.

In the example above I created the buildUrl function because it feels natural, but not a parseJson function because that feels unnatural. I'm not purposely trying to put F# at a disadvantage here, it is just how I would write both examples.

Some arguments based on await / yield:

  • The F# proposal differs from the normal JavaScript syntax, having to put the await / yield after the fact feels strange, and less readable.
  • The Hack proposal shines here because it is shorter and does not require a special syntax for await and yield.

FP vs ...

One party in this debate are the functional programmers. Functional programming (FP) is a paradigm in which functions are the most important building blocks. The F# proposal is more functional, so team F# has many functional programmers.

This is why you'll encounter lots of FP related terminology when investigating both sides of the argument. Unary functions, currying, point free style / tacit programming, partial application, Higher order Functions, etc etc.

Lets define each term:

point free style / tacit programming: a way of programming in which you do not name the arguments. The pipeline operator allows us to do this. Contrast the following code:

// Every sub result is named in a const
function processEmployee(employee) {
  const employeeIncreasedByPerformance = increaseByPeformance(employee);
  const employeeIncreasedByInflation = increaseByInflation(employeeIncreasedByPerformance, 1.25);
  const employeeRounded = roundWage(employeeIncreasedByInflation);
  return promoteEmployee(employeeRounded);
}

// Nothing needs an explicit name, so it is Tacit.
function processEmployee(employee) {
  return employee
    |> increaseByPeformance
    |> (employee) => increaseByInflation(employee, 1.25) 
    |> roundWage
    |> promoteEmployee;
}

unary function: a function with only one argument. The term binary function is used for when there are two parameters, and the term N-ary is used for when a function has more than one argument / parameter. Here is an example:

// Unary because it only has one argument.
function unary(x) {
  return x + 1;
}

// Binary because it has two arguments
function binary(x, y) {
  return x + y;
}

// n-ary because it has more than one argument.
function nAry(x, y, z) {
  return x + y + z;
}

Higher order Function: any function which accepts another function as an argument, or which returns a function. Here is an example of both:

// Array.map is a function which accepts another function.
[1, 2, 3].map((x) => x * 2);

// makeGreeter returns another function
function makeGreeter(greet) {
  return function (name) {
    return `${greet} ${name}`;
  };
}

const frenchy = makeGreeter('Bonjour');
frenchy("Ethel"); // "Bonjour Ethel"
frenchy("Fred"); // "Bonjour Fred"

const dutchy = makeGreeter('Hallo');
dutchy("Ethel"); // "Hallo Ethel"
dutchy("Fred"); // "Hallo Fred"

partial application: fixing one or more argument of a function to a specific value. By creating another function, for example:

function add(a, b) {
  return a + b;
}

// Fixing the second parameter "b" to the value 5
const addFive = (a) => add(a, 5);
addFive(5); // 10

Currying: being able to call a function without all its parameters. And if you do call a function without all parameters, instead of getting the value, you receive a function with that parameter already filled in. For example:

function add(a) {
  return function(b) {
    return a + b;
  }
}

// Now can use "add" like this
add(1)(2) // 3

// or alternatively
const addOne = addOne(1);
addOne(2) // 3

Normally you do not "curry" or "partial" the functions yourself. You let your programming language do it for you, such as in Haskell. When your language does not natively support it, such as Javascript, you can use library like lodash, which has a curry and a partial function.

Now why is FP so important in this discussion? The answer is that JavaScript libraries written in the FP style are at a disadvantage in the Hack proposal.

They extensively rely on HoF, currying and partial application.

This forces functional oriented libraries, such as RxJS, to change their APIs when wanting to make pipeline operator more useful / readable.

I find that this snippet from the GitHub issue makes their point:

// Now:
source$.pipe(
  filter(x => x % 2 === 0),
  map(x => x + x),
  concatMap(x => of(x).pipe(delay(1000)))
)
.subscribe(console.log)

// Hack:
source$
  |> filter(x => x % 2 === 0)(^)
  |> map(x => x + x)(^)
  |> concatMap(x => of(x) |> delay(1000)(^))(^)
  |> %.subscribe(console.log)

The ^ appears quite clunky in the example above, and justifiably called ugly.

The point is that the F# proposal would support functional libraries better, and of course a more functional style in general.

The crux

I think it all boils down to this question:

Should JavaScript cater to the functional programming paradigm?

I'm going to give the answer as I understand both parties would give it.

The F# side's answer is somethings like this:

JavaScript is not a functional language in the purest sense, but would benefit from functional programming features. JavaScript can be made more compatible with FP by accepting both the F#, and partial application proposal.

The Hack side's answer is:

JavaScript must cater to many different type of developers, not only to the functional crowd. Therefore picking a syntax which is more accessible to a wider audience is more important. Simple expressions are easer to understand than unary functions and currying, and we believe Hack is what more developers would understand.

More resources

Now it is time to send you off to learn more about the argument yourself, from the people at the hart of the discussion. Hopefully with my help it become a little easier to understand their points of view.

My opinion

So what is my take on this topic?

My initial instinct was to go with F# proposal. This is due to my fondness of the functional programming paradigm.

But after my research I'm not so sure anymore. For me the F# await and yield weigh heavily on my soul. In writing the F# pokémon API example, I kept feeling something was wrong. I'm even at the point that I hope I'm misunderstanding the F# await syntax. Please correct me if I'm wrong.

So for now I'm in favor of Hack, till next time.