Web Workers in PureScript – part I

Sunday. November 27, 2016 - 10 mins

This article introduces PureScript support for Web Workers I am currently working on. Any form of feedback on the approach or API design is more than welcome – feel free to comment bellow or on GitHub commits.

Web Workers introduction
- Inter-thread communication
- JavaScript API
Web Workers in PureScript
- PureScript API
  - Low-level API
  - Channel-like API
- Example
Summary

Web Workers introduction

Skip to section Web Workers in PureScript if you know Web Workers JavaScript API.

Web Workers is a HTML5 technology that brings support of multithreading to JavaScript in browsers. It's core use case in web development is execution of long-running computational-heavy tasks, which would normally block the main thread, thus inconveniently freezing the user interface.

Web Workers allow to deal with blocking the main thread via offloading the blocking task to a standalone thread, leaving the main thread unblocked.

Inter-thread communication

Usually, we need some sort of communication between main thread and worker to convey information back and forth – Web Workers don't involve the traditional shared memory approach for concurrency, but provide bidirectional messaging API for shared nothing approach.

Thanks to shared nothing architecture JavaScript doesn't have to deal with implications coming with shared memory – data races, synchronization etc.

JavaScript API

Complete API description can be found in MDN's Using Web Workers article.

In summary the API provides means to:

Start a worker by passing URL of the workers source (e.g. JS file) via new Worker().
Send a message to & from worker via postMessage().
Define callbacks for processing incoming message via onmessage.

Web Workers in PureScript

The JavaScript API could be used directly without any library via PureScript’s FFI. We would have to:

define FFI mapping,
create a PureScript module defining main function for each worker with FFI calls,
compile all worker modules as standalone “binaries”, one at a time,
instantiate workers via FFI calls, passing name of the compiled JS file in String as a parameter.

However, as you can imagine this approach is tiresome, error prone and namely the messaging API can be modeled in more PureScript-friendly way. Let's ask ourselves a question:

What can be done to achieve seamless adoption of Web Workers technology to PureScript?

Provide a functional API befitting PureScript’s ecosystem.

Eliminate need to compile a standalone JS file with main for each worker.

Allow to run any function as an anonymous Web Worker, like Haskell’s forkIO.

What comes next is my proposal for this matter.

PureScript API

The overarching philosophy is:

Keep low-level API close to the metal while allowing to build higher abstraction on top of it.

Low-level API

A simple foreign function interface based on the Eff monad located in namespace Control.Monad.Eff.Worker. The API consists of two similar sets of functions – one for the main thread and second for worker thread.

Control.Monad.Eff.Worker.Master defines:

Spawning a worker from the main thread with startWorker,
sending a message to the worker via sendMessage,
and defining callback for messages from worker with onMessage.

And analogously Control.Monad.Eff.Worker.Slave allows to:

Send a message to the main thread via sendMessage,
and define callback for messages from main thread with onMessage.

To add type safety to the native JS API, Control.Monad.Eff.Worker introduces two types specifying types of incoming and outgoing messages:

WorkerModule req res represents a PureScript module, where the worker code is located
Worker req res is an instance of underlying JavaScript Web Worker

Type parameters req and res are types of request (sent by main thread) and response (sent by worker thread) messages.

Role of both types will be demonstrated in the example bellow.

Channel-like API

The low-level API in the previous section is suitable for simple worker tasks, where no complex communication is required. However for more complex use cases, it might be a bit clumsy to use. Asynchronous API based on Aff monad and AVar primitive may be more useful in such cases.

The basic concept behind this approach is that both incoming and outgoing messages are queued in two AVar primitives – one for each direction.

The Aff monad allows to access message queues via putVar and takeVar functions. Although the underlying operation is asynchronous, the code looks like an ordinary, imperative, synchronous code:

main = launchAff $ do
  -- AVar setup omitted
  putVar requestQueue "42" -- add message "42" to request queue
  response <- takeVar responseQueue -- await message on response queue
  liftEff $ log response

However, before we can queue and dequeue messages we have to create queues bound to the worker – function makeChan from Control.Monad.Aff.Worker.Master or Control.Monad.Aff.Worker.Slave, depending on wether applied in main or worker thread context, is available.

Both flavours return a tuple, which contains two AVars representing request and response queues. We'll see usage in the example bellow.

Example

The following example demonstrates the proposed API and dealing with worker code separation as required by JavaScript Web Worker API.

The goal is to create a simple echo worker that resends every incoming message back.

Worker code

Let’s start with defining the echo worker. Basically we need to wait for the incoming message, process it and return back.

To achieve the goal we'll use the Channel-like API described in the previous section.

First we'll create Echo.purs module housing the worker thread logic:

module Echo where

-- imports redacted
import Control.Monad.Aff.AVar (putVar, takeVar, AVAR)
import Control.Monad.Aff.Worker.Slave (makeChan)
import Control.Monad.Eff.Worker (WORKER, WorkerModule)

echo :: forall e. Aff (avar :: AVAR, console :: CONSOLE, worker :: WORKER | e) Unit
echo = do
  Tuple req res <- makeChan workerModule -- create worker-bound AVar queues
  forever $ void $ do
    message <- takeVar req -- await message on request queue
    liftEff $ log $ "Worker received: " <> message
    putVar res message -- send the message back

default :: forall e. Eff (avar :: AVAR, console :: CONSOLE, err :: EXCEPTION, worker :: WORKER | e) Unit
default = void $ launchAff echo

In echo function we create queues via makeChan. And then loop the following asynchronous computation: await for an incoming request from the main thread via takeVar, log the message and send the message back by queueing it to the response queue by putVar.

The last function, default, is an equivalent of main function. But in this case it's the function that is executed when the worker thread starts. It takes care of launching the echo handler.

Eliminating per-worker compilation

Compiling each worker module separately, e.g. via pulp browserify -m Worker --to dist/Worker.js, to get a standalone JavaScript file that can be later fed to constructor new Worker(url), is somewhat tedious approach.

Fortunately, webworkify in combination with browserify allows any module to be used as a worker’s source. The only task that is left to the programmer is to explicitly call require() with hardcoded name of module, which contains the worker definition. Thanks to the require call, webworkify post-processing will do the rest.

Let's add a foreign module to our Echo module, Echo.js:

exports.workerModule = require("Echo");

And use FFI to introduce it in Echo.purs:

type Request = String
type Response = String

foreign import workerModule :: WorkerModule Request Response

Apart from exposing foreign workerModule, this line also defines types of incoming and outgoing message types – named Request and Response for unambiguous meaning when reasoning either from perspective of main thread or worker thread.

Running the worker

With worker logic defined, we’ll take a look at spawning a worker and sending a message:

module Main where

-- imports redacted
import Control.Monad.Aff.AVar (putVar, takeVar, AVAR)
import Control.Monad.Aff.Worker.Master (makeChan)
import Control.Monad.Eff.Worker (Worker, WORKER)
import Control.Monad.Eff.Worker.Master (startWorker)

import Echo (Request, Response, workerModule)

ping :: forall e. Worker Request Response -> Request -> Aff (avar :: AVAR, console :: CONSOLE, worker :: WORKER | e) Unit
ping w message = do
  Tuple req res <- makeChan w
  forkAff $ forever do
    response <- takeVar res
    liftEff $ log $ "Worker returned: " <> response
  putVar req message

main :: forall e. Eff (avar :: AVAR, console :: CONSOLE, err :: EXCEPTION, worker :: WORKER | e) Unit
main = void $ do
  worker <- startWorker workerModule
  void $ launchAff $ ping worker "foobar"

First we instantiate request and response queues by makeChan, but this time from the main thread context. Then we fork an asynchronous computation, that awaits responses from worker via takeVar. Finally we send the initial message to the worker via putVar.

Without forkAff, the message would never be sent – forever would block for, well, forever.

Again there's main function, which is responsible for starting up JavaScript worker and proceeding with our ping computation.

When we compile the example and open it in a browser, we'll see the following log in console:

Worker received: foobar
Worker returned: foobar

Summary

The source code can be found on GitHub: JanDupal/purescript-web-workers. There's still a lot of aspects to cover:

Polishing the API
Publish the library
Documentation and test coverage
forkIO-like seamless integration to allow almost any function to be executed as a worker – requires changes in the PureScript compiler

And more as tracked on project's board.

Stay tuned for more. Any feedback appreciated!

Jan Dupal

Jan is a Software Engineering Team Lead at NetSuite and a Computer Science student at FI MUNI interested in distributed systems, PL theory and Data Analytics.