Programming paradigms in JavaScript – callbacks, Promise, async/await, promisify, functional Ramda and reactive RxJS

Programming paradigms in JavaScript – callbacks, Promise, async/await, promisify, functional Ramda and reactive RxJS

As a quick exercise, I wanted to read all URLs from my page’s sitemap instead of crawling the site. Just after I added a nested callback I decided to apply the Single Responsibility Principle and convert calls to Promises. Later I experimented with async/await and automatic conversion with Node’s promisify. Finally, I rewrote the solution in functional style using Ramda and reactive using RxJS. Read on to follow the evolution of callbacks in JavaScript.

In total, I refactored the code 8 times to test different approaches. I hope it will give you some ideas on how you could improve the readability and maintainability of your application. I will follow the evolution of paradigms depicted in this graph:

(Interested in how I created this graph?)

Information

You can find all the sources in my GitHub repo.

At first I created a new Node.js application (npm init in a new folder).

I. Single function

This approach should be avoided in all cases, where there is only one method that does all the things (or even several different things).

Let’s consider an application that retrieves some URLs (for example http://host/page-sitemap.xml and http://host/post-sitemap.xml). These links contain XML files with i.a. links, which have to be parsed and extracted. Finally, the links should be printed to the console.

First, I installed the required packages:

  • npm install request
    • request is used to retrieve data from a URL. It’s a good example because it contains a synchronous function with a callback
  • npm install xpath
    • xpath is used to run XPath queries on XML contents
  • npm install xmldom

Then I wrote a sample application:

JS

Let’s quickly follow the code.

Line 9 – I compose the URL by joining the server name ('https://example.com/') with 'page' or 'post', and '-sitemap.xml'. Each loop in line 24 processes another file name.

Line 11 – parse the body as an XML document

Line 13 – extract the URLs from the XML. Here is an example of the document:

XML

Usually the XPath to get <loc> tags would look like //url/loc, but it won’t work without providing the default namespace (xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"). To simplify things, a search function is usually used that ignores namespaces, but makes the query more complex:

//*[local-name(.)="url"]/*[local-name(.)="loc"]

The only difference is the usage of the function: *[local-name(.)="url"] in place of url, and alike for loc. That function compares the “local name” (i.e. the name without namespace) of a given node (.) with "url".

Back to the previous code:

Line 15 – retrieves the inner text of the selected loc nodes.

Line 17-19 – print the URLs.

Line 24 – call the retrieve() function for every item in the names array. I hope it is clear that this array function could be simplified (and obfuscated) to:

names.forEach(retrieve);

Summary: although that function was very short, I’ll use it to show different approaches to processing data in modern applications. Nonetheless, you could already notice the problem of nesting calls. Callback hell or pyramid of doom – you will find an amusing image if you search it.

II. Extract methods – SRP

Before doing further experiments, I split that one “long” method into separate specialized methods. As you can notice, there are at least 3 things going on in the previous method:

  1. download an XML
  2. extract URLs from that file
  3. display the URLs

I exposed these responsibilities in different methods:

JS
index.js

Now some methods can be reused in other files or solutions, the nesting inside functions and coupling between them is reduced (but not removed completely yet) and comments are needless – the names of the functions describe what they are responsible for. The code got closer to SRPSingle Responsibility Principle.

III. Add callback

There is still tight dependency between the first and other methods, making it impossible to reuse that retrieving function. Sometimes that’s fine, sometimes it’s not. To make further refactoring into Promises easier, I encapsulated processing into a callback function:

JS
index.js

The retrieve function does not know what it should do next with the retrieved file. It is the responsibility of the caller (line 18) to specify the flow.

IV. Promise

I wanted to get rid of that callback because it interrupts the flow of the program. The first idea is to convert the method with a callback into a Promise.

The conversion was not hard. I needed to wrap the execution into a new Promise(function(resolve, reject) { call, and use the callback (yes, another callback… trust me, this is going to improve the situation):

  • resolve to return a value or
  • reject to throw an error.
JS
index.js

Notice that I got rid of the callback parameter in the retrieve function. Calling that function seems to be more complicated at first:

Before:

JS

Now:

JS

However, currently it is clear what is happening with the process() method – it is called after retrieve() finishes. Previously it was completely vague. Besides, Promises allow chaining further functions with .then() and define error handling in the place of the caller, not callee.

Information

Caller – function that calls another function – here forEach callback
Callee – function called by another – here retrieve

V. Async/await

Where we have Promises, there we can use the async/await pattern.

This pattern simplifies calling a Promise which can be seen in the next refactoring:

JS

By calling a Promise (or an async method) with an await keyword, you can make linear calls to Promises without resolving them with the .next() method.

Information

The await word before a call to a method means waiting (but not blocking UI) for the completion of the method (in case of a Promise – resolving the result). After that, the result is available and the program can proceed to next lines.

There is a restriction, though: you can use the await keyword only in an async method (the keyword that declares an asynchronous method). In most cases that only means you need to add async to the caller. And async/await to its caller, and so on…

VI. All Promises

Another option is converting all methods to Promises, whether it makes sense or not. The advantage of this will be a clearer flow of the program in the last lines.

I will refactor “IV. Promise“:

JS
index.js

I boxed every of the 3 functions into the new Promise() { ... } call and changed the last call to:

JS

which could be simplified to

JS

A clear flow of the application emerges: retrieve()parseXmlAndGetUrls()display(). But at the cost of adding extra lines to every function declaration.

VII. Promisify

There is a clever function in Node.js utilities that automatically converts a method with a callback to a Promise: promisify(). There are some restrictions for the callback argument, but fortunately request() satisfies them.

Namely, promisify() requires a callback with two parameters: error and value. The callback used by request.get() has 3 parameters: error, response and body. This means that promisifying the request’s get() function I will have the entire response instead of just body. This is not a problem, as body is a field of response.

Again I will refactor “IV. Promise“:

JS
index.js

There are several new things:

Line 1: The promisify function was imported.

Line 4: Promisify automatically creates retrieve() function from request.get(). The new function accepts the same parameters as the original except the callback argument – so the URL to be retrieved.

Line 5: Previously I didn’t pass full URL to the retrieve() function, but I generated it inside. This was preventing that function from being generic. Now there was no other option but to extract generation of the URL either to a variable or to a method. I chose the latter.

Lines 12-17: This is a total disaster! I mixed nested calls, Promises and imperative programming.

I did not refactor it further on purpose, but instead thought of taking a functional approach to this problem.

VIII. Functional solution with Ramda

I wondered how to make the flow being more explicit and linear, how to apply functional programming concepts in my example. There are numerous functional JavaScript libraries, and Ramda is one of the more mature libraries.

Two prerequisities:

  • install the library: npm install ramda
  • import it: const R = require('ramda');

I made the following assumptions:

  • everything must be a function
  • as many functions as possible are piped, i.e. the output of one function is the input for another
  • name goes into getUrl()
  • getUrl() returns URL which goes into retrieve()
  • retrieve() returns a Promise which is resolved by a helper function resolve():
    const resolve = promise => Promise.resolve(promise);
  • now the flow must break as I have to call .then() on the result…
  • but I will pass another flow as the argument for .then():
  • then() returns the response which goes into getBody() to extract the body. I’ll use a Ramda function R.prop to return property with the name “body”:
    const getBody = R.prop('body');
  • getBody() returns the XML contents of the file which goes to parseXmlAndGetUrls()
  • parseXmlAndGetUrls() returns list of detected URLs which goes into display()
  • this finishes the processing

Here is the source code:

JS

Instead of using names.forEach() I used R.forEach(). Thanks to currying, I could write either:

R.forEach(fullProcess, names)

or

R.forEach(fullProcess)(names)

I preferred separating data from operations with brackets instead of comma.

What does it functional solution look like? It’s far from clear. I could also combine lines 17-31:

JS

Maybe better, but the Promise breaks the flow.

IX. Reactive solution with RxJS

In the end, I wanted to try yet another approach – using a reactive library. I like the versatility and usefulness of reactive programming and I expected it would improve readability. As I worked much with RxJS, I installed it in Node.js too: npm install rxjs.

JS

Thanks to stream operations now it looks super clear. Let me explain the lines 13-21:

  • from(names) – create a stream from the names array, i.e. for every element name do the following steps
  • map(name => getUrl(name)) – take name, call getUrl() to get url and use that value in next steps
  • concatMap(url => retrieve(url)) – take url, call retrieve() to get response and use that value in next steps; concatMap here resolves the promise and simplifies chaining
  • map(response => response.body) – take response, extract body and use that value in next steps
  • map(xml => process(xml)) – take body (xml), call parseXmlAndGetUrls() to get list of URLs and use that value in the last step
  • map(urls => display(urls)) – take URLs and call display()
  • subscribe() runs the processing

Source code

You can find all the sources in my GitHub repo.

Leave a Reply

avatar
  Subscribe  
Notify of