REST API Design: Transitive Inclusion, or Inlining Associated Objects

Transitive Inclusion (Transclusion for short) is a feature of REST APIs useful for clients wishing to get more than just the resource in the URL path, in one request.
We use a query string parameter called inline, or perhaps include, equal to the name of the associated objects to include.

For example, suppose that in a Books API, author objects have associated books.
A request for an author:

GET /authors/12345
{
  "id": 12345,
  "name": "Ralph Ellison"
}

The list of books associated with the author:

GET /authors/12345/books
[
  {
    "id": 1,
    "title": "Invisible Man"
  },
  {
    "id": 2,
    "title": "Juneteenth"
  }
]

We can pass the inline parameter on an author request:

GET /authors/12345?inline=books
{
  "id": 12345,
  "name": "Ralph Ellison",
  "books": [
    {
      "id": 1,
      "title": "Invisible Man"
    },
    {
      "id": 2,
      "title": "Juneteenth"
    }
  ]
}

Note the extra field “books” is included or inlined in the author object.
This way a client app can get the author object and associated books in one request, reducing round-trips. Of course if the objects are small, including the associated objects by default is the best option; this is mostly useful for large, expensive-to-retrieve associated objects.

The transitive inclusion feature can be even more powerful if we get a list of all authors in the API, inlining all of their associated books:

GET /authors?inline=books

Or we can include many associated objects:

GET /authors?include=books,articles

The result is a comprehensive list object, all in one request:

[
  {
    "id": 12345,
    "name": "Ralph Ellison",
    "books": [
      {
        "id": 1,
        "title": "Invisible Man"
      },
      {
        "id": 2,
        "title": "Juneteenth"
      }
    ],
    "articles": [ ... ]
  },
  {
    "id": 67890,
    "name": "Doris Lessing",
    "books": [
      {
        "id": 10,
        "title": "The Golden Notebook"
      },
      {
        "id": 11,
        "title": "The Fifth Child"
      },
      ...
    ],
    "articles": [ ... ]
  },
  ...
]

Such a comprehensive response object can be convenient for an app to build a page listing authors and books, for example; only a single HTTP request can be used .
Note that if the set is large, pagination can be used in addition.

 

Base64 Encode and Decode in Node.js

Using Node on the server side, we find that we cannot use the JavaScript functions atob and btoa. They are not defined; we get the following errors:

ReferenceError: btoa is not defined
ReferenceError: atob is not defined

The functions are defined in JavaScript in the browser, but not in server-side Node.js.

To encode to base-64, instead of btoa use the following:

const myString = 'hello'
const encoded = Buffer.from(myString)
                      .toString('base64')
console.log(encoded)

Output:

aGVsbG8=

To decode from base-64, instead of atob use the following:

const base64String = 'aGVsbG8='
const decoded = Buffer.from(base64String, 'base64')
                      .toString('utf-8')
console.log(decoded)

Output:

hello

 

Mock Only a Single Function from a Module with Jest

Recall that using jest.mock(module) mocks every function exported from the module.

But sometimes we need to mock only one or some functions in a module, leaving the others’ real implementations as-is.

Suppose we are using the following module of functions:

my-functions.js:

const f = () => 'Return from f'
const g = () => 'Return from g'
const h = () => 'Return from h'

module.exports = {
  f,
  g,
  h
}

The code below uses a Jest spy with a mock implementation to mock out just one function while leaving alone the original implementations of the other two.

my-functions.test.js:

const functions = require('./my-functions')

describe('mock only one function from module', () => {
  it('should return only one mocked result', () => {
    jest.spyOn(functions, 'g')
      .mockImplementation(() => '_mock_')

    console.log(functions.f())
    console.log(functions.g())
    console.log(functions.h())

    expect(functions.f()).toEqual('Return from f')
    expect(functions.g()).toEqual('_mock_')
    expect(functions.h()).toEqual('Return from h')
  })
})

Running the tests:

$ jest my-functions.test.js
PASS ./my-functions.test.js
mock only one function from module
✓ should return only one mocked result (14ms)

console.log my-functions.test.js:8
Return from f

console.log my-functions.test.js:9
_mock_

console.log my-functions.test.js:10
Return from h

Test Suites: 1 passed, 1 total
Tests: 1 passed, 1 total
Snapshots: 0 total
Time: 0.956s
Ran all test suites matching /my-functions.test.js/i.

Note that only g() returns a mocked string.

 

Jest Mock Behaviour with mockReturnValueOnce

A Jest mock function can be set to return a specific value for all calls, or just once.

Note that if we define a return value with mockReturnValueOnce, the mock function will return undefined for all subsequent calls.
It will not throw an error and may go unnoticed, causing undesirable behaviour.

If a Jest mock returns undefined incorrectly check to make sure how it was defined and how many times it was called.

The code below shows the different behaviours:

describe('mock once tests', () => {
  test('many returns', () => {
    const f = jest.fn()
    f.mockReturnValueOnce('result')

    const result1 = f()
    console.log(result1)

    const result2 = f()
    console.log(result2)

    const result3 = f()
    console.log(result3)
  })

  test('one return', () => {
    const f = jest.fn()
    f.mockReturnValue('result')

    const result1 = f()
    console.log(result1)

    const result2 = f()
    console.log(result2)

    const result3 = f()
    console.log(result3)
  })
})

Output:

result
undefined
undefined

result
result
result

 

Force Vagrant to Re-download the Box File

Sometimes we want to download the operating system image file again and start from scratch when building a Vagrant VM.

If vagrant up does not load a new box even after a vagrant destroy of the existing VM, do the following.

Ensure the box_url variable in the Vagrantfile is set to the correct URL of the new box (config.vm.box_url).

View existing boxes to get the existing box name:

$ vagrant box list

Output:

myBox (virtualbox, 0)

Fully remove the box file:

$ vagrant box remove myBox

Now the box file listed in the Vagrantfile will be downloaded again when running vagrant up.

 

Format a Date as a Year-Month-Day String in JavaScript

To format a date in YYYYMMDD or similar string format in JavaScript, we can use the following. In this example we use the current date.

Using dateformat:

$ npm install dateformat
const dateFormat = require('dateformat')

const now = new Date()
const dateString = dateFormat(now, 'yyyymmdd')

console.log(dateString)

Result:

20210313

Using moment.js:

$ npm install moment
const moment = require('moment')

const now = new Date()
const dateString = moment(now).format('YYYYMMDD')

console.log(dateString)

Result:

20210313

References

https://www.npmjs.com/package/dateformat

https://www.npmjs.com/package/moment

 

Make Commit Message Lint Git Hook Run First with Commitlint and Husky

Enforcing a standard for commit messages with a commit-msg hook using something like commitlint is useful, but ideally if we have a commit message hook set up to lint the commit message, it should run first, before any test suite.
That is, if we have a large unit test suite which must pass before committing is possible, and it takes a long time to run (i.e. a minute or two), it is better to have a commit message problem fail first, quickly. It is annoying to run through a large, slow test suite successfully only to have a commit message lint prevent the entire commit at the end!

It is not possible to change the order of git hooks execution. Git does not have access to the commit message at that stage of processing.

However, using commitlint and Husky, we can achieve the desired order by simply using the commit-msg hook to trigger our tests instead of the pre-commit hook.
Inside .huskyrc (or the “husky” field inside package.json) we can change this:

{
  "hooks": {
    "commit-msg": "commitlint -e $GIT_PARAMS",
    "pre-commit": "npm test"
  }
}

Into this:

{
  "hooks": {
    "commit-msg": "commitlint -e $GIT_PARAMS && npm test"
  }
}

The commit-msg hook will run before the commit is actually made, so this is still a pre-commit hook requiring successful run of the unit tests before one can commit.
But any problem with the commit message itself will show up immediately.

 

Change Commit Message Length Limit in Commitlint Config

Writing descriptive commit messages is helpful to readers of code; sometimes the default limit for the number of characters enforced by Commitlint in a given configuration is too short for descriptions of very specific changes. To change the character length limit do the following.
Suppose we are using an existing set of rules (in this example, the Angular config) and just want to change the character limit to 200.
The config file below should achieve this:

commitlint.config.js

module.exports = {
  extends: ['@commitlint/config-angular'],
  rules: {
    'header-max-length': [2, 'always', 200],
  }
}

This will use all of the rules from config-angular but override the commit message length rule (header-max-length).

The first element 2 means “throw error” (1 means “warning”, 0 means “disable the rule”).

References

https://commitlint.js.org/#/reference-rules

Classify Handwritten Digits Using Keras

We can build a very small but very accurate deep learning model for classifying handwritten MNIST digits in Python with Keras, easily trained on a laptop CPU. 

This assumes Keras and numpy are installed; see this post for installing on macOS for example.

The labeled data for training is available inside the Keras library itself in keras.datasets

The load_data function returns a pair of tuples, each element of which is a numpy array with the images (stored as tensors) and labels respectively.  First,  in the training script, we only use the training data.

The model is a network of two densely-connected layers. 

We use categorical_crossentropy for the loss function since there are multiple categories to classify (one for each digit). 

The training and testing images are re-shaped to match the input shape of the Dense layer which is the first layer in the model. They are also converted to float32 type. 

The fit function executes the training process, after which the trained model is saved to a file. The classification script will read the model from the file and use it. 

The following is the training script.

train.py

from keras.datasets import mnist
from keras import models
from keras import layers
from keras.utils import to_categorical
 
(trainImages, trainLabels) = mnist.load_data()[0]
network = models.Sequential() network.add(layers.Dense(512, activation='relu', input_shape=(28 * 28,))) network.add(layers.Dense(10, activation='softmax'))
network.compile( optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'] )
network.summary()
trainImages = trainImages.reshape((60000, 28 * 28)) trainImages = trainImages.astype('float32') / 255
trainLabels = to_categorical(trainLabels)
network.fit(trainImages, trainLabels, epochs=5, batch_size=128)
network.save('digit_model.keras')

Train the network with: 

$ python train.py

 

To classify a specific image, we select one example from the test data (available with Keras) and open the saved model from the file to use for classification. First, only the test images array is extracted; a specific image to test is chosen by user input. 

The trained model is loaded with load_model

An index is read from the user to choose one of the testing data images (stored as a tensor). The image is reshaped into the shape required as input into the model. 

We use pyplot to show the actual image before running the classification. 

Running the classification is done by calling predict; note that the input being classified as one of the 10 digits must be in the correct shape as a numpy array.

The function argmax is then used to select the highest value out of the list of predictions, which are probabilities indicating how likely it is that the input is in the class at the given position. For example, for an image most likely classified as 2, the third position value will have the highest probability (the list starts from zero). The output is printed to show the actual numerical representation.

The following is the classification script.

classify.py

from keras import models
from keras.datasets import mnist
from matplotlib import pyplot
from numpy import array, argmax

mnistDataTuples = mnist.load_data()
testData = mnistDataTuples[1]
testImages = testData[0]

network = models.load_model('digit_model.keras')

testImages = testImages.reshape((10000, 28 * 28))
testImages = testImages.astype('float32') / 255

testImageIndex = int(input('Select test image index (0-9999): '))
inputImage = testImages[testImageIndex]
inputImageScaled = inputImage.reshape(28, 28)

pyplot.imshow(inputImageScaled, cmap='gray')
pyplot.show() # Opens blocking window; close it to continue.

# Predict class using the correct shape of the test image.
resultPredictions = network.predict( array( [inputImage,] ) )
print(resultPredictions)

resultClass = argmax(resultPredictions)

print('The digit is: ' + str(resultClass))

To run the classification, execute the following and select an index: 

$ python classify.py
Select test image index (0-9999):

The original image will be displayed in a window; after it is closed the result of the classification will be printed in the terminal. 

This can classify the MNIST digits with about 98% accuracy. 

API Design: Slug Fields and Identifiers

Using slug values and parameters in APIs allows more ‘hackable’ and easier to recall URIs to resources and helps Developer Experience (DX), i.e. API UX.
For example, in a Books API, the resource representing the book “Invisible Man” would have a slug equal to “invisible-man” in addition to its numeric canonical ID.
Slugs should be unique; they can be computed by many library routines widely available, which involves removing punctuation, lower-casing and replacing whitespace by hyphens.
This ability lets us access specific objects much more easily.
For example, we can use:

GET /api/books/invisible-man

In addition to:

GET /api/books/4456234

Both are valid requests for the object:

{
  "id": "4456234",
  "slug": "invisible-man",
  "title": "Invisible Man",
  "author": "Ralph Ellison"
}

An API is usually up and running at least in three environments: Dev, Staging and Production, and sometimes more.
With slug values identifying objects, a client need not know or discover IDs in different environments. For example in the books API, we can access the same object’s different versions across environments with /books/invisible-man, e.g. on different hosts such as api.dev.books.com/books/invisible-man or api.books.com/books/invisible-man.
The actual IDs in the underlying databases need not be known; a slug value lets us quickly access the resource’s copy in any environment.

Further, slug values for collection objects can be even more powerful. For example, exploring entire categories of books can be much easier if slug values for categories can be used:

GET /categories/novels/books

Instead of a specific ID-based request:

GET /categories/5423/books

In both cases the result is a list of all books in the category novels.

A frontend app using the API need not know the ID of a category to request objects in the category.
This means the app can make an API call by slug based directly from parsing the web URL, without having determine any resource IDs.
Consequently, slugs are ideal for URL paths for web apps, providing a correspondence between user URLs and API parameters for developers.

If a single object can have multiple kinds of slugs, we can use fields like titleSlug and categorySlug.