Force Vagrant to Re-download the Box File

Sometimes we want to download the operating system image file again and start from scratch when building a Vagrant VM.

If vagrant up does not load a new box even after a vagrant destroy of the existing VM, do the following.

Ensure the box_url variable in the Vagrantfile is set to the correct URL of the new box (config.vm.box_url).

View existing boxes to get the existing box name:

$ vagrant box list

Output:

myBox (virtualbox, 0)

Fully remove the box file:

$ vagrant box remove myBox

Now the box file listed in the Vagrantfile will be downloaded again when running vagrant up.

 

Format a Date as a Year-Month-Day String in JavaScript

To format a date in YYYYMMDD or similar string format in JavaScript, we can use the following. In this example we use the current date.

Using dateformat:

$ npm install dateformat
const dateFormat = require('dateformat')

const now = new Date()
const dateString = dateFormat(now, 'yyyymmdd')

console.log(dateString)

Result:

20210313

Using moment.js:

$ npm install moment
const moment = require('moment')

const now = new Date()
const dateString = moment(now).format('YYYYMMDD')

console.log(dateString)

Result:

20210313

References

https://www.npmjs.com/package/dateformat

https://www.npmjs.com/package/moment

 

Make Commit Message Lint Git Hook Run First with Commitlint and Husky

Enforcing a standard for commit messages with a commit-msg hook using something like commitlint is useful, but ideally if we have a commit message hook set up to lint the commit message, it should run first, before any test suite.
That is, if we have a large unit test suite which must pass before committing is possible, and it takes a long time to run (i.e. a minute or two), it is better to have a commit message problem fail first, quickly. It is annoying to run through a large, slow test suite successfully only to have a commit message lint prevent the entire commit at the end!

It is not possible to change the order of git hooks execution. Git does not have access to the commit message at that stage of processing.

However, using commitlint and Husky, we can achieve the desired order by simply using the commit-msg hook to trigger our tests instead of the pre-commit hook.
Inside .huskyrc (or the “husky” field inside package.json) we can change this:

{
  "hooks": {
    "commit-msg": "commitlint -e $GIT_PARAMS",
    "pre-commit": "npm test"
  }
}

Into this:

{
  "hooks": {
    "commit-msg": "commitlint -e $GIT_PARAMS && npm test"
  }
}

The commit-msg hook will run before the commit is actually made, so this is still a pre-commit hook requiring successful run of the unit tests before one can commit.
But any problem with the commit message itself will show up immediately.

 

Change Commit Message Length Limit in Commitlint Config

Writing descriptive commit messages is helpful to readers of code; sometimes the default limit for the number of characters enforced by Commitlint in a given configuration is too short for descriptions of very specific changes. To change the character length limit do the following.
Suppose we are using an existing set of rules (in this example, the Angular config) and just want to change the character limit to 200.
The config file below should achieve this:

commitlint.config.js

module.exports = {
  extends: ['@commitlint/config-angular'],
  rules: {
    'header-max-length': [2, 'always', 200],
  }
}

This will use all of the rules from config-angular but override the commit message length rule (header-max-length).

The first element 2 means “throw error” (1 means “warning”, 0 means “disable the rule”).

References

https://commitlint.js.org/#/reference-rules

Classify Handwritten Digits Using Keras

We can build a very small but very accurate deep learning model for classifying handwritten MNIST digits in Python with Keras, easily trained on a laptop CPU. 

This assumes Keras and numpy are installed; see this post for installing on macOS for example.

The labeled data for training is available inside the Keras library itself in keras.datasets

The load_data function returns a pair of tuples, each element of which is a numpy array with the images (stored as tensors) and labels respectively.  First,  in the training script, we only use the training data.

The model is a network of two densely-connected layers. 

We use categorical_crossentropy for the loss function since there are multiple categories to classify (one for each digit). 

The training and testing images are re-shaped to match the input shape of the Dense layer which is the first layer in the model. They are also converted to float32 type. 

The fit function executes the training process, after which the trained model is saved to a file. The classification script will read the model from the file and use it. 

The following is the training script.

train.py

from keras.datasets import mnist
from keras import models
from keras import layers
from keras.utils import to_categorical
 
(trainImages, trainLabels) = mnist.load_data()[0]
network = models.Sequential() network.add(layers.Dense(512, activation='relu', input_shape=(28 * 28,))) network.add(layers.Dense(10, activation='softmax'))
network.compile( optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'] )
network.summary()
trainImages = trainImages.reshape((60000, 28 * 28)) trainImages = trainImages.astype('float32') / 255
trainLabels = to_categorical(trainLabels)
network.fit(trainImages, trainLabels, epochs=5, batch_size=128)
network.save('digit_model.keras')

Train the network with: 

$ python train.py

 

To classify a specific image, we select one example from the test data (available with Keras) and open the saved model from the file to use for classification. First, only the test images array is extracted; a specific image to test is chosen by user input. 

The trained model is loaded with load_model

An index is read from the user to choose one of the testing data images (stored as a tensor). The image is reshaped into the shape required as input into the model. 

We use pyplot to show the actual image before running the classification. 

Running the classification is done by calling predict; note that the input being classified as one of the 10 digits must be in the correct shape as a numpy array.

The function argmax is then used to select the highest value out of the list of predictions, which are probabilities indicating how likely it is that the input is in the class at the given position. For example, for an image most likely classified as 2, the third position value will have the highest probability (the list starts from zero). The output is printed to show the actual numerical representation.

The following is the classification script.

classify.py

from keras import models
from keras.datasets import mnist
from matplotlib import pyplot
from numpy import array, argmax

mnistDataTuples = mnist.load_data()
testData = mnistDataTuples[1]
testImages = testData[0]

network = models.load_model('digit_model.keras')

testImages = testImages.reshape((10000, 28 * 28))
testImages = testImages.astype('float32') / 255

testImageIndex = int(input('Select test image index (0-9999): '))
inputImage = testImages[testImageIndex]
inputImageScaled = inputImage.reshape(28, 28)

pyplot.imshow(inputImageScaled, cmap='gray')
pyplot.show() # Opens blocking window; close it to continue.

# Predict class using the correct shape of the test image.
resultPredictions = network.predict( array( [inputImage,] ) )
print(resultPredictions)

resultClass = argmax(resultPredictions)

print('The digit is: ' + str(resultClass))

To run the classification, execute the following and select an index: 

$ python classify.py
Select test image index (0-9999):

The original image will be displayed in a window; after it is closed the result of the classification will be printed in the terminal. 

This can classify the MNIST digits with about 98% accuracy. 

API Design: Slug Fields and Identifiers

Using slug values and parameters in APIs allows more ‘hackable’ and easier to recall URIs to resources and helps Developer Experience (DX), i.e. API UX.
For example, in a Books API, the resource representing the book “Invisible Man” would have a slug equal to “invisible-man” in addition to its numeric canonical ID.
Slugs should be unique; they can be computed by many library routines widely available, which involves removing punctuation, lower-casing and replacing whitespace by hyphens.
This ability lets us access specific objects much more easily.
For example, we can use:

GET /api/books/invisible-man

In addition to:

GET /api/books/4456234

Both are valid requests for the object:

{
  "id": "4456234",
  "slug": "invisible-man",
  "title": "Invisible Man",
  "author": "Ralph Ellison"
}

An API is usually up and running at least in three environments: Dev, Staging and Production, and sometimes more.
With slug values identifying objects, a client need not know or discover IDs in different environments. For example in the books API, we can access the same object’s different versions across environments with /books/invisible-man, e.g. on different hosts such as api.dev.books.com/books/invisible-man or api.books.com/books/invisible-man.
The actual IDs in the underlying databases need not be known; a slug value lets us quickly access the resource’s copy in any environment.

Further, slug values for collection objects can be even more powerful. For example, exploring entire categories of books can be much easier if slug values for categories can be used:

GET /categories/novels/books

Instead of a specific ID-based request:

GET /categories/5423/books

In both cases the result is a list of all books in the category novels.

A frontend app using the API need not know the ID of a category to request objects in the category.
This means the app can make an API call by slug based directly from parsing the web URL, without having determine any resource IDs.
Consequently, slugs are ideal for URL paths for web apps, providing a correspondence between user URLs and API parameters for developers.

If a single object can have multiple kinds of slugs, we can use fields like titleSlug and categorySlug.

Install Keras on macOS

To install the Keras deep learning library on macOS (no GPU required), first make sure to have a recent version of Python3.

The best way to manage recent Python versions is using pyenv. Assuming starting from scratch:

brew install pyenv

Add this line to .bash_profile:

# For pyenv.
if command -v pyenv 1>/dev/null 2>&1; then
  eval "$(pyenv init -)"
fi

Be careful with version compatibility; at the time of writing Python 3.6 works well.

pyenv install 3.6.0

pyenv global 3.6.0

First install the TensorFlow backend, then Keras itself:

pip install tensorflow

pip install keras

Test the installation:

python

>>> import keras

The import should succeed without error if the installation is complete.

 

Undo a Commit on the Current Branch in Git

To undo a commit made on a branch (e.g. master) as well as unstage the changes and restore the changes to local changes, use the following:

$ git reset HEAD~

Unstaged changes after reset:
M file.js

$ git status
On branch master
Changes not staged for commit:
 (use "git add <file>..." to update what will be committed)
 (use "git checkout -- <file>..." to discard changes in working directory)

modified: file.js

no changes added to commit (use "git add" and/or "git commit -a")
$

The commit will be permanently removed from the branch, and remain as local changes only.

NOTE: this requires at least 2 commits in the repo or the reference will not exist.

 

Increase Readability of Function Calls in JavaScript with an Argument Object

An argument object is a pattern to create more readable code involving function calls.

As the number of function arguments increases, readability rapidly declines.

We can create cleaner code by using an argument object: an object with keys for each function parameter. The reader does not have to inspect or deduce the meaning of positional arguments.

Suppose we have a function to locate a point based on latitude and longitude, but also with a radius and a flag indicating placing a marker.

A function call would look like this:

geoLocate(40.730610, -73.935242, 100.0, true);

Coming across the above call in code would require digging into the meaning of each positional argument.

We can make this more readable by making the function accept an argument object instead as below:

geoLocate({ 
  lat: 40.730610,
  lng: -73.935242,
  radius: 100.0,
  placeMarker: true 
});

This approach is especially helpful with boolean flags as arguments, which as positional parameters can be quite unreadable.

 

REST API Design: Health Check Endpoint

A health-check or simply health endpoint can be very useful for testing and inspecting a running API, especially when rather implementation-specific information is used. It can also be used as the endpoint called by automated monitoring and alerting.

For a microservice, the health often depends on several dependencies. The health endpoint can list the status of the dependencies.

An example call:

GET /api/health

{
  "healthy": true,
  "dependencies": [
    {
      "name": "serviceA",
      "healthy": true
    },
    {
      "name": "serviceB",
      "healthy": true
    }
  ]
}

Some other options for naming: “status”, or “isHealthy”.

The response status code is 200 OK for a healthy API, and a good choice for unhealthy is 503 Service Unavailable.

Getting more specific, we can design sub-resources for health information pertaining to subdomains or functionality of the API.
For a concrete example, imagine an API which can report on the status of a data ingest service as part of its own health:

GET /api/health/data-ingest

{
  "isHealthy": false,
  "databaseUpdatedAt": 1592716496,
  "memoryUsage": "255MB"
}

This sub-resource gives us specific information about a data ingest subsystem.

We can use this sort of design to make monitoring more granular: imagine an alert being fired only when certain specific health sub-resources return an error status, but not all.

Using Headers for Health Status

Another option is to use HTTP headers for specific details and keep the JSON result body small, showing only the status. For example:

GET /api/health/data-ingest

Content-type: application/json
X-Health-Database-Updated-At: 1592716496
X-Health-Memory-Usage": 255MB

{
  "healthy": true
}