Classify Handwritten Digits Using Keras

We can build a very small but very accurate deep learning model for classifying handwritten MNIST digits in Python with Keras, easily trained on a laptop CPU. 

This assumes Keras and numpy are installed; see this post for installing on macOS for example.

The labeled data for training is available inside the Keras library itself in keras.datasets

The load_data function returns a pair of tuples, each element of which is a numpy array with the images (stored as tensors) and labels respectively.  First,  in the training script, we only use the training data.

The model is a network of two densely-connected layers. 

We use categorical_crossentropy for the loss function since there are multiple categories to classify (one for each digit). 

The training and testing images are re-shaped to match the input shape of the Dense layer which is the first layer in the model. They are also converted to float32 type. 

The fit function executes the training process, after which the trained model is saved to a file. The classification script will read the model from the file and use it. 

The following is the training script.

train.py

from keras.datasets import mnist
from keras import models
from keras import layers
from keras.utils import to_categorical
 
(trainImages, trainLabels) = mnist.load_data()[0]
network = models.Sequential() network.add(layers.Dense(512, activation='relu', input_shape=(28 * 28,))) network.add(layers.Dense(10, activation='softmax'))
network.compile( optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'] )
network.summary()
trainImages = trainImages.reshape((60000, 28 * 28)) trainImages = trainImages.astype('float32') / 255
trainLabels = to_categorical(trainLabels)
network.fit(trainImages, trainLabels, epochs=5, batch_size=128)
network.save('digit_model.keras')

Train the network with: 

$ python train.py

 

To classify a specific image, we select one example from the test data (available with Keras) and open the saved model from the file to use for classification. First, only the test images array is extracted; a specific image to test is chosen by user input. 

The trained model is loaded with load_model

An index is read from the user to choose one of the testing data images (stored as a tensor). The image is reshaped into the shape required as input into the model. 

We use pyplot to show the actual image before running the classification. 

Running the classification is done by calling predict; note that the input being classified as one of the 10 digits must be in the correct shape as a numpy array.

The function argmax is then used to select the highest value out of the list of predictions, which are probabilities indicating how likely it is that the input is in the class at the given position. For example, for an image most likely classified as 2, the third position value will have the highest probability (the list starts from zero). The output is printed to show the actual numerical representation.

The following is the classification script.

classify.py

from keras import models
from keras.datasets import mnist
from matplotlib import pyplot
from numpy import array, argmax

mnistDataTuples = mnist.load_data()
testData = mnistDataTuples[1]
testImages = testData[0]

network = models.load_model('digit_model.keras')

testImages = testImages.reshape((10000, 28 * 28))
testImages = testImages.astype('float32') / 255

testImageIndex = int(input('Select test image index (0-9999): '))
inputImage = testImages[testImageIndex]
inputImageScaled = inputImage.reshape(28, 28)

pyplot.imshow(inputImageScaled, cmap='gray')
pyplot.show() # Opens blocking window; close it to continue.

# Predict class using the correct shape of the test image.
resultPredictions = network.predict( array( [inputImage,] ) )
print(resultPredictions)

resultClass = argmax(resultPredictions)

print('The digit is: ' + str(resultClass))

To run the classification, execute the following and select an index: 

$ python classify.py
Select test image index (0-9999):

The original image will be displayed in a window; after it is closed the result of the classification will be printed in the terminal. 

This can classify the MNIST digits with about 98% accuracy. 

API Design: Slug Fields and Identifiers

Using slug values and parameters in APIs allows more ‘hackable’ and easier to recall URIs to resources and helps Developer Experience (DX), i.e. API UX.
For example, in a Books API, the resource representing the book “Invisible Man” would have a slug equal to “invisible-man” in addition to its numeric canonical ID.
Slugs should be unique; they can be computed by many library routines widely available, which involves removing punctuation, lower-casing and replacing whitespace by hyphens.
This ability lets us access specific objects much more easily.
For example, we can use:

GET /api/books/invisible-man

In addition to:

GET /api/books/4456234

Both are valid requests for the object:

{
  "id": "4456234",
  "slug": "invisible-man",
  "title": "Invisible Man",
  "author": "Ralph Ellison"
}

An API is usually up and running at least in three environments: Dev, Staging and Production, and sometimes more.
With slug values identifying objects, a client need not know or discover IDs in different environments. For example in the books API, we can access the same object’s different versions across environments with /books/invisible-man, e.g. on different hosts such as api.dev.books.com/books/invisible-man or api.books.com/books/invisible-man.
The actual IDs in the underlying databases need not be known; a slug value lets us quickly access the resource’s copy in any environment.

Further, slug values for collection objects can be even more powerful. For example, exploring entire categories of books can be much easier if slug values for categories can be used:

GET /categories/novels/books

Instead of a specific ID-based request:

GET /categories/5423/books

In both cases the result is a list of all books in the category novels.

A frontend app using the API need not know the ID of a category to request objects in the category.
This means the app can make an API call by slug based directly from parsing the web URL, without having determine any resource IDs.
Consequently, slugs are ideal for URL paths for web apps, providing a correspondence between user URLs and API parameters for developers.

If a single object can have multiple kinds of slugs, we can use fields like titleSlug and categorySlug.

Install Keras on macOS

To install the Keras deep learning library on macOS (no GPU required), first make sure to have a recent version of Python3.

The best way to manage recent Python versions is using pyenv. Assuming starting from scratch:

brew install pyenv

Add this line to .bash_profile:

# For pyenv.
if command -v pyenv 1>/dev/null 2>&1; then
  eval "$(pyenv init -)"
fi

Be careful with version compatibility; at the time of writing Python 3.6 works well.

pyenv install 3.6.0

pyenv global 3.6.0

First install the TensorFlow backend, then Keras itself:

pip install tensorflow

pip install keras

Test the installation:

python

>>> import keras

The import should succeed without error if the installation is complete.

 

Undo a Commit on the Current Branch in Git

To undo a commit made on a branch (e.g. master) as well as unstage the changes and restore the changes to local changes, use the following:

$ git reset HEAD~

Unstaged changes after reset:
M file.js

$ git status
On branch master
Changes not staged for commit:
 (use "git add <file>..." to update what will be committed)
 (use "git checkout -- <file>..." to discard changes in working directory)

modified: file.js

no changes added to commit (use "git add" and/or "git commit -a")
$

The commit will be permanently removed from the branch, and remain as local changes only.

NOTE: this requires at least 2 commits in the repo or the reference will not exist.

 

Increase Readability of Function Calls in JavaScript with an Argument Object

An argument object is a pattern to create more readable code involving function calls.

As the number of function arguments increases, readability rapidly declines.

We can create cleaner code by using an argument object: an object with keys for each function parameter. The reader does not have to inspect or deduce the meaning of positional arguments.

Suppose we have a function to locate a point based on latitude and longitude, but also with a radius and a flag indicating placing a marker.

A function call would look like this:

geoLocate(40.730610, -73.935242, 100.0, true);

Coming across the above call in code would require digging into the meaning of each positional argument.

We can make this more readable by making the function accept an argument object instead as below:

geoLocate({ 
  lat: 40.730610,
  lng: -73.935242,
  radius: 100.0,
  placeMarker: true 
});

This approach is especially helpful with boolean flags as arguments, which as positional parameters can be quite unreadable.

 

REST API Design: Health Check Endpoint

A health-check or simply health endpoint can be very useful for testing and inspecting a running API, especially when rather implementation-specific information is used. It can also be used as the endpoint called by automated monitoring and alerting.

For a microservice, the health often depends on several dependencies. The health endpoint can list the status of the dependencies.

An example call:

GET /api/health

{
  "healthy": true,
  "dependencies": [
    {
      "name": "serviceA",
      "healthy": true
    },
    {
      "name": "serviceB",
      "healthy": true
    }
  ]
}

Some other options for naming: “status”, or “isHealthy”.

The response status code is 200 OK for a healthy API, and a good choice for unhealthy is 503 Service Unavailable.

Getting more specific, we can design sub-resources for health information pertaining to subdomains or functionality of the API.
For a concrete example, imagine an API which can report on the status of a data ingest service as part of its own health:

GET /api/health/data-ingest

{
  "isHealthy": false,
  "databaseUpdatedAt": 1592716496,
  "memoryUsage": "255MB"
}

This sub-resource gives us specific information about a data ingest subsystem.

We can use this sort of design to make monitoring more granular: imagine an alert being fired only when certain specific health sub-resources return an error status, but not all.

Using Headers for Health Status

Another option is to use HTTP headers for specific details and keep the JSON result body small, showing only the status. For example:

GET /api/health/data-ingest

Content-type: application/json
X-Health-Database-Updated-At: 1592716496
X-Health-Memory-Usage": 255MB

{
  "healthy": true
}

 

Get Current Date in Unix Epoch Time in JavaScript

We can get the current Unix epoch timestamp in JavaScript (e.g. Node.js) with the following:

const epoch = Math.round(new Date().getTime() / 1000) 
console.log(epoch)

Result:

1601941415

We can also use valueOf:

const epoch = Math.round(new Date().valueOf() / 1000)
console.log(epoch)

Result:

1601941860

Probably the best way is using the static method now from Date:

const epoch = Math.round(Date.now() / 1000) 
console.log(epoch)

Result:

1601941936

Note that we need to divide by 1000 because the original result is in milliseconds.

 

Dynamically Generate Variable Names in Perl

NOTE: this is not recommended, but it is a powerful feature which can be useful.

We can generate dynamic variable names in Perl using Symbolic References. To use the feature, we have to turn off strict refs.

The code below generates the variable names ‘var1’, ‘var2’, ‘var3’ dynamically in a loop as strings, names which can be used as actual variable names with the help of symbolic references.
Of course, hashes should be used instead whenever possible; this is for demonstration.

use strict;
 
our $var1 = 'a';
our $var2 = 'b';
our $var3 = 'c';

for (my $i = 1; $i < 4; $i++) {
  my $variableName;
  {
    # Symbolic References require 'no strict'.
    no strict 'refs';
    $variableName = ${'var' . $i}; # Dynamic name.
  }
  print $variableName . "\n";
}

Output:

a
b
c

 

Istanbul Ignore Syntax for Jest Code Coverage

Istanbul is the tool Jest uses to calculate test coverage. Sometimes we need to exclude some code from the coverage calculations. This is done with special comments which are parsed by Istanbul. There are a few variations of the syntax.

Ignore a Function

/* istanbul ignore next */
const f = () => {
  return 'abc'
}

This will exclude the entire function from code coverage requirements.

Ignore a Whole File

/* istanbul ignore file */

... file contents ...

Use this as the first line of the file. The entire file will be excluded from code coverage.

Ignore a Method in a Class

class A {
  f() {
    console.log("f called")
  }
  /* istanbul ignore next */ g() {
    console.log("g called")
  }
}

The comment must be on or above the line defining the method so it is not part of the coverage requirement.

Function Inside an Exported Object

Sometimes we have a module which exports some functions inside an object.
The example below shows how to ignore these for coverage. The comment must be right before the function definition.

module.exports = {
  f: () => { 
    console.log('f called')
  },
  g: /* istanbul ignore next */ () => {
    console.log('g called')
  },
  h: /* istanbul ignore next */ async () => {
    console.log('h called')
  }
}

Note that for async functions we must place the comment before the async keyword.

Ignore Else Cases

To ignore just the else case of a block of code for test coverage, use the syntax as below.

function f(x) {
  /* istanbul ignore else */
  if (x >= 0) {
    console.log('positive')
  } else { // Ignore this block for code coverage.
    console.log('negative')
  }
}

NOTE: the ignore-else comment is placed above the if statement to ignore the else case for coverage.

References

https://github.com/gotwarlost/istanbul/blob/master/ignoring-code-for-coverage.md

 

Log Values Inside Ramda.js Pipelines to the Console

It can be difficult to debug by examining intermediate values inside Ramda pipelines with console.log.

The Ramda function tap comes in handy here and is like a wiretap: we can use it to tap into a functional pipeline to insert some behaviour in between steps, while letting the pipeline continue to work as usual. We can insert R.tap in between functions inside pipelines built with R.compose or R.pipe.

In diagram form:

A -> B -> C
A -> B -> R.tap(fn) -> C

(pipeline proceeds as usual, passing output from B to the input of C)

The following simple example reverses a list and then takes the average (mean).

const R = require('ramda')
 
const f = R.compose(
  R.mean, // Take average.
  R.reverse // Reverse elements.
)

const result = f([1, 2, 3])

console.log(result)

This prints:

2

Suppose we want to debug this function composition and print out the intermediate reversed list.

We can insert the “wiretap” with R.tap with a function to log the input to see the intermediate value flowing through:

const R = require('ramda')
 
const f = R.compose(
  R.mean, // Take average.
  R.tap(x => console.log(x)), // Print and continue.
  R.reverse // Reverse elements.
)

const result = f([1, 2, 3])

console.log(result)

Result:

[ 3, 2, 1 ]
2

This technique can be very handy to debug more complex code using pipe and compose by inserting R.tap.

References

https://ramdajs.com/docs/#tap