Test XPath in the Terminal

There is a handy command-line tool to run XPath expressions and test them in the terminal on macOS.
The command is: xpath.

Assuming we have an XML file test.xml:

<books>
  <book id="123">
    <title>Book A</title>
  </book>
  <book id="456">
    <title>Book B</title>
  </book>
</books>

To test an XPath statement on this file, we can use:

$ xpath test.xml "//book[@id=456]"

Result:

Found 1 nodes:
-- NODE --
<book id="456">
  <title>Book B</title>
</book>

Note that the file argument is first and the XPath expression second, in double-quotes.

 

Merge Standard Error with Standard Output when Using a Pipe

When piping the output of one terminal command to another on a Unix-based system, by default only the Standard Output (stdout) of the first command is piped to the Standard Input (stdin) of the second command.
We can merge streams to make sure stderr is piped to the second command.
The general syntax is:

$ cmd1 2>&1 | cmd2

For a specific example: the “No such file” error below is sent to Standard Error.
The word count command receives an empty input.

$ cat invalid-file | wc
cat: invalid-file: No such file or directory
    0     0     0

Now, if we merge streams, Standard Error is piped to word-count and the number of characters in the error message is counted and printed:

$ cat invalid-file 2>&1 | wc
    1     7     45

This shows that en error output was sent through the pipe to the second command.

 

Run a Large Language Model Locally in the Terminal

We can run a Large Language Model (LLM) – although not quite as good as ChatGPT – on a local machine.
One of the easiest to run is Alpaca, a fine-tuning of LLaMA.
The following works on an Apple M1 Mac.

Clone and build the repo:

$ git clone https://github.com/antimatter15/alpaca.cpp

$ cd alpaca.cpp/

$ make chat

Download the pre-trained model weights:

$ wget -O ggml-alpaca-7b-q4.bin -c https://gateway.estuary.tech/gw/ipfs/QmQ1bf2BTnYxq73MFJWu1B7bQ2UD6qG7D7YDCxhTndVkPC

(See the source repo below for alternatives if this fails).

Run the model:

$ ./chat

Output:

main: seed = 1679968451
llama_model_load: loading model from 'ggml-alpaca-7b-q4.bin' - please wait ...
llama_model_load: ggml ctx size = 4529.34 MB
llama_model_load: memory_size = 512.00 MB, n_mem = 16384
llama_model_load: loading model part 1/1 from 'ggml-alpaca-7b-q4.bin'
llama_model_load: .................................... done
llama_model_load: model size = 4017.27 MB / num tensors = 291

== Running in chat mode. ==
- Press Ctrl+C to interject at any time.
- Press Return to return control to LLaMa.
- If you want to submit another line, end your input in '\'.

> What is the age of the universe?
The current estimate for when our Universe was created, 
according to modern cosmology and astronomy, 
is 13.798 billion years ago (±0.2%).
>

References

https://github.com/antimatter15/alpaca.cpp

When npm prestart is Executed

The package.json file allows us to specify a “prestart” script.
When does the npm prestart script run?
The entry under “prestart” will run first when “start” is called, before the script specified by “start“.

The example below demonstrates this.

package.json:

{
  "name": "example-repo",
  "scripts": {
    "prestart": "echo 'In prestart...'",
    "start": "node index.js"
  }
}

Running start:

$ npm run start

Output:

> prestart
> echo 'In prestart...'

In prestart...

> start
> node index.js

This can be useful to run scripts required as a pre-requisite for running a build, for example.

 

REST API Design: Endpoint to Delete a Large Number of Items

In a proper REST API design, we use the DELETE HTTP verb to delete items by ID:

DELETE /resources/{id}

Deleting a list of IDs could look like:

DELETE /resources/{id1},{id2},{id3}

Or as a list of query string parameters:

DELETE /resources?ids=id1,id2,id3

What if the list of items to delete is really large? Meaning, what if clients need to delete items by specifying thousands of IDs?
We then run into URI length limits, so the above options will not be enough.

One way of dealing with this would be to avoid the issue using deletion by some criteria, e.g. tags.

DELETE /resources?tag=items-to-delete

But what if we really have no choice but to delete a very large ad-hoc unpredictable list specified by the client?

The following design can work well. Since DELETE does not have a body, we can use POST.

Because the verb POST no longer properly reflects the action being performed by the API, we can add a sub-resource named /bulk-deletion under resources:

POST /resources/bulk-deletion

{
  "idsToDelete": [ id1, id2, id3, id4, ..., idN ]
}

As a variation, the sub-resource could be /deletion-list or even /ids-to-delete:

POST /resources/ids-to-delete

{
  "ids": [ id1, id2, id3, id4, ..., idN ]
}

This way our limit on the number of IDs is the POST body size limit, so the design can handle a very long list.
The path makes the operation unambiguous and also keeps the route name as a noun so we avoid action names in the URI.

 

Install golang-migrate Inside a Docker Container

To run database migrations using golang-migrate in a pipeline for a Go application, we need the binary migrate command available in the container.

Installing the database migration utility golang-migrate for Go inside a Linux container can be accomplished with the following.
Add this to the Dockerfile (note: the full path):

RUN go install -tags 'postgres' github.com/golang-migrate/migrate/v4/cmd/migrate@latest
RUN ln -s /go/bin/linux_amd64/migrate /usr/local/bin/migrate

After running docker build and run, SSH into the container to see if everything is correct.
Use docker ps to get the container ID. The following commands accomplish this:

$ docker build -t my-tag .

$ docker run -d -p 5000:5000 my-tag

$ docker ps
CONTAINER ID IMAGE ...
8e402138e4b2 my-tag ...

$ docker exec -it 8e402138e4b2 /bin/sh

Now we should have a shell inside the container.
Check if the migrate command installed correctly:

/# which migrate

This should show a path to the command, e.g.:

/go/bin/migrate

Check the version of the command:

/# migrate -version

This should show, e.g.:

dev

The container should now be able to run Go database migrations, preferably by running a startup script when the container starts.

 

Perform Speech Recognition in the Terminal with Whisper

OpenAI Whisper is a state-of-the-art speech recognition model that we can run from the command line.

This post assumes macOS with Python >= 3.7 installed.

First we need to install FFmpeg for audio processing.

$ brew install ffmpeg

Install Whisper:

$ pip install openai-whisper

This will also install a binary command: whisper

Now, record a piece of audio using QuickTime or similar.

Save the file to file.m4a, for example.

Then, to run the speech recognition:

$ whisper file.m4a --model small

The output will look something like this:

Detecting language using up to the first 30 seconds. 
Use `--language` to specify the language
Detected language: English
[00:00.000 --> 00:01.420] Hello there.

Notes

We can also use the specific repo URI if brew does not work on a system:

$ pip install git+https://github.com/openai/whisper.git

We can use the medium or large models if the small model is not sufficiently accurate:

$ whisper file.m4a --model medium
$ whisper file.m4a --model large

 

Run TypeScript File from the Command Line

To run a script written in TypeScript from the terminal, outside of the browser, we cannot use the regular Node.js binary.
We need to install the ts-node command as below:

$ npm install -g ts-node

If TypeScript itself is not installed, install it with:

$ npm install -g typescript

Then, we can run the TypeScript file using:

$ ts-node file.ts

 

GraphQL Mesh Gateway Health Check Endpoint

When running the GraphQL Mesh server in a cloud environment in a container, perhaps using a platform like Kubernetes or Elastic Container Service on AWS, we need to specify an HTTP endpoint to act as a health check for the container.

Though it is not well-documented, the Mesh server does have a healthcheck endpoint available.

The route is simply: /healthcheck

Here is an example calling the healthcheck of a locally running Mesh server:

$ curl "http://localhost:4000/healthcheck" -v
* Connected to localhost (127.0.0.1) port 4000 (#0)
> GET /healthcheck HTTP/1.1
> Host: localhost:4000
> User-Agent: curl/7.77.0
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< Date: Mon, 19 Dec 2022 03:06:17 GMT
< Connection: keep-alive
< Keep-Alive: timeout=5
< Content-Length: 0
<

Note that just the root response (“/”) returns an HTTP 302 Found and so /healthcheck is ideal for a 200 OK health status response.

Set a POST Request to be a Query in GraphQL Mesh using OpenAPI

When integrating REST APIs into GraphQL Mesh through the OpenAPI plugin, sometimes we want to specify that a POST path is actually a query and not a mutation.
That is, we need to override the default GraphQL Mesh OpenAPI/Swagger plugin behaviour of assuming that a path with a POST method is a mutation.

In the GraphQL Mesh YAML file (.meshrc.yml) we use the following in the declaration of our example service:

- name: example-service
  handler:
    openapi:
      source: '${EXAMPLE_SERVICE_BASE_URI}/docs'
      baseUrl: '${EXAMPLE_SERVICE_BASE_URI}'
      operationHeaders:
        Authorization: "{context.headers['authorization']}"
      selectQueryOrMutationField:
        - fieldName: 'exampleAction'
          type: Query

Note that fieldName is the value of operationId in the OpenAPI Spec (in this example: ‘exampleAction’).

Version Differences

Important: there was a change in the syntax for this feature in November 2022.

The previous format was as below:

- name: example-service
  handler:
    openapi:
      source: '${EXAMPLE_SERVICE_BASE_URI}/docs'
      baseUrl: '${EXAMPLE_SERVICE_BASE_URI}'
      operationHeaders:
        Authorization: "{context.headers['authorization']}"
      selectQueryOrMutationField:
        - title: 'Example Service Spec'
          path: /v1/example-service/resource
          method: post
          type: Query

Make sure the title is the key in the YAML and matches the exact title in the OpenAPI Spec referenced.

To confirm this works, start the Mesh server and confirm that the operation shows up as a Query instead of a Mutation.

References

https://github.com/Urigo/graphql-mesh/discussions/2921