Synthesize Speech in a Different Language using Python

To synthesize speech in Python in a language other than English using pyttsx3, we need to find which voice is available for the desired language.

First, we can print out the list of all available voices.
Each of the voice objects will include a list of languages that the voice supports (usually one).

In this example we will synthesize a string in Polish. For other languages other than English, simply find the voice which supports that language in the full output list of voices.

 

import pyttsx3

synthesizer = pyttsx3.init()

voices = synthesizer.getProperty("voices")

for voice in voices:
  if "zosia" in voice.id: # The Polish voice.
    print(voice.id) # Full ID string.
    print("Languages for voice:")
    print(voice.languages)

synthesizer.setProperty("language", "pl_PL")

synthesizer.setProperty("voice", 
  "com.apple.speech.synthesis.voice.zosia"
)

synthesizer.say("Cześć, jak się masz?")

synthesizer.runAndWait()

API Design: Paginated Responses by Default

One important way to reduce performance issues and potential abuse in an API is using pagination by default.

For example, suppose we have a call like:

GET /items

Conceptually, this REST resource represents a list of all items available.

In a real production API, however, this should default to actually getting the first page only. Specifically:

GET /items

should be an equivalent call to:

GET /items?page=1

This is because as the items collection grows, in theory the list of all items can become extremely large.

If /items attempts to return all items at once, the endpoint becomes a performance problem and a potential API security issue: it opens the API up to Resource Exhaustion Attacks. Attackers can abuse the API by requesting very large lists repeatedly, in parallel, potentially depleting server resources and causing denial-of-service to legitimate users.
Implemeting pagination-by-default helps prevent this abuse.

There should also be a hard limit on the maximum page size, for calls where the page size is specified.
For example, 500 items could be a hard maximum.
For any larger page sizes, we can return an error such as the following:

GET /items?pageSize=501
400 Bad request
{
  "error": Page size too large"
}

Following these guidelines will help an API be more performant and resistant to abuse.

 

Read Header Values from a File in a cURL Request

It can be cumbersome to type many different header names and values when composing a cURL command.

We can read all of the headers sent with a request from a file using the syntax below.

NOTE: make sure to have curl version 7.55.0 or higher.

To check:

$ curl --version

To cURL with headers read from a file:

$ curl -H @headers_file.txt http://somesite.com

Here is an example file:

headers_file.txt:

Accept: application/json
Content-type: application/json

We can confirm the headers are sent correctly using verbose mode (-v).

$ curl -H @headers_file.txt http://somesite.com -v

As another test, We can see exactly what is sent to a remote server by first receiving the request locally with netcat. In a terminal, open:

$ nc -l 9090

Then launch the request in a second terminal:

$ curl -H @headers_file.txt http://localhost:9090

In the terminal listening with netcat, we should receive a request with the headers specified in the file:

GET / HTTP/1.1
Host: localhost:9090
User-Agent: curl/7.77.0
Accept: application/json
Content-type: application/json

Note that default values for headers will be overridden.

 

Synthesize Speech using Python

We can perform text-to-speech in Python using the PyTTSX3 speech synthesis library.
Install the PyTTSX3 library:
$ pip install pyttsx3
The following example script will synthesize the audio for speaking “hello”.

synthesize-hello.py:

import pyttsx3

synthesizer = pyttsx3.init()

synthesizer.say("hello")

synthesizer.runAndWait()
synthesizer.stop()
To perform speech synthesis with a specific voice, use the following.
This is specific to macOs.

synthesize-by-voice.py:

import pyttsx3

synthesizer = pyttsx3.init()

voices = synthesizer.getProperty("voices")
for voice in voices:
  print(voice.id)

voiceChoice = input("Enter name: ")

synthesizer.setProperty("voice",
  "com.apple.speech.synthesis.voice." + str(voiceChoice))

stringToSay = input("Enter text to read: ")

synthesizer.say(stringToSay)
synthesizer.runAndWait()
synthesizer.stop()
The example run below shows the available voices and the input choosing a specific voice.
The input string is then synthesized as speech.
com.apple.speech.synthesis.voice.Alex
com.apple.speech.synthesis.voice.alice
com.apple.speech.synthesis.voice.alva

...

com.apple.speech.synthesis.voice.yuri
com.apple.speech.synthesis.voice.zosia
com.apple.speech.synthesis.voice.zuzana

Enter name: yuri
Enter text to read: this is a fake voice
The output is audio.

Sentiment Analysis with Keras based on Twitter Training Data

This post shows how to build a sentiment classifier for strings using Deep Learning, specifically Keras.
The classifier is trained from scratch using labelled data from Twitter.
The data set can be obtained from:
The CSV data looks like the following, inside the file:

training.1600000.processed.noemoticon.csv:

"0","1467810369","Mon Apr 06 22:19:45 PDT 2009","NO_QUERY","_TheSpecialOne_","@switchfoot http://twitpic.com/2y1zl - Awww, that's a bummer.  You shoulda got David Carr of Third Day to do it. ;D"
"0","1467810672","Mon Apr 06 22:19:49 PDT 2009","NO_QUERY","scotthamilton","is upset that he can't update his Facebook by texting it... and might cry as a result  School today also. Blah!"
"0","1467810917","Mon Apr 06 22:19:53 PDT 2009","NO_QUERY","mattycus","@Kenichan I dived many times for the ball. Managed to save 50%  The rest go out of bounds"
"0","1467811184","Mon Apr 06 22:19:57 PDT 2009","NO_QUERY","ElleCTF","my whole body feels itchy and like its on fire "
...
The first value is the sentiment (“0” is Negative, “4” is positive).
To start, we need to convert the file to remove incompatible UTF-8 characters:
$ iconv -c training.1600000.processed.noemoticon.csv > training-data.csv
The -c option specifies to ignore lines which cause errors.
The file training-data.csv is now the input for the next step.
The following script cleans the data to extract only the parts we need: the strings and the labelled sentiment values.

prepare-training-data.py:

import numpy as np

inputFile = open("training-data.csv")
lines = inputFile.readlines()

np.random.shuffle(lines)

outputLines = []

for line in lines:
  parts = line.split(",")
  sentiment = parts[0]
  text = parts[5]
  outputLine = text.strip() + " , " + sentiment + "\n"
  outputLines.append(outputLine)

outputFile = open("cleaned-sentiment-data.csv", "w")
outputFile.writelines(outputLines)
Run the script to generate the cleaned file:
$ python prepare-training-data.py
The cleaned training file has only Text, Sentiment.
The file looks like this:

cleaned-sentiment-data.csv:

"@realdollowner Today is a better day.  Overslept , "4"
"@argreen Boo~ , "0"
"Just for the people I don't know  x" , "4"
...
We can now use this file to train our model.
The following script is the training process.

sentiment-train.py:

import pickle

from keras.layers import Embedding, LSTM, Dense
from keras.models import Sequential
from keras.preprocessing.sequence import pad_sequences
from keras.preprocessing.text import Tokenizer 
from numpy import array

trainFile = open("cleaned-sentiment-data.csv", "r")

labels = []
wordVectors = []

allLines = trainFile.readlines()

# Take a subset of the data.
lines = allLines[:600000]

for line in lines:
  parts = line.split(",")
  string = parts[0].strip()
  wordVectors.append(string)

  sentiment = parts[1].strip()
  if sentiment == "\"4\"": # Positive.
    labels.append(array([1, 0]))
  if sentiment == "\"0\"": # Negative.
    labels.append(array([0, 1]))

labels = array(labels)

tokenizer = Tokenizer(num_words=10000)
tokenizer.fit_on_texts(wordVectors)

# Save tokenizer to file; will be needed for categorization script.
with open("tokenizer.pickle", "wb") as handle:
  pickle.dump(tokenizer, handle, protocol=pickle.HIGHEST_PROTOCOL)

sequences = tokenizer.texts_to_sequences(wordVectors)

paddedSequences = pad_sequences(sequences, maxlen=60)

model = Sequential()
# Embedding layer: number of possible words, size of the embedding vectors.
model.add(Embedding(10000, 60))
model.add(LSTM(15, dropout=0.2))
model.add(Dense(2, activation='softmax'))

model.compile(optimizer='adam',
  loss='categorical_crossentropy',
  metrics=['accuracy']
)

model.summary()

model.fit(paddedSequences, labels, epochs=5, batch_size=128)

model.save("sentiment-model.h5")
The training process output will be something similar to:
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
embedding (Embedding)        (None, None, 60)          600000
_________________________________________________________________
lstm (LSTM)                  (None, 15)                4560
_________________________________________________________________
dense (Dense)                (None, 2)                 32
=================================================================
Total params: 604,592
Trainable params: 604,592
Non-trainable params: 0
_________________________________________________________________
Epoch 1/5
4688/4688 [==============================] - 134s 29ms/step - loss: 0.4844 - accuracy: 0.7660
Epoch 2/5
4688/4688 [==============================] - 111s 24ms/step - loss: 0.4488 - accuracy: 0.7879
Epoch 3/5
4688/4688 [==============================] - 110s 24ms/step - loss: 0.4342 - accuracy: 0.7961
Epoch 4/5
4688/4688 [==============================] - 111s 24ms/step - loss: 0.4226 - accuracy: 0.8026
Epoch 5/5
4688/4688 [==============================] - 127s 27ms/step - loss: 0.4128 - accuracy: 0.8079
The model created is saved to the file: sentiment-model.h5
To use the model, we can use the following script:

sentiment-classify.py:

import pickle
from keras import models
from keras.preprocessing.sequence import pad_sequences

with open("tokenizer.pickle", "rb") as handle:
    tokenizer = pickle.load(handle)

userInput = input("Enter a phrase: ")

inputSequence = tokenizer.texts_to_sequences([userInput])

paddedSequence = pad_sequences(inputSequence)

model = models.load_model("sentiment-model.h5")

predictions = model.predict(paddedSequence)

print(predictions[0])

if predictions[0][0] > predictions[0][1]:
  print("Positive")
else:
  print("Negative")
Example usage:
$ python sentiment-classify.py 
Enter a phrase: what a great day!

[0.8984171  0.10158285]
Positive

$ python sentiment-classify.py
Enter a phrase: yesterday was terrible

[0.13580368 0.86419624]
Negative

Test XPath in the Terminal

There is a handy command-line tool to run XPath expressions and test them in the terminal on macOS.
The command is: xpath.

Assuming we have an XML file test.xml:

<books>
  <book id="123">
    <title>Book A</title>
  </book>
  <book id="456">
    <title>Book B</title>
  </book>
</books>

To test an XPath statement on this file, we can use:

$ xpath test.xml "//book[@id=456]"

Result:

Found 1 nodes:
-- NODE --
<book id="456">
  <title>Book B</title>
</book>

Note that the file argument is first and the XPath expression second, in double-quotes.

 

Merge Standard Error with Standard Output when Using a Pipe

When piping the output of one terminal command to another on a Unix-based system, by default only the Standard Output (stdout) of the first command is piped to the Standard Input (stdin) of the second command.
We can merge streams to make sure stderr is piped to the second command.
The general syntax is:

$ cmd1 2>&1 | cmd2

For a specific example: the “No such file” error below is sent to Standard Error.
The word count command receives an empty input.

$ cat invalid-file | wc
cat: invalid-file: No such file or directory
    0     0     0

Now, if we merge streams, Standard Error is piped to word-count and the number of characters in the error message is counted and printed:

$ cat invalid-file 2>&1 | wc
    1     7     45

This shows that en error output was sent through the pipe to the second command.

 

Run a Large Language Model Locally in the Terminal

We can run a Large Language Model (LLM) – although not quite as good as ChatGPT – on a local machine.
One of the easiest to run is Alpaca, a fine-tuning of LLaMA.
The following works on an Apple M1 Mac.

Clone and build the repo:

$ git clone https://github.com/antimatter15/alpaca.cpp

$ cd alpaca.cpp/

$ make chat

Download the pre-trained model weights:

$ wget -O ggml-alpaca-7b-q4.bin -c https://gateway.estuary.tech/gw/ipfs/QmQ1bf2BTnYxq73MFJWu1B7bQ2UD6qG7D7YDCxhTndVkPC

(See the source repo below for alternatives if this fails).

Run the model:

$ ./chat

Output:

main: seed = 1679968451
llama_model_load: loading model from 'ggml-alpaca-7b-q4.bin' - please wait ...
llama_model_load: ggml ctx size = 4529.34 MB
llama_model_load: memory_size = 512.00 MB, n_mem = 16384
llama_model_load: loading model part 1/1 from 'ggml-alpaca-7b-q4.bin'
llama_model_load: .................................... done
llama_model_load: model size = 4017.27 MB / num tensors = 291

== Running in chat mode. ==
- Press Ctrl+C to interject at any time.
- Press Return to return control to LLaMa.
- If you want to submit another line, end your input in '\'.

> What is the age of the universe?
The current estimate for when our Universe was created, 
according to modern cosmology and astronomy, 
is 13.798 billion years ago (±0.2%).
>

References

https://github.com/antimatter15/alpaca.cpp

When npm prestart is Executed

The package.json file allows us to specify a “prestart” script.
When does the npm prestart script run?
The entry under “prestart” will run first when “start” is called, before the script specified by “start“.

The example below demonstrates this.

package.json:

{
  "name": "example-repo",
  "scripts": {
    "prestart": "echo 'In prestart...'",
    "start": "node index.js"
  }
}

Running start:

$ npm run start

Output:

> prestart
> echo 'In prestart...'

In prestart...

> start
> node index.js

This can be useful to run scripts required as a pre-requisite for running a build, for example.

 

REST API Design: Endpoint to Delete a Large Number of Items

In a proper REST API design, we use the DELETE HTTP verb to delete items by ID:

DELETE /resources/{id}

Deleting a list of IDs could look like:

DELETE /resources/{id1},{id2},{id3}

Or as a list of query string parameters:

DELETE /resources?ids=id1,id2,id3

What if the list of items to delete is really large? Meaning, what if clients need to delete items by specifying thousands of IDs?
We then run into URI length limits, so the above options will not be enough.

One way of dealing with this would be to avoid the issue using deletion by some criteria, e.g. tags.

DELETE /resources?tag=items-to-delete

But what if we really have no choice but to delete a very large ad-hoc unpredictable list specified by the client?

The following design can work well. Since DELETE does not have a body, we can use POST.

Because the verb POST no longer properly reflects the action being performed by the API, we can add a sub-resource named /bulk-deletion under resources:

POST /resources/bulk-deletion

{
  "idsToDelete": [ id1, id2, id3, id4, ..., idN ]
}

As a variation, the sub-resource could be /deletion-list or even /ids-to-delete:

POST /resources/ids-to-delete

{
  "ids": [ id1, id2, id3, id4, ..., idN ]
}

This way our limit on the number of IDs is the POST body size limit, so the design can handle a very long list.
The path makes the operation unambiguous and also keeps the route name as a noun so we avoid action names in the URI.