Classify an Object in an Image in Python Using the YOLO Model

To perform object classification on an image file using Python, we can use the open source pre-trained YOLO model from Ultralytics.

First, install the library using:

$ pip install ultralytics

For example, assume we have an image of a tractor in a local file tractor.jpeg under images/.

Note that we can also run the model from the command line using:

$ yolo predict source='images/tractor.jpeg'

In Python, we need to extract the result from all of the model output, which requires a bit more code.

The model’s predict function will return a list of results with probability values, as well as a list of all labels.

The code below will extract the highest probability label and print it.

from ultralytics import YOLO

model = YOLO("")

# Path to an image file assumed to exist.
results = model.predict("images/tractor.jpeg")

# Overall results is a list.
result = results[0]

probabilities = result.probs

# Top1 is the most likely result.
topLabelNumber = probabilities.top1

# Now find the label name for that label number.
allNames = result.names
for labelNumber, label in allNames.items():
  if labelNumber == topLabelNumber:
    resultLabel = label

print("Classification result:")


Synthesize Speech in a Different Language using Python

To synthesize speech in Python in a language other than English using pyttsx3, we need to find which voice is available for the desired language.

First, we can print out the list of all available voices.
Each of the voice objects will include a list of languages that the voice supports (usually one).

In this example we will synthesize a string in Polish. For other languages other than English, simply find the voice which supports that language in the full output list of voices.


import pyttsx3

synthesizer = pyttsx3.init()

voices = synthesizer.getProperty("voices")

for voice in voices:
  if "zosia" in # The Polish voice.
    print( # Full ID string.
    print("Languages for voice:")

synthesizer.setProperty("language", "pl_PL")


synthesizer.say("Cześć, jak się masz?")