🐶
Machine Vision

Simple OpenCV Digit Recognition in Python

By Jan on 02/17/2025

Learn how to build a simple Optical Character Recognition (OCR) system using OpenCV-Python to recognize handwritten digits with ease.

Simple OpenCV Digit Recognition in Python

Table of Contents

Introduction

This code snippet demonstrates a simple optical character recognition (OCR) process for recognizing handwritten digits in an image. It utilizes OpenCV for image processing and a pre-trained machine learning model for digit classification.

Step-by-Step Guide

  1. Import necessary libraries:
import cv2
import numpy as np
  1. Load and preprocess the image:
image = cv2.imread('image.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
  1. Find contours of the digits:
cnts = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
  1. Loop through each contour and recognize the digit:
for c in cnts:
    x,y,w,h = cv2.boundingRect(c)
    ROI = thresh[y:y+h, x:x+w]
    # Resize the digit image to a standard size
    ROI = cv2.resize(ROI, (28,28), interpolation=cv2.INTER_AREA)
    # Flatten the image into a 1D array
    ROI = ROI.reshape((1,784))
    # Use a pre-trained model (e.g., KNN, SVM) to predict the digit
    prediction = model.predict(ROI)
    print("Predicted digit:", prediction)
  1. Display the results:
cv2.imshow('Image', image)
cv2.waitKey(0)

Explanation:

  • The code first preprocesses the image by converting it to grayscale, applying thresholding to segment the digits from the background, and finding contours of the digits.
  • For each contour, it extracts the region of interest (ROI) containing the digit, resizes it to a standard size, and flattens it into a 1D array.
  • A pre-trained machine learning model (e.g., KNN, SVM) is then used to predict the digit based on the extracted features.
  • Finally, the predicted digit is printed, and the original image is displayed with the recognized digits highlighted.

Code Example

This Python code performs handwritten digit recognition using a KNN classifier. It loads the MNIST dataset for training, preprocesses an input image to find digit contours, and then uses the trained KNN model to predict the digit in each contour. The predictions are then displayed on the original image.

import cv2
import numpy as np
from sklearn.neighbors import KNeighborsClassifier

# Load the pre-trained MNIST dataset
with np.load('mnist.npz') as data:
    train_images = data['x_train']
    train_labels = data['y_train']

# Flatten the images and normalize pixel values
train_images = train_images.reshape((60000, 784)) / 255.0

# Create a KNN classifier and train it on the MNIST dataset
model = KNeighborsClassifier(n_neighbors=5)
model.fit(train_images, train_labels)

# Load and preprocess the image
image = cv2.imread('digits_image.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]

# Find contours of the digits
cnts = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]

# Loop through each contour and recognize the digit
for c in cnts:
    x, y, w, h = cv2.boundingRect(c)
    ROI = thresh[y:y + h, x:x + w]
    # Resize the digit image to a standard size
    ROI = cv2.resize(ROI, (28, 28), interpolation=cv2.INTER_AREA)
    # Flatten the image into a 1D array
    ROI = ROI.reshape((1, 784))
    # Use the trained KNN model to predict the digit
    prediction = model.predict(ROI)
    print("Predicted digit:", prediction[0])

    # Draw a rectangle around the digit and display the prediction
    cv2.rectangle(image, (x, y), (x + w, y + h), (0, 255, 0), 2)
    cv2.putText(image, str(prediction[0]), (x, y - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.9, (0, 255, 0), 2)

# Display the results
cv2.imshow('Image', image)
cv2.waitKey(0)

Before running the code:

  1. Install necessary libraries:

    pip install opencv-python numpy scikit-learn
  2. Download the MNIST dataset:

  3. Replace 'digits_image.jpg' with the actual path to your image containing digits.

Explanation:

  • This code uses the KNN algorithm for digit recognition.
  • It loads the pre-trained MNIST dataset to train the KNN model.
  • The code then preprocesses the input image, finds contours of digits, resizes them to 28x28 pixels, and feeds them to the trained KNN model for prediction.
  • Finally, it draws bounding boxes around the recognized digits and displays the predicted digit on the image.

Additional Notes

General:

  • This code provides a basic framework for digit recognition in images.
  • It's important to note that this is a simplified example and may not perform well on complex images or handwritten digits with variations in style.

Preprocessing:

  • Thresholding: The code uses Otsu's thresholding which automatically determines the optimal threshold value for separating the digits from the background. This works well for images with a clear separation between foreground and background. For more complex images, adaptive thresholding techniques might be necessary.
  • Contour Finding: The cv2.RETR_EXTERNAL flag ensures that only the outermost contours of the digits are detected, avoiding nested contours.

Digit Recognition:

  • MNIST Dataset: The code uses the MNIST dataset for training the KNN model. MNIST is a large dataset of handwritten digits commonly used for training and benchmarking image recognition models.
  • KNN Classifier: KNN is a simple and effective classification algorithm, but other algorithms like Support Vector Machines (SVM) or Convolutional Neural Networks (CNNs) might provide better accuracy, especially for more complex scenarios.
  • Feature Extraction: The code uses raw pixel values as features for the KNN classifier. More sophisticated feature extraction techniques like Histogram of Oriented Gradients (HOG) or Local Binary Patterns (LBP) could improve recognition accuracy.

Improvements:

  • Error Handling: Implement checks for cases where no contours are found or the prediction fails.
  • Bounding Box Optimization: The bounding boxes might not perfectly fit the digits. Consider using contour approximation techniques to get tighter bounding boxes.
  • Performance Optimization: For real-time applications, optimize the code for speed by using techniques like resizing the image before processing or implementing contour analysis on a smaller ROI.
  • Generalization: To improve the model's ability to recognize digits with different writing styles and variations, train the model on a more diverse dataset or use data augmentation techniques.

Additional Considerations:

  • Image Quality: The quality of the input image significantly impacts the accuracy of digit recognition. Ensure good lighting, contrast, and minimal noise in the input image.
  • Digit Segmentation: If the digits are touching or overlapping, additional preprocessing steps like erosion, dilation, or watershed segmentation might be required to separate them before recognition.
  • Real-World Applications: This code can be extended for various real-world applications like recognizing digits in license plates, reading handwritten forms, or automating data entry tasks.

Summary

This code snippet demonstrates a simple pipeline for recognizing handwritten digits in an image using OpenCV and a pre-trained machine learning model.

Here's a breakdown of the process:

  1. Image Preprocessing:

    • The input image is converted to grayscale.
    • Thresholding is applied to separate the digits from the background.
    • Contours of individual digits are detected.
  2. Digit Extraction and Recognition:

    • Each detected contour is processed individually.
    • A bounding box is drawn around the contour to extract the Region of Interest (ROI) containing the digit.
    • The ROI is resized to a standard size (28x28 pixels) for compatibility with the pre-trained model.
    • The resized ROI is flattened into a 1D array, representing the digit's features.
    • A pre-trained machine learning model (e.g., KNN, SVM) predicts the digit based on these features.
  3. Result Display:

    • The predicted digit is printed to the console.
    • The original image is displayed, potentially with recognized digits highlighted.

Key Libraries Used:

  • OpenCV (cv2): Used for image loading, preprocessing, contour detection, and display.
  • NumPy (np): Used for array manipulation and reshaping.
  • A Machine Learning Library (not shown): Used for loading and utilizing the pre-trained digit recognition model (e.g., scikit-learn for KNN or SVM).

This code provides a basic framework for handwritten digit recognition. You can adapt and expand it by incorporating different pre-trained models, improving preprocessing techniques, and adding features like digit localization and bounding box visualization.

Conclusion

This code provides a practical example of using OpenCV and machine learning for handwritten digit recognition. By leveraging pre-trained models and image processing techniques, it accurately identifies and classifies digits within images. This foundation can be further developed for applications like automating data entry from handwritten forms, recognizing digits in license plates, and other OCR tasks. The code's modularity allows for easy integration of different machine learning models and preprocessing steps, making it adaptable to various digit recognition scenarios.

References

Were You Able to Follow the Instructions?

😍Love it!
😊Yes
😐Meh-gical
😞No
🤮Clickbait