Simple OpenCV Digit Recognition in Python

Introduction
Step-by-Step Guide
Code Example
Additional Notes
Summary
Conclusion
References

Introduction

This code snippet demonstrates a simple optical character recognition (OCR) process for recognizing handwritten digits in an image. It utilizes OpenCV for image processing and a pre-trained machine learning model for digit classification.

Step-by-Step Guide

Import necessary libraries:

import cv2
import numpy as np

Load and preprocess the image:

image = cv2.imread('image.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]

Find contours of the digits:

cnts = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]

Loop through each contour and recognize the digit:

for c in cnts:
    x,y,w,h = cv2.boundingRect(c)
    ROI = thresh[y:y+h, x:x+w]
    # Resize the digit image to a standard size
    ROI = cv2.resize(ROI, (28,28), interpolation=cv2.INTER_AREA)
    # Flatten the image into a 1D array
    ROI = ROI.reshape((1,784))
    # Use a pre-trained model (e.g., KNN, SVM) to predict the digit
    prediction = model.predict(ROI)
    print("Predicted digit:", prediction)

Display the results:

cv2.imshow('Image', image)
cv2.waitKey(0)

Explanation:

The code first preprocesses the image by converting it to grayscale, applying thresholding to segment the digits from the background, and finding contours of the digits.
For each contour, it extracts the region of interest (ROI) containing the digit, resizes it to a standard size, and flattens it into a 1D array.
A pre-trained machine learning model (e.g., KNN, SVM) is then used to predict the digit based on the extracted features.
Finally, the predicted digit is printed, and the original image is displayed with the recognized digits highlighted.

Code Example

This Python code performs handwritten digit recognition using a KNN classifier. It loads the MNIST dataset for training, preprocesses an input image to find digit contours, and then uses the trained KNN model to predict the digit in each contour. The predictions are then displayed on the original image.

import cv2
import numpy as np
from sklearn.neighbors import KNeighborsClassifier

# Load the pre-trained MNIST dataset
with np.load('mnist.npz') as data:
    train_images = data['x_train']
    train_labels = data['y_train']

# Flatten the images and normalize pixel values
train_images = train_images.reshape((60000, 784)) / 255.0

# Create a KNN classifier and train it on the MNIST dataset
model = KNeighborsClassifier(n_neighbors=5)
model.fit(train_images, train_labels)

# Load and preprocess the image
image = cv2.imread('digits_image.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]

# Find contours of the digits
cnts = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]

# Loop through each contour and recognize the digit
for c in cnts:
    x, y, w, h = cv2.boundingRect(c)
    ROI = thresh[y:y + h, x:x + w]
    # Resize the digit image to a standard size
    ROI = cv2.resize(ROI, (28, 28), interpolation=cv2.INTER_AREA)
    # Flatten the image into a 1D array
    ROI = ROI.reshape((1, 784))
    # Use the trained KNN model to predict the digit
    prediction = model.predict(ROI)
    print("Predicted digit:", prediction[0])

    # Draw a rectangle around the digit and display the prediction
    cv2.rectangle(image, (x, y), (x + w, y + h), (0, 255, 0), 2)
    cv2.putText(image, str(prediction[0]), (x, y - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.9, (0, 255, 0), 2)

# Display the results
cv2.imshow('Image', image)
cv2.waitKey(0)

Before running the code:

Install necessary libraries:

pip install opencv-python numpy scikit-learn

Download the MNIST dataset:
- Download the MNIST dataset from https://s3.amazonaws.com/img-datasets/mnist.npz and save it in the same directory as your Python script.
Replace 'digits_image.jpg' with the actual path to your image containing digits.

Explanation:

This code uses the KNN algorithm for digit recognition.
It loads the pre-trained MNIST dataset to train the KNN model.
The code then preprocesses the input image, finds contours of digits, resizes them to 28x28 pixels, and feeds them to the trained KNN model for prediction.
Finally, it draws bounding boxes around the recognized digits and displays the predicted digit on the image.

Additional Notes

General:

This code provides a basic framework for digit recognition in images.
It's important to note that this is a simplified example and may not perform well on complex images or handwritten digits with variations in style.

Preprocessing:

Thresholding: The code uses Otsu's thresholding which automatically determines the optimal threshold value for separating the digits from the background. This works well for images with a clear separation between foreground and background. For more complex images, adaptive thresholding techniques might be necessary.
Contour Finding: The cv2.RETR_EXTERNAL flag ensures that only the outermost contours of the digits are detected, avoiding nested contours.

Digit Recognition:

MNIST Dataset: The code uses the MNIST dataset for training the KNN model. MNIST is a large dataset of handwritten digits commonly used for training and benchmarking image recognition models.
KNN Classifier: KNN is a simple and effective classification algorithm, but other algorithms like Support Vector Machines (SVM) or Convolutional Neural Networks (CNNs) might provide better accuracy, especially for more complex scenarios.
Feature Extraction: The code uses raw pixel values as features for the KNN classifier. More sophisticated feature extraction techniques like Histogram of Oriented Gradients (HOG) or Local Binary Patterns (LBP) could improve recognition accuracy.

Improvements:

Error Handling: Implement checks for cases where no contours are found or the prediction fails.
Bounding Box Optimization: The bounding boxes might not perfectly fit the digits. Consider using contour approximation techniques to get tighter bounding boxes.
Performance Optimization: For real-time applications, optimize the code for speed by using techniques like resizing the image before processing or implementing contour analysis on a smaller ROI.
Generalization: To improve the model's ability to recognize digits with different writing styles and variations, train the model on a more diverse dataset or use data augmentation techniques.

Additional Considerations:

Image Quality: The quality of the input image significantly impacts the accuracy of digit recognition. Ensure good lighting, contrast, and minimal noise in the input image.
Digit Segmentation: If the digits are touching or overlapping, additional preprocessing steps like erosion, dilation, or watershed segmentation might be required to separate them before recognition.
Real-World Applications: This code can be extended for various real-world applications like recognizing digits in license plates, reading handwritten forms, or automating data entry tasks.

Summary

This code snippet demonstrates a simple pipeline for recognizing handwritten digits in an image using OpenCV and a pre-trained machine learning model.

Here's a breakdown of the process:

Image Preprocessing:
- The input image is converted to grayscale.
- Thresholding is applied to separate the digits from the background.
- Contours of individual digits are detected.
Digit Extraction and Recognition:
- Each detected contour is processed individually.
- A bounding box is drawn around the contour to extract the Region of Interest (ROI) containing the digit.
- The ROI is resized to a standard size (28x28 pixels) for compatibility with the pre-trained model.
- The resized ROI is flattened into a 1D array, representing the digit's features.
- A pre-trained machine learning model (e.g., KNN, SVM) predicts the digit based on these features.
Result Display:
- The predicted digit is printed to the console.
- The original image is displayed, potentially with recognized digits highlighted.

Key Libraries Used:

OpenCV (cv2): Used for image loading, preprocessing, contour detection, and display.
NumPy (np): Used for array manipulation and reshaping.
A Machine Learning Library (not shown): Used for loading and utilizing the pre-trained digit recognition model (e.g., scikit-learn for KNN or SVM).

This code provides a basic framework for handwritten digit recognition. You can adapt and expand it by incorporating different pre-trained models, improving preprocessing techniques, and adding features like digit localization and bounding box visualization.

Conclusion

This code provides a practical example of using OpenCV and machine learning for handwritten digit recognition. By leveraging pre-trained models and image processing techniques, it accurately identifies and classifies digits within images. This foundation can be further developed for applications like automating data entry from handwritten forms, recognizing digits in license plates, and other OCR tasks. The code's modularity allows for easy integration of different machine learning models and preprocessing steps, making it adaptable to various digit recognition scenarios.

References

Recognizing digits with OpenCV and Python - PyImageSearch | Learn how to recognize digits in images using OpenCV, Python, and computer vision + image processing techniques.
Recognizing digits with OpenCV and Python (Simple digit OCR ... | Jan 17, 2020 ... I'm trying to create a program that can see what number an image is and print the integer in the console. (I'm using python 3)
eyanq/sdr: Simple Digit Recognition OCR in OpenCV - GitHub | Simple Digit Recognition OCR in OpenCV. Contribute to eyanq/sdr development by creating an account on GitHub.
OpenCV: Automatic License/Number Plate Recognition (ANPR) with ... | In this tutorial, you will build a basic Automatic License/Number Plate (ANPR) recognition system using OpenCV and Python.
Simple Digit Recognition OCR in OpenCV-Python - OpenCV-Python | A blog about OpenCV Python Tutorial. You could find some basic tutorials in this blog
Read Number From Image - Image Analysis - Image.sc Forum | Hi Fellas, I am new to scikit-image. Can any one have idea how can i detect number from image. if anyone know please share with you’re guide line. Note: Images are only contains number(Basically photo capture from mobile) Thanks In Advance
Welcome to OpenCV-Python - OpenCV-Python | A blog about OpenCV Python Tutorial. You could find some basic tutorials in this blog
Handwritten Digit Recognition with OpenCV - GeeksforGeeks | A Computer Science portal for geeks. It contains well written, well thought and well explained computer science and programming articles, quizzes and practice/competitive programming/company interview Questions.
Python Optical Character Recognition (OCR): A Tutorial | Built In | Optical character recognition (OCR) is a tool that can recognize text in images. Here’s how to build an OCR engine in Python.