Creating a Hand Gesture Recognition System with Convolutional Neural Networks (CNN) and OpenCV

January 29, 2024May 26, 2024

Hand gesture recognition is a fascinating application that involves the intersection of computer vision and machine learning. In this blog post, we’ll explore how to build a hand gesture recognition system using a Convolutional Neural Network (CNN) and OpenCV for real-time video processing.

Building the Neural Network

Let’s start by assembling the neural network using the Keras library and compiling it for training. The neural network is a simple CNN with three convolutional and max-pooling layers, followed by dense layers. The model is then compiled with categorical crossentropy loss and the Adam optimizer.

# Import necessary libraries
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Activation, Flatten, Dense, Dropout
from tensorflow.keras.preprocessing.image import ImageDataGenerator, img_to_array, load_img

# Define the model
model = Sequential()
# Add convolutional layers
model.add(Conv2D(32, (3, 3), input_shape=(54, 54, 1)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
# Add more convolutional layers
model.add(Conv2D(32, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
# Add additional convolutional layers
model.add(Conv2D(64, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
# Flatten the tensor and add dense layers
model.add(Flatten())
model.add(Dense(64))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(4))
model.add(Activation('softmax'))

# Compile the model
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

Preparing Data for Training

To train the model, we need to prepare our data. We use an ImageDataGenerator for data augmentation and flow the data from directories.

# Set batch size
batch_size = 16

# Create data generators
training_datagen = ImageDataGenerator(
    rotation_range=50,
    width_shift_range=0.1,
    height_shift_range=0.1,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True,
    fill_mode='nearest'
)

validation_datagen = ImageDataGenerator(zoom_range=0.2, rotation_range=10)

# Flow data from directories
training_generator = training_datagen.flow_from_directory(
    'training_data',
    target_size=(54, 54),
    batch_size=batch_size,
    color_mode='grayscale'
)

validation_generator = validation_datagen.flow_from_directory(
    'validation_data',
    target_size=(54, 54),
    batch_size=batch_size,
    color_mode='grayscale'
)

Training the Network

Now, we can train the model on the augmented data.

# Train the model
model.fit_generator(
    generator=training_generator,
    steps_per_epoch=2000 // batch_size,
    epochs=50,
    verbose=1,
    validation_data=validation_generator,
    validation_steps=200 // batch_size,
    workers=8,
)

Plotting Model History

We can visualize the training and validation performance using the model’s history.

# Plot the training history
import re
import matplotlib.pyplot as plt

# Read the model fitting output
with open(MODEL_HISTORY) as history_file:
    history = history_file.read()

# Extract relevant data
data = {}
data['acc'] = re.findall(' acc: ([0-9]+\.[0-9]+)', history)
# Extract other metrics like loss, val_acc, val_loss

# Plot the data
plt.figure()
plt.title("Training Loss")
plt.xlabel("Epoch #")
plt.ylabel("Loss")
plt.plot(data['loss'])

# Repeat for other metrics (accuracy, validation loss, validation accuracy)


plt.show()

This concludes the training phase of our hand gesture recognition model. In the next section, we’ll integrate this model with OpenCV for real-time video processing and hand tracking.

NLP

Exploring Data with Sentence Similarity: Unveiling Insights with NLP

ByKishore January 5, 2024May 27, 2024

Unlocking the Potential of Natural Language Processing (NLP) for Data Exploration In the vast world of Natural Language Processing, effective data exploration is a crucial step toward understanding and leveraging textual data. In this blog post, we’ll delve into three powerful techniques tailored for this purpose: data visualization, sentence similarity, and sentence clustering. To illustrate…

Data Analytics

The Ultimate Guide to Organizing Your Data Like a Pro 😧

ByKishore January 10, 2024May 27, 2024

Lists, a versatile and fundamental data structure in Python, play a pivotal role in various programming scenarios. In this comprehensive guide, we will explore the creation, manipulation, and advanced features of lists in Python. Understanding Lists A list is an ordered collection of elements enclosed in square brackets [ ] and separated by commas. Python…

Data Analytics | Machine Learning

Custom SGD (Stochastic) Implementation for Linear Regression on Boston House Dataset

ByKishore February 25, 2024May 26, 2024

In this post, we’ll explore the implementation of Stochastic Gradient Descent (SGD) for Linear Regression on the Boston House dataset. We’ll compare our custom implementation with the SGD implementation provided by the popular machine learning library, scikit-learn. Importing Libraries Data Loading and Preprocessing We load the Boston House dataset, standardize the data, and split it…

Machine Learning

Understanding Decision Trees: A Comprehensive Guide with Python Implementation

ByKishore February 20, 2024May 27, 2024

Introduction: Decision trees are powerful tools in the field of machine learning and data science. They are versatile, easy to interpret, and can handle both classification and regression tasks. In this blog post, we will explore decision trees in detail, understand how they work, and implement a decision tree classifier using Python. What is a…

NLP

Unveiling the Power of Word Embeddings with Gensim

ByKishore January 11, 2024May 28, 2024

In the realm of Natural Language Processing (NLP), word embeddings have emerged as a game-changer. Unlike traditional approaches that use words as features, word embeddings leverage dense, low-dimensional vectors to capture the meaning and usage of a word. One pioneering model in this domain is Word2Vec, developed by Thomas Mikolov and team at Google. In…

Data Analytics | Machine Learning

Visualizing Data for Classification

ByKishore January 9, 2024May 27, 2024

In this lab, we’ll explore the German bank credit dataset to understand relationships for a classification problem. Unlike regression problems where the label is a continuous variable, classification problems involve categorical labels. We aim to visually explore the data to identify features useful in predicting customers with bad credit. Load and Prepare the Dataset Let’s…

Building the Neural Network

Preparing Data for Training

Training the Network

Plotting Model History

Similar Posts

Leave a Reply Cancel reply