Virtual Keyboard in python using OpenCv

Virtual keyboards are becoming a necessary tool for many applications, particularly those using touchscreens and augmented reality. Hand gestures combined with a virtual keyboard may provide for an intuitive and engaging user interface. This post will demonstrate how to use OpenCV, a powerful package for computer vision applications, and Python to create a basic virtual keyboard.

Prerequisites & libraries

First, let us install the required modules.

pip install opencv-python cvzone numpy pynput

Step-by-Step Implementation

1. Initialization

Libraries are imported for video capture, hand detection, keyboard control, and image processing.
A webcam object is created to capture live video.
- cap.set(cv2.CAP_PROP_FRAME_WIDTH, 1080)
- cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 520)
- W:`1080` and H:`520` is resolution of Image taken by camera, it's been reduced for faster processing.
- To know more about this refer to this link.
The HandDetector from CVZone is initialized to detect hands in the video frames. The detection confidence is set to 0.8, and the tracking confidence to 0.2.
The virtual keyboard layout is defined as a list of nested lists, representing rows and keys.
A Keyboard Controller object is created to interact with the system keyboard.

Python

import cv2
import cvzone
from cvzone.HandTrackingModule import HandDetector
from time import sleep
import numpy as np
from pynput.keyboard import Controller, Key

# Initialize video capture
cap = cv2.VideoCapture(0)
# Set the frame width and height
cap.set(cv2.CAP_PROP_FRAME_WIDTH, 1080)  # Width
cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 520)   # Height

cv2.namedWindow("Virtual Keyboard", cv2.WINDOW_NORMAL)

# Initialize HandDetector for hand tracking
# Detection and tracking confidence thresholds from CVZone
detector = HandDetector(detectionCon=0.8, minTrackCon=0.2)

# Define virtual keyboard layout
keyboard_keys = [
    ["1", "2", "3", "4", "5", "6", "7", "8", "9", "0"],
    ["Q", "W", "E", "R", "T", "Y", "U", "I", "O", "P"],
    ["A", "S", "D", "F", "G", "H", "J", "K", "L", ";"],
    ["Z", "X", "C", "V", "B", "N", "M", ",", ".", "/"],
    ["SPACE", "ENTER", "BACKSPACE"]
]

keyboard = Controller()  # Create a keyboard controller instance

2. Drawing Buttons

Button Class:
- This class is defined to represent each key on the virtual keyboard. It stores the button's position, text label, and size.

Python

class Button:
    def __init__(self, pos, text, size=(85, 85)):
        self.pos = pos
        self.size = size
        self.text = text

Function `draw button` takes an image and a list of buttons, drawing each button on the image using OpenCV drawing functions.
- Button box:
  - cvzone.cornerRect(img, (x, y, w, h), 20, rt=0)
  - cv2.rectangle(img, button.pos, (int(x + w), int(y + h)), (37, 238, 250), cv2.FILLED)
  - Rounded Corners: cvzone.cornerRect draws a rectangle with rounded corners.
  - Filled Rectangle: cv2.rectangle draws a filled rectangle with the specified color (37, 238, 250).
  - For more detailed info refer: Corner rectangle, OpenCV-Rectangle

Python

def draw_buttons(img, button_list):
    """
    Draws buttons on the given image.

    Args:
        img (numpy.ndarray): The image on which the buttons will be drawn.
        button_list (list): A list of Button objects representing the buttons to be drawn.

    Returns:
        numpy.ndarray: The image with the buttons drawn.
    """
    for button in button_list:
        x, y = button.pos
        w, h = button.size
        cvzone.cornerRect(img, (x, y, w, h), 20, rt=0)
        cv2.rectangle(img, button.pos, (int(x + w), int(y + h)),
                      (37, 238, 250), cv2.FILLED)
        cv2.putText(img, button.text, (x + 20, y + 65),
                    cv2.FONT_HERSHEY_PLAIN, 4, (0, 0, 0), 4)
    return img

3. Button objects

A list of Button objects is created based on the keyboard layout definition. This list represents the virtual keyboard displayed on the screen.

Python

button_list = []

# Create Button objects based on keyboard_keys layout
for k in range(len(keyboard_keys)):
    for x, key in enumerate(keyboard_keys[k]):
        if key != "SPACE" and key != "ENTER" and key != "BACKSPACE":
            button_list.append(Button((100 * x + 25, 100 * k + 50), key))
        elif key == "ENTER":
            button_list.append(
                Button((100 * x - 30, 100 * k + 50), key, (220, 85)))
        elif key == "SPACE":
            button_list.append(
                Button((100 * x + 780, 100 * k + 50), key, (220, 85)))
        elif key == "BACKSPACE":
            button_list.append(
                Button((100 * x + 140, 100 * k + 50), key, (400, 85)))

4. Main Loop

Main loop for capturing frames and detecting hand gestures.
- The main loop continuously reads video frames from the webcam.
- Hand detection is performed on each frame using the HandDetector object (detector) above initialized.
  - allHands, img = detector.findHands(img) # Find hands in the frame
  - allHands is dictionary element which contains landmarks coordinates (lm_list), bounding box (bbox_info), type left or right hand.
  - If no hands were detected in frame allHands length equals zero.
  - For detailed info refer: this link
- Buttons are drawn on the frame using the draw_buttons function.
- If a hand is detected, fingertip positions (landmarks) are analyzed.
- If a fingertip of index finger touches with thumb on a button with distance less than 30, the corresponding key is simulated using the Keyboard Controller object. A small delay is added to prevent accidental presses.
- For space, enter, and backspace, the appropriate Key object from the pynput library is used to simulate the key press and release.
- On Successful click button color changes to Green. `cv2.rectangle(img, button.pos, (x + w, y + h), (0, 255, 0), cv2.FILLED)`
Exit and Cleanup:
- The loop exits when the ESC key is pressed.

Python

while True:
    success, img = cap.read()  # Read frame from camera
    allHands, img = detector.findHands(img)  # Find hands in the frame

    if len(allHands) == 0:
        lm_list, bbox_info = [], []
    else:
        # Find landmarks and bounding box info
        lm_list, bbox_info = allHands[0]['lmList'], allHands[0]['bbox']

    img = draw_buttons(img, button_list)  # Draw buttons on the frame

    # Check if landmarks (lmList) are detected
    if lm_list:
        for button in button_list:
            x, y = button.pos
            w, h = button.size

            # Check if index finger (lmList[8]) is within the button bounds
            if x < lm_list[8][0] < x + w and y < lm_list[8][1] < y + h:
                cv2.rectangle(img, button.pos, (x + w, y + h),
                              (247, 45, 134), cv2.FILLED) # Highlight the button on hover
                cv2.putText(img, button.text, (x + 20, y + 65),
                            cv2.FONT_HERSHEY_PLAIN, 4, (0, 0, 0), 4)

                # Calculate distance between thumb (lmList[4]) and index finger (lmList[8])
                distance = np.sqrt(
                    (lm_list[8][0] - lm_list[4][0])**2 + (lm_list[8][1] - lm_list[4][1])**2)

                # If distance is small, simulate key press
                if distance < 30:
                    # Check for special keys
                    if button.text not in ['ENTER', "BACKSPACE", "SPACE"]:
                        keyboard.press(button.text)  # Press the key
                        # Small delay for better usability & prevent accidental key presses
                        sleep(0.2)
                    else:
                        if button.text == "SPACE":
                            keyboard.press(Key.space)
                            keyboard.release(Key.space)
                            sleep(0.2)

                        elif button.text == "ENTER":
                            keyboard.press(Key.enter)
                            keyboard.release(Key.enter)
                            sleep(0.2)

                        elif button.text == "BACKSPACE":
                            keyboard.press(Key.backspace)
                            keyboard.release(Key.backspace)
                            sleep(0.2)

                        else:
                            pass
                    cv2.rectangle(img, button.pos, (x + w, y + h),
                                  (0, 255, 0), cv2.FILLED)
                    cv2.putText(img, button.text, (x + 20, y + 65),
                                cv2.FONT_HERSHEY_PLAIN, 4, (0, 0, 0), 4)

    # Display the frame with virtual keyboard
    cv2.imshow("Virtual Keyboard", img)
    if cv2.waitKey(1) & 0xFF == 27:  # Exit loop on ESC key press
        break

The webcam object and OpenCV windows are released for proper resource management.

Python

# Release resources
cap.release()
cv2.destroyAllWindows()

Video Demonstration

Conclusion

This code shows the basic implementation of hand-tracking keyboard interfacing with the help of OpenCV and CVZone (Handtracking). It demonstrates the hand detection through a video and key (pynput) press simulation for touchless gestures. On this basis, of course, it is an elementary solution that allows us to expand the possibilities for more complex virtual keyboards and touchless controls. It is possible to proceed to manage them by customized layout choices, apply machine learning for better accuracy, and integrate it into more applications forms. Experiment and keep coding!

Virtual Keyboard in python using OpenCv

Prerequisites & libraries

Step-by-Step Implementation

1. Initialization

2. Drawing Buttons

3. Button objects

4. Main Loop

Video Demonstration

Conclusion

Explore