Virtual Keyboard in python using OpenCv
Virtual keyboards are becoming a necessary tool for many applications, particularly those using touchscreens and augmented reality. Hand gestures combined with a virtual keyboard may provide for an intuitive and engaging user interface. This post will demonstrate how to use OpenCV, a powerful package for computer vision applications, and Python to create a basic virtual keyboard.
Prerequisites & libraries
First, let us install the required modules.
pip install opencv-python cvzone numpy pynput
Step-by-Step Implementation
1. Initialization
- Libraries are imported for video capture, hand detection, keyboard control, and image processing.
- A webcam object is created to capture live video.
- cap.set(cv2.CAP_PROP_FRAME_WIDTH, 1080)
- cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 520)
- W:`1080` and H:`520` is resolution of Image taken by camera, it's been reduced for faster processing.
- To know more about this refer to this link.
- The HandDetector from CVZone is initialized to detect hands in the video frames. The detection confidence is set to 0.8, and the tracking confidence to 0.2.
- The virtual keyboard layout is defined as a list of nested lists, representing rows and keys.
- A Keyboard Controller object is created to interact with the system keyboard.
import cv2
import cvzone
from cvzone.HandTrackingModule import HandDetector
from time import sleep
import numpy as np
from pynput.keyboard import Controller, Key
# Initialize video capture
cap = cv2.VideoCapture(0)
# Set the frame width and height
cap.set(cv2.CAP_PROP_FRAME_WIDTH, 1080) # Width
cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 520) # Height
cv2.namedWindow("Virtual Keyboard", cv2.WINDOW_NORMAL)
# Initialize HandDetector for hand tracking
# Detection and tracking confidence thresholds from CVZone
detector = HandDetector(detectionCon=0.8, minTrackCon=0.2)
# Define virtual keyboard layout
keyboard_keys = [
["1", "2", "3", "4", "5", "6", "7", "8", "9", "0"],
["Q", "W", "E", "R", "T", "Y", "U", "I", "O", "P"],
["A", "S", "D", "F", "G", "H", "J", "K", "L", ";"],
["Z", "X", "C", "V", "B", "N", "M", ",", ".", "/"],
["SPACE", "ENTER", "BACKSPACE"]
]
keyboard = Controller() # Create a keyboard controller instance
2. Drawing Buttons
- Button Class:
- This class is defined to represent each key on the virtual keyboard. It stores the button's position, text label, and size.
class Button:
def __init__(self, pos, text, size=(85, 85)):
self.pos = pos
self.size = size
self.text = text
- Function `draw button` takes an image and a list of buttons, drawing each button on the image using OpenCV drawing functions.
- Button box:
- cvzone.cornerRect(img, (x, y, w, h), 20, rt=0)
- cv2.rectangle(img, button.pos, (int(x + w), int(y + h)), (37, 238, 250), cv2.FILLED)
- Rounded Corners: cvzone.cornerRect draws a rectangle with rounded corners.
- Filled Rectangle: cv2.rectangle draws a filled rectangle with the specified color (37, 238, 250).
- For more detailed info refer: Corner rectangle, OpenCV-Rectangle
- Button box:
def draw_buttons(img, button_list):
"""
Draws buttons on the given image.
Args:
img (numpy.ndarray): The image on which the buttons will be drawn.
button_list (list): A list of Button objects representing the buttons to be drawn.
Returns:
numpy.ndarray: The image with the buttons drawn.
"""
for button in button_list:
x, y = button.pos
w, h = button.size
cvzone.cornerRect(img, (x, y, w, h), 20, rt=0)
cv2.rectangle(img, button.pos, (int(x + w), int(y + h)),
(37, 238, 250), cv2.FILLED)
cv2.putText(img, button.text, (x + 20, y + 65),
cv2.FONT_HERSHEY_PLAIN, 4, (0, 0, 0), 4)
return img
3. Button objects
- A list of Button objects is created based on the keyboard layout definition. This list represents the virtual keyboard displayed on the screen.
button_list = []
# Create Button objects based on keyboard_keys layout
for k in range(len(keyboard_keys)):
for x, key in enumerate(keyboard_keys[k]):
if key != "SPACE" and key != "ENTER" and key != "BACKSPACE":
button_list.append(Button((100 * x + 25, 100 * k + 50), key))
elif key == "ENTER":
button_list.append(
Button((100 * x - 30, 100 * k + 50), key, (220, 85)))
elif key == "SPACE":
button_list.append(
Button((100 * x + 780, 100 * k + 50), key, (220, 85)))
elif key == "BACKSPACE":
button_list.append(
Button((100 * x + 140, 100 * k + 50), key, (400, 85)))
4. Main Loop
- Main loop for capturing frames and detecting hand gestures.
- The main loop continuously reads video frames from the webcam.
- Hand detection is performed on each frame using the HandDetector object (detector) above initialized.
- allHands, img = detector.findHands(img) # Find hands in the frame
- allHands is dictionary element which contains landmarks coordinates (lm_list), bounding box (bbox_info), type left or right hand.
- If no hands were detected in frame allHands length equals zero.
- For detailed info refer: this link
- Buttons are drawn on the frame using the draw_buttons function.
- If a hand is detected, fingertip positions (landmarks) are analyzed.
- If a fingertip of index finger touches with thumb on a button with distance less than 30, the corresponding key is simulated using the Keyboard Controller object. A small delay is added to prevent accidental presses.
- For space, enter, and backspace, the appropriate Key object from the pynput library is used to simulate the key press and release.
- On Successful click button color changes to Green. `cv2.rectangle(img, button.pos, (x + w, y + h), (0, 255, 0), cv2.FILLED)`
- Exit and Cleanup:
- The loop exits when the ESC key is pressed.
while True:
success, img = cap.read() # Read frame from camera
allHands, img = detector.findHands(img) # Find hands in the frame
if len(allHands) == 0:
lm_list, bbox_info = [], []
else:
# Find landmarks and bounding box info
lm_list, bbox_info = allHands[0]['lmList'], allHands[0]['bbox']
img = draw_buttons(img, button_list) # Draw buttons on the frame
# Check if landmarks (lmList) are detected
if lm_list:
for button in button_list:
x, y = button.pos
w, h = button.size
# Check if index finger (lmList[8]) is within the button bounds
if x < lm_list[8][0] < x + w and y < lm_list[8][1] < y + h:
cv2.rectangle(img, button.pos, (x + w, y + h),
(247, 45, 134), cv2.FILLED) # Highlight the button on hover
cv2.putText(img, button.text, (x + 20, y + 65),
cv2.FONT_HERSHEY_PLAIN, 4, (0, 0, 0), 4)
# Calculate distance between thumb (lmList[4]) and index finger (lmList[8])
distance = np.sqrt(
(lm_list[8][0] - lm_list[4][0])**2 + (lm_list[8][1] - lm_list[4][1])**2)
# If distance is small, simulate key press
if distance < 30:
# Check for special keys
if button.text not in ['ENTER', "BACKSPACE", "SPACE"]:
keyboard.press(button.text) # Press the key
# Small delay for better usability & prevent accidental key presses
sleep(0.2)
else:
if button.text == "SPACE":
keyboard.press(Key.space)
keyboard.release(Key.space)
sleep(0.2)
elif button.text == "ENTER":
keyboard.press(Key.enter)
keyboard.release(Key.enter)
sleep(0.2)
elif button.text == "BACKSPACE":
keyboard.press(Key.backspace)
keyboard.release(Key.backspace)
sleep(0.2)
else:
pass
cv2.rectangle(img, button.pos, (x + w, y + h),
(0, 255, 0), cv2.FILLED)
cv2.putText(img, button.text, (x + 20, y + 65),
cv2.FONT_HERSHEY_PLAIN, 4, (0, 0, 0), 4)
# Display the frame with virtual keyboard
cv2.imshow("Virtual Keyboard", img)
if cv2.waitKey(1) & 0xFF == 27: # Exit loop on ESC key press
break
The webcam object and OpenCV windows are released for proper resource management.
# Release resources
cap.release()
cv2.destroyAllWindows()
Video Demonstration
Conclusion
This code shows the basic implementation of hand-tracking keyboard interfacing with the help of OpenCV and CVZone (Handtracking). It demonstrates the hand detection through a video and key (pynput) press simulation for touchless gestures. On this basis, of course, it is an elementary solution that allows us to expand the possibilities for more complex virtual keyboards and touchless controls. It is possible to proceed to manage them by customized layout choices, apply machine learning for better accuracy, and integrate it into more applications forms. Experiment and keep coding!