使用 Keras 构建验证码识别系统

CNN 是一种专门设计用于处理图像的神经网络模型，通过卷积层提取图像特征，并通过全连接层进行字符分类。然后我们对图像进行二值化，以便将图像转换为黑白二值图，从而提升字符的识别效果。在训练之前，我们需要将图像数据归一化到[0, 1]范围，并将标签转换为 one-hot 编码。假设我们的字符集包括 0-9 和 A-Z，一共36个字符。OpenCV 的 findContours 函数能够帮助我们检测图像

ttocr.com

1027人浏览 · 2025-02-19 23:10:09

ttocr.com · 2025-02-19 23:10:09 发布

在本教程中，我们将使用 Keras（一个高层次的神经网络库，通常与 TensorFlow 配合使用）来构建验证码识别系统。我们将通过卷积神经网络（CNN）来训练模型，使其能够从验证码图像中识别字符。Keras 提供了简洁的 API，非常适合快速构建和训练深度学习模型。

1. 环境准备
首先，确保你已经安装了以下所需的库：

bash

pip install tensorflow opencv-python numpy matplotlib pillow
TensorFlow：深度学习框架，Keras 是 TensorFlow 的高级 API。
opencv-python：用于图像加载和处理。
numpy：进行数据处理和数组运算。
matplotlib：用于可视化训练过程中的损失和准确率。
2. 数据集准备与图像预处理
验证码图像通常包含噪声、干扰线条、扭曲的字符等，处理这些噪声是提高识别准确性的关键步骤。在进行训练之前，我们需要对图像进行一系列的预处理：灰度化、二值化、去噪等。

(1) 图像加载与预处理
首先，我们加载验证码图像并进行灰度化处理。灰度化将图像转换为黑白模式，有助于减少颜色的干扰。然后我们对图像进行二值化，以便将图像转换为黑白二值图，从而提升字符的识别效果。

python

import cv2
import numpy as np

def preprocess_image(img_path):
# 读取图像
img = cv2.imread(img_path)

# 转换为灰度图
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

# 二值化处理，使用 Otsu 的方法自动选择阈值
_, binary = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)

# 高斯模糊去噪
blurred = cv2.GaussianBlur(binary, (5, 5), 0)

return blurred

# 示例图像路径
img_path = 'captcha_images/test1.png'
processed_img = preprocess_image(img_path)

# 显示处理后的图像
cv2.imshow('Processed Image', processed_img)
cv2.waitKey(0)
cv2.destroyAllWindows()
(2) 提取字符区域
我们使用轮廓检测来提取每个字符的区域。OpenCV 的 findContours 函数能够帮助我们检测图像中的所有轮廓，并提取每个字符的边界框。

python

def extract_characters(processed_img):
contours, _ = cv2.findContours(processed_img, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
char_images = []
for contour in contours:
x, y, w, h = cv2.boundingRect(contour)
if w > 10 and h > 10: # 忽略噪点
char_img = processed_img[y:y+h, x:x+w]
char_images.append(char_img)

# 按照字符的从左到右顺序排序
char_images.sort(key=lambda x: x[0][0]) # 排序依据是字符的左上角 x 坐标
return char_images

# 提取字符区域
char_images = extract_characters(processed_img)

# 显示提取的字符
for i, char_img in enumerate(char_images):
cv2.imshow(f'Character {i+1}', char_img)
cv2.waitKey(0)

cv2.destroyAllWindows()
3. 构建卷积神经网络（CNN）
在这一部分，我们将使用 Keras 来构建一个卷积神经网络（CNN）。CNN 是一种专门设计用于处理图像的神经网络模型，通过卷积层提取图像特征，并通过全连接层进行字符分类。

(1) 构建 CNN 模型
我们将构建一个简单的卷积神经网络，其中包括两个卷积层、池化层、展平层以及全连接层。

python

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout

def build_cnn_model(input_shape=(28, 28, 1), num_classes=36):
model = Sequential()

# 卷积层1
model.add(Conv2D(32, (3, 3), activation='relu', input_shape=input_shape))
model.add(MaxPooling2D((2, 2)))

# 卷积层2
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D((2, 2)))

# 展平层
model.add(Flatten())

# 全连接层
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5)) # Dropout层，防止过拟合

# 输出层：假设字符集包含 0-9 和 A-Z，总共 36 个字符
model.add(Dense(num_classes, activation='softmax'))

return model

# 构建模型
model = build_cnn_model(input_shape=(28, 28, 1), num_classes=36)

# 查看模型结构
model.summary()
(2) 数据预处理与训练
在训练之前，我们需要将图像数据归一化到[0, 1]范围，并将标签转换为 one-hot 编码。假设我们的字符集包括 0-9 和 A-Z，一共36个字符。

python

import numpy as np
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.preprocessing.image import img_to_array, load_img
from tensorflow.keras.preprocessing.image import ImageDataGenerator

def load_and_preprocess_image(image_path):
# 加载图像并转换为灰度图
img = load_img(image_path, color_mode='grayscale', target_size=(28, 28))
img = img_to_array(img)
img = img / 255.0 # 数据归一化

return img

# 假设我们有训练图像路径和标签
train_image_paths = ['captcha_images/train1.png', 'captcha_images/train2.png'] # 示例路径
train_labels = [0, 1] # 示例标签

# 加载并处理训练图像
train_images = np.array([load_and_preprocess_image(img_path) for img_path in train_image_paths])

# 将标签转换为 one-hot 编码
train_labels = to_categorical(train_labels, num_classes=36)

# 数据增强（可选）
datagen = ImageDataGenerator(rotation_range=10, zoom_range=0.1, width_shift_range=0.1, height_shift_range=0.1)
datagen.fit(train_images)

# 训练模型
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
model.fit(datagen.flow(train_images, train_labels, batch_size=32), epochs=10)
4. 模型评估与预测
训练完成后，我们可以评估模型的性能，并对新的验证码图像进行预测。

(1) 评估模型
python

# 假设你有测试图像路径和标签
test_image_paths = ['captcha_images/test1.png']
test_labels = [0]

# 加载并处理测试图像
test_images = np.array([load_and_preprocess_image(img_path) for img_path in test_image_paths])

# 将标签转换为 one-hot 编码
test_labels = to_categorical(test_labels, num_classes=36)

# 评估模型
loss, accuracy = model.evaluate(test_images, test_labels)
print(f"Test Accuracy: {accuracy * 100:.2f}%")
(2) 对验证码进行预测
python
更多内容访问ttocr.com或联系1436423940
def predict_captcha(model, img_path, char_set="0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ"):
# 加载并处理图像
img = load_and_preprocess_image(img_path)
img = np.expand_dims(img, axis=0) # 增加批次维度

# 预测
pred = model.predict(img)
predicted_class = np.argmax(pred, axis=1)[0]

# 获取预测的字符
predicted_char = char_set[predicted_class]

return predicted_char

# 对图像进行预测
captcha_image = 'captcha_images/test1.png'
predicted_label = predict_captcha(model, captcha_image)
print(f"Predicted CAPTCHA label: {predicted_label}")