10.2. 基于kNN的手写数据OCR¶
10.2.1. 目标¶
10.2.2. 手写数字的OCR¶
我们的目标是构建一个可以读取手写数字的应用程序。为此,我们需要一些列车数据和测试数据。OpenCV附带一个图像 \(digits.png\) (在文件夹中 opencv/samples/python2/data/
>>> import numpy as np
>>> import cv2 as cv
>>> from matplotlib import pyplot as plt
>>> img = cv.imread('/cvdata/digits.png')
>>> gray = cv.cvtColor(img,cv.COLOR_BGR2GRAY)
>>> # Now we split the image to 5000 cells, each 20x20 size
>>> cells = [np.hsplit(row,100) for row in np.vsplit(gray,50)]
>>> # Make it into a Numpy array. It size will be (50,100,20,20)
>>> x = np.array(cells)
>>> # Now we prepare train_data and test_data.
>>> train = x[:,:50].reshape(-1,400).astype(np.float32) # Size = (2500,400)
>>> test = x[:,50:100].reshape(-1,400).astype(np.float32) # Size = (2500,400)
>>> # Create labels for train and test data
>>> k = np.arange(10)
>>> train_labels = np.repeat(k,250)[:,np.newaxis]
>>> test_labels = train_labels.copy()
>>> # Initiate kNN, train the data, then test it with test data for k=1
>>> knn = cv.ml.KNearest_create()
>>> knn.train(train, cv.ml.ROW_SAMPLE, train_labels)
>>> ret,result,neighbours,dist = knn.findNearest(test,k=5)
>>> # Now we check the accuracy of classification
>>> # For that, compare the result with test_labels and check which are wrong
>>> matches = result==test_labels
>>> correct = np.count_nonzero(matches)
>>> accuracy = correct*100.0/result.size
>>> print( accuracy )
>>> # save the data
>>> np.savez('knn_data.npz',train=train, train_labels=train_labels)
>>> # Now load the data
>>> with np.load('knn_data.npz') as data:
>>> print( data.files )
>>> train = data['train']
>>> train_labels = data['train_labels']
['train', 'train_labels']
10.2.3. 英文字母OCR¶
接下来,我们将对英文字母表做同样的操作,但数据和功能集略有变化。在这里,OpenCV提供的不是图像而是一个数据文件, letter-recognition.data
in opencv/samples/cpp/
folder. If you open it, you will see 20000 lines which may, on first sight, look like garbage. Actually, in each row, first column is an alphabet which is our label. Next 16 numbers following it are its different features. These features are obtained from UCI Machine Learning Repository . 您可以在中找到这些功能的详细信息 this page .
>>> import cv2 as cv
>>> import numpy as np
>>> # Load the data, converters convert the letter to a number
>>> data= np.loadtxt('/cvdata/letter-recognition.data',
>>> dtype= 'float32',
>>> delimiter = ',',
>>> converters= {0: lambda ch: ord(ch)-ord('A')})
>>> # split the data to two, 10000 each for train and test
>>> train, test = np.vsplit(data,2)
>>> # split trainData and testData to features and responses
>>> responses, trainData = np.hsplit(train,[1])
>>> labels, testData = np.hsplit(test,[1])
>>> # Initiate the kNN, classify, measure accuracy.
>>> knn = cv.ml.KNearest_create()
>>> knn.train(trainData, cv.ml.ROW_SAMPLE, responses)
>>> ret, result, neighbours, dist = knn.findNearest(testData, k=5)
>>> correct = np.count_nonzero(result == labels)
>>> accuracy = correct*100.0/10000
>>> print( accuracy )