模式识别领域应用机器学习的场景非常多,手写识别就是其中一种,最简单的数字识别是一个多类分类问题,我们借这个多类分类问题来介绍一下google最新开源的tensorflow框架,后面深度学习的内容都会基于tensorflow来介绍和演示
什么是tensorflow
tensor意思是张量,flow是流。
张量原本是力学里的术语,表示弹性介质中各点应力状态。在数学中,张量表示的是一种广义的“数量”,0阶张量就是标量(比如:0、1、2……),1阶张量就是向量(比如:(1,3,4)),2阶张量就是矩阵,本来这几种形式是不相关的,但是都归为张量,是因为他们同时满足一些特性:1)可以用坐标系表示;2)在坐标变换中遵守同样的变换法则;3)有着相同的基本运算(如:加、减、乘、除、缩放、点积、对称……)
那么tensorflow可以理解为通过“流”的形式来处理张量的一种框架,是由google开发并开源,已经应用于google大脑项目开发
tensorflow安装
sudo pip install https://storage.googleapis.com/tensorflow/mac/tensorflow-0.9.0-py2-none-any.whl
不同平台找对应的whl包
可能遇到的问题:
发现无法import tensorflow,问题在于protobuf版本不对,必须先卸载掉,再安装tensorflow,这样会自动安装3.0版本的protobuf
sudo pip uninstall protobuf
sudo brew remove protobuf260
sudo pip install --upgrade https://storage.googleapis.com/tensorflow/mac/tensorflow-0.9.0-py2-none-any.whl
手写数字数据集获取
在 http://yann.lecun.com/exdb/mnist/ 可以下载手写数据集,http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz和http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz,下载解压后发现不是图片格式,而是自己特定的格式,为了说明这是什么样的数据,我写了一段程序来显示这些数字:
/************************
* author: SharEDITor
* date: 2016-08-02
* brief: read MNIST data
************************/
#include
#include
#include
#include
unsigned char *lables = NULL;
/**
* All the integers in the files are stored in the MSB first (high endian) format
*/
void copy_int(uint32_t *target, unsigned char *src)
{
*(((unsigned char*)target)+0) = src[3];
*(((unsigned char*)target)+1) = src[2];
*(((unsigned char*)target)+2) = src[1];
*(((unsigned char*)target)+3) = src[0];
}
int read_lables()
{
FILE *fp = fopen("./train-labels-idx1-ubyte", "r");
if (NULL == fp)
{
return -1;
}
unsigned char head[8];
fread(head, sizeof(unsigned char), 8, fp);
uint32_t magic_number = 0;
uint32_t item_num = 0;
copy_int(&magic_number, &head[0]);
// magic number check
assert(magic_number == 2049);
copy_int(&item_num, &head[4]);
uint64_t values_size = sizeof(unsigned char) * item_num;
lables = (unsigned char*)malloc(values_size);
fread(lables, sizeof(unsigned char), values_size, fp);
fclose(fp);
return 0;
}
int read_images()
{
FILE *fp = fopen("./train-images-idx3-ubyte", "r");
if (NULL == fp)
{
return -1;
}
unsigned char head[16];
fread(head, sizeof(unsigned char), 16, fp);
uint32_t magic_number = 0;
uint32_t images_num = 0;
uint32_t rows = 0;
uint32_t cols = 0;
copy_int(&magic_number, &head[0]);
// magic number check
assert(magic_number == 2051);
copy_int(&images_num, &head[4]);
copy_int(&rows, &head[8]);
copy_int(&cols, &head[12]);
uint64_t image_size = rows * cols;
uint64_t values_size = sizeof(unsigned char) * images_num * rows * cols;
unsigned char *values = (unsigned char*)malloc(values_size);
fread(values, sizeof(unsigned char), values_size, fp);
for (int image_index = 0; image_index < images_num; image_index++)
{
// print the label
printf("========================================= %d ======================================\n", lables[image_index]);