大数据下基于Tensorflow框架的深度学习示例教程

def _int64_feature(value): return tf.train.Feature(int64_list=tf.train.Int64List(value=http://www.netofthings.cn/JieJueFangAn/2017-02/[value]))"hljs-function">def _bytes_feature(value): return tf.train.Feature(bytes_list=tf.train.BytesList(value=http://www.netofthings.cn/JieJueFangAn/2017-02/[value]))"hljs-function">def convert_to(data_set, name): images = data_set.images labels = data_set.labels num_examples = data_set.num_examples if images.shape[0] != num_examples: raise ValueError('Images size %d does not match label size %d.' % (images.shape[0], num_examples)) rows = images.shape[1] cols = images.shape[2] depth = images.shape[3] filename = os.path.join(FLAGS.directory, name + '.tfrecords') print('Writing', filename) writer = tf.python_io.TFRecordWriter(filename) for index in range(num_examples): image_raw = images[index].tostring() example = tf.train.Example(features=tf.train.Features(feature={ 'height': _int64_feature(rows), 'width': _int64_feature(cols), 'depth': _int64_feature(depth), 'label': _int64_feature(int(labels[index])), 'image_raw': _bytes_feature(image_raw)})) writer.write(example.SerializeToString()) writer.close()def main(argv): # Get the data. data_sets = mnist.read_data_sets(FLAGS.directory, dtype=tf.uint8, reshape=False, validation_size=FLAGS.validation_size) # Convert to Examples and write the result to TFRecords. convert_to(data_sets.train, 'train') convert_to(data_sets.validation, 'validation') convert_to(data_sets.test, 'test')if __name__ == '__main__': parser = argparse.ArgumentParser() parser.add_argument( '--directory', type=str, default='/tmp/data', help='Directory to download data files and write the converted result' ) parser.add_argument( '--validation_size', type=int, default=5000, help="""\ Number of examples to separate from the training data for the validation set.\ """ ) FLAGS = parser.parse_args() tf.app.run()

二、Tensorflow读取HDFS数据的设置

文中前面内容介绍了HDFS的配置以及将数据转换后存储到HDFS,Tensorflow读取HDFS时只需要简单的两步,首先执行项目时需要加入环境前缀:

CLASSPATH=$($HADOOP_HDFS_HOME/bin/hadoop classpath --glob) python example.py

其次读取数据时,需要在数据的路径前面加入HDFS前缀,比如:

hdfs://default/user/data/example.txt

三、分布式模型的示例代码

该示例代码是读取HDFS上的MNIST数据,建立相应的server与work集群构建出一个三层的深度网络,包含两层卷积层以及一层SoftMax层。代码如下:

from __future__ import