DL&ML

模型评估

模型评估此节内容只针对分类模型，使用sklearn库 1、准确率 accuracy_score函数计算精度，在多标签分类中，该函数返回子集精度。如果样本的整个预测标签集与真实的标签集严格匹配，则子集精度为1.0; 否则它是0.0。如果$\hat{y}i$是第$i$类样本预测值，$y_i$是相应的真值，那么正确预测的分数$n\text{samples}$被定义为$$\texttt{accuracy}(y, \hat{y}) = \frac{1}{n_\text{samples}} \sum_{i=0}^{n_\text{samples}-1} 1(\hat{y}_i = y_i)$$ import numpy as np from sklearn.metrics import accuracy_score y_pred = [0, 2, 1, 3] y_true = [0, 1, 2, 3] accuracy_score(y_true, y_pred) 0.5 accuracy_score(y_true, y_pred, normalize=False) # 若normalize为False,则返回正确分类的样本数 2 2、混淆矩阵该confusion_matrix函数通过计算混淆矩阵来评估分类准确性，行对应于真正的类，列表示预测值。 from sklearn.metrics import confusion_matrix y_true = [2, 0, 2, 2, 0, 1] y_pred = [0, 0, 2, 2, 0, 2] confusion_matrix(y_true, y_pred) array([[2, 0, 0], [0, 0, 1], [1, 0, 2]]) 3、汉明损失如果$\hat{y}j$是预测为第$j$类的样本，$y_j$是真值，$n\text{labels}$是类别的数目，则两个样本之间的汉明损失定义为：$$L_{Hamming}(y, \hat{y}) = \frac{1}{n_\text{labels}} \sum_{j=0}^{n_\text{labels} - 1} 1(\hat{y}_j \not= y_j)$$ $1(x)$是指标函数...

温度预测

温度预测 1. 观察耶拿天气数据集的数据 import os fname = './data/jena_climate_2009_2016.csv' f = open(fname) data = f.read() f.close() lines = data.split('\n') header = lines[0].split(',') lines = lines[1:] print(header) print(len(lines)) ['"Date Time"', '"p (mbar)"', '"T (degC)"', '"Tpot (K)"', '"Tdew (degC)"', '"rh (%)"', '"VPmax (mbar)"', '"VPact (mbar)"', '"VPdef (mbar)"', '"sh (g/kg)"', '"H2OC (mmol/mol)"', '"rho (g/m**3)"', '"wv (m/s)"', '"max. wv (m/s)"', '"wd (deg)"'] 420551 # 解析数据 import numpy as np float_data = np.zeros((len(lines), len(header) - 1)) for i, line in enumerate(lines): values = [float(x) for x in line....

理解LSTM层与GRU层

理解LSTM层与GRU层 SimpleRNN的问题在于，在时刻t，理论上来说，它应该能够记住许多时间部之前见过的各种信息，但实际上它是不可能学到这种长期依赖的。其原因在于“梯度消失”问题，这一效应类似于在层数较多的非循环网络中观察到的效应，随着层数的增加，网络最终变得无法训练。 1. LSTM层 LSTM层是SimpleRNN的一种变体，它增加了一种携带信息跨越多个时间步的方法。假设有一条传送带，其运行方向平行于你所处理的序列。序列中的信息可以在任意位置跳上传送带，然后被传送到更晚的时间步，并在需要时原封不动地跳回来。这实际上就是LSTM的原理：它保存信息以便后面使用，从而防止较早期的信号在处理过程中逐渐消失。 1.1 Keras 中一个LSTM的例子 # 准备数据 from keras.datasets import imdb from keras.preprocessing import sequence max_features = 10000 maxlen = 500 batch_size = 32 print('Loading data...') (input_train, y_train), (input_test, y_test) = imdb.load_data(num_words=max_features) print(len(input_train), 'train_sequences') print(len(input_test), 'test sequences') print('Pad sequences (samples x time)') input_train = sequence.pad_sequences(input_train, maxlen=maxlen) input_test = sequence.pad_sequences(input_test, maxlen=maxlen) print('input_train shape: ', input_train.shape) print('input_test shape:', input_test.shape) Loading data... 25000 train_sequences 25000 test sequences Pad sequences (samples x time) input_train shape: (25000, 500) input_test shape: (25000, 500) # 使用Keras中的LSTM层 from keras....

理解循环神经网络

理解循环神经网络 1. 简单的循环神经网络 RNN以渐进的方式处理信息，同时保存一个关于所处理内容的内部模型，这个模型是根据过去的信息构建的，并随着新信息的进入而不断更新。 RNN处理序列的方式是：遍历所有序列元素，并保存一个状态（State），其中包含与已查看内容相关的信息。 RNN的伪代码： state_t = 0 for input_t in input_sequence: output_t = f(input_t, state_t) state_t = output_t 可以给出具体的函数f,从输入和状态到输出的变换，其参数包括两个矩阵（W和U）和一个偏置向量。它类似于前馈网络中密集连接层所做的变换。 state_t = 0 for input_t in input_sequence: output_t = activation(dot(W, input_t) + dot(U, state_t) + b) state_t = output_t # 简单RNN的numpy实现 import numpy as np timesteps = 100 # 输入序列的时间步数 input_features = 32 # 输入特征空间的维度 output_features = 64 # 输出特征空间的维度 inputs = np.random.random((timesteps, input_features)) # 输入数据：随机噪声，仅作为示例 state_t = np.zeros((output_features,)) # 初试状态：全零向量 # 创建随机的权重矩阵 W = np....