1 Specifying the name of the network used for optimization
Set Network to the name of the network created on the EDIT tab.
2 Specifying the name of the dataset used for optimization
Set data to the name of the dataset loaded on the DATASET tab.
3 Specifying the parameter update method
From the Config list, select Optimizer.
Select an updater from the following (“Adam” is used by default).
Updater | Update expression |
Adadelta | $$g_t \leftarrow \Delta w_t\\ G_t \leftarrow G_{t-1} + g_t^2\\ w_{t+1} \leftarrow w_t – \frac{\eta}{\sqrt{G_t} + \epsilon} g_t$$ Matthew D. Zeiler ADADELTA: An Adaptive Learning Rate Method |
Adagrad | $$g_t \leftarrow \Delta w_t\\ G_t \leftarrow G_{t-1} + g_t^2\\ w_{t+1} \leftarrow w_t – \frac{\eta}{\sqrt{G_t} + \epsilon} g_t$$ John Duchi, Elad Hazan and Yoram Singer Adaptive Subgradient Methods for Online Learning and Stochastic Optimization |
Adam | $$m_t \leftarrow \beta_1 m_{t-1} + (1 – \beta_1) g_t\\ v_t \leftarrow \beta_2 v_{t-1} + (1 – \beta_2) g_t^2\\ w_{t+1} \leftarrow w_t – \alpha \frac{\sqrt{1 – \beta_2^t}}{1 – \beta_1^t} \frac{m_t}{\sqrt{v_t} + \epsilon}$$ Kingma and Ba Adam: A Method for Stochastic Optimization. |
Adamax | $$m_t \leftarrow \beta_1 m_{t-1} + (1 – \beta_1) g_t\\ v_t \leftarrow \max\left(\beta_2 v_{t-1}, |g_t|\right)\\ w_{t+1} \leftarrow w_t – \alpha \frac{\sqrt{1 – \beta_2^t}}{1 – \beta_1^t} \frac{m_t}{v_t + \epsilon}$$ Kingma and Ba Adam: A Method for Stochastic Optimization. |
Momentum | $$v_t \leftarrow \gamma v_{t-1} + \eta \Delta w_t\\ w_{t+1} \leftarrow w_t – v_t$$ Ning Qian On the momentum term in gradient descent learning algorithms |
Nag | $$v_t \leftarrow \gamma v_{t-1} – \eta \Delta w_t\\ w_{t+1} \leftarrow w_t – \gamma v_{t-1} + \left(1 + \gamma \right) v_t$$ Yurii Nesterov A method for unconstrained convex minimization problem with the rate of convergence o(1/k2) |
RMSprop | $$g_t \leftarrow \Delta w_t\\ v_t \leftarrow \gamma v_{t-1} + \left(1 – \gamma \right) g_t^2\\ w_{t+1} \leftarrow w_t – \eta \frac{g_t}{\sqrt{v_t} + \epsilon}$$ Geoff Hinton Lecture 6a : Overview of mini-batch gradient descent http://www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec6.pdf |
Sgd | $$w_{t+1} \leftarrow w_t – \eta \Delta w_t$$ |
θ: Parameter to be updated
g: Gradient
η, α: Learning Rate, Alpha (learning rate)
γ, β1, β2: Momentum, or Decay, Beta1, Beta2 (Decay parameters)
ε: Epsilon (a small value used to prevent division by zero)
4 Setting the Weight Decay (L2 regularization) strength
Specify the weight decay coefficient in Weight Decay.
5 Gradually decaying the learning rate
Specify the rate for the decay of the learning rate in Learning Rate Multiplier. Specify the interval at which the decay of the learning rate happens in number of mini-batches (NNabla) in “LR Update Interval” (NNabla only). For example, to multiply the learning rate by 0.99 every mini-batch, set Learning Rate Multiplier to 0.9999 and LR Update Interval to 1. To make the learning rate 10 times smaller every 20 epochs, set Learning Rate Multiplier to 0.1 and LR Update Interval to (number of training data samples÷size_of_a_mini_batch)×20= nr_of_minibatches×20
6 Updating parameters once every several mini-batches
Specify the parameter update interval in Update Interval. For example, to calculate four gradients using mini-batches containing 64 data samples and then update the parameters using these gradients every four mini-batches, set Batch Size to 64 and Update Interval to 4.
Notes
In order to perform optimization using multiple training networks, the Update Interval must be set to 1.
7 Adding a new optimizer
Click the hamburger menu (≡) or right-click the Config list to open a shortcut menu, and click Add Optimizer.
8 Renaming an optimizer
- Click the hamburger menu (≡) or right-click the Config list to open a shortcut menu, and click Rename.
- Alternatively, on the Config list, double-click the optimizer you want to rename.
- Type the new name, and press Enter.
9 Deleting an optimizer
- From the Config list, select the optimizer you want to delete.
- Click the hamburger menu (≡) or right-click the Config list to open a shortcut menu, and click Delete.
- Alternatively, press Delete on the keyboard.