CS231n Assignment1--Q4

2019-11-06 07:35:44

字体：大中小

来源：转载

供稿：网友

Q4: Two-Layer Neural Network

two_layer_net.ipynb

Forward pass: compute scores

Your scores: [[-0.81233741 -1.27654624 -0.70335995] [-0.17129677 -1.18803311 -0.47310444] [-0.51590475 -1.01354314 -0.8504215 ] [-0.15419291 -0.48629638 -0.52901952] [-0.00618733 -0.12435261 -0.15226949]]

correct scores: [[-0.81233741 -1.27654624 -0.70335995] [-0.17129677 -1.18803311 -0.47310444] [-0.51590475 -1.01354314 -0.8504215 ] [-0.15419291 -0.48629638 -0.52901952] [-0.00618733 -0.12435261 -0.15226949]]

Difference between your scores and correct scores: 3.68027204961e-08

Forward pass: compute loss

Difference between your loss and correct loss: 1.79412040779e-13

Backward pass

W1 max relative error: 3.669858e-09 W2 max relative error: 3.440708e-09 b2 max relative error: 3.865039e-11 b1 max relative error: 1.125423e-09

Final training loss: 0.0171496079387

这里写图片描述

Load the data

Train data shape: (49000, 3072) Train labels shape: (49000,) Validation data shape: (1000, 3072) Validation labels shape: (1000,) Test data shape: (1000, 3072) Test labels shape: (1000,)

Train a network

iteration 0 / 1000: loss 2.302970 iteration 100 / 1000: loss 2.302474 iteration 200 / 1000: loss 2.297076 iteration 300 / 1000: loss 2.257328 iteration 400 / 1000: loss 2.230484 iteration 500 / 1000: loss 2.150620 iteration 600 / 1000: loss 2.080736 iteration 700 / 1000: loss 2.054914 iteration 800 / 1000: loss 1.979290 iteration 900 / 1000: loss 2.039101 Validation accuracy: 0.287

Debug the training

这里写图片描述

Tune your hyperparameters

best_net = None # store the best model into this hidden_size_choice = [400]learning_rate_choice = [3e-3]reg_choice = [0.02, 0.05, 0.1]batch_size_choice =[500]num_iters_choice = [1200]best_acc = -1best_stats = Noneinput_size = 32 * 32 * 3for batch_size_curr in batch_size_choice: for reg_cur in reg_choice: for learning_rate_curr in learning_rate_choice: for hidden_size_curr in hidden_size_choice: for num_iters_curr in num_iters_choice: PRint print "current training hidden_size:",hidden_size_curr print "current training learning_rate:",learning_rate_curr print "current training reg:",reg_cur print "current training batch_size:",batch_size_curr net = TwoLayerNet(input_size, hidden_size_curr, num_classes) stats = net.train(X_train, y_train, X_val, y_val, num_iters=num_iters_curr, batch_size=batch_size_curr, learning_rate=learning_rate_curr, learning_rate_decay=0.95, reg=reg_cur, verbose=True) val_acc = (net.predict(X_val) == y_val).mean() print "current val_acc:",val_acc if val_acc>best_acc: best_acc = val_acc best_net = net best_stats = stats print print "best_acc:",best_acc print "best hidden_size:",best_net.params['W1'].shape[1] print "best learning_rate:",best_net.hyper_params['learning_rate'] print "best reg:",best_net.hyper_params['reg'] print "best batch_size:",best_net.hyper_params['batch_size']################################################################################## TODO: Tune hyperparameters using the validation set. Store your best trained ## model in best_net. ## ## To help debug your network, it may help to use visualizations similar to the ## ones we used above; these visualizations will have significant qualitative ## differences from the ones we saw above for the poorly tuned network. ## ## Tweaking hyperparameters by hand can be fun, but you might find it useful to ## write code to sweep through possible combinations of hyperparameters ## automatically like we did on the previous exercises. ##################################################################################pass################################################################################## END OF YOUR CODE ##################################################################################

current training hidden_size: 400 current training learning_rate: 0.003 current training reg: 0.02 current training batch_size: 500 iteration 0 / 1200: loss 2.302670 iteration 100 / 1200: loss 1.685716 iteration 200 / 1200: loss 1.599757 iteration 300 / 1200: loss 1.385544 iteration 400 / 1200: loss 1.479385 iteration 500 / 1200: loss 1.466029 iteration 600 / 1200: loss 1.456854 iteration 700 / 1200: loss 1.309732 iteration 800 / 1200: loss 1.236479 iteration 900 / 1200: loss 1.221071 iteration 1000 / 1200: loss 1.210234 iteration 1100 / 1200: loss 1.123294 current val_acc: 0.5

best_acc: 0.5 best hidden_size: 400 best learning_rate: 0.003 best reg: 0.02 best batch_size: 500

current training hidden_size: 400 current training learning_rate: 0.003 current training reg: 0.05 current training batch_size: 500 iteration 0 / 1200: loss 2.302935 iteration 100 / 1200: loss 1.693358 iteration 200 / 1200: loss 1.509740 iteration 300 / 1200: loss 1.572148 iteration 400 / 1200: loss 1.495700 iteration 500 / 1200: loss 1.400046 iteration 600 / 1200: loss 1.370000 iteration 700 / 1200: loss 1.249708 iteration 800 / 1200: loss 1.305766 iteration 900 / 1200: loss 1.342539 iteration 1000 / 1200: loss 1.277757 iteration 1100 / 1200: loss 1.232157 current val_acc: 0.512

best_acc: 0.512 best hidden_size: 400 best learning_rate: 0.003 best reg: 0.05 best batch_size: 500

current training hidden_size: 400 current training learning_rate: 0.003 current training reg: 0.1 current training batch_size: 500 iteration 0 / 1200: loss 2.303187 iteration 100 / 1200: loss 1.815929 iteration 200 / 1200: loss 1.736408 iteration 300 / 1200: loss 1.503271 iteration 400 / 1200: loss 1.571691 iteration 500 / 1200: loss 1.474189 iteration 600 / 1200: loss 1.478976 iteration 700 / 1200: loss 1.355830 iteration 800 / 1200: loss 1.261623 iteration 900 / 1200: loss 1.272220 iteration 1000 / 1200: loss 1.303129 iteration 1100 / 1200: loss 1.320341 current val_acc: 0.517

best_acc: 0.517 best hidden_size: 400 best learning_rate: 0.003 best reg: 0.1 best batch_size: 500

这里写图片描述