\n```\n```{=html}\n
If you run this notebook you can train, interrupt the kernel,evaluate, and continue training later. Comment out the lines where theencoder and decoder are initialized and run trainIters
again.
\n```\n```{=html}\n
\n```\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"hidden_size = 128\nbatch_size = 32\n\ninput_lang, output_lang, train_dataloader = get_dataloader(batch_size)\n\nencoder = EncoderRNN(input_lang.n_words, hidden_size).to(device)\ndecoder = AttnDecoderRNN(hidden_size, output_lang.n_words).to(device)\n\ntrain(train_dataloader, encoder, decoder, 80, print_every=5, plot_every=5)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Set dropout layers to `eval` mode\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"encoder.eval()\ndecoder.eval()\nevaluateRandomly(encoder, decoder)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Visualizing Attention\n=====================\n\nA useful property of the attention mechanism is its highly interpretable\noutputs. Because it is used to weight specific encoder outputs of the\ninput sequence, we can imagine looking where the network is focused most\nat each time step.\n\nYou could simply run `plt.matshow(attentions)` to see attention output\ndisplayed as a matrix. For a better viewing experience we will do the\nextra work of adding axes and labels:\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"def showAttention(input_sentence, output_words, attentions):\n fig = plt.figure()\n ax = fig.add_subplot(111)\n cax = ax.matshow(attentions.cpu().numpy(), cmap='bone')\n fig.colorbar(cax)\n\n # Set up axes\n ax.set_xticklabels([''] + input_sentence.split(' ') +\n ['