Args:
dictionary (~fairseq.data.Dictionary): encoding dictionary
embed_dim (int, optional): embedding dimension
embed_dict (str, optional): filename from which to load pre-trained
embeddings
max_positions (int, optional): maximum supported input sequence length
convolutions (list, optional): the convolutional layer structure. Each
list item `i` corresponds to convolutional layer `i`. Layers are
given as ``(out_channels, kernel_width, [residual])``. Residual