A batched-softmax wrapper to mask the probabilities of padding.

For instance there may be a batch of instances where A is padding.

XXXXAA
XXAAAA
XXXXXX


MaskedSoftmax ensures that no probability is given to the A's.

For this example, sourceSizes is {4, 2, 6} and sourceLength is 6.

• sourceSizes - the true lengths (with left padding).
• sourceLength - the length of the batch.