Which of the following statements about the standard normal distribution are true?
Maximum likelihood estimation (MLE) requires knowledge of the sample data's distribution type.
Which of the following statements about the functions of layer normalization and residual connection in the Transformer is true?
Overfitting is a condition where a model is overly simple and excessive generalization errors occur.