Huawei Related Exams
H13-321_V2.5 Exam
In 2017, the Google machine translation team proposed the Transformer in their paperAttention is All You Need. In a Transformer model, there is customized LSTM with CNN layers.
If OpenCV is used to read an image and save it to variable "img" during image preprocessing, (h, w) = img.shape[:2] can be used to obtain the image size.
Which of the following statements about the multi-head attention mechanism of the Transformer are true?