Conference paperBeyond RNNs: Positional self-attention with co-attention for video question answering