Increasingly, artificial neural networks are explored to learn relationships among temporal sequence data for purposes of classification, prediction, and anomaly detection with the hope of exceeding the performance of more traditional machine learning algorithms. While the underlying Long Short-Term Memory or Gated Recurrent Unit networks are still the preferred choices by many researchers, such recurrent networks are sub-optimal to learn relationships within and across longer sequences. Transformer neural networks, originally designed to improve the performance of natural language processing tasks, pose an interesting alternative as their attention mechanisms are more capable of capturing context and meaning within longer sequences. Such features present opportunities to apply transformer networks also to temporal sequence data of financial asset prices. This thesis introduces an extension of the original transformer neural network which is capable of multivariate time series representation learning in a supervised learning context and attempts to train temporal sequences of financial asset prices. The prediction accuracy of the transformer extension exceeds two of the most popular recurrent neural networks used for temporal sequence data prediction. The experiments are conducted in the context of a trading algorithm that showcases the practical potential and its implications. As the model is not input data specific, opportunities to transfer enhancements to other domains exist.