The document proposes a Heterogeneous Graph Transformer (HGT) model for tasks on heterogeneous graphs like the Open Academic Graph. HGT uses heterogeneous mutual attention to aggregate information from different node types, heterogeneous message passing to propagate information, and target-specific aggregation to generate embeddings. It is evaluated on several node classification, link prediction, and author disambiguation tasks, outperforming baselines like GCN, GAT, R-GCN, HetGNN and HAN.