In this paper, we present a novel approach called Cross-modal Augmented Transformer (CAT) for medical report generation. Unlike previous methods that rely on coarse-grained features without human ...