G2ANet
G2ANet
Graph Extraction
Given the locality of interaction, G2ANet constructs the relationship between agents as an agent-coordination graph

The coordination graph is extracted from two-stage attention mechanism and individual observation embeddings

Hard Attention
The hard attention of first stage samples the binary hard attention weights for each pair of agents through Bi-LSTM
To enable the back-propagation of gradients, the above sampling process is approximated through gumbel-softmax
where the and is the temperature coefficient which controls the smoothness of softmax
Soft Attention
The soft attention of second stage performs scaled dot product based on the hard attention weights in the first stage
which forms the final output weights of edges in the coordination graph to be used in the downstream network or module
Network Architecture
Based on the weighted coordination graph, G2ANet adopts GNN to integrate the information of neighbouring agents
Policy | Value |
---|---|
![]() |
![]() |
The policy and value networks can be further derived from the GNN encoded embeddings and trained end-to-end