Longformer

1 Post

Graphs related to different attention mechanisms

More Efficient Transformers: BigBird is an efficient attention mechanism for transformers.

As transformer networks move to the fore in applications from language to vision, the time it takes them to crunch longer sequences becomes a more pressing issue. A new method lightens the computational load using sparse attention.

Longformer

More Efficient Transformers: BigBird is an efficient attention mechanism for transformers.

Subscribe to The Batch