Following the original model, several specialized versions were released:
The Perceiver treats text as a sequence of raw bytes rather than traditional word-level tokens, allowing it to understand the meaning of text directly from its individual characters. perceiver
: After initially looking at the text, the model repeatedly refines its understanding through "latent transformer" blocks, essentially "thinking" about the data in its own internal space. Evolution: Perceiver IO and Perceiver AR Following the original model
: The model uses a small set of "latent" variables to attend to the much larger input text. This "cross-attention" step decouples the depth of the network from the size of the input, making it much faster for long documents. perceiver
Following the original model, several specialized versions were released:
The Perceiver treats text as a sequence of raw bytes rather than traditional word-level tokens, allowing it to understand the meaning of text directly from its individual characters.
: After initially looking at the text, the model repeatedly refines its understanding through "latent transformer" blocks, essentially "thinking" about the data in its own internal space. Evolution: Perceiver IO and Perceiver AR
: The model uses a small set of "latent" variables to attend to the much larger input text. This "cross-attention" step decouples the depth of the network from the size of the input, making it much faster for long documents.