Eagle (RWKV-5) and Finch (RWKV-6): Marking Substantial Progress in Recurrent Neural Networks-Based Language Models by Integrating Multiheaded Matrix-Valued States and Dynamic Data-Driven Recurrence Mechanisms

Eagle (RWKV-5) and Finch (RWKV-6): Marking Substantial Progress in Recurrent Neural Networks-Based Language Models by Integrating Multiheaded Matrix-Valued States and Dynamic Data-Driven Recurrence Mechanisms

Large Language Models (LLMs) have transformed Natural Language Processing, but the dominant Transformer architecture suffers from quadratic complexity issues. While techniques like sparse attention have aimed to reduce this complexity, a new breed of models is...