Tech »  Topic »  What Makes AI Smarter? Inside the Training of Language Models

What Makes AI Smarter? Inside the Training of Language Models


What Makes AI Smarter? Inside the Training of Language Models by @rendering

Scaling laws guide language model improvements, with Transformer++ and Mamba showing strong results. Training recipes significantly impact performance, while expanded state dimensions boost SSMs.

Table of Links

Abstract and 1. Introduction

2 State Space Models

3 Selective State Space Models and 3.1 Motivation: Selection as a Means of Compression

3.2 Improving SSMs with Selection

3.3 Efficient Implementation of Selective SSMs

3.4 A Simplifed SSM Architecture

3.5 Properties of Selection Mechanisms

3.6 Additional Model Details

4 Empirical Evaluation and 4.1 Synthetic Tasks

4.2 Language Modeling

4.3 DNA Modeling

4.4 Audio Modeling and Generation

4.5 Speed and Memory Benchmarks

4.6 Model Ablations

5 Discussion

6 Conclusion, Acknowledgments and References

A Discussion: Selection Mechanism

B Related Work and B.1 S4 Variants and Derivatives

B.2 SSM Architectures

B ...


Copyright of this story solely belongs to hackernoon.com . To see the full text click HERE