Task Prompt Design For LLM Video Generation

Task Prompt Design For LLM Video Generation by @teleplay

VideoPoet uses task-specific prefixes with text, visual, and audio tokens, training only on outputs like visual and audio tokens with special task prompts.

Table of Links

Abstract and 1 Introduction

2. Related Work

3. Model Overview and 3.1. Tokenization

3.2. Language Model Backbone and 3.3. Super-Resolution

4. LLM Pretraining for Generation

4.1. Task Prompt Design

4.2. Training Strategy

5. Experiments

5.1. Experimental Setup

5.2. Pretraining Task Analysis

5.3. Comparison with the State-of-the-Art

5.4. LLM’s Diverse Capabilities in Video Generation and 5.5. Limitations

6. Conclusion, Acknowledgements, and References

A. Appendix

4.1. Task Prompt Design

We design a pretraining task mixture, each with a defined prefix input and output. The model conditions on the prefix, applying the loss solely to the output. Fig. 2 shows a typical input-output sequence layout. For ...

Copyright of this story solely belongs to hackernoon.com . To see the full text click HERE

Table of Links

4.1. Task Prompt Design

Share:

More related news