Task Prompt Design For LLM Video Generation
hackernoon.comVideoPoet uses task-specific prefixes with text, visual, and audio tokens, training only on outputs like visual and audio tokens with special task prompts.
Table of Links
3. Model Overview and 3.1. Tokenization
3.2. Language Model Backbone and 3.3. Super-Resolution
4. LLM Pretraining for Generation
5. Experiments
5.2. Pretraining Task Analysis
5.3. Comparison with the State-of-the-Art
5.4. LLM’s Diverse Capabilities in Video Generation and 5.5. Limitations
6. Conclusion, Acknowledgements, and References
4.1. Task Prompt Design
We design a pretraining task mixture, each with a defined prefix input and output. The model conditions on the prefix, applying the loss solely to the output. Fig. 2 shows a typical input-output sequence layout. For ...
Copyright of this story solely belongs to hackernoon.com . To see the full text click HERE