|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Cryptocurrency News Articles
Pyramid Flow: A New Technique for High-Quality AI Videos
Oct 10, 2024 at 11:23 pm
AI video generation is a computationally intensive task that typically involves modeling large spatiotemporal spaces. Traditional methods often require
A new AI video generation model, Pyramid Flow, was released this week, offering high-quality video clips up to 10 seconds in length — quickly, and all open source.
Developed by a collaboration of researchers from Peking University, Beijing University of Posts and Telecommunications, and Kuaishou Technology — the latter the creator of the well-reviewed proprietary Kling AI video generator — Pyramid Flow leverages a new technique wherein a single AI model generates video in stages, most of them low resolution, saving only a full-res version for the end of its generation process.
It’s available as raw code for download on Hugging Face and Github, and can be run in an inference shell here but requires the user to download and run the model code on their own machine.
At inference, the model can generate a 5-second, 384p video in just 56 seconds—on par with or faster than many full-sequence diffusion counterparts — though Runway’s Gen 3-Alpha Turbo still takes cake in terms of speed of AI video generation, coming in at under one minute and often times 10-20 seconds in our tests.
We haven’t had a chance to test Pyramid Flow yet, but the videos posted by the model creators appear to be incredibly lifelike, high enough resolution, and compelling — analogous to those of proprietary offerings. You can see various examples here on its Github project page.
Indeed, Pyramid Flow is available designed now to download and use — even for commercial/enterprise purposes — and is designed to compete directly with paid proprietary offerings such as Runway’s Gen-3 Alpha, Luma’s Dream Machine, Kling, and Haulio, which can cost hundreds of even thousands of dollars a year for users on unlimited generation subscriptions.
As the race between various AI video providers to gain users continues, Pyramid Flow aims to bring more efficiency and flexibility to developers, artists, and creators seeking advanced video generation capabilities.
A new technique for high-quality AI videos: ‘pyramidal flow matching’
AI video generation is a computationally intensive task that typically involves modeling large spatiotemporal spaces. Traditional methods often require separate models for different stages of the process, which limits flexibility and increases the complexity of training.
Pyramid Flow is built on the concept of pyramidal flow matching, a method that drastically cuts down the computational cost of video generation while maintaining high visual quality, completing the video generation process as a series of “pyramid” stages, with only the final stage operating at full resolution.
It’s described in a pre-reviewed paper, “Pyramidal Flow Matching for Efficient Video Generative Modeling,” submitted to open access science journal arXiv on October 8, 2024.
The authors include Yang Jin, Zhicheng Sun, Ningyuan Li, Kun Xu, Hao Jiang, Nan Zhuang, Quzhe Huang, Yang Song, Yadong Mu, and Zhouchen Lin. Most of these researchers are affiliated with Peking University, while others are from Kuaishou Technology.
As they write, the ability to compress and optimize video generation at different stages leads to faster convergence during training, allowing Pyramid Flow to generate more samples per training batch.
For example, the proposed pyramidal flow reduces the token count by a factor of four compared to traditional diffusion models, which results in more efficient training.
The model can produce 5- to 10-second videos at 768p resolution and 24 frames per second, all while being trained on open-source datasets. Specifically, the paper states that Pyramid Flow was trained on trained on:
In total, the authors curated approximately 10 million single-shot videos.
However, many of these “public” or “open source” datasets have in recent years come under fire from critics for including copyrighted material without permission or informed consent of the copyright holders, and LAION-5B in particular accused of hosting child sexual abuse material.
Separately, Runway is among the companies being sued by artists in a class action lawsuit for training on materials without permission, compensation, or consent — allegedly in violation of U.S. copyright. The case remains being argued in court, for now.
Permissively licensed, open source for commercial usage
Pyramid Flow is released under the MIT License, allowing for a wide range of uses, including commercial applications, modifications, and redistribution, provided the copyright notice is preserved.
This makes Pyramid Flow an attractive option for developers and companies looking to integrate the model into proprietary systems, and could challenge Luma AI and Runway as both look to offer paid application programming interfaces for developers seeking to integrate their proprietary AI video generation technology into customer or employee-facing apps.
Yet those proprietary models already exist as inferences suitable for developers, while Pyramid Flow has a demo inference on Hugging Face, it is not suitable for building full applications atop it and users would need to host their own version of an inference, which
Disclaimer:info@kdj.com
The information provided is not trading advice. kdj.com does not assume any responsibility for any investments made based on the information provided in this article. Cryptocurrencies are highly volatile and it is highly recommended that you invest with caution after thorough research!
If you believe that the content used on this website infringes your copyright, please contact us immediately (info@kdj.com) and we will delete it promptly.
-
- Feds Bust Four Crypto Firms in Sting Operation, Seize $25 Million in Crypto
- Oct 11, 2024 at 02:25 am
- Federal prosecutors are getting creative in order to catch bad actors in the cryptocurrency market. After receiving a tip from the Securities and Exchange Commission, federal prosecutors set up a fake crypto company to investigate a firm based in Boston.
-
- Lombard Vault, a Bitcoin-Focused Yield Product by Lombard Finance, Surpassed the $100 Million Mark in TVL
- Oct 11, 2024 at 02:25 am
- The product enables users to deposit Wrapped Bitcoin (WBTC), LBTC (a liquid staking token representing Bitcoin deposited with Lombard), and other tokens to leverage decentralized finance (DeFi) strategies
-
- Current Trends in the Cryptocurrency Market Paint Three Different Pictures
- Oct 11, 2024 at 02:25 am
- Cardano is wading through rough waters. Its price hovers below the $0.350 mark and the 100-hour simple moving average. Cardano's price forecast points to potential resistance at $0.3480, complicating any rebound.