- The Vision, Debugged;
- Posts
- Can't Wait for SORA? Try Dream Machine for AI Video Generation
Can't Wait for SORA? Try Dream Machine for AI Video Generation
PLUS: Why India's GenAI Market is Set to Explode!
Howdy fellas!
We are officially entering the twenties š
Yes, this is the 20th edition of The Vision Debugged, and we hit 600 readers this week š š
Gif by beachbossinfluencers on Giphy
Thanks to all our amazing subscribers for their consistent support, feedback & curiosity! If you found this useful, please share it with a friend and consider subscribing if you havenāt already.
Spark and Trouble are back with yet another host on interesting AI titbits!
Hereās a sneak peek into this edition š
Product Lab: Luma AIās Dream Machine
Meet the AI candidate standing for election as an MP in the UK
Nvidia releases Nemotron-4 model family
Explore the LLM landscape in India with home-grown models
Time to jump in!š
PS: Got thoughts on our content? Share 'em through a quick survey at the end of every edition It helps us see how our product labs, insights & resources are landing, so we can make them even better.
Product Labsš¬: Decoding Dream Machine
Ever since SORA was announced, weāve been itching to try it. Bad news, still waiting. So, when Luma AIās Dream Machine was launched we tried it immediately.
Whatās the biggest advantage of Dream Machine over SORA, Kling and Veo?
Itās available now for useā¦for FREE š
Product Labs: Decoding the AI Matrix - Luma AIās Dream Machine (source: Created by authors)
Tap the pic to get a better view
Whatās in it for you?
Luma AI has rapidly emerged as a key player in consumer AI. Known for their Realtime NeRF on the web since February last year, they raised $43 million from a16z in January and launched Genie, a text-to-3D AI model.
Luma AIās Dream Machine is a text-to-video AI tool that empowers users to craft high-quality, realistic video clips from simple text prompts in mere minutes. It can generate 120 frames in 120 seconds, achieving 1 frame per second. It produces realistic and cinematic videos but is currently limited to short clips of about 5 seconds and video generation takes approximately 120 seconds per request due to high demand.
Dream Machine Home Page
Dream Machine has both paid and free options, free accounts allow 30 generations per month, with paid plans offering up to 2,000 generations for around $500. Yes, it certainly will add up to your credit card bill next month.
āDrone Shot of a regular day in Ancient Mesopotamiaā with Dream Machine (created by authors)
The awesomenessā¦
There is an Enhance Prompt feature which, under the hood, we assume would add more specifics and details to your prompt. Itās very essential since the world of text-to-video creation is relatively new and hence not everyone is proficient with prompting.
We experimented with and without Enhance Prompt and there is a significant difference in the video quality. What might be nice, is to understand what is the final enhanced prompt to help users learn.
Take a guess which does and does not use the āEnhance Promptā feature (created by authors)
Another really cool feature is that you can consider an image along with a prompt and animate the image. Dream Machine not only animates the image but also progresses to add additional elements in the video based on identifying the scenario in the image. So, you can start with an AI-generated image from Midjourney or Microsoft Designer and then use Dream Machine to add some magic.
Used Designer to generate an image of an old woman underwater and then uploaded it to Dream Machine
And now for the kinksā¦
Luma Labs is not shy to call out its limitations and is thereby admitting that Dream Machine is still a very early-stage product. But given the anticipation of the big names as their competitors, Dream Machine has taken a page from Midjourneyās playbook and launched an MVP version.
An MVP, in Lean Startup, is the simplest version of your product that still gets real user feedback. It's about learning fast, not building everything at once.
Imagine testing a food delivery app with a landing page before coding the whole thing. By validating core ideas with a basic product, you can avoid wasting time and resources on features users might not even want. This lets you focus on what truly matters and build a successful product.
Also, Dream Machine does not have access to a large user base like OpenAI, Google or Microsoft, for fast iterations and feedback. Hence it is necessary for Luma AI to launch Dream Machine in its MVP stage to start collecting user feedback.
Dream Machine still has a few limitations and Luma Labs has been very transparent in disclosing it.
Morphing: The Dream Machine sometimes struggles with accurately rendering transformations, leading to visual distortions when objects change shape or form.
Movement: The AI can have difficulty simulating realistic movements, resulting in unnatural motion like characters sliding instead of walking.
Text: Generating coherent text within videos is a challenge for the Dream Machine, often resulting in inconsistent or irrelevant text displays.
Janus: This refers to the modelās struggle with maintaining narrative consistency, especially in videos that require a coherent storyline or bidirectional context.
Beyond these limitations in their video generation model, we had a few more nitpicky thoughts:
Every video generated has the Luma watermark. While it is surely a way to distinguish AI-generated content from real content (& a clever way to establish a brand name), it could be annoying to the user.
There is no direct option to download any video - you must go via the browserās save video option. Adding a simple āDownloadā button could simplify the UX.
As the rendering time for Dream Machine is currently quite high, itās a bit dull to wait and stare at the screen or else keep switching back to the tab to see if itās done. A good alternative could be leveraging the Labour Illusion and coming up with an interactive experience for the wait time.
The Labour Illusion states that people value things more when they see the work behind them. Making users wait for something they requested while showing them how it is being prepared creates the appearance of effort. Users are usually more likely to appreciate the results of that effort.
Dream Machine by Luma AI is just 3 days old.
Now memes are becoming videos.
10 wild examples:
1. Distracted boyfriend
ā Madni Aghadi (@hey_madni)
8:52 AM ā¢ Jun 15, 2024
Whatās the intrigue?
There are a host of AI-based video generation tools even now. In the realm of AI video generation tools, each offers unique features catering to different needs.
Runway ML stands out for its realistic videos and precise motion control. However, the video generated is only 4 seconds (with the announcement of Gen-3, this limit might be waived, but until we see it in action, letās stick to 4), however, Runway does have many interesting features such as inpainting.
Pika Labs excels in creative freedom and character animation, despite occasional distortions. While Pika can create interesting outputs, the video quality isn't super high. The resolution is capped at 1024 x 576 pixels and the frame rate is 8 frames per second, making it appear choppy.
Stable Video Diffusion is notable for its speed and open-source nature, though it may lack the advanced features of its counterparts. It has options to generate. There are 2 models, capable of generating 14 and 25 frames at customizable frame rates between 3 and 30 frames per second.
Krea AI can also create videos from prompts and images. Itās a little complex compared to the others as you need to add a frame along with the text. Krea can generate videos up to 10 seconds.
Dream Machine is still in its early stages. It is much closer to Pika and Runway than it is to Sora.
The key challenge for users now is to avail the right tool, given this buffet of options. Hereās our takeā¦the best AI video generation platform depends on your specific project requirements:
Need hyper-realistic motion? Dream Machine is your go-to
Craving ultimate control over every detail? Runway is your captain
Stuck in a creative rut? Let Krea spark your imagination
Prioritizing object consistency? Stable Video is your champion
Whatcha Got There?!š«£
Buckle up, tech fam! Every week, our dynamic duo āSparkā āØ & āTroubleāš share some seriously cool learning resources we stumbled upon.
āØ Sparkās Selections |
š Troubleās Tidbits |
Your Wish, Our Command š
You Asked šāāļø, We Answered āļø
Question: All the news about the latest developments in AI, focuses on the big tech firms and then the upcoming startups based in Silicon Valley. What is the landscape of generative AI in India?
Answer: Recognizing the socio-cultural gap in current systems, Indian entrepreneurs and developers are building a GenAI ecosystem suited to Indian needs, providing better contextual outputs.
Major Indian conglomerates such as Reliance (Jio), Tata Consultancy Services, Infosys, and Mahindra & Mahindra are involved in various GenAI initiatives.
Notable projects include BharatGPT, Ola Krutrim, and Tech Mahindra's Project Indus, which supports 40 Indic languages to address language challenges.
The landscape of LLMs in India (source: by Inc42 on LinkedIn)
According to Inc42ās predictions, India's GenAI market is set to grow significantly, from $1.1 billion in 2023 to over $17 billion by 2030, with a CAGR of 48%.
BharatGPT and similar developments are driving India's progress in conversational AI, paving the way for a digitally empowered future where language barriers are eliminated and information becomes more accessible.
Spark & Troubleās Synergy Checkš§©
Well, thatās a wrap! Until then, |
Reply