Can't Wait for SORA? Try Dream Machine for AI Video Generation

PLUS: Why India's GenAI Market is Set to Explode!

Howdy fellas!

We are officially entering the twenties 😍
Yes, this is the 20th edition of The Vision Debugged, and we hit 600 readers this week 🙏 🎉

excited seth meyers GIF by Late Night with Seth Meyers

Gif by beachbossinfluencers on Giphy

Thanks to all our amazing subscribers for their consistent support, feedback & curiosity! If you found this useful, please share it with a friend and consider subscribing if you haven’t already.

Spark and Trouble are back with yet another host on interesting AI titbits!

Here’s a sneak peek into this edition 👀

  • Product Lab: Luma AI’s Dream Machine

  • Meet the AI candidate standing for election as an MP in the UK

  • Nvidia releases Nemotron-4 model family

  • Explore the LLM landscape in India with home-grown models

Time to jump in!😄

PS: Got thoughts on our content? Share 'em through a quick survey at the end of every edition It helps us see how our product labs, insights & resources are landing, so we can make them even better.

Product Labs🔬: Decoding Dream Machine

Ever since SORA was announced, we’ve been itching to try it. Bad news, still waiting. So, when Luma AI’s Dream Machine was launched we tried it immediately.

What’s the biggest advantage of Dream Machine over SORA, Kling and Veo?
It’s available now for use…for FREE 😛

Product Labs: Decoding the AI Matrix - Luma AI’s Dream Machine (source: Created by authors)
Tap the pic to get a better view

What’s in it for you?

Luma AI has rapidly emerged as a key player in consumer AI. Known for their Realtime NeRF on the web since February last year, they raised $43 million from a16z in January and launched Genie, a text-to-3D AI model.

Luma AI’s Dream Machine is a text-to-video AI tool that empowers users to craft high-quality, realistic video clips from simple text prompts in mere minutes. It can generate 120 frames in 120 seconds, achieving 1 frame per second. It produces realistic and cinematic videos but is currently limited to short clips of about 5 seconds and video generation takes approximately 120 seconds per request due to high demand.

Dream Machine Home Page

Dream Machine has both paid and free options, free accounts allow 30 generations per month, with paid plans offering up to 2,000 generations for around $500. Yes, it certainly will add up to your credit card bill next month.

“Drone Shot of a regular day in Ancient Mesopotamia” with Dream Machine (created by authors)

The awesomeness…

There is an Enhance Prompt feature which, under the hood, we assume would add more specifics and details to your prompt. It’s very essential since the world of text-to-video creation is relatively new and hence not everyone is proficient with prompting.

We experimented with and without Enhance Prompt and there is a significant difference in the video quality. What might be nice, is to understand what is the final enhanced prompt to help users learn.

Take a guess which does and does not use the “Enhance Prompt” feature (created by authors)

Another really cool feature is that you can consider an image along with a prompt and animate the image. Dream Machine not only animates the image but also progresses to add additional elements in the video based on identifying the scenario in the image. So, you can start with an AI-generated image from Midjourney or Microsoft Designer and then use Dream Machine to add some magic.

Used Designer to generate an image of an old woman underwater and then uploaded it to Dream Machine

And now for the kinks…

Luma Labs is not shy to call out its limitations and is thereby admitting that Dream Machine is still a very early-stage product. But given the anticipation of the big names as their competitors, Dream Machine has taken a page from Midjourney’s playbook and launched an MVP version.

An MVP, in Lean Startup, is the simplest version of your product that still gets real user feedback. It's about learning fast, not building everything at once.

Imagine testing a food delivery app with a landing page before coding the whole thing. By validating core ideas with a basic product, you can avoid wasting time and resources on features users might not even want. This lets you focus on what truly matters and build a successful product.

Also, Dream Machine does not have access to a large user base like OpenAI, Google or Microsoft, for fast iterations and feedback. Hence it is necessary for Luma AI to launch Dream Machine in its MVP stage to start collecting user feedback.

Dream Machine still has a few limitations and Luma Labs has been very transparent in disclosing it.

  • Morphing: The Dream Machine sometimes struggles with accurately rendering transformations, leading to visual distortions when objects change shape or form.

  • Movement: The AI can have difficulty simulating realistic movements, resulting in unnatural motion like characters sliding instead of walking.

  • Text: Generating coherent text within videos is a challenge for the Dream Machine, often resulting in inconsistent or irrelevant text displays.

  • Janus: This refers to the model’s struggle with maintaining narrative consistency, especially in videos that require a coherent storyline or bidirectional context.

Beyond these limitations in their video generation model, we had a few more nitpicky thoughts:

  • Every video generated has the Luma watermark. While it is surely a way to distinguish AI-generated content from real content (& a clever way to establish a brand name), it could be annoying to the user.

  • There is no direct option to download any video - you must go via the browser’s save video option. Adding a simple “Download” button could simplify the UX.

  • As the rendering time for Dream Machine is currently quite high, it’s a bit dull to wait and stare at the screen or else keep switching back to the tab to see if it’s done. A good alternative could be leveraging the Labour Illusion and coming up with an interactive experience for the wait time.

The Labour Illusion states that people value things more when they see the work behind them. Making users wait for something they requested while showing them how it is being prepared creates the appearance of effort. Users are usually more likely to appreciate the results of that effort.

What’s the intrigue?

There are a host of AI-based video generation tools even now. In the realm of AI video generation tools, each offers unique features catering to different needs.

Runway ML stands out for its realistic videos and precise motion control. However, the video generated is only 4 seconds (with the announcement of Gen-3, this limit might be waived, but until we see it in action, let’s stick to 4), however, Runway does have many interesting features such as inpainting.

Pika Labs excels in creative freedom and character animation, despite occasional distortions. While Pika can create interesting outputs, the video quality isn't super high. The resolution is capped at 1024 x 576 pixels and the frame rate is 8 frames per second, making it appear choppy.

Stable Video Diffusion is notable for its speed and open-source nature, though it may lack the advanced features of its counterparts. It has options to generate. There are 2 models, capable of generating 14 and 25 frames at customizable frame rates between 3 and 30 frames per second.

Krea AI can also create videos from prompts and images. It’s a little complex compared to the others as you need to add a frame along with the text. Krea can generate videos up to 10 seconds.

Dream Machine is still in its early stages. It is much closer to Pika and Runway than it is to Sora.

The key challenge for users now is to avail the right tool, given this buffet of options. Here’s our take…the best AI video generation platform depends on your specific project requirements:

  • Need hyper-realistic motion? Dream Machine is your go-to

  • Craving ultimate control over every detail? Runway is your captain

  • Stuck in a creative rut? Let Krea spark your imagination

  • Prioritizing object consistency? Stable Video is your champion

Whatcha Got There?!🫣

Buckle up, tech fam! Every week, our dynamic duo “Spark”  & “Trouble”😉 share some seriously cool learning resources we stumbled upon.

Spark’s Selections

😉 Trouble’s Tidbits

Your Wish, Our Command 🙌

You Asked 🙋‍♀️, We Answered ✔️

Question: All the news about the latest developments in AI, focuses on the big tech firms and then the upcoming startups based in Silicon Valley. What is the landscape of generative AI in India?

Answer: Recognizing the socio-cultural gap in current systems, Indian entrepreneurs and developers are building a GenAI ecosystem suited to Indian needs, providing better contextual outputs.

  • Major Indian conglomerates such as Reliance (Jio), Tata Consultancy Services, Infosys, and Mahindra & Mahindra are involved in various GenAI initiatives.

  • Notable projects include BharatGPT, Ola Krutrim, and Tech Mahindra's Project Indus, which supports 40 Indic languages to address language challenges.

The landscape of LLMs in India (source: by Inc42 on LinkedIn)

According to Inc42’s predictions, India's GenAI market is set to grow significantly, from $1.1 billion in 2023 to over $17 billion by 2030, with a CAGR of 48%.

BharatGPT and similar developments are driving India's progress in conversational AI, paving the way for a digitally empowered future where language barriers are eliminated and information becomes more accessible.

Spark & Trouble’s Synergy Check🧩

Well, that’s a wrap!
Thanks for reading 😊

See you next week with more mind-blowing tech insights 💻

Until then,
Stay Curious🧠 Stay Awesome🤩

PS: Do catch us on LinkedIn - Sandra & Tezan

Reply

or to participate.