AI Creation Bonanza: Pika 1.0, Stability Video Diffusion, Krea AI and Navigating Perpetual Change
Christmas came early for AI creators - exploring the latest toys at our disposal, and advice on dealing with relentless change
🪄 Creative Technology Digest
Your weekly dose on the future of creation & computing.
🍇 This Week's Juicy Topic: The Future of AI Creation & Staying Sane Navigating The Change
Ho ho ho! Christmas came early for creators this year — with a huge set of releases across AI photo and video creation tools. First we had text-to-video, then image-to-video and video-to-video, and now we’re seeing real-time processing emerge as another axis of innovation.
Pika 1.0 has rolled out advanced video in-painting and out-painting capabilities, in addition to improve image-to-video. Meanwhile, real-time diffusion is getting a boost with the release of LCM-LoRA and SDXL Turbo — allowing us to create imagery at the speed of thought.
Pika hits that 1.0 milestone. AI video tools are legit dream factories. And now you can both create *and* edit your videos. So beyond just text/image inputs, you can upload videos and edit them with AI. Congrats @pika_labs — excited to see y’all blossom!
🌶️ My hot take: New superpowers tend to raise the floor AND the ceiling. Creators adapt to new capabilities as their baseline and innovate at a higher level of abstraction.
Make no mistake… just because artists are getting a new set of tools, doesn’t mean they’re settling for a “one click” solution. Instead, they’re weaving their new found super powers to achieve results in days that would’ve otherwise taken weeks. Case in point: this amazing multi-tool workflow by @8bit_e
Late adopters:
“AI isn’t real art”
“Must be one click”
“Literally no creativity”The art in question:
What’s also amazing is these tools are by-passing the classical VFX & 3D workflow. Just consider that the example below was done in Pika 1.0 without the creator needing make a 3d model for the props, doing pose estimation for the person wearing it, lighting and compositing the object. It’s all happening “automagically” in pixel space. Kind of nuts to see, knowing how hard this is to pull off in “classical” AR platforms!
Video in-painting in Pika 1.0 is goated. Perfect for mixed reality magic like this 🪄
Moreover: It’s wild to see this completely bypass the classical 3D animation and VFX pipeline. Imagine the hours saved.
Amazing experiment by @Martin_Haerlin
🛠️ Helpful Framework For Navigating Perpetual Change
No matter which vertical you look at — change seems to be the only constant. I get asked often how I keep up with the latest advancements while staying sane. The truth is I don’t 😅 , but what I will say is the following:
If you want to succeed in harnessing AI's potential you need to perpetually switch between playground mode & architect mode.
#1 Playground Mode 🏖️
New AI capabilities continually hit the market in raw, unfiltered form. It's critical to have a mode of play where you explore and map out the creative possibilities. I call it "playground mode" because you engage in unstructured experimentation, and thus become open to happy accidents. Challenge yourself to play more, and find those nuggets of gold.
#2 Architect Mode 🏗️
Once you've sufficiently mapped out the possibilities enter "architect mode" where you take on a more structured approach to put those findings to work. This is also where you weave these new capabilities into a "classical" workflow incorporating non-AI tools. Don't let those nuggets of gold go to waste. Build something with that insight - whether it's content, a prototype or a full-blown product.
💡Deep Dive: 10 Amazing Things You Can Do With 3D Gaussian Splatting & Volumetric Capture
We’ve covered advances in 3D Gaussian Splatting in the past. But what can you actually do with this reality capture tech? Here are 10 applications of 3D reality Capture in this 11 min video. And if X is more your speed — tap the link below:
In this 11-minute video, we’ll get into:
00:00 The Reality Capture Revolution
01:05 3D Memory Capture
01:54 Reality Bending Visual Effects
03:08 Dynamic 3D Capture (Movements!)
04:36 Advanced Relighting Effects
05:34 Kitbashing in 3D Game Engines
06:38 Sharing Content Everywhere (Web & Mobile)
07:34 Editing Gaussian Splats & E-Commerce
08:26 Virtual Production & Game Dev
09:10 Heritage Conservation
09:33 Big Picture: Connecting Physical & Digital Worlds
10:48 Conclusion & What’s Next for 3D Capture
If you missed our previous deep dive edition — you can catch up below:
Gaussian Splatting: The Next Big Breakthrough in 3D Graphics
Forget NeRFs - Why Gaussian Splatting Is A Game Changer + Amazing Use Cases It Unlocks
creativetech.beehiiv.com/p/gaussian-splatting-next-big-breakthrough-3d-graphics
🎥 Creation Corner:
1. Image-to-video with Stable Video Diffusion (SVD) The team behind Stable Diffusion has made another leap forward, introducing their SVD model, opening new doors for image-to-video transformations. Just look at that temporal coherence!
This is image-to-video with Stable Video Diffusion (SVD). The temporal coherence is impressive!
The model is also well suited for multi-view consistency (i.e. doing image-to-3D)
Look fwd to diving deeper and sharing more. Meanwhile check out @fofrAI
2. LCM-Lora and real-time generative AI: Envision a world where AI and real-time content creation coalesce, fused with LCM-Lora’s capabilities, a vision brought to life by Krea AI.
Creativity at the speed of thought — LCM-LoRA continues to impress.
2D canvas is cool but I can’t wait till there’s a lightweight 3D engine in tools like @krea_ai.
Way more control to “spatially”describe your vision, then take it all the way with GenAI.
Once you go realtime, you don't go back
Finally had a chance to play with LCM-Lora and realtime generative AI and it lives up to the hype.
⚙️ @autodesk Maya (left) --> Krea AI (right)
The style strength slider in @krea_ai is awesome, but we really need ControlNet style… twitter.com/i/web/status/1…
Another cool experiment: Streaming live data from Google Earth directly into Krea, transforming the world’s most comprehensive digital twin into a breathtaking artistic canvas.
Real-time generative AI is a game changer.
The rule is simple: your input data controls your output data.
And there is staggering amounts of data out there my dear friends! 🌐x🌍 = 🎨
🌍 I'm streaming @googleearth into @krea using an LCM-Lora model to turn the most… twitter.com/i/web/status/1…
3. Audio-to-audio: We've seen the thrills of video-to-video and the staple magic of image-to-image transformations. But now, brace yourselves for the groundbreaking realm of audio-to-audio. Oh heck yeah!
In the AI creation space, video-to-video is a lot of fun, and image-to-image is a staple. But audio-to-audio is going to be a whole another level. Here’s Eleven Labs AI doing Elon2Rogan. LOL!
4. Emu Video: Pushing the boundaries of the AI video model. Emu Video, Pika, and Runway all offer unprecedented capabilities in video editing and creation. Whether you're a content creator, a filmmaker, or just an AI enthusiast, this is a development you'll want to watch closely. You can also try it out here: https://emu-video.metademolab.com/
Meta Research is working on Emu Video — an AI video model intended to rival the likes of Pika and Runway.
They compared results with human raters and concluded their “model outperforms commercial solutions such as RunwayML’s Gen2 and Pika Labs.”
Wdyt?
💌 Wrapping up & next up:
AI-vengers assemble!
Thanks for the dope portrait @KamaniMadeThis 🙏🏽
That’s all for this week. The next edition will be focused on Google Gemini — the good and the bad, along with a new YouTube video on multimodality. Got feedback, questions, or just want to chat? Reply to this email or catch me on social media. As always, thanks for reading! I’ll see ya’ll in the next one 🖖