Mouth to Music Video with this AI Workflow I wanted to see and show how far you can go using just tiny inputs such as my mouth to create a “music video”. This workflow shows how little you actually need to create something unique and captivating using something of your own (in this case my mouth). Here’s what I did 1. Start with audio: In my case I used Suno to create the beat and music I wanted, something I can freestyle rap to and matched my idea, BUT you can really use any audio you want (doesn’t really matter) 2. Record video: I recorded an initial video of me, and then edited it to a cropped out shot of just my mouth on CapCut, which I used as the input for the next step (in runway). That’s what the clear square frame is meant to represent in the top video. 3.Plug into Runway and Expand: Using Runway’s Expand feature, I gave just the small video of my mouth and a series of prompts to generate the different characters that visually connected to the words/story I am telling, generating the “outside” of the mouth based on the prompt. 4. Iterate and refine: Once I found results I liked, I continued with several expansions on each generation, widening the shot to create a larger sized video which I could crop and play around with in the final composition. Combining something about you + AI allows you to make something creative, original and refreshing that is going to be more difficult to replicate than other content, this is just another example of that. It’s something I think we’ll start to see a lot more of, so there’s no reason not to start today and be early! What I love about this is its versatility. This can work for so many types of videos (music vids, talking heads, even cinematic) It’s quick, quality is awesome, and a lot of fun tbh. PS: I’m experimenting with workflows like this every day—follow for more creative ideas and inspiration 🔥
Nice approach. This reminds me of a an in-painting experimentation phase I went through, using different sizes of regions to inpaint for different strategic results. Another very interesting thing to consider is where the edge of your "expansion" or "inpainting" TOUCHES or CROSSES just a small portion of another structure or identifiable pattern in the pixels (identifiable to the model). This little "nick" or "tiny overlap" you have at the edges of your inpainting region or expansion region, will be sort of grabbed by the model in most cases and used to extend structure in a sense. I never took such an approach with video, but it is all analagous. So to take advantage of that for a certain kind of look, one does a lot of iterations of erasing small regions which just touch or just slightly cross over the EDGE of identifiable structures. Then let the inpainting or expansion generate, then repeat. For an image I might do this 10 to 100 times. For video I am sure there are other dimensions to this concept. Unfortunately I am deep in full stack web application code with no time for generation .. but at least the site I'm rebuilding is dedicated to AI generation and AI in general. Thanks for sharing!
This is smart !!!
You Sir are a genius!!!!!!! This is what film making is right now, not trying to recreate Hollywood but to create something that isn't. LOVE IT TOOOO MUCH!!!!!!!!!! I salute you !
This is super creative! Please repost to the generative AI group here in Linkeind: https://www.linkedin.com/groups/13141390/
good idea for a worklfow, I'ma have to try this.
Great workflow
Dude, this is a great way of using Runway's expand feature. Hadn't even thought about using it in this way. Good call and thanks for sharing those steps 🙌
This is great. What were the biggest challenges on AI expansion? Have you tried using larger patches for "character consistency"?
"show the world what you're about", keep going Roy Hermann
Filmmaker, storyteller, troublemaker, recordbreaker. I've directed in every genre from TV to animation to features winning awards along the way. Sold all my cameras. Resistance is futile.
1moBit of stabilisation and reframing to keep the mouth locked in the centre would have probably gotten even better results. Super interesting workflow