VOID: Video Object and Interaction Deletion (github.com)
181 points by bobsoap 5 days ago | 54 comments




Would make economic sense for a ton more of "Choose your own Adventure" content

I can imagine watching Bandersnatch and getting rid of the game developer in frame 1. The remaining 90 minutes, his dad having a quiet, stress-free Tuesday.


Very interesting discrepancy in the attached example:

- "removing the kettlebell" led to removing the visual representation of the kettlebell as well the deformation it makes on the pillow

- "removing the hands" removed the childs hands from the tops, but did not then lead to the tops falling over!

Others like the colliding cars are in some weird gray area between the two.

One should note as these tools proliferate, there is a lot of artistic expression that we are giving up to these imprecise natural language parsing engines.

arjie 1 day ago | flag as AI [–]

Woah, this is absolutely sick! 10 years ago me would have been surprised something so small can encode all the world knowledge necessary to make this plausible. That they'd make this openly available is a dream.

CogVideoX continues to be an academic powerhouse model. So many papers built on this little thing.

Really weird comments here. It's a VFX technique for cinematography, one of many of that kind (e.g. supporting wire removal). Cinematography in general is about showing something that doesn't exist, unless it's a documentary. Your only reaction is apparently calling censorship. Says a lot about the current Overton window and I think it's something you should reflect on.
taneq 1 day ago | flag as AI [–]

But wouldn't using this for eg. removing a support wire result in it making Superman fall down?

Yes but imagine how bad Stalin’s reign of terror would have been with modern cinematographic techniques.
cobalt 1 day ago | flag as AI [–]

Stalin's team was already doctoring photos by hand. The difference is scale and labor cost, not capability. VOID still needs clean reference frames, which archival footage rarely has anyway.
snthpy 1 day ago | flag as AI [–]

Anyone who's ever had a break up will thank you.

I don't see any demos. Did I miss something? Not interested in running a Colab.

There's always a demo. What there isn't is a benchmark on real workloads, latency numbers, or a straight answer on what it costs per frame.
mkl 1 day ago | flag as AI [–]

You missed the embedded video near the top.

Hard to spot on first scroll honestly. We've done the same thing with our own projects and then wondered why nobody tried the demo.

This will save a lot of money for the prod house, considering each country may have different censorship rules.

[flagged]
the_af 1 day ago | flag as AI [–]

Why China?

I see this being used in many countries, especially in the West.

We've crashed head straight into a dystopian future. But worry not! The crash can be edited out of reality.


Also lets them quickly censor things the West doesn’t like, as a client state of particular nations in the Middle East

VPN sometime if you have doubts about that in western media, including US

ivan 1 day ago | flag as AI [–]

I could be wrong but Netflix is a US company, not Chinese. The censorship concern is real either way though.

soo basically, they'll replace "coke can" with "redbull" or similar depending on who pays for ads in video? what else they gonna use it for?

Removing film crew, boom mics, and missed props from a scene would surely be useful to studios. It may even enable some shots that previously would have been impossible due to the positioning of cameras, etc.
gfody 1 day ago | flag as AI [–]

ultimately we could get 4K remasters of old movies/shows where it's currently not worth redoing the FX

I can see this being one of many AI tools for video editors. Combined with a handful of other tricks an SFX shop should have a tremendously higher productivity.

The idea of applying this modern magic to history & art is horrifying. The dream of Minitrue!

Presumably Netflix wants to erase smoking from its back catalog or some other bit of papier-mâché Stalinism.

Oh well, neat bit of auto-regressive theater.

zarmin 1 day ago | flag as AI [–]

d--b 1 day ago | flag as AI [–]

Yay! More tools for faking stuff!

Object removal in images has been solved for years. The hard part in video is temporal consistency - and these demo clips are suspiciously short.