Netflix Releases VOID: Video Object and Interaction Deletion Model on Hugging Face

What VOID Does
VOID removes objects from videos along with all interactions they induce on the scene — not just secondary effects like shadows and reflections, but physical interactions like objects falling when a person is removed.
Technical Requirements
- Requires a GPU with 40GB+ VRAM (e.g., A100)
- Built on CogVideoX-Fun-V1.5-5b-InP
- Fine-tuned for video inpainting with interaction-aware quadmask conditioning
- Quadmask is a 4-value mask that encodes: primary object (remove), overlap regions, affected regions (falling objects, displaced items), and background (keep)
- Resolution: 384x672 (default)
- Max frames: 197
- Scheduler: DDIM
- Precision: BF16 with FP8 quantization for memory efficiency
Model Files
void_pass1.safetensors- Base inpainting model (required)void_pass2.safetensors- Warped-noise refinement for temporal consistency (optional)
Pass 1 is sufficient for most videos. Pass 2 adds optical flow-warped latent initialization for improved temporal consistency on longer clips.
Quick Start
The included notebook handles setup, downloads models, runs inference on a sample video, and displays the result.
git clone https://github.com/netflix/void-model.git
cd void-modelCLI Usage
# Install dependencies
pip install -r requirements.txt
Download the base model
huggingface-cli download alibaba-pai/CogVideoX-Fun-V1.5-5b-InP
--local-dir ./CogVideoX-Fun-V1.5-5b-InP
Download VOID checkpoints
huggingface-cli download netflix/void-model
--local-dir .
Run Pass 1 inference on a sample
python inference/cogvideox_fun/predict_v2v.py
--config config/quadmask_cogvideox.py
--config.data.data_rootdir= "./sample"
--config.experiment.run_seqs= "lime"
--config.experiment.save_path= "./outputs"
--config.video_model.transformer_path= "./void_pass1.safetensors"
Input Format
Each video needs three files in a folder:
input_video.mp4- source videoquadmask_0.mp4- 4-value mask (0=remove, 63=overlap, 127=affected, 255=keep)prompt.json- {"bg": "description of scene after removal"}
The repo includes a mask generation pipeline (VLM-MASK-REASONER/) that creates quadmasks from raw videos using SAM2 + Gemini.
Training Details
- Trained on paired counterfactual videos generated from two sources: HUMOTO (human-object interactions rendered in Blender with physics simulation) and Kubric (object-only interactions using Google Scanned Objects)
- Training was run on 8x A100 80GB GPUs using DeepSpeed ZeRO Stage 2
Architecture
- Base: CogVideoX 3D Transformer (5B parameters)
- Input: Video + quadmask + text prompt describing the scene after removal
📖 Read the full source: HN AI Agents
👀 See Also

Reddit User Shares AI Tool for Gathering Financial Account Balances
A Reddit post on r/openclaw presents an AI agent designed to streamline the collection of financial account balances using Python. Users discuss automation potential via custom scripts leveraging APIs like Plaid.

Claude Command Center: Open-Source Dashboard for Claude Code Analytics
Claude Command Center is a local dashboard that reads your ~/.claude/ directory to display Claude Code session data, costs, and MCP server configurations. Built entirely using Claude Code with an Express backend and React frontend, it requires zero configuration and runs locally with no cloud or telemetry.

Agent Kernel: Three Markdown Files for Stateful AI Agents
Agent Kernel provides three markdown files that enable stateful behavior in AI coding agents without databases or custom frameworks. It works with OpenCode, Claude Code, Codex, Cursor, Windsurf, and similar tools.

Testing AI Agents Against Real-world APIs with d3 Labs
d3 labs offers 10 free production APIs to help developers test AI agents in real-world scenarios instead of relying on unrealistic mocks.