Supercuts [⧉✂︎|>]

Reconstructing Video Through Word-Level ML Segmentation

Type

Video Art, Creative Coding

Technologies

Machine Learning, Python

Supported by

Goldsmiths, University of London

This project explores video segmentation and reassembly using machine learning and generative techniques. Starting with a source video, I use a speech-to-text model to generate a word-level transcript. Each word is then isolated into a separate video clip using precise timecodes.

From there, I've experimented with two recomposition strategies. 1. Text-Based Reconstruction — A new sequence of words is generated to define an edit of the video by assembling the matching clips in that order. 2. Intensity-Based Sorting — Using audio RMS amplitude, each clip is assigned a rough loudness value, and clips are sorted and reassembled accordingly.