Supercuts [⧉✂︎|>]
Reconstructing Video Through Word-Level ML Segmentation
This project explores video segmentation and reassembly using machine learning and generative techniques. Starting with a source video, I use a speech-to-text model to generate a word-level transcript. Each word is then isolated into a separate video clip using precise timecodes.
From there, I've experimented with two recomposition strategies. 1. Text-Based Reconstruction — A new sequence of words is generated to define an edit of the video by assembling the matching clips in that order. 2. Intensity-Based Sorting — Using audio RMS amplitude, each clip is assigned a rough loudness value, and clips are sorted and reassembled accordingly.