Microsoft’s AI Copilot in the Edge browser can now generate text summaries of videos—with some limitations. According to Mikhail Parakhin, Microsoft’s CEO of advertising and web services, this feature only works on videos that have been pre-processed by Microsoft or have subtitles.
Parakhin explained that Copilot needs a text transcript of the video to create a summary. He said, "In order for it to work, we need to pre-process the video. If the video has subtitles—we can always fallback on that, if it does not and we didn't preprocess it yet—then it won’t work."
Copilot can also summarize video meetings and calls in Microsoft 365, but it requires audio transcription first. Similarly, Copilot on Microsoft Stream can summarize any video, but it needs users to generate a transcript.
The feature was demonstrated by designer Pietro Schirano, who posted a screen recording of Copilot summarizing a YouTube video about the GTA VI trailer. The video had subtitles, so Copilot was able to produce a summary with highlights and timestamps in seconds.
However, not all videos have subtitles or transcripts. Parakhin said that most publicly available videos (i.e. YouTube) should work.