Skip to Main Content

Course Content Accessibility: Subtitles

Timed-Text Subtitle File Format

Subtitle Formats

The subtitles available on DVDs are usually bitmap images. That is, each frame of subtitle text is a digital photograph, and does not contain data in the form of letters, numbers, and punctuation. These work fine with physical DVD players, but online streaming players generally can't manage image-based subtitles. There are two ways of making the subtitles on a DVD available for online streaming:

Burn In: A given set of subtitles on a DVD can be "burned in" or "directly rendered" to the video file while it is being prepared for streaming. The subtitles become a permanent part of the video file and are always visible, with no options to turn them off or chose an alternate set of subtitles.

Convert Images to Timed Text:  The subtitle images on the DVD can be passed through a process of Optical Character Recognition (OCR) to produce a file containing the subtitles as text (letters, numbers, and punctuation), together with time-stamps that indicate when and for how long each subtitle frame is to be visible.

The process of converting image subtitles to timed text can be tedious because the OCR process is often unreliable. Although extensive editing of the text is often required, one advantage of this conversion is that the time-stamps generated for each subtitle frame are highly reliable, so the subtitles will synchronize cleanly with the video when played.

For many popular videos, timed-text files might be available from the internet. These are often of high quality, but need checking and editing nevertheless. Transcription accuracy is often good, but may not be perfect. In addition, the time-stamps in a downloaded subtitle file might not exactly match the specific version of a video being processed. Any downloaded subtitle file needs to be checked for both transcription accuracy and video synchronization, and may require adjustments.

A number of timed-text formats are currently in use. The majority of timed-text subtitles obtainable in English on the internet are in a format known as SubRip, with a file extension of .srt. SubRip files give a subtitle frame number, display start and stop times to the millisecond, and subtitle text, with some formatting options using HTML tags.  A SubRip file looks like this:

00:02:05,840 --> 00:02:06,841
<i>All right, team,</i>

00:02:06,920 --> 00:02:08,046
<i>stay in sight of each other.</i>

00:02:08,160 --> 00:02:09,241
<i>Let's make NASA proud today.</i>

00:02:10,680 --> 00:02:12,409
MARTINEZ: <i>How's it looking</i>
<i>over there, Watney?</i>