Turn TTML subtitle files into SBV format quickly—upload your .ttml, convert, and download the SBV subtitle file in seconds.
.ttml (or .dfxp — same XML grammar) file, or click "Add Files" to pick one. Batch is supported, so you can queue an entire season of captions in one pass. Everything runs in your browser session — the file never leaves your device.<p> element with begin / end (or dur) attributes, normalises the timestamps to YouTube's H:MM:SS.mmm form, and writes one cue per blank-line-separated block. No bitrate, frame rate, or container settings are needed — subtitle conversion is purely a text/structure transform.<span tts:fontStyle="italic">), and <br/> line breaks land in plain text. SBV strips all styling by design — bold, italic, colour, positioning, and region anchors are dropped because YouTube's SBV parser ignores markup. If you need styling preserved, use TTML to WebVTT or TTML to SRT (with basic tags) instead..sbv file (or a ZIP for batch jobs). Drop it straight into YouTube Studio's "Subtitles" tab under "Upload file → With timing."TTML (Timed Text Markup Language) is the W3C's XML-based caption format — the workhorse of Netflix, BBC iPlayer, Apple TV, and most broadcast and OTT delivery pipelines via the IMSC1 profile. SBV (sometimes called SubViewer 2.0) is YouTube's stripped-down internal format: timestamps and plain text, no indices, no styling, no XML wrapper. Conversion is mostly a "downgrade" — you're flattening rich, regionally-positioned, styled captions into the simplest text-and-timing form a video platform accepts. Typical reasons to do it:
.sbv because that's what YouTube exported in the early CC era. If your in-house tooling still parses SBV, converting upstream TTML keeps the pipeline intact.tts:origin, tts:extent, tts:textAlign), colour, font, and region metadata that aren't needed for a web review pass. SBV's plain text is faster to copy-paste into a Google Doc for a producer to mark up.awk or a quick script across an SBV file is dramatically easier than walking an XML tree. Once you're in SBV, you're one step from a clean transcript.tickRate, frameRate, and SMPTE timestamps differently. Round-tripping TTML → SBV → SRT is a quick way to expose timing drift before delivery.| Property | TTML (input) | SBV (output) |
|---|---|---|
| Container | XML (<tt> root with multiple namespaces) |
Plain UTF-8 text, no headers |
| Maintained by | W3C Timed Text WG (TTML2 Rec 2018, IMSC1.3 Rec 2026) | YouTube (internal, undocumented spec) |
| Timestamp form | HH:MM:SS.fff clock-time, HH:MM:SS:FF SMPTE, or media-time offsets |
H:MM:SS.mmm clock-time only |
| Styling | Full: colour, font family/size/weight, italics, underline, background, opacity | None — all tags stripped |
| Positioning / regions | Yes (region, tts:origin, tts:extent) |
None |
| Animation | <animate> / <set> elements |
None |
| Speaker / role markers | <span tts:fontStyle="italic"> or ttm:role="dialogue" |
Convention: prefix line with >> and the speaker name |
| Cue index numbers | Implicit via document order | None — cues separated by a blank line |
| Multi-line cue | <br/> inside <p> |
Literal newline inside the cue block |
| Maximum cue length | No spec limit | Practical: YouTube renders ~40 chars per line, 2 lines max |
| Use case | Broadcast, OTT delivery (Netflix, BBC, Apple TV), accessibility compliance | YouTube uploads, legacy MCN tooling |
| TTML fragment | SBV output |
|---|---|
<p begin="00:00:00.599" end="00:00:04.160">Hi, my name is Alice.</p> |
0:00:00.599,0:00:04.160Hi, my name is Alice. |
<p begin="00:00:05.000" dur="00:00:02.500"><span tts:fontStyle="italic">whispering</span> Come closer.</p> |
0:00:05.000,0:00:07.500whispering Come closer. (italic dropped) |
<p begin="00:00:10s" end="00:00:13s">Line one<br/>Line two</p> |
0:00:10.000,0:00:13.000Line oneLine two |
<p begin="00:00:15:12" end="00:00:18:00" region="bottom">SMPTE 30fps cue</p> |
0:00:15.400,0:00:18.000SMPTE 30fps cue (region dropped, frames converted to ms) |
Yes. YouTube Studio's "Subtitles → Upload file → With timing" path accepts .sbv directly. YouTube's documented basic subtitle formats are SubRip (.srt), SubViewer (.sbv / .sub), MPsub, LRC, and Videotron Lambda; SBV is one of the original native upload formats. That said, for new workflows YouTube also accepts TTML directly — if you're not bound to legacy tooling, uploading the original TTML preserves styling and positioning that SBV can't carry.
SBV is a minimal text format with no styling grammar. The TTML attributes tts:color, tts:fontStyle, tts:fontWeight, tts:textDecoration, tts:origin, tts:extent, and region references all get discarded during conversion. If you need any of those preserved, TTML to WebVTT is a better target — WebVTT supports basic <b>, <i>, <u>, and positioning hints. SRT preserves italics and bold via inline HTML-like tags. Convert to SBV only when the destination (YouTube, an MCN tool) specifically requires it.
00:00:15:12 handled?TTML can express times as clock-time (00:00:15.500), media-time offsets (15.5s), or SMPTE timecode (00:00:15:12 where the final field is a frame count). SBV only supports clock-time with milliseconds. The converter reads the TTML document's ttp:frameRate (and ttp:frameRateMultiplier for NTSC drop-frame variants) to translate frames to milliseconds — frame 12 at 30 fps becomes 400 ms. If ttp:frameRate is missing, the spec default of 30 fps applies, which may cause drift for 24 fps or 25 fps source content. Check the preview before publishing.
DFXP ("Distribution Format Exchange Profile") was the original name for what is now TTML1; the file extensions .ttml, .dfxp, and .xml are used interchangeably for the same underlying XML grammar. YouTube treats .dfxp and .ttml as the same input type. This converter accepts the .ttml extension; rename a .dfxp file to .ttml if needed, or use a TTML-to-SBV variant if your file uses the .dfxp suffix.
>> before speaker names?That's the YouTube convention for speaker identification in SBV files — there's no formal <speaker> element, so caption authors prefix the line with >> followed by the speaker name (>> ALICE: Hi). If your TTML uses ttm:role="dialogue" with a speaker name in <ttm:agent>, the converter folds that into the SBV cue text. If your TTML has no speaker metadata, the cue is emitted as plain text with no prefix.
TTML supports multiple <div xml:lang="..."> blocks in a single document for parallel-language captioning. SBV has no language metadata field — one SBV file is implicitly one language. If your TTML contains multiple language tracks, run the conversion once per language and pick the corresponding xml:lang filter, or split the source TTML upstream. YouTube tags the language at upload time via the file-naming convention or the Studio language picker, not from the SBV content.
Neither format imposes a hard limit, but YouTube renders SBV at roughly 40 characters per line and 2 lines per cue before truncating, with a recommended duration of 1–7 seconds per cue. If your source TTML has long cues meant for a 1920×1080 broadcast frame, you may want to re-segment after conversion — Subtitle Edit and Aegisub both have automatic re-wrap tools that operate on SBV/SRT.
Conversion runs entirely client-side in your browser session — the TTML is parsed, transformed, and written to a downloadable .sbv blob without any server round-trip. No sign-up, no watermark, no email gating. Files are released from memory when you close the tab.
SBV strips so much information that round-tripping back to broadcast-quality TTML is lossy — you can recover the timing and the text, but not the original styling, regions, or speaker roles. If you must, use a dedicated SBV-to-TTML tool and then re-author the styling and positioning in a captioning suite like Subtitle Edit or EZTitles. Don't expect a clean round-trip if the original TTML carried IMSC1 positioning that a streamer's QC pipeline depends on.