Convert TTML to SBV Online

Q: How are SMPTE timestamps like 00:00:15:12 handled?

TTML can express times as clock-time (00:00:15.500), media-time offsets (15.5s), or SMPTE timecode (00:00:15:12 where the final field is a frame count). SBV only supports clock-time with milliseconds. The converter reads the TTML document's ttp:frameRate (and ttp:frameRateMultiplier for NTSC drop-frame variants) to translate frames to milliseconds — frame 12 at 30 fps becomes 400 ms. If ttp:frameRate is missing, the spec default of 30 fps applies, which may cause drift for 24 fps or 25 fps source content. Check the preview before publishing.

Q: My TTML has multiple language tracks. Will I get one SBV per language?

TTML supports multiple blocks in a single document for parallel-language captioning. SBV has no language metadata field — one SBV file is implicitly one language. If your TTML contains multiple language tracks, run the conversion once per language and pick the corresponding xml:lang filter, or split the source TTML upstream. YouTube tags the language at upload time via the file-naming convention or the Studio language picker, not from the SBV content.

Turn TTML subtitle files into SBV format quickly—upload your .ttml, convert, and download the SBV subtitle file in seconds.

From format

To format

Input (TTML)

📂 Upload .ttml

Output (SBV)

How to Convert TTML to SBV Online

Upload Your TTML File: Drag and drop a .ttml (or .dfxp — same XML grammar) file, or click "Add Files" to pick one. Batch is supported, so you can queue an entire season of captions in one pass. Everything runs in your browser session — the file never leaves your device.
Confirm SBV as the Output: SBV is preselected. The converter parses the TTML XML tree, walks every  element with begin / end (or dur) attributes, normalises the timestamps to YouTube's H:MM:SS.mmm form, and writes one cue per blank-line-separated block. No bitrate, frame rate, or container settings are needed — subtitle conversion is purely a text/structure transform.
Review and (Optionally) Edit: Open the preview to scan how multi-line cues, speaker labels (), and   line breaks land in plain text. SBV strips all styling by design — bold, italic, colour, positioning, and region anchors are dropped because YouTube's SBV parser ignores markup. If you need styling preserved, use TTML to WebVTT or TTML to SRT (with basic tags) instead.
Convert and Download: Click "Convert" and grab the .sbv file (or a ZIP for batch jobs). Drop it straight into YouTube Studio's "Subtitles" tab under "Upload file → With timing."

Why Convert TTML to SBV?

TTML (Timed Text Markup Language) is the W3C's XML-based caption format — the workhorse of Netflix, BBC iPlayer, Apple TV, and most broadcast and OTT delivery pipelines via the IMSC1 profile. SBV (sometimes called SubViewer 2.0) is YouTube's stripped-down internal format: timestamps and plain text, no indices, no styling, no XML wrapper. Conversion is mostly a "downgrade" — you're flattening rich, regionally-positioned, styled captions into the simplest text-and-timing form a video platform accepts. Typical reasons to do it:

Repurpose broadcast masters for YouTube — A captioning house delivers IMSC1 TTML files to a streamer; the social team wants to publish the same episodes on the brand's YouTube channel. SBV (or SRT) is the format YouTube Studio accepts under "Upload file → With timing."
Legacy YouTube workflows and CMS templates — Older multi-channel network (MCN) tools and partner CMS scripts were built around .sbv because that's what YouTube exported in the early CC era. If your in-house tooling still parses SBV, converting upstream TTML keeps the pipeline intact.
Strip styling for re-edit — TTML can carry positional overrides (tts:origin, tts:extent, tts:textAlign), colour, font, and region metadata that aren't needed for a web review pass. SBV's plain text is faster to copy-paste into a Google Doc for a producer to mark up.
Lightweight transcript extraction — SBV is two-line cues with a blank-line separator; running awk or a quick script across an SBV file is dramatically easier than walking an XML tree. Once you're in SBV, you're one step from a clean transcript.
Cross-tool sanity check — Different TTML authoring tools (Caption Maker, EZTitles, Subtitle Edit, MacCaption) interpret tickRate, frameRate, and SMPTE timestamps differently. Round-tripping TTML → SBV → SRT is a quick way to expose timing drift before delivery.

TTML vs SBV — What Carries Over and What Gets Dropped

Property	TTML (input)	SBV (output)
Container	XML (`<tt>` root with multiple namespaces)	Plain UTF-8 text, no headers
Maintained by	W3C Timed Text WG (TTML2 Rec 2018, IMSC1.3 Rec 2026)	YouTube (internal, undocumented spec)
Timestamp form	`HH:MM:SS.fff` clock-time, `HH:MM:SS:FF` SMPTE, or media-time offsets	`H:MM:SS.mmm` clock-time only
Styling	Full: colour, font family/size/weight, italics, underline, background, opacity	None — all tags stripped
Positioning / regions	Yes (`region`, `tts:origin`, `tts:extent`)	None
Animation	`<animate>` / `<set>` elements	None
Speaker / role markers	`<span tts:fontStyle="italic">` or `ttm:role="dialogue"`	Convention: prefix line with `>>` and the speaker name
Cue index numbers	Implicit via document order	None — cues separated by a blank line
Multi-line cue	`<br/>` inside `<p>`	Literal newline inside the cue block
Maximum cue length	No spec limit	Practical: YouTube renders ~40 chars per line, 2 lines max
Use case	Broadcast, OTT delivery (Netflix, BBC, Apple TV), accessibility compliance	YouTube uploads, legacy MCN tooling

Quick Reference: How a TTML Cue Becomes an SBV Cue

TTML fragment	SBV output
`<p begin="00:00:00.599" end="00:00:04.160">Hi, my name is Alice.</p>`	`0:00:00.599,0:00:04.160` `Hi, my name is Alice.`
`<p begin="00:00:05.000" dur="00:00:02.500"><span tts:fontStyle="italic">whispering</span> Come closer.</p>`	`0:00:05.000,0:00:07.500` `whispering Come closer.` (italic dropped)
`<p begin="00:00:10s" end="00:00:13s">Line one<br/>Line two</p>`	`0:00:10.000,0:00:13.000` `Line one` `Line two`
`<p begin="00:00:15:12" end="00:00:18:00" region="bottom">SMPTE 30fps cue</p>`	`0:00:15.400,0:00:18.000` `SMPTE 30fps cue` (region dropped, frames converted to ms)

Frequently Asked Questions

Will YouTube accept the SBV file I get out of this converter?

Yes. YouTube Studio's "Subtitles → Upload file → With timing" path accepts .sbv directly. YouTube's documented basic subtitle formats are SubRip (.srt), SubViewer (.sbv / .sub), MPsub, LRC, and Videotron Lambda; SBV is one of the original native upload formats. That said, for new workflows YouTube also accepts TTML directly — if you're not bound to legacy tooling, uploading the original TTML preserves styling and positioning that SBV can't carry.

Why does my output have no styling, colours, or positioning?

SBV is a minimal text format with no styling grammar. The TTML attributes tts:color, tts:fontStyle, tts:fontWeight, tts:textDecoration, tts:origin, tts:extent, and region references all get discarded during conversion. If you need any of those preserved, TTML to WebVTT is a better target — WebVTT supports basic , , , and positioning hints. SRT preserves italics and bold via inline HTML-like tags. Convert to SBV only when the destination (YouTube, an MCN tool) specifically requires it.

How are SMPTE timestamps like `00:00:15:12` handled?

TTML can express times as clock-time (00:00:15.500), media-time offsets (15.5s), or SMPTE timecode (00:00:15:12 where the final field is a frame count). SBV only supports clock-time with milliseconds. The converter reads the TTML document's ttp:frameRate (and ttp:frameRateMultiplier for NTSC drop-frame variants) to translate frames to milliseconds — frame 12 at 30 fps becomes 400 ms. If ttp:frameRate is missing, the spec default of 30 fps applies, which may cause drift for 24 fps or 25 fps source content. Check the preview before publishing.

What's the difference between TTML and DFXP, and which does this tool accept?

DFXP ("Distribution Format Exchange Profile") was the original name for what is now TTML1; the file extensions .ttml, .dfxp, and .xml are used interchangeably for the same underlying XML grammar. YouTube treats .dfxp and .ttml as the same input type. This converter accepts the .ttml extension; rename a .dfxp file to .ttml if needed, or use a TTML-to-SBV variant if your file uses the .dfxp suffix.

Why does the SBV output show `>>` before speaker names?

That's the YouTube convention for speaker identification in SBV files — there's no formal <speaker> element, so caption authors prefix the line with >> followed by the speaker name (>> ALICE: Hi). If your TTML uses ttm:role="dialogue" with a speaker name in <ttm:agent>, the converter folds that into the SBV cue text. If your TTML has no speaker metadata, the cue is emitted as plain text with no prefix.

My TTML has multiple language tracks. Will I get one SBV per language?

TTML supports multiple <div xml:lang="..."> blocks in a single document for parallel-language captioning. SBV has no language metadata field — one SBV file is implicitly one language. If your TTML contains multiple language tracks, run the conversion once per language and pick the corresponding xml:lang filter, or split the source TTML upstream. YouTube tags the language at upload time via the file-naming convention or the Studio language picker, not from the SBV content.

Is there a maximum cue length or duration I should worry about?

Neither format imposes a hard limit, but YouTube renders SBV at roughly 40 characters per line and 2 lines per cue before truncating, with a recommended duration of 1–7 seconds per cue. If your source TTML has long cues meant for a 1920×1080 broadcast frame, you may want to re-segment after conversion — Subtitle Edit and Aegisub both have automatic re-wrap tools that operate on SBV/SRT.

Does this happen in my browser or get uploaded somewhere?

Conversion runs entirely client-side in your browser session — the TTML is parsed, transformed, and written to a downloadable .sbv blob without any server round-trip. No sign-up, no watermark, no email gating. Files are released from memory when you close the tab.

How do I go the other direction — SBV back to TTML for broadcast delivery?

SBV strips so much information that round-tripping back to broadcast-quality TTML is lossy — you can recover the timing and the text, but not the original styling, regions, or speaker roles. If you must, use a dedicated SBV-to-TTML tool and then re-author the styling and positioning in a captioning suite like Subtitle Edit or EZTitles. Don't expect a clean round-trip if the original TTML carried IMSC1 positioning that a streamer's QC pipeline depends on.

Related Convert tools

TTML Converter Convert Ttml To Srt Convert Ttml To Vtt Convert Ttml To Ass Convert Ttml To Ssa Convert Srt To Sbv Convert Vtt To Sbv Convert Ass To Sbv