indexpost archiveatom feed syndication feed icon

Who has ever heard of the track element?

2023-02-19

I learned about a new-to-me HTML element today and realized it might prove useful in the future for the kinds of bug report videos I tend to capture.

To the point: the <track> element lets you associate text with a media element like audio or video.

An Example

I've previously written about an annoying inconsitency and while I included a video to demonstrate; it is not exactly clear what I was doing. I think this could be a great case for time-coded annotations like VTT seems to provide. I don't think this is outside the intended use of the feature.

How it is Done

  <video src="/2020/static/info-menu-weirdness.webm" controls
	 preload="none"
         style="border:1px solid #424242"
         alt="demonstrating the odd toggling behavior and layering in the channel details menu over threads">
    <track default
           srclang="en"
           src="/2023/static/test.vtt">
  </video>

The test.vtt file referenced above looks like this:

WEBVTT description for slack menu confusion demo

00:01.000 --> 00:03.000
open information pane

00:03.000 --> 00:06.000
toggle information pane

00:08.000 --> 00:10.000
back button closes pane

00:10.000 --> 00:13.000
show information pane

00:13.000 --> 00:15.000
open thread

00:15.000 --> 00:17.000
closing thread reveals information pane

00:20.000 --> 00:24.000
open information pane
navigate back to close

00:24.000 --> 00:27.000
open thread

00:27.000 --> 00:30.000
closing thread reveals information pane

00:32.000 --> 00:35.000
open thread
toggle information pane

00:35.000 --> 00:39.000
information pane is closed
thread is not visible

For me personally I think this could be especially useful because when I record these sorts of videos I tend to script them in advance. This helps to focus my attention on demonstrating precisely what I intend and can minimize potentially confusing, inessential actions. Those sorts of scripts could become the VTT track paired with the video.

Data URI

The one thing I find myself a bit unsure about is my own ability to keep some things in sync. I face the same issue with generated tables on this very site. My solution in most cases is to embed the source information where it is intend to be used. I found it is possible to use data URIs for VTT src attributes and accomplish something similar.

In practice it ends up looking like this:

  <video src="/2020/static/info-menu-weirdness.webm" controls
	 preload="none"
         style="border:1px solid #424242"
         alt="demonstrating the odd toggling behavior and layering in the channel details menu over threads">
    <track default
           srclang="en"
           src="data:text/vtt,WEBVTT%20description%20for%20slack%20menu%20confusion%20demo%0A%0A00%3A01.000%20--%3E%2000%3A03.000%0Aopen%20information%20pane%0A%0A00%3A03.000%20--%3E%2000%3A06.000%0Atoggle%20information%20pane%0A%0A00%3A08.000%20--%3E%2000%3A10.000%0Aback%20button%20closes%20pane%0A%0A00%3A10.000%20--%3E%2000%3A13.000%0Ashow%20information%20pane%0A%0A00%3A13.000%20--%3E%2000%3A15.000%0Aopen%20thread%0A%0A00%3A15.000%20--%3E%2000%3A17.000%0Aclosing%20thread%20reveals%20information%20pane%0A%0A00%3A20.000%20--%3E%2000%3A24.000%0Aopen%20information%20pane%0Anavigate%20back%20to%20close%0A%0A00%3A24.000%20--%3E%2000%3A27.000%0Aopen%20thread%0A%0A00%3A27.000%20--%3E%2000%3A30.000%0Aclosing%20thread%20reveals%20information%20pane%0A%0A00%3A32.000%20--%3E%2000%3A35.000%0Aopen%20thread%0Atoggle%20information%20pane%0A%0A00%3A35.000%20--%3E%2000%3A39.000%0Ainformation%20pane%20is%20closed%0Athread%20is%20not%20visible">
  </video>

It is certainly uglier but I think it might integrate in the same way I sometimes use a feature of my editor to generate HTML tables from friendlier ASCII tables that I consider the source data. After all, the big ugly string above is basically just the result of calling url-hexify-string in emacs. It is similarly available in JavaScript (via encodeURI) etc. for those cases where emacs is unavailable.

Thoughts

Way back in 2015 I cheerily wrote about some new browser features and the direction of application development on the web. While I have grown less chipper with the results of web application development I do find, in instances like this today, that browsers make for pretty good note taking platforms. It can be hard to reconcile all the things I don't like when I find these genuinely useful bits scattered throughout.

Hacked Up Demo

I am insufficiently motivated to build this idea out too far but I am surprised at the ease of developing a VTT file alongside the video playback with something as simple as the following. It is extremely naive in its implementation, grabbing the video time and concatenating it to a string of the textarea. I have omitted handling "real" timestamps and only capture whole seconds. Similarly it just appends to the textarea without validating the contents or doing anything smart in the face of skipping forward or reverse. It seems to work surprisingly well though (especially considering it is under 20 lines of JavaScript):