Word Highlighting System
A solution to quick and accurate audio timing and SQL text/timing code generation

 

Fathom's Word Highlighting System synchronizes text with voice-over or native speaker narrations. Works with 11 languages and more language options are in development. This web-based user interface offers an order of magnitude efficiency gain through process automation saving audio engineers and developers time and money. Input text with audio and output code direct to your SQL tables.

Price Calculator

Step by step process

Create Transcript

The Word Highlighting System needs a transcript of the book. A transcript is a text file of the whole book, broken into pages using character sequences, with line breaks to represent the end of sentences, and double line breaks to indicate paragraphs or visual breaks.

Scrub

Scrub creates a single text transcript file by reading the HTML of the pages, removing captions, separating text from HTML tags, using whitespace to separate words, processing visual line breaks, unifying line endings, ensuring punctuation consistency, identifying acronyms and numbers with decimals points and coalescing unnecessary whitespace.

Rip Transcript

Rip Transcript uses the scrubbed transcript to create separate files for each set of facing pages. It also unifies line endings and inserts space between characters for non-alphabetic languages.

Make Page HTML

Make Page HTML uses the ripped transcript file to uniquely number words using CSS identifiers and keep track of changing page numbers. It also places punctuation in such a way that is not included in the word highlighting. These HTML files are then uploaded to the database manually by an administrator and can be customized by hand as needed.

Force Alignment

Force Alignment uses the ripped transcript files and narrated audio files to generate a set of JSON-encoded files containing timing data. This data consists of start and end times for each identified word, along with its corresponding CSS identifier and the word it matches.

Create SQL

Create SQL parses the JSON output of the Force Alignment step, converts it into this format and outputs a single SQL file. This SQL file can then be run by the database to inset or update existing timing data for a book.

Renumber Words

Renumber Words is used to “re-sequence” all CSS identifiers in the Page HTML and timing data if original output is sequentially out of sync. This tool can also make hand-edits to the database easier so that identifiers do not have to be renumbered by hand.

Clean

Clean removes all intermediate files if there is an accumulation of old, disused intermediate files and prevents their unintentional incorporation in the result once the transcript administrative steps are repeated.

Apply Fine Tuning

Apply Fine Tuning is used to adjust timing manually if audio timing is inaccurate. This will take in a JSON and an audio file and output a replacement JSON file.