e_speech-ock

SSML and HTML Support

SSML (Speech Synthesis Markup Language)
- speak
- voice
- prosody
- say-as
- mark
- s
- p
- sub
- tts:style
- audio
- emphasis
- break
HTML
References
- SSML
- HTML

SSML (Speech Synthesis Markup Language)

SSML consists of XML-like tags, for example: Did you mean the <emphasis level="strong"><prosody pitch="75">green</prosody></emphasis> beans?

The following markup tags and attributes are recognised:

speak

xml:base (the value is just passed back as a parameter with the UriCallback() function)
xml:lang

voice

xml:lang
name
age
variant
gender

prosody

rate (x-slow, slow, medium, fast, x-fast or a percentage such as 125%)
volume (silent, x-soft, soft, medium, loud, x-loud, +1dB or -1dB)
pitch (a number, for example “75”)
range (default, x-low, low, medium, high, x-high)

say-as

interpret-as=”characters”
interpret-as=”characters” format=”glyphs”
interpret-as=”tts:key”
interpret-as=”tts:char”
interpret-as=”tts:digits”

mark

name

s

xml:lang

p

xml:lang

sub

alias

tts:style

field=”punctuation” mode=none,all,some
field=”capital_letters” mode=no,spelling,icon,pitch

audio

src

emphasis

level (none, reduced, moderate, strong or x-strong)

break

strength
time

HTML

eSpeak can speak HTML text directly, or text containing both SSML and HTML markup.
Any unrecognised tags are ignored.

The following tags cause a sentence break:

br
dd
li
img
td

The following tags cause a paragraph break:

h1
h2
h3
h4
hr

Text between the following tags is ignored:

script
style

References

SSML

Speech Synthesis Markup Language (SSML) Version 1.0. W3C Recommendation, 3 March 2009. W3C.
Speech Synthesis Markup Language (SSML) Version 1.1. W3C Recommendation, 7 September 2010. W3C.
SSML 1.0 say-as attribute values. W3C NOTE, 26 May 2005. W3C.

HTML

HTML 5.2. W3C Recommendation, 14 December 2017. W3C.
HTML Living Standard. Continually updated. WHATWG.

This site is open source. Improve this page.