SSML and HTML Support
SSML (Speech Synthesis Markup Language)
SSML consists of XML-like tags, for example: Did you mean the <emphasis level="strong"><prosody pitch="75">green</prosody></emphasis> beans?
The following markup tags and attributes are recognised:
speak
- xml:base (the value is just passed back as a parameter with the UriCallback() function)
- xml:lang
voice
- xml:lang
- name
- age
- variant
- gender
prosody
- rate (
x-slow
, slow
, medium
, fast
, x-fast
or a percentage such as 125%
)
- volume (
silent
, x-soft
, soft
, medium
, loud
, x-loud
, +1dB
or -1dB
)
- pitch (a number, for example “75”)
- range (
default
, x-low
, low
, medium
, high
, x-high
)
say-as
- interpret-as=”characters”
- interpret-as=”characters” format=”glyphs”
- interpret-as=”tts:key”
- interpret-as=”tts:char”
- interpret-as=”tts:digits”
mark
s
p
sub
tts:style
- field=”punctuation” mode=none,all,some
- field=”capital_letters” mode=no,spelling,icon,pitch
audio
emphasis
- level (
none
, reduced
, moderate
, strong
or x-strong
)
break
HTML
eSpeak can speak HTML text directly, or text containing both SSML and HTML markup.
Any unrecognised tags are ignored.
The following tags cause a sentence break:
The following tags cause a paragraph break:
Text between the following tags is ignored:
References
SSML
- Speech Synthesis Markup Language (SSML) Version 1.0.
W3C Recommendation, 3 March 2009. W3C.
- Speech Synthesis Markup Language (SSML) Version 1.1.
W3C Recommendation, 7 September 2010. W3C.
- SSML 1.0 say-as attribute values.
W3C NOTE, 26 May 2005. W3C.
HTML
- HTML 5.2.
W3C Recommendation, 14 December 2017. W3C.
- HTML Living Standard.
Continually updated. WHATWG.