Signup/Sign In

JavaScript Text to Speech using SpeechSynthesis Interface

Technology #javascript

    In a world dominated by visual content, the power of auditory communication should not be underestimated. JavaScript, the dynamic programming language that fuels the interactive web, offers a fascinating feature that allows us to give a voice to our websites and applications: the SpeechSynthesis interface.

    This technology enables developers to convert written text into spoken words, bringing a whole new dimension to user experiences.

    To add a text-to-speech feature on your webpage using Javascript, we need to use the Web Speech API, which can be used to synthesize speech which is converting text to speech, and we can also use it to recognize speech to convert speech to text. We will cover the speech-to-text in our next post, for this one, we will learn how we can convert text to audio in JavaScript.

    JavaScript Text to Speech using SpeechSynthesis Interface

    Javascript text to speech example

    We will be using the following interfaces/properties in this tutorial:

    1. SpeechSynthesis

    2. SpeechSynthesisUtterance

    3. window.speechSynthesis

    Let us see what are these, one by one.

    JavaScript SpeechSynthesis Interface

    This is the main controller interface for the speech synthesis service which controls the synthesis or creation of speech using the text provided. This interface is used to start the speech, stop the speech, pause it and resume it, along with getting the voices supported by the device.

    The following are the methods available in this Interface:

    speak(): Add the utterance(object of SpeechSynthesisUtterance) in the queue, which will be spoken when there is no pending utterance before it. This is the function, we will be using too.

    pause(): To pause the current ongoing speech.

    resume(): To resume the paused speech.

    cancel(): To cancel all the pending utterances or speech created, which are not yet played.

    getVoices(): To get a list of all supported voices which the device supports.

    JavaScript SpeechSynthesisUtterance Interface

    This is the interface in which we actually create the speech or utterance using the text provided, setting a language type, volume, pitch of the voice, rate of speech, etc. Once we have created an object for this interface, we provide it to the SpeechSynthesis object's speak() method to play the speech.

    Following are the properties provided by this interface to configure it(we have used all of them in our code example):

    lang: To get and set the language of speech.

    pitch: To get and set the pitch of the voice at which the utterance will be spoken.

    rate: To get and set the speed at which the utterance will be spoken.

    volume: To get and set the volume.

    text: To get and set the text which has to be spoken.

    voice: To get or set the voice to be used.

    JavaScript window.speechSynthesis Property

    This property of the Javascript window object is used to get the reference of the speech synthesis controller interface, on which we call the speak() method.

    Now, let's jump into the code, to see all this in action.

    NOTE: This feature is an experimental feature and is supported in limited browsers but the latest versions of Chrome, Firefox, Safari, and Opera should support this.

    JavaScript Text to Speech

    Here is the JS code sample, which shows how to use the SpeechSynthesisUtterance and SpeechSynthesis interfaces to initialize and configure the utterance that will be spoken aloud in the browser.

    let speech = new SpeechSynthesisUtterance();
    speech.lang = "en-US";
    speech.text = msg;
    speech.volume = 1;
    speech.rate = 1;
    speech.pitch = 1;                

    Now let's see the code in action below.

    In the code above, you will see all the properties. I would recommend you to play around with properties and methods to see how this works.


    The SpeechSynthesis interface has revolutionized the way we interact with websites and applications, breaking the barriers of traditional visual communication. JavaScript Text to Speech opens up endless possibilities, from enhancing accessibility and inclusivity to creating engaging interactive narratives.

    By harnessing the power of auditory communication, developers can create rich, immersive experiences that captivate users on a whole new level. As we continue to push the boundaries of web development, the SpeechSynthesis interface will undoubtedly play a significant role in shaping the future of user experiences. So go ahead, explore the vast potential of JavaScript Text to Speech, and let your websites and applications speak volumes!

    Frequently Asked Questions(FAQs)

    1. What is the SpeechSynthesis interface?

    The SpeechSynthesis interface is an API provided by modern web browsers that enables JavaScript developers to convert written text into synthesized speech. It allows websites and applications to speak to users by controlling the speech synthesis process.

    2. How does JavaScript Text to Speech work?

    JavaScript Text to Speech works by utilizing the SpeechSynthesis interface, which consists of a set of objects and methods for controlling speech synthesis. Developers can create speech synthesis instances, specify the text to be spoken, set desired voice characteristics, and control the playback.

    3. Which web browsers support the SpeechSynthesis interface?

    The SpeechSynthesis interface is supported by most modern web browsers, including Chrome, Firefox, Safari, and Edge. However, it's essential to check for specific browser versions to ensure compatibility.

    4. Can I customize the speech output using JavaScript Text to Speech?

    Yes, JavaScript Text to Speech allows for customization of the speech output. Developers can choose from a variety of available voices, set speech rate and pitch, and even add pauses and emphasis to specific words or phrases, offering a more personalized and natural user experience.

    5. What are some practical applications of JavaScript Text to Speech?

    JavaScript Text to Speech finds applications in various domains. It can enhance accessibility for visually impaired users, provide an audio feedback in interactive applications, create voice-guided navigation systems, and even be used to develop interactive storytelling experiences that engage users through audio narration. Its potential is limited only by imagination and creativity.

    You may also like:

    I like writing content about C/C++, DBMS, Java, Docker, general How-tos, Linux, PHP, Java, Go lang, Cloud, and Web development. I have 10 years of diverse experience in software development. Founder @ Studytonight