Remove requirement for Invocation name when an Audio Streaming skill is playing
Problem:
When an audio streaming skill is actively streaming/playing, users of the skill have to re-open the skill in order to issue additional commands. This creates a very awkward user experience, for example:
"Alexa, ask AnyPod to play This American Life"
<This American Life starts playing>
"Alexa, ask AnyPod to play the oldest episode"
<oldest episode starts playing>
"Alexa, ask AnyPod to fast forward 3 minutes"
<episode fast forwards>
"Alexa, ask AnyPod to <etc>"
--
Solution 1:
Ideally this interaction would go like this:
"Alexa, ask AnyPod to play This American Life"
<This American Life starts playing>
"AnyPod, play the oldest episode"
Having "AnyPod" (the skill's invocation name) substitute in as an alternate wake word would be great. But of course you'll never do that in a million years, until in a few years from now when you actually do. So how about this as a compromise in the meantime:
Solution 2:
"Alexa, ask AnyPod to play This American Life"
<This American Life starts playing>
"Alexa, play the oldest episode"
This will require some clever engineering to determine whether the user is said something that maps to a valid intent within the active audio streaming skill's invocation model - and if not, handle the request the same way they are currently handled. That seems reasonably doable.
It does create potential problems where an audio streaming skill may have an intent that conflicts with the invocation name of another skill -ie, the user may wish to open another skill but find that the audio streaming skill "hijacked" their request because the invocation name for the other skill triggered an intent in the audio streaming skill. But, that problem is solvable a few different ways - put greater restrictions on the intents that can be accessed in this scenario, for example:
a "catch all" intent with nothing but a custom slot could not be allowed
any intent with words matching the invocation name of existing skills would not be allowed (they would be allowed during normal skill usage, but not in this more restricted context)

-
Jonathan H commented
This is still a problem for many developers, and makes a poor experience for users. I hope they implement this soon
-
Jonathan H commented
Don't know why my previous comment was deleted, but this is a really important suggestion and implementation is long overdue. It provides a really bad user experience having to do this full invocation each time
-
Akshita commented
While audio player is running, user can only move next or previous in the audio player. The session closes so any other command for skill has to be made after again reopening the session by saying invocation name which creates such an awkward user experience.
There are below use cases in the skill that i want to develop.
1) Let user purchase product in between on going songs
2) Let user navigate to main menu while audio player is playing.
3) Let user choose from playlist while audio is playing in audio playerAll the above three are restricted as of now because after playing audio, the session of the skill just closes.
This is a concern for not just me but many developers. There needs to be a better way and a better solution for this.
-
James Cimino commented
I would like to extend this capablility for video as well,.
-
First Asset HR commented
I am surprised the issue of losing a session after the audio playback is not a top priority.
It is absolutely crucial to keep the conversation with a client after the audio file ! -
Karthikraj commented
I think, while the Audio Player is streaming, Alexa knows exactly which skill is using the Audio Player.
So, exposing other intents from the skill would be simple. For ex,
Just allow, users to say "Alexa, provide menu options", "Alexa, play latest episodes"...
-
Mike Paine commented
Yes, please implement this, its very awkward to use the skills in this fashion. TY
-
Ian D Banks commented
Currently if a user streams audio via the skill there is no friendly way to get back to the skill without the user saying an invocation command.
I'd propose a way to allow a skill to open a new session, allowing speechOutput after the PlaybackNearlyFinished or PlaybackFinished events occur in the AudioPlayer interface.
There are a couple use-cases for this:
1) A podcast skill where it would be beneficially to ask the user if they'd like to listen to the next episode.
2) Give some information about the next episode before it begins to play
3) Provide a purchase opportunity after an episode completes but before the next begins.
4) Allow the user to interact with some other aspect of the skill other than audio (e.g. Tell me about this podcast, or I want to donate to this organization). -
AJ commented
I commented here before I utilized the built-in intents that come from activating the "Audio Player Interface."
For a skill that needs to be interrupted during audio playback: if the new command is related to starting over, skipping etc. The interface for Audio player provides an exemption for the "invocation name" to have to be said.
-
Stevie & Lauren Eeles commented
Certain things like this bug me, but I find it is largely down to the maker of the skill.
For example,
Alexa, ask unofficial ron swanson for some advice, where "unoffical ron swanson" is the trigger for it. Why? The unoffical is unecessary.
Likewise, the hilarious "ask for a ****" has the trigger "ask for a ****". Just "****" would suffice.
I've not come across an issue where Alexa being the trigger word getting in the way is an issue. For instance with Audible, you dont really need to mention Audible at all. Just have to ask what you want. Read my book [name]. Go to chapter [whatever].
-
AJ commented
Thank you for sharing this!