Amazon's Echo devices will no longer have the option to handle the processing locally.
The Echo devices, which give users access to the firm's Alexa personal assistant via voice commands, will stop offering local processing capabilities on March 28, at which point all voice recordings will be sent to the cloud to be processed, as reported by The Verge.
Local processing was only offered by three of the Echo devices, Echo Dot (4th Gen), Echo Show 10, and Echo Show 15, and for customers in the US.
Customers were sent an email last week explaining the changed, which noted: "As we continue to expand Alexa's capabilities with generative AI features that rely on the processing power of Amazon's secure cloud, we have decided to no longer support this feature."
Local processing was a feature of the Echo devices available for those who opted in for privacy concerns, or for simple commands such as home functions including turning lights on and off, or adjusting the thermostat, where large amounts of processing power were unnecessary.
The move comes as Amazon is launching Alexa+, a subscription-based AI assistant.
In a blog post, Alexa+ is described by Amazon as being "more conversational, smarter, [and] personalized."
Trained on LLMs on Amazon Bedrock, the new Alexa will be able to understand "half-formed thoughts" and "colloquial expressions."
Amazon also claims that Alexa+ can be proactive - offering an example of "suggesting you start your commute early when there’s heavy traffic, or telling you a gift you wanted to buy is on sale."
In another post, Amazon details the technology upgrades that went into the new Alexa.
There has been much speculation about where AI inferencing workloads will end up, with latency frequently being a driver to keep them closer to the Edge. This move suggests that Amazon is ditching that notion for its Alexa offering, at least for the 'on-device' Edge.
The company notes that the company has had to find mitigations to latency. "Customers expect Alexa to be fast, yet there’s an inherent tension when balancing accuracy and speed," the company wrote.
"To manage that tradeoff, we built a sophisticated routing system using state-of-the-art models from Amazon Bedrock—including Amazon Nova and Anthropic Claude—instantly matching each customer request with the best model for the task at hand, balancing all the requirements of a crisp, conversational experience."
DCD has contacted Amazon to learn more about how Amazon Echo Alexa requests will be processed, the hardware that will be used, and in which data centers processing will occur.
In a statement from Amazon, the company said: "The Alexa experience is designed to protect our customers’ privacy and keep their data secure, and that’s not changing. We’re focusing on the privacy tools and controls that our customers use most and work well with generative AI experiences that rely on the processing power of Amazon’s secure cloud. Customers can continue to choose from a robust set of tools and controls, including the option to not save their voice recordings at all. We’ll continue learning from customer feedback, and building privacy features on their behalf.”
Amazon has a massive fleet of AI infrastructure at its disposal through its Amazon Web Services (AWS) division, including chips from the likes of Nvidia and AMD, and its own home-grown Trainium and Inferentia chips. In December 2024, AWS announced Project Rainier, which would be a cluster of Trainium2 UltraServers containing hundreds of thousands of Trainium chips interconnected with third-generation, low-latency petabit scale EFA networking, for Anthropic.