During a recently concluded 12-month study of the Alexa Skills Store review process, academics said they managed to smuggle 234 Alexa-breaking policy skills (apps) into the official Alexa store.
The results of the study are actually worse than it seems because the academics tried to upload 234 applications that break the policies and managed to get them all approved, without serious difficulties.
“Surprisingly, we successfully certified 193 skills in their first presentation,” the research team wrote this week on a website detailing their findings.
The research team said that 41 Alexa skills were rejected during the first presentation, but were finally obtained at the official store after a second attempt.
“Violations of privacy policy were the specified problem for 32 rejections, while 9 rejections were due to UI problems.” The investigators said.
The purpose of this quirky research project was to test Amazon’s skills review process for the Alexa Skills Store, the web portal where users go to install apps for their Alexa device.
In recent years, previous academic work [1, 2, 3, 4] revealed that the research teams had no difficulty uploading malicious Alexa skills to the official store, which they used to test their experiments.
With each project, researchers warned Amazon that the skill review process was insufficient, Amazon promised to do better, and then new research would appear months later, showing that researchers were still able to upload malicious skills regardless of Amazon’s promises.
Placing policy-breaking skills in the children’s category
During this experiment, the research team assembled a set of 234 Alexa skills that violated Amazon’s basic policies.
These were apps that were not overtly malicious, but simply provided prohibited information to user questions, or collected private information by asking Alexa users about their names and other personal data.
The research team uploaded the apps in the Alexa Skills Store and approved and certified them for the kids section of the Alexa store, where the policies should be applied more strictly than other sections.
Examples of Alexa skills the research team obtained in the kids section include:
An Alexa skill that provided instructions on how to build a firearm silencer (hidden inside a children’s craft skill)
An Alexa skill that recommends the use of a recreational drug (hidden inside a children’s desert fact skill)
An Alexa skill that pushes advertising (hidden inside a geography fact skill)
An Alexa skill that collects children’s names. (hidden inside a storytelling skill)
An Alexa skill that collects health data (hidden within a health care skill)
The academic team cited several reasons why they were able to post all of their skills that violate policies in the official store:
- Inconsistency in checking – The researchers said that different skills that violate the same policy received different comments from reviewers, suggesting that reviewers were not viewing or applying Amazon policies in the same way in presentations.
- Limited voice check – Reviewers did limited verification of the skill’s voice commands and code. This allows threat actors to post malicious apps to the official store simply by delaying initial malicious responses, long enough to avoid the brief review process.
- Overtrust placed on developers – The researchers said Amazon appears to have a native trust in the skill developers and will approve the skills based on the responses the developers provide on the forms submitted during the skill review process. This allowed researchers to claim that their app did not collect user information, something Amazon never verified during the actual review.
- Humans are involved in certification – The research team said that due to inconsistency in various skills certifications and rejections, it has led them to believe that skills certification relies heavily on manual testing, as some issues may have been detected by some systems automated.
- Negligence during certification – The review process was not comprehensive enough to detect obvious skills in breaking policy.
- Possibly outsourced and not performed in the U.S. – Based on the skill review timestamps, some reviews appear to have been done by non-native English speakers or by reviewers who are unfamiliar with US law.
Review of children’s current skills.
After conducting their research, the academic team removed its malicious abilities, to prevent a user from accidentally bumping into it and installing it on their devices.
However, the research team also wanted to know if other bad skills made it to the Alexa Official Skill Store in the past. They did this by selecting 2,085 negative comments from the skills listed in the Kids category and identifying the 825 Alexa skills in which they were posted.
“Through dynamic testing of 825 skills, we identified 52 problem skills with policy violations and 51 broken skills in the children category,” the researchers said.
This included Alexa skills that were suspected of collecting user information, skills that included ads, or skills that promised various rewards for positive reviews on the Alexa store.
Amazon disagrees with study, but promises to do better
In an email today, Amazon disagreed with the report’s findings, citing additional processes that are involved in reviewing child-directed skills that the research team did not consider.
This included additional audits for child-centered skills that take place after the skills are listed and certified in the official store and a skills monitoring system that scans the responses of the skills for inappropriate content.
Since “bad” applications were removed immediately after certification, these additional systems did not work.
“Customer trust is our top priority and we take violations of our Alexa Skill policies seriously,” an Amazon spokesperson told ZDNet.
“We conduct security and policy reviews as part of skills certification and we have systems in place to continually monitor live skills for potentially malicious behavior or policy violations. Any offensive skills we identify are blocked during certification or deactivates quickly.
“We are constantly improving these mechanisms and have implemented additional certification verifications to further protect our clients. We appreciate the work of independent investigators helping to bring potential issues to our attention.”
Whether these new certification checks will make a difference remains to be seen, most likely during a future round of investigation.
Additional details are available in a document titled “Dangerous Skills Earned Certification: Measuring the Reliability of the Amazon Alexa Platform” [PDF] which was presented this week at the FTC’s PrivacyCon 2020 virtual conference.
The research team also conducted similar tests at the Google Assistant store, but said Google handled it much better.
“While Google does a better job in the certification process based on our preliminary measurement, it is still not perfect and has potentially exploitable flaws that need to be tested further in the future,” the researchers said.
“In total, we filed 273 actions that violate the policies required by Amazon / Google, and looked to see if they can pass certification. As a result, 116 of them were approved. We filed 85 actions for children and got 15 approved; other categories, 101 actions approved. among 188 shares.
Here is an example of Wizard actions (applications) that were approved during the tests, collecting the names of the children: