More information about our processes to safeguard speech data

Share
  • July 11, 2019

We’re focused on building products that work for everyone, and as part of this, we invest significant resources to ensure that our speech technology works for a wide variety of languages, accents and dialects. This enables products like the Google Assistant to understand your request, whether you’re speaking English or Hindi. 

As part of our work to develop speech technology for more languages, we partner with language experts around the world who understand the nuances and accents of a specific language. These language experts review and transcribe a small set of queries to help us better understand those languages. This is a critical part of the process of building speech technology, and is necessary to creating products like the Google Assistant. 

We just learned that one of these language reviewers has violated our data security policies by leaking confidential Dutch audio data. Our Security and Privacy Response teams have been activated on this issue, are investigating, and we will take action. We are conducting a full review of our safeguards in this space to prevent misconduct like this from happening again.

We apply a wide range of safeguards to protect user privacy throughout the entire review process. Language experts only review around 0.2 percent of all audio snippets. Audio snippets are not associated with user accounts as part of the review process, and reviewers are directed not to transcribe background conversations or other noises, and only to transcribe snippets that are directed to Google. 

The Google Assistant only sends audio to Google after your device detects that you’re interacting with the Assistant—for example, by saying “Hey Google” or by physically triggering the Google Assistant. A clear indicator (such as the flashing dots on top of a Google Home or an on-screen indicator on your Android device) will activate any time the device is communicating with Google in order to fulfill your request. Rarely, devices that have the Google Assistant built in may experience what we call a “false accept.” This means that there was some noise or words in the background that our software interpreted to be the hotword (like “Ok Google”). We have a number of protections in place to prevent false accepts from occurring in your home.  

Building products for everyone is a core part of our DNA at Google. We hold ourselves to high standards of privacy and security in product development, and hold our partners to these same standards. We also provide you with tools to manage and control the data stored in your account. You can turn off storing audio data to your Google account completely, or choose to auto-delete data after every 3 months or 18 months. We’re always working to improve how we explain our settings and privacy practices to people, and will be reviewing opportunities to further clarify how data is used to improve speech technology. Visit your account to review or change your settings, and view (and, if you choose, delete) all the activity that’s stored with your account.

Source : More information about our processes to safeguard speech data