[ad_1]
We all use apps. We know that they capture information about us. But exactly how much information? I have worked as a software engineer at Apple and a medium-tech company. I have seen the good and the bad. And my experience at Apple makes me feel much more comfortable with the system that Apple and Google have proposed for COVID-19 exposure notification. This is why.
Apple respects user privacy
When I worked on Apple Watch, one of my tasks was to record how many times the Weather and Stocks apps were launched and report it to Apple. Recording how many times each application starts is simple. But reporting that data to Apple is much more complex.
Apple emphasizes that its developers must be aware of customer security and privacy at all times. There are some basic rules, the two most relevant are:
- Collect information only for a legitimate business purpose
- Do not collect more information than you need for that purpose.
That second could use a little expansion. If you collect general-purpose data (how often do people check the weather?), You can’t accidentally collect something that can identify the user, such as the city you’re looking for. I didn’t realize how strictly Apple enforces these rules until I was assigned to record user data.
Once I recorded how many times the Weather and Stocks apps launched, I set up Apple’s internal framework to inform the company about the data. My first revelation was that the frame strongly encouraged him to transmit numbers, not strings (words). By not reporting strings, your code cannot accidentally register the user’s name or email address. You are specifically cautioned not to record file paths, which may include the user’s name (such as /Users/David/Documents/MySpreadsheet.numbers). It is also not allowed to play tricks like encoding letters as numbers to send strings (like A = 65, B = 66, etc.)
Later, I learned that I couldn’t enter my code into Apple’s source control system until the privacy review committee had inspected and approved it. This was not as daunting as it sounds. Some senior engineers wanted a written justification for the data they were recording and for the business purpose. They also checked my code to make sure I wasn’t accidentally recording more than expected.
Once I was approved to use Apple’s data reporting framework, I was allowed to verify my code in the source code control system. If I had tried to verify my code in source control without approval, the build server would have refused to compile it.
When the next beta version of watchOS came out, I was able to see in our dashboard how many times Weather and Stocks apps were launched each day, depending on the version of the operating system. But nothing else. Mission accomplished, privacy maintained.
TechCo largely ignores user privacy
I also wrote iPhone apps for a mid-size tech company that will remain nameless. However, you’ve probably heard of it, and it has several thousand employees and several billions of dollars in revenue. Call it TechCo, in part because its approach to user privacy is, unfortunately, all too common in the industry. They cared much less about user privacy than Apple.
The application I worked on recorded all user interactions and reported that data to a central server. Every time I took an action, the app captured what screen I was on and what button I tapped. No attempt was made to minimize the data captured or anonymize it. Each returned record included the user’s IP address, username, real name, language and region, time stamp, iPhone model, and much more.
Please note that this behavior was not malicious in any way. The company’s goal was not to monitor its users. Instead, the marketing department just wanted to know which features were most popular and how they were used. Most importantly, marketers wanted to know where people fell from the “funnel”.
When you buy something online, the buying process is called a funnel. First, you look at a product, for example, a pair of sneakers. Add the sneakers to your shopping cart and click the buy button. Then you enter your name, address and credit card, and finally, click Buy.
At each stage of the process, people fall. They decide they really don’t want to spend $ 100 on new sneakers, or that their kids are running to show them something, or their spouse tells them dinner is ready. Whatever the reason, they forget about the sneakers and never complete the purchase. It is called a funnel because it narrows like a funnel, with fewer people successfully progressing through each stage to completion.
Companies spend a lot of time figuring out why people fall at each stage of the funnel. Reducing the number of stages reduces the number of opportunities to fall. For example, remembering your name and address from a previous order and automatically filling it out means that you don’t have to re-enter that information, reducing the chance of you exiting the process at that time. The final reduction is Amazon’s patented 1-click order. Click a single button, and those sneakers are on their way to you.
TechCo’s marketing department wanted more data on why people fell out of the funnel, which they would then use to adjust the funnel and sell more products. Unfortunately, they never thought about user privacy when they collected this data.
Most of the data was collected not by code we write ourselves, but by third-party libraries that we add to our application. Google Firebase is the most popular library for collecting user data, but there are dozens of others. We had half a dozen of these libraries in our app. Although they provided more or less similar features, each one collected some unique data that marketing wanted, so we had to add it.
The data was stored in a large database that any engineer could search. This was useful in verifying that our code was working as intended. I could start our app, touch some screens and look at my account in the database to make sure my actions were registered correctly. However, the database was not designed to compartmentalize access: everyone who had access could see all the information it contained. You could easily search the actions of any of our users. I was able to see their real names and IP addresses, when they logged in and out, what actions they took and what products they paid for.
Some of the more experienced engineers and I knew that this was poor security, and we told TechCo management that it should be improved. Test data should be accessible to all engineers, but production user data should not be. Real names and IP addresses must be stored in a separate secure database; the general database should remove unidentifiable user identifications. Data that is not necessary for a specific business purpose should not be collected at all.
But marketing preferred the kitchen sink approach, sucking up all available data. From a functional point of view, marketers weren’t completely unreasonable, because that extra data allowed them to go back and answer questions about user patterns they hadn’t thought about when writing the app. But just because something can be done doesn’t mean it should be done. Our security complaints were ignored, and we Finally I stopped complaining.
The app had not been released outside of the US. USA When I worked on it. Probably not legal under the European General Data Protection Regulation (also known as GDPR; see Geoff Duncan’s article, “The European General Data Protection Regulation makes privacy global,” May 2, 2018). I suppose it will be modified before TechCo launches it in Europe. The app also does not comply with the California Consumer Privacy Act (CCPA), which is intended to let California residents know what data is collected and control their use in certain ways. So it may be changing a lot to accommodate GDPR and CCPA soon.
Privacy is integrated into the exposure notification proposal COVID-19
With those two stories in mind, consider the COVID-19 exposure notification technology proposed by Apple and Google. This proposal is not about explicit contact tracing: it does not identify you or anyone with whom you came into contact.
(My explanation below is based on published descriptions, such as Glenn Fleishman’s article, “Apple and Google Partner for Privacy-Preserving COVID-19 Contact Tracing and Notification,” April 10, 2020. Apple and Google have continued to modify elements of the project; read comments on that article for important updates. Glenn has also received information from the Apple / Google partnership, and has investigated this count.)
The current draft of the proposal has a very conscious feeling of Apple’s privacy. Participation in both recording and transmission of information is optional, as it is your choice to report if you receive a positive diagnosis of COVID-19. Your phone does not transmit any personal information about you. Instead, it creates a Bluetooth beacon with a unique ID that cannot be traced back to you. The identification is derived from a randomly generated diagnostic encryption key that is generated every 24 hours and stored only on your phone. Even that ID isn’t trackable – it changes every 15 minutes, so it can’t be used alone to identify your phone. Only the last 14 keys (14 days) are kept.
Your phone records all the IDs it picks up from other nearby phones, but not the location where you recorded them. The Bluetooth ID list you found is stored on your phone, it is not sent to a central server. (Apple and Google recently confirmed that they will not approve any apps that use this contact notification system and also record the location.)
If you ever test positive for COVID-19, then you use a public health authority app that can interact with the Apple and Google framework to report your diagnosis. You will likely need to enter a code or other information to validate the diagnosis and prevent applications from being used for false reporting, which would cause unnecessary problems and undermine confidence in the system.
When the app confirms its diagnosis, it activates your phone to upload up to the last 14 days of encryption keys daily to servers controlled by Apple and Google, although they may be charged less depending on when the exposure might have occurred.
If you have the service activated, your phone constantly downloads the daily diagnostic keys that confirmed that people’s devices have published. Your phone then performs cryptographic operations to see if you can match the IDs derived from each key with the Bluetooth IDs captured during the same period covered by the key. If so, it was close and you will receive a notification. (Proximity is a tricky question, due to Bluetooth’s range and how separate devices could measure so closely.) Even without an app installed, you will receive a message from the smartphone’s operating system; With an app, you will receive more detailed instructions.
At no time does the server know anyone’s name or location, just a set of randomly generated encryption keys; You don’t even get the exact Bluetooth beacons, which could allow someone to identify you from public spaces. In fact, your phone never sends data to the server unless it shows the app that it tested positive for COVID-19. Even if a hacker or overly enthusiastic government agency took over the server, they would not be able to identify users. Because your phone dumps all keys over 14 days, even if your phone is broken it would reveal little information in the long run.
Actually, there would be more than one server, and the process is more complicated. This is a comprehensive outline that shows how Apple and Google are building privacy from the start to avoid the kinds of mistakes made by TechCo.
Apple claims to respect user privacy, and my experience indicates that this is true. I am much more willing to trust a system developed by Apple than one created by any other company or government. It is not that another company or government is trying to abuse the user’s privacy; it’s just that outside of Apple, many organizations lack an understanding of what it means to create privacy from the start or have competing interests that undermine efforts to do the right thing.