DaDaDa 2016 - Legally fine but creepy
Ethics in data analytics beyond legal compliance: Through case studies (Target pregnancy prediction, stalker apps, network monitoring), argues transparency and saying "no" to customers matters more than hiding behind consent and terms & conditions.
Abstract
This presentation examines ethical responsibility in data analytics beyond legal compliance, challenging data professionals to develop self-imposed ethical standards rather than hiding behind terms and conditions. Using German data protection laws (BDSG and TMG) and the upcoming 2018 GDPR as a legal foundation, the speaker argues that meeting minimum legal requirements through user consent does not protect organizations from public backlash or brand damage when data practices feel invasive or exploitative. Through five interactive case studies—Target's pregnancy prediction algorithm, the "Girls Around Me" stalker app, casual neighbor Googling, network performance data collection, and the PokéFit Pokémon GO tracker—the presentation engages audiences in real-time ethical voting to reveal the subjective nature of "creepiness" judgments. Key themes include the distinction between personal, pseudonymous, and anonymous data (with GDPR eliminating pseudonymous data's privileged consent-free status), the problem of context collapse when public data gets aggregated for unintended purposes, and the importance of transparency in showing users what inferences are being made about them. The speaker emphasizes that data scientists must cultivate individual judgment and courage to refuse ethically questionable projects even when technically legal, comparing data collection boundaries to social norms—just as houseguests shouldn't raid your refrigerator despite being invited in, companies shouldn't exploit every data point users technically consented to sharing. Historical context from 1890 privacy laws addressing instantaneous photography to Norse tent positioning reveals recurring patterns where technological advances create new privacy tensions requiring evolved social frameworks. The presentation concludes that common sense, transparency, user empowerment, and willingness to say "no" to customers matter more than parsing legal technicalities, particularly for corporate clients concerned about brand reputation in an era where viral backlash can destroy business value regardless of legal defensibility.
About the speaker
Ralf Klüber is a data analytics professional at P3, a consulting and engineering company specializing in automotive, aviation, and telecommunications sectors. P3 is a 15-year-old organization with 300 million euros in revenue, structured as a group of 21 entrepreneurial companies operating as an "entrepreneurial sandbox" for innovation. The company is known for its practical engineering approach—staying with clients until prototypes are running rather than just delivering presentations—and for conducting mobile network performance testing, including work for Connect magazine in Germany. Klüber leads data analytics initiatives at P3, including development of network performance measurement applications and experimental projects like PokéFit, and has direct experience refusing customer requests on ethical grounds despite their technical and legal feasibility.
Transcript summary
A speaker from P3, a consulting and engineering company, presents a non-technical talk about responsibility and ethics in data analytics. Through historical context, legal frameworks (German data protection laws and upcoming GDPR), and interactive case studies, the presentation challenges the audience to think beyond legal compliance toward self-imposed ethical standards. The central argument: you cannot hide behind the law when public backlash strikes—transparency, common sense, and willingness to say "no" to customers matter more than technical legal compliance.
Opening Disclaimer and Context
The speaker opens with a trigger warning: "I have no single formula in my presentation and I have no single line of code, so if you want to leave, leave now." This is a "light talk" where the audience can relax and look at pictures—it's more food for thought than technical content. The speaker is the only one standing between the audience and lunch, creating a conversational, informal tone.
Quick Company Introduction
P3 is a consulting company—mainly engineers—15 years old, 300 million in revenue. It's actually a group of 21 companies. "If someone comes up with a great idea, we build another company, so it's kind of an entrepreneurial sandbox. If you have something to provide as value, just raise your voice and you get stuff and money if needed." They're active in three major markets: Automotive, Aviation, and Telecommunications (roughly 90% of revenue), with the rest split among smaller areas like energy and storage.
What P3 Does
They're also active in digitalization, Industry 4.0, and advanced analytics—"the buzzwords which everyone kind of throws out here." Typical work includes management support, network measurement, and testing. "Who has heard about P3 before? Just raise your hand, just check if it's still alive." The speaker mentions the Connect magazine in Germany—a special interest paper—which tests mobile networks (who has the best network, who is most colorful, most sharp), and "this one is performed by P3 as an example." Typically their customers know them but not their competitors, which is an advantage. Another principle: "We're not leaving when the first slides are ready, but we typically leave when the first prototype is running."
No Formulas, No Code—But Questions Welcome
"That was basically it. Anyone wants to leave because of no formulas? No? Good. We talk about responsibility." The speaker encourages questions throughout rather than saving them for the end: "If you have 20 questions at the end, everyone will be looking at you like 'another question?' and I want to have dinner, so if you have questions, do it now."
The Changing World: 2005 vs. Today
The speaker shows a picture: "Who knows where this picture is from? Anyone saw it before?" It's from 2005, showing an early adopter. "Look at that guy, he's an early adapter. Was 2005, kind of the announcement—Who Wants to Be a Millionaire?" Then the same picture, years later—everything has changed. "Things have totally changed, especially when we're talking about using data, aggregating data, using information you can collect. Everyone can collect on those kinds of devices."
FIFA 15 Terms and Conditions
"If you ever try to look into the terms and conditions of FIFA 15, for example, do that. If you play or your friends or your children play FIFA 15, give it a try." The world has really changed—everything is digital, and users really leave traces everywhere they go with their smart devices. "And we do something with those traces. That's our job, you probably know, and I do." Things can get creepy.
Defining "Creepy"
What's creepy? "There's no clear definition, but if you see it, you have this kind of hairs go up and you say 'okay, let's think about it.'" The speaker wants the audience to think about what they've done in the past and what they probably do in their work—"in terms of analysis you do, think of how creepy they are. And think what you can do as an analyst to kind of still stay in a way that you feel comfortable."
Legal Boundaries: German Data Protection Law
When talking about creepiness, we need to talk about legal boundaries first. In Germany, there are two major laws which are relevant: Bundesdatenschutzgesetz (BDSG) and Telemediengesetz (TMG). There are similar concepts in EU law (covered on the next slide). "We don't need to go to each and every point here, but the key thing is that the left-hand side—BDSG—is everything which is kind of the offline world, so this is the old economy. The TMG is basically everything which has to do with web pages, applications, devices, smart watches, whatever you're thinking."
The Key Rule: Consent
The key thing to summarize it in one line: "If you have the consent of the user to collect data, then you can pretty much do anything. That's why I urge you to read terms and conditions of FIFA 15 or any other app which is free in the app store, because if the app is free, you pay." The key thing: you need to have—that's the common understanding—twice the consent of the user that he's willing to share the data with you. In the app environment, the first consent is installing the app already. "This is the first consent." The second consent is typically what comes up in each and every app—the terms and conditions, "the 90 pages when you do an iOS 10 upload."
The Coming Change: GDPR 2018
There's something changing in the near future: May 2018. "There is a thing which comes up which has really high fines if you work against it, so companies are really keen to follow that thing. This is the only kind of motivation you really can give." There's also something different than in the past: in Germany at least, there's a concept—you have private data, anonymous data, and in between you have pseudonymous data, which is something like "you still can relate this is the same person which was Monday in Munich and Wednesday in Frankfurt, but you don't know who it is. That's pseudonymous."
Pseudonymous Data Privilege Ending
That concept is not widely known, at least not outside of Germany. "And it has a privilege at the moment, so even for this pseudonymous data, you don't need consent of the user. And this changes in 2018. So this changes a lot of business models in the future." The industry is hardly following these changes.
Self-Regulation Doesn't Help Enough
Next to legal requirements, there's also self-regulation (in Germany: "Selbstverpflichtung")—the industry says "these are the rules we want to follow." Each body has their things. The key thing—there's a quote from Peter Schaar, who had a blog post in 2012 on publishing pictures. The key thing: "The rules don't help as long as you don't self-restrain yourself to kind of something you feel kind of [comfortable with]."
The Pattern: Law Isn't Enough
When talking about responsibility, you see a pattern that emerges. The speaker shows a diagram: even if you follow the law, you cannot hide behind the law. "If you have the storm, breaking news because you created an application and something went crazy there, then you are in the focus." There's a slight move between "you're still legal" and then it starts becoming not okay. "So you need to think about: am I still here in that black box or am I already in a way where I think I can do something useful and not harm anyone?"
Privacy Is Not New
When we talk about privacy, one can think this is a new concept. It's not. Very interesting: "It's the tents of the tribe, the Norse, which went around—they already had the concept that the entries of those tents were not facing each other. Some sort of privacy." Another example: "These are actually people sitting on toilets and talking. The Romans had really no issue there. So that changed over the course of the years."
1890: The Right to Privacy
The first thing which really went into something we could see as comparable to what we have now is a law which was raised in 1890. "And the main driver was instantaneous photography, where the paparazzi came up and people wanted to basically protect themselves and their privacy from that new technology that way." You can see this pattern of pictures is still kind of new or actual—think of Google Glass. "Would you go with Google Glass?" Technology is changing things, and the data which can be collected is something you need to be using carefully and wisely.
Case Study 1: Target's Pregnancy Prediction
Now for three, four, five different examples of analysis ranging from creepy or not creepy. The speaker wants the audience engaged to vote after each example: "If you think it's creepy, hold your arm to the left side. If you think it's okay, hold your arm to the right side." The next example: Target (something like Kaufland in Germany where you can buy food and stuff) basically looked at what their customers buy. One interesting young data engineer had the idea: "Let's create a pregnancy score."
How the Pregnancy Score Worked
Women in their first trimester buy oily things to keep their body in shape, they use additional supplements, etc. They found a very good way to identify pregnant women. What they did as well: they really knew they were at the edge because they didn't want to send a full-fledged flyer with diapers—"they thought it's crazy, right? We never talked to that person, and now how do you find out?" So they put next to the diapers a sewing machine and some chocolates to disguise their knowledge. "They have to hide it."
The Father's Complaint
The story goes—it was actually published at the time—that one father went back to them and said, "Guys, what are you doing? My 17-year-old daughter got that shitty flyer and she's not pregnant." Two weeks later, he had to come back and say, "Sorry, there was something going on in my home that I didn't know." Vote: Creepy or not creepy? The audience seems split. "Okay, that's interesting."
Case Study 2: Girls Around Me App
Another one: an app which was called "Girls Around Me." It basically used publicly available information like Foursquare logins and that kind of stuff to identify venues which have a lot of girls checking in instead of boys. "So this app was a big hit." They used only data which is publicly available. "And yeah..." They needed to pull it from the store and also change [their approach]. The speaker shows pictures of the app interface. Vote: Creepy or not creepy? "I thought about that. Still a lot of people think it's not creepy. I found that interesting."
Case Study 3: Googling Your Neighbors
Another one: "Who has not done this, right? You share the car—your daughter, you have a six-year-old daughter who goes to school and you share a carpool and you're like 'okay, what are those guys which I put my daughter in the car? Who is that? Who's my neighbor?'" There are several other examples—you can find out the price point of many services like this. "I think this is actually kind of common at the moment. I do that. I have not Googled you yet—I do afterwards. But if you Google me, this is not me—you need to do that as well. So it's always about shares." Vote: Creepy or not creepy? The audience response suggests it's considered less creepy.
Case Study 4: Network Performance Data Collection
"So this is what we do." They collect information on smartphones about network performance—this is where they come from as a company. They have an application which sits on your iPhone or Android and basically shows you information about data speed, signal strength, very technical details about mobile networks. It shows this to you because "we think that mobile network performance is like weather. It could be me sitting here in this building and have perfect performance and you sitting there on the other side of the building and it could be shitty. Same could be for weather—that side of the building could be sunshine, that side could be rain."
Global Virtual Drive Test
"We do have basically a virtual drive test from every country except West Sahara and North Korea." What do they do with that? "We show it to the user on one side—that is your personal data—and on the right-hand side we derive aggregate information about the O2s, Vodafones, etc. of the world and use that information to judge, to help them also improve. So this is what we do with our data." Vote: Creepy or not creepy? The audience seems to consider it acceptable. "Sure? Creepy or not creepy?" Most vote not creepy.
Visualizing Personal Performance
"So you get to visualize your personal performance. You can see how good it is, how bad it is. You can dip in more detail on a daily basis and see 'okay, yesterday was better than today' and that kind of stuff. So you can search the bar which has the perfect coverage and also use the app." Users are downloading it, installing it, using it.
Case Study 5: PokéFit
"And they—we did another one. I did the creepy thing on that one as well." The speaker references the Pokémon GO hype on July 6 of this year (2016). "We thought 'okay, we need to use that' and created an application which was called PokéFit. It basically created out of a Pokémon GO session a fitness tracker. So we had information on top of your Pokémon GO session—'okay, this is the number of calories I've burned, this is the number of miles I walked, this is the time I spent'—and for sure, the same data gets collected." Vote: Creepy or not creepy?
PokéFit Results
"Yeah, it was not so successful. People installed it and then uninstalled it afterwards. It's still kind of—data is coming in—and this is how it works." The speaker shows how it functions. "Basically, if you collect data, you need to understand what you do with the data. I think this is more important. You need to protect the users—that's very important. You need to look through 'do I harm those people which deliver the data to you?' And as long as you do aggregated views, that's, from my understanding, kind of on the right side of things."
Summary: Law Isn't Enough
"As a short summary: hiding behind the law is not enough. You need to make your own decisions, from my understanding, and also speak up. I have had customer requests which I said no to. 'I won't do that.' I think it's still within the law—I could still do it because I have the consent of the user—but I think it's not worth taking the risk, especially if you work with corporate companies." They're getting kind of settled down, they don't want to risk their brand and [their reputation].
How to Avoid Problems
So how to avoid that? "I think you need to be very transparent. You tell the user what you're doing with the data." You get some criticism—"if you look in the store and look at PokéFit, you see some critics, sure. But if you are transparent about it, then your answer is 'okay, then don't install it if you don't like it. That's the game.'"
The Guest Analogy
"The key thing is: even if you invite someone to your home and he's a guest, he should not go to the refrigerator and eat your stuff, right? Even if you bought it for him. So this is where you need to slowly find the right way. So it's about morals, it's about sense, it's about common sense—a rare thing. And if you feel uncomfortable, speak up. I think that's the important thing."
Hiring and Closing
"So you want to speak up when it's too much? We're hiring. So I'll spend Saturday this week not only because it's fun but also I want to search for talents, and we currently have openings here. So get in touch with those guys or with me after the talk. And that was it for my end. Thank you."
Key Insights: Ethics Beyond Compliance
This presentation delivers several crucial lessons for data scientists and analysts working with personal data. First, legal compliance provides a floor, not a ceiling—meeting minimum legal requirements doesn't protect you from public backlash, brand damage, or ethical responsibility. The Target pregnancy case demonstrates this perfectly: technically legal (users consented via terms and conditions), but the execution reveals knowledge users didn't realize they were sharing, creating the "creepy" factor.
Second, the concept of pseudonymous data matters more than many realize, especially as GDPR eliminates its privileged status. Being able to track "the same person was in Munich on Monday and Frankfurt on Wednesday" without knowing their identity seems privacy-preserving but enables powerful inferences and behavioral profiling. The 2018 regulatory change forcing consent for pseudonymous data fundamentally alters business models built on this distinction.
Third, transparency and user empowerment serve as ethical guideposts. The speaker's network performance app succeeds ethically because it shows users their own data first, provides value directly to them, and then uses aggregated data for broader insights. This contrasts with Target's pregnancy prediction, where users had no visibility into the inferences being made about them or ability to control secondary uses of their shopping data.
Fourth, context collapse creates ethical problems even with public data. The "Girls Around Me" app used only publicly available Foursquare check-ins but combined them in ways that violated contextual integrity—users shared location to connect with friends, not to enable stranger tracking. Public data in one context becomes invasive when aggregated and repurposed, regardless of technical legality.
Fifth, self-regulation requires individual courage. The speaker's willingness to refuse customer requests that feel wrong, even when legal, exemplifies professional ethics in practice. This becomes especially important when working for corporate clients concerned about brand reputation—sometimes protecting the client means saying no to technically feasible but ethically questionable projects.
Sixth, the historical context of privacy concerns reveals technology-driven pattern repetition. From Norse tent positioning to 1890 laws addressing instantaneous photography to current debates about Google Glass and smartphone tracking, each technological leap creates new privacy tensions requiring new social norms and legal frameworks. Understanding this pattern helps anticipate future ethical challenges.
Finally, the "common sense" principle acknowledges that formal rules and frameworks ultimately depend on individual judgment. The refrigerator analogy—even if you invite someone into your home, they shouldn't raid your fridge—captures the intuition that consent has limits and contextual boundaries that can't all be codified in terms and conditions. Data scientists need to cultivate ethical intuitions that go beyond checking legal compliance boxes.
The interactive voting format brilliantly engages the audience in active ethical reasoning rather than passive listening, revealing how subjective "creepiness" judgments can be while still identifying patterns (pregnancy prediction and stalking apps generally seen as creepier than network monitoring or casual Googling). This variability underscores why the speaker emphasizes personal responsibility—no universal algorithm exists for ethical data use, requiring thoughtful case-by-case evaluation balanced against transparency, user benefit, and potential harm.