A blog about security, privacy, algorithms, and email in the enterprise. 

Viewing entries tagged


Data Privacy, Cataphora, and the Maginot Line between Transparent and Intrusive

Data is a trending topic right now, and data privacy is one of its trendiest subsets. To wit, Charles Duhigg’s investigative report on Target’s data mining for the New York Times spawned a series of follow-ups, in March, The Atlantic profiled  NYU Law professor Helen Nissenbaum and her flow-based privacy framework,  and the FTC just published a privacy report endorsing privacy-by-design and the “Do Not Track” button. The demarcation line between what should be public vs private is a dynamic and jagged (some might say gerrymandered) one that depends on a piece of data’s original context vs the contexts in which it is eventually used. It seems perfectly reasonable for Foursquare to publish its users’ locations but less reasonable for a third-party dating application like Girls Around Me to provide these locations, along with Facebook profile photos, to its users. It seems reasonable that an online money management service like Mint serves up ads tailored to users’ credit ratings, but less reasonable that banks determine applicants’ loan rates based on their Facebook friends’ credit ratings. Because we’re storing and analyzing corporate email, user privacy is something that we have to get right. Of course, an employer’s definition of “right” might be different than the employee’s, so we’ve been trying to figure out a definition that will please both. Companies are legally permitted to access their employees’ email, and usually this manifests in explicit/inappropriate language monitoring. As long as employees are aware of the monitoring, this sort of vocab dinging seems reasonable. But what about sentiment analysis, and the inferred knowledge of employees’ mind states it provides? Invaluable to the company, I think, but potentially detrimental, and sometimes errantly so, to the employee.  Does explicit consent justify armchair psychology and any actions that result?  Even if employees are fully and duly informed of all monitoring and tracking practices, I’m not sure. Take, for example, Cataphora.

Cataphora is a “behavioral modeling and monitoring” software that analyzes employees’ digital and mobile actions from legal, risk, compliance, HR, and brand management perspectives. The copy on its website doesn’t even try to address employees—there are callouts on its news page to articles with titles like “In Defense of Employer Monitoring,” and “Finding Office Buck-Passers, Heroes, and Shirkers.” If employers are not monitoring employees’ digital activity, Cataphora CEO Elizabeth Charnock argues, they are making themselves vulnerable to leaks, blow-ups, and Youtube frittering-induced productivity slumps. In a blog post entitled “Getting Big Brother Right,” Rick Janowski brought up as a use case an employee on the verge of a breakdown due to non-work-related factors. Cataphora could identify and alert management to the employee’s mental state, allowing them to “provide a safety net for someone who might be prone temporarily to making bad decisions or being less diligent than they normally would be.” Aka remove him from fiscal and legal harm’s way before it’s too late. Ooh, Carnival Cruise is having a flash sale! I hear Alaska’s great this time of year!

You could argue that behavioral mining software is just one of the many new “transparent” office measures, which manifest physically in concepts like open and free range offices (a different desk every day!), and culturally in social enterprise platforms like Yammer, Rypple, and Trello. There’s been a push, lately, to besmirch the traditional office, with its many doors and walls and silos. Which is all very well and fine, but there is a point where public property ends and person begins. Perhaps the central tower is too zoomed in to see it.



Big Data: Making Complex Things Simpler

[Image via ICT4Accountability]

Yesterday and today, I attended the first ever Big Data class at MIT Sloan . The lecturers were Erik Brynjolfsson and Alex 'Sandy' Pentland. I'd previously heard Erik speak at MIT in October (when I first heard in-depth about the components of Big Data), and I've since read his book Race Against the Machine: How the Digital Revolution is Accelerating Innovation, Driving Productivity, and Irreversibly Transforming Employment and the Economy (highly recommended). I had high expectations, and they were ultimately exceeded.

It will take me a bit of time to catch up on everything I wrote down in 50 pages of notes. Due to a combination of no hotel wifi, Amex fraud false positives and Verizon order complexity, I had only my T-Mobile BlackBerry for connectivity on the first morning of the conference. For the first time in a long time, I took notes on paper throughout the class. This gave me mixed feelings. It was certainly nice to create diagrams easily, use different fonts and means of emphasis, create my own notation for action items and areas to research. However, now I'm left with a fragile notebook that I'm paranoid about losing and hours of transcription ahead.

Following a quick and dirty data mine of my own notes, here are some of the most interesting topics, insights, theories, and quotes from the day:


  • Balancing experienced gut vs data
  • Learning to discriminate correlation vs. causality
  • Effectiveness of different communication media for communicating and learning
  • Social graph patterns for creative vs cohesive groups
  • How to continuously run experiments and use Overall Evaluation Criteria
  • Stages of Organizational Evolution: Hubris, Measurement, Semmelweis Reflex, Fundamental Understanding.
  • Big Data Out in the Wild
  • Techniques for Building Viral Adoption
  • Email Analytics: Productivity and Information Diffusion
  • Privacy Legislative Issues in the US and Europe
  • Personal Data as Asset Class
  • The Matrix of Change

Insights and theories: 

  • Companies born on the web, such as Amazon, Facebook, and Google perform hundreds of experiments per day.
  • The Hawthorne Effect: letting people know that they are being experimented on changes their behavior.
  • Social metrics: between-ness, centrality, constraints, geodesic distance
  • Behavioral demographics (where you go, who you hang out with) are a more precise form of defining identity than iris scans or fingerprints.
  • Researchers are able to diagnose depression just by observing cell phone usage. 
  • The Panopticon: who needs a physical surveillance tower in the smart phone age?
  • Tuyman's Law: Any statistic that appears interesting is almost certainly a mistake.

Vox populi:

  • "A wealth of information creates a poverty of attention." -Herbert Simon
  • "When physicists have data that is too noisy, they build a better tool for finer resolution." - Erik Brynjolfsson
  • "Big data is a mental prosthetic." - Erik Brynjolfsson
  • "People are bundles of habits formed by the people around them." - Pentland (more so than a person's friends or peers).
  • "Go get the data! Don't argue about designs."
  • "To have a great idea, have a lot of them" -Edison
  • Lord Lever's Quandry: "Half of my marketing budget is wasted. I just don't know which half."
  • "Where you spend your time is who you are." - Pentland
  • "In Hong Kong, you'll buy everything but your house on your phone." -Pentland referring to the all knowing Octopus card.
  • "70% of all workers are information workers." unattributed.
  • "People care about privacy, but if you offer them an Amazon Gift Card, they will turn it over." -Pentland
  • "Gender predicts information diffusion, but not productivity" -Erik Brynjolfsson, from data on email analytics
  • "People being rational is an abominable model, but all economics is based upon it." -Pentland
  • "We are not concerned about data privacy. We don't give data to anyone except the government." Manager from China Mobile

I also learned that, as CEO, mine is the HiPPO (, ( term coined by Ronny Kohavi of Microsoft) which is fraught with danger for the organization.