Tuesday, July 22, 2014

How Transparent Data Markets Could Better Protect Your Data and Your Rights

Originally published 04/03/2014 on Maus Strategic Consulting.com 

As buzzwords go, Big Data has proven to have a lot of staying power. Despite its arguably vague definition* and the numerous privacy and accountability concerns, Big Data investment  and vendor revenue not only remain strong, but continue to grow. Big Data has had more than its share of controversy. 
Aside from general privacy concerns, there have been incidents like the the Edward Snowden leaks, Target's data breach of millions of customers' credit card  information, the  credit bureau Experian accidentally selling U.S. bulk consumer credit data to a Vietnamese identity thief through a third party. Indeed, as of the summer of 2013, the FTC had brought over 40 lawsuits against companies for failing to provide adequate security for consumer data--and those are only the cases that the government knows about.
 Keep in mind that due to legal liabilities, companies have every reason to hide breaches from consumers if they think that they can get away with it.Then there are the arguments that the use of Big Data undermines democracy, unfairly manipulates consumers, enforces discrimination, etc.

(*For purposes of this article "Big Data" refers to massive volumes of data useful for a variety of analysis. It is chiefly used here in regards to data about individuals, such as social media data and personal records.)
                                                                            Image and data courtesy of Wikibon

"You want some, uh, Facebook Chat records with that, or maybe some psychological profiles?"
There are also quite a number number of potential benefits that could possibly be gained from analysis of Big Data, from improving epidemic readiness, to more effective education, to preventing bullying.

Even if one considers the use of Big Data to be a net gain, though, the current market for selling and exchanging data has some room for improvement in terms of security, accountability, and trust, as even many of the data brokers themselves would agree.

It seems unlikely that any strong U.S. regulations will hinder private data collection, at least in the near future.
Firstly, there is substantial money to be made from Big Data, and with money comes lobbyists to influence legislation.
Even more fundamentally though, there's the fact that politicians  themselves collect and use this data extensively to tailor advertising for their political campaigns.

Data Exchanges
But, there may be a way that better aligns the interests of consumers, Big Data collectors, and Big Data users: transparent, privately-owned, but publicly-regulated markets for the data.

Imagine something like an Amazon, Alibaba, or New York Mercantile Exchange, focused on the purchase and licensing of Big Data. Suppliers could increase their markets, buyers could increase their options, and all transactions would be public record.  Much like how the SEC regulates financial markets, a government agency (possibly the FTC) would monitor all transactions and perform due diligence on participants to prevent fraud and ensure that standards are met.

The future of Big Data commerce?
Everyone would gain from this system.

Because the transaction records would be publicly available, consumer and civil liberty watchdog groups could better keep track of how data is being used, guarding against questionable practices. The openness would also create clear lines of accountability for the use and safekeeping of data, so that all parties would be fully aware of where any potential liability rests, aiding any future investigations. Standardized security and background screenings on all participants in transactions would increase data security, compared to the current patchwork of oversight.

Sellers of Big Data would gain expanded markets for their data, especially the sellers of data with more niche demand. Speaking of niches, they would also have a better view of the market overall, allowing current vendors and potential entrepreneurs to better identify gaps that they might fill.
It could also reduce their costs.  Standardized background checks on purchasers carried out by the exchanges and government agencies would eliminate redundant security screenings between companies and reduce the risk of their liability for leaks. Additionally, standardized systems which automate the purchasing process of data would reduce their need for sales staff.

Purchasers and users of Big Data would similarly gain an expanded visibility of their options, which would likely drive costs down overall and help them to find data that most perfectly suits their needs.  Once their backgrounds have been screened to make transactions overall they wouldn't need to go through screenings with every different transactions, reducing delays.

Transparent transactions and open markets increase overall market efficiency, improving profitability for all involved. However, not only would we collectively benefit from more effective economic transactions, but we would to reap more of the rewards that Big Data research has to offer, such as more efficient public institutions and businesses.

Issues  to Consider
There are a few potential objections to be considered and pitfalls to be avoided.

Some consumer advocates might be concerned that the data exchanges would further incentivize data collection through intrusive methods. However, much of the data is already being collected. The exchanges would simply make these transactions more transparent so that they would be better open to thorough, systematic scrutiny by both the public and the government. Essentially this allows the public to better watch the watchers and prevent particularly egregious abuse.

There are a few reasons that the Big Data watchers might be hesitant to let the public watch their transactions, and thus might be hesitant to utilize the exchanges.
Firstly, the they might fear a customer backlash if consumers knew how their information was being collected and used. This concern has some basis. According to a 2012 poll, 85% of American Internet users would be angry “If I found out that Facebook was sending me ads for political candidates based on my profile information that I had set to private" (which, in fact, Facebook does.) Another poll, by Ernst and Young in 2013 found that 70% of online consumers are "never happy" for companies to share their personal data and 63% say that being asked to share personal data would stop them signing up to a new service or product.
However,  practices such as these are already going on extensively and there is nothing in particular to stop consumers from informing themselves about it.  So, a consumer is either they're already informed/semi-informed, in which case they do or don't care, or they are uninformed and likely to remain so, regardless of any new developments on the subject.
By definition, the chronically uninformed are extremely unlikely to change their behavior regardless of what information is available on how their data is being used, including any information about the data exchanges.  The question then becomes how the informed and semi-informed would change their behavior.
Image and data courtesy of Pew Research Center.
Historically, American views of data collection have proven interestingly fluid. According to a Pew poll, when George W. Bush was in office 3/4 of Republicans were in favor of NSA surveillance and 3/5ths of Democrats opposed it. On the other hand, when Barack Obama was in office in 2013, just a narrow majority of Republicans supported NSA surveillance, while almost 2/3rds of Democrats supported it.
Thus, not only can attitudes on data collection shift, but (according to an admittedly uncertain reading) it would seem that the shift corresponds strongly with how the individuals view the people associated with the data collection.
This would suggest that if the collectors, holders, and users of Big Data are seen as trustworthy bargainers in good faith, that much of the public are more likely to be comfortable with their use of the data. If the brokers are seen to be operating openly and publicly, it is likely to improve consumer trust.
Furthermore, openly revealing just how common data commerce is will normalize it to the population at large. So, by appearing less secretive, data collectors and vendors will reduce their apparent need to be secretive.

Another reason that organizations might hesitate to make their transactions public would be that they are revealing critical information to competitors about their strategic intent. This concern could be particularly acute with political campaigns. The issue could be avoided, however, through some misdirection or obscuration on the part of the transacting organizations. Organizations could make red herring purchases, or buy in bulk to hide what they are actually interested in. Considering who would be writing the legislation, political campaigns might be given special consideration, such as allowing individual campaigns to sub-license data  from political parties or PACs without revealing precisely which data they are actually using.

There is also the risk that the big players in Big Data commerce (Google, Facebook, Acxiom, etc.) might not like the opening of these new markets, fearing that it could open the door to competing vendors. Thus, they might be tempted to boycott the exchanges to prevent them from picking up steam.
On the other hand, the major vendors have a great deal to gain from the exchanges. Firstly, they have the most data to sell. Secondly, they would benefit immensely from the increased trust in the industry that openness and transparency might allow.

The inverse risk would be that the larger Big Data vendors might try to use the exchanges and the regulations to strangle smaller vendors with regulatory costs or leveraging their market power to ensure favorable treatment by the exchanges, thus consolidating their hold on the market and stifling innovation.  To prevent this, security auditing requirements and other expenses should be scaled based on the size of the organization's transactions, and a certain degree of neutrality in the exchanges should be mandated to ensure balance.

For various reasons, some organizations might prefer to keep a few or even all of their transactions secret from the public. Thus they might wish to make these transactions outside of the exchanges.
One could argue that their intention to avoid public scrutiny deserves some public scrutiny.
Considering the threat to private security, the matter merits some thought as to whether this should be allowed at all. Not only could the transactions be unsecure or even illegal themselves, but their very existence might undermine the public trust in Big Data that the exchanges would help to establish. Thus, it may be worth consideration to either require that all Big Data transactions above a certain threshold be publicly disclosed (using proper measures to prevent circumvention) and/or even requiring that all transactions take place through the exchanges.

One complication that remains is how intelligence, defense, and law enforcement agencies and their contractors should  fit into the overall picture. They would certainly argue that they have a special needs for their Big Data activities to be secret. Whether or not they merit an exception and how they should be handled, is beyond the scope of this article, however.


  1. In near future, big data handling and processing is going to the future of IT industry. Thus taking Hadoop Training in Chennai | Big Data Training in Chennai will prove beneficial for talented professionals.

  2. You have done really great job. Your blog is very unique and informative. Thanks. Devops Online Training | Data Science Online Training

  3. Hi, Really your post was very informative. Today's internet era learn Hadoop Online Training will helps you to reach your goal.Selenium Training

  4. Nice sharing. R is a language and environment for statistical computing and graphics. Want to make a career in R Programming. Learn R Programming Training course @ GangBoard. We are the best provider of online training on evergreen technologies.

  5. The strategy you have updated here will make me to get trained in future technologies. By the way you are running a great blog. Thanks for sharing this..

  6. Thank you for this valuable information. I have got some important suggestions from it. Get your business to the next level in simple steps.
    ERP Software Solutions in Chennai.

  7. The best thing is that your blog really informative thanks for your great information!
    erp providers in chennai

  8. Nice blog. Thank you for sharing. The information you shared is very effective for learners I have got some important suggestions from it. erp in chennai.