Dark Data in Blockchain Ecosystem

by speculative data

This is a speculative piece on the blockchain ecosystem. I acknowledge the environmental impact that cryptocurrencies currently have. My intent is to focus on blockchain and proceed with faith that blockchain technologies will evolve over time.

Currently, major data generators like TIFAAANG (Twitter, Instagram, Facebook, Apple, Amazon, Alibaba, Netflix, Google) are the biggest dark data holders. Additionally, from local grocery stores, multinational companies, different layers of governments, schools, universities, hospitals, etc., all generate data of different scales and scopes. The global scale of such data could be around 7.5 septillion (7,700,000,000,000,000,000,000) gigabytes per day. Out of which, a tiny fraction could be utilized, and the remaining unused data is dark data.

Collectively, online human behaviours have created massive amounts of activities, fortunes, jobs, opportunities, knowledge, and leisure. Tech companies of all sizes have utilized our online data for their profit maximizing activities.

Numbers are the crystal ball in the modern world. Companies are monetizing massive amounts of data to advertise, connect, and share. In fact, new businesses are emerging to collect data and sell to bigger companies. Utilizing other forms of technology, businesses are being built using Artificial Intelligence to help other businesses share and analyze data securely. Hence, data has value and businesses of any size will gladly agree — the more data they have, the more they will utilize them for their benefits. Even USPS is monitoring your social media posts.

Dark Data is a mixture of raw and processed data that organizations/agencies collect and store. You may have heard of big data, which is massive amounts of information. Inside that large data set, only just a portion of that data set is going to be structured and used, such as: age, name, location, and phone number. Semi-structured data is loosely organized data, such as: tweets organized by hashtags, emails by inbox, etc. The remaining data, unstructured data, also known as Dark Data by the IT industry, signify information warehousing that is not organized in a clearly defined framework or model, such as sounds, digital surveillance photos, and sensor data. Dark data has the potential value for any business because there may lie unfiltered information, which can be used as a monetization method, if utilized properly.

A leading global consulting firm, Deloitte, states:

“There are different types of dark data. Take a message, such as a tweet. A tweet is ‘dark’ as it needs to have the language extracted so that a computer can analyse what is written in it. The metadata around the tweet is ‘dark’ also - the time of day sent, the @user, the #hashtag, the device, the location. Analysing the text in the tweet gives you an insight into what is being said, who said it, how happy or angry the sender is. The tweet may contain images or audio, which when analysed using image recognition tools can extract content such as descriptions or terms…”

So a tweet is not just a tweet, it can be of value depending on who wants it and how they plan on using it.

I believe that Dark Data right now, is waiting for something. While it may or may not be used by organizations, it is something that should not be ignored. Dark Data is waiting for a technology to take it to the next level, to add value or exploit for profit to cause untaxed externalities. I believe blockchain is technology that has the potential to add value as well as to secure such data with proper ownership. In simple terms, blockchain is a decentralized distributed database system without any need to trust central parties.

Trust is something that the Cypherpunks did not have for the central banks and their control over monetary policy. The Cypherpunks are an eclectic crew of hackers, hobbyists, technologists, and rebels sharing a core belief “that the internet would soon become an important battleground for human freedom”. They were right on their suspicions with the 2008 market crash. In 2008, central banks (government) spent massive amounts of money to bail out failing banks for frauding the American People.

To curious readers that wish to know more about The Cypherpunk movement, a book Digital Cash: The Unknown History of the Anarchists, Utopians, and Technologists Who Created Cryptocurrency by Finn Brunton is a good source of information. The NYU Professor Finn tells the story of the pioneers of cryptocurrency, cryptography, the technical experiments, encryption, attempts, and failures to create the first digital currency and the technologies involved. Digital Cash also explores the wider questions of what makes digital objects valuable, and how do we learn to trust and use different technologies along the way? Another resource to learn more is this FAQ called: THE CYPHERNOMICON.

Which brings us to the creation of Bitcoin, by an anonymous group or individual that goes by Satoshi Nakamoto. Bitcoin was created amidst the 2008 market crash and has been the topic of many controversies. However, it wasn’t just “Bitcoin” that was revolutionary, but it was what was inside. What was this significant intangible, unknown complex technology inside Bitcoin in 2009? Blockchain. Big banks are profiting fortunes using data, however they failed to understand cryptocurrency and blockchain, and flatly dismissed their existence for years.

 Source: reddit — https://www.reddit.com/r/Bitcoin/comments/ma5lmd/bitcoin_before_and_after_headlines/

Until some interns presented the various use cases of Blockchain, why cryptocurrencies are here to stay, even the CIA is onboard and probably added in an explainer video. Although, one has to wonder if the dynamics going on in workplaces about technologies are akin to Dilbert’s workplaces…

Source: https://dilbert.com/strip/2018-04-20

Why Blockchain and what is it?

Investopedia defines blockchain as:

“One key difference between a typical database and a blockchain is the way the data is structured. A blockchain collects information together in groups, also known as blocks, that hold sets of information.

Blockchain is essentially a decentralized ledger that does not need a central authority or a 3rd party affiliate to control. It does not require a central entity because certain blockchains require a Proof-of-Work (PoW) algorithm. PoW is a mechanism that allows the decentralized network to come into agreement for any transactions. However, proof-of-work (also known as crypto-mining) has severe environmental impact due to its electricity usage to solve the algorithms. There could be millions of miners around the world and that magnifies the overall environmental impact.

The original blockchain was created inside Bitcoin. However, in the past decade there has been a rise of more advanced blockchains such as Ethereum. The early creators of Ethereum also realized the potential of blockchain and some began diving into other blockchain-related projects as well. Dr. Gavin Wood left Ethereum and pursued creating another blockchain called Polkadot. Dr. Gavin Wood is a computer programmer, co-founder of Ethereum, and inventor of Solidity, the computer programming language used in the Ethereum Blockchain. Currently, Ethereum blockchain is using the Proof-of-Work protocol, and Polkadot is using the Proof-of-Stake (PoS). Proof-of-Stake uses an election process to validate blocks, and provides an alternative to the heavy computational power of proof-of-work consensus algorithms.

There are many ongoing developments and research surrounding blockchain since its creation in 2009. In the United States, a crucial framework was published in October of 2020, Attorney General William P. Barr announced the release of “Cryptocurrency: An Enforcement Framework,” a publication produced by the Attorney General’s Cyber-Digital Task Force. The 80+ page document published by the DOJ (Department of Justice), is significant in a few different ways. One, by even releasing a framework, it shows to America and the world that the DOJ is focusing on crypto. Two, as Coindesk indicates,“the document also suggested the U.S. government would enforce its laws regardless of where exchanges – referred to as virtual asset service providers, or VASPs – are based. In other words, these exchanges should comply with U.S. laws – even for their non-U.S. customers”. Three, there’s a section (page 20, II. Law and Regulations) in the framework that lists all of the laws that the DOJ can use to counter malicious activities. A stepping stone in the right direction, due to so many cryptocurrency-exchange platforms emerging and various use-cases of cryptocurrencies evolving.

Another hurdle that Blockchain has to go through is the image problem. Cryptocurrencies are notorious for bad actors such as scammers and hackers. It seems almost every month there is a new crypto-scam/pyramid scheme appearing, however it is important to clarify that implementing blockchain for business purposes is not the same as hyping and marketing cryptocurrencies. Blockchains serve as an infrastructure to create cryptocurrencies, without blockchains there are no cryptocurrencies.

Koray Caliskan, author of “Data Money: The socio-technical infrastructure of cryptocurrency Blockchains” defines cryptocurrency as “data money.” Blockchain allows the rights to move data in the digital space. The paper describes a blockchain as a digital actor-network platform that makes it possible to define and distribute these data transfer rights.

Why Blockchain: My proposed concept of developing blockchain as the infrastructure for a permanent data lockbox for all dark data. It will be a technological base to resolve the issue of privacy paradox. For example, Professor Vincent Mitchell and Bernadette Kamleitner argue “ One way to increase the sense of ownership would be to create a system where individuals are able to sell their personal data or, as has been promoted by California governor Gavin Newsom, to institute a ‘data dividend’ whereby some percentage of the revenue generated from users’ personal data is returned to the user.” Therefore, blockchain can be a potential for an alternative direction for combating data rights.

Some of the newly proposed New York Privacy Bills, require businesses to act as “Data Fiduciaries.” New York State Senator Kevin Thomas stated, “Fiduciaries, like an attorney or a doctor hold onto your information. They don’t share it, unless there is a need for the purpose for which they collected it. That’s not what’s going on here with these data companies, and these data brokers. They’re sharing it and we’re getting targeted.” However, one of the main problems emerging from these kinds of laws is hindering other state’s rules and regulations.

Lina Khan has argued that the proposed NY bill is incompatible with existing law in Delaware, where many of the tech giants are incorporated, that requires companies to maximize returns for shareholders. This causes the interests of stockholders and users to diverge. Directors of companies may be put in unforgivable positions of having to violate their fiduciary duties to stockholders under Delaware law, in order to fulfill their fiduciary duties to New York users. This is just for New York State, and there are many other states that are proposing slightly different bills. At this rate, more complexities for businesses and the general public are bound to rise and in an effort at simplification, may lead to Federal legislation that might be more palatable to lobbyists, but offer less protection to citizens.

However, the question remains on who should have ownership of an individual’s data: people, businesses, or government? In absence of data regulation and data protection, how can technology be the bridge for all stakeholders?

Conclusion

Individual concerns on dark data will increase in coming days and years. Dark data accumulation will multiply as well. A few state governments such as New York and California are trying to address the data privacy and rights. But, how the government is going to govern big data issues along with the individual’s rights to have ownership and manage their personal data is in the dark because of lack of proper rules. Additionally there is a lack of internationally accepted data governing framework. New startups attempting to tackle dark data may not be effective due to non-existing regulations.

Blockchain ecosystems could be the viable technological intervention in order to protect people’s privacy and data rights. This sort of new technology could be developed in public- private collaboration in order to manage dark data. For example, major data generating activities of TIFAAANG (Twitter, Instagram, Facebook, Apple, Amazon, Alibaba, Netflix, Google) and other corporations can use “blockchain ecosystems’’ to keep individuals’ data secured and compensated. Meanwhile, the government could (and should) introduce new rules for blockchain data management and privacy, because the technology has the potential to bridge the trust among all players in the ecosystem: government, businesses and people.


speculative data is a collection of user data generated from a location within the crop circles. They are not a tool of A.I but rather an agency of A.T (Alien Technology).

Sources:
1
Dark Connections

The internet is made up of interconnected pieces of data about its users. Every website has trackers installed in it, mostly belonging to Google or Facebook, that keep tabs on the people using it. This data is neither protected or encrypted, often fully accessible to anyone with the means to access it. Though these companies store our data and use it to sell their products to us, they are in no way responsible for it. This entire system is almost always not implicit and shrouded in the background of its utility. This section aims to connect these dots that exist in the dark underbelly of the internet, that we have a vague idea about, but that are not necessarily clear.
Making these connections can make the online experience feel scary and unsafe, but it already is. Although governments and large corporations are often seen as the problem, the truth is that they are far less interested in you or I than someone who knows us personally and has an agenda that involves us. This section shines a light on the dark patterns that enable your data to be collected and potentially mobilized against your interest.

2
Digital Forensics

In order to combat the practice of dark data, one can exploit the loopholes in its architecture. But in order to do this, we need to at least comprehend the full extent of the information that is collected about us. It is now possible for us to demand the data that is collected about us, though this option is not directly obvious to most people. Resources like APIs, Google Takeout, and OSINT tools allow us to conduct small-scale investigations with regards to where our data lives and what data exists about us. This section is a collection of attempts by the authors to gain access to and interpret their own data that exists online.
However, awareness of the data does not guarantee its control. Google may give us a copy of the data that exists about us in its servers through its Google Takeout service; but this does not mean that that we now own this data. Google can still use it however it likes, it has not been deleted from their databases. We are being given only an illusion of control and this is intentional. Digital Forensics can only grant us a window into this massive machine, the machinations of which may still continue to be unclear. This section explores these windows and what they teach us both about ourselves and about the technology that we utilize.

3
Data Futures

What is the future of dark data? People are increasingly aware that information about them is collected online. Governments are making efforts to regulate Big Tech and protect the privacy of citizens. How can we imagine better ways to exist in the system? How can we protect ourselves from its repercussions? This section speculates how dark data is changing as a practice. It discusses ways in which people can take action and re-examine their browsing methods. The ideas discussed here think about how technology can be used to propose solutions to the problem it has created.
It is important to consider that the practice of data collection and exploitation is ongoing. There is no easy way out of these cycles. However, we would like to believe that sparking deliberate thought and action to help you orient yourself in this Wild West landscape can make the process of coming to terms with dark data easier.

4
About

This digital edition was compiled from scholarship, research, and creative practice in spring 2021 to fulfill the requirements for PSAM 5752 Dark Data, a course at Parsons School of Design.

Editors

  • Sarah Nichols
  • Apurv Rayate

Art Directors

  • Nishra Ranpura
  • Pavithra Chandrasekhar

Technology Directors

  • Ege Uz
  • Olivier Brückner

Faculty

  • David Carroll
  • Melanie Crean

Contributors

This site needs no privacy policy because we did not install any tracking code and this site does not store any cookies.