Digital Profiling and Fingerprinting

by Nicholas D’Angelo

The field of digital privacy and data security is constantly changing. In the past year we have experienced a global pandemic that has pushed us to live our lives online. With the increase of traffic the need for online privacy and data security has surged. The ad tech industry exists to track and profit off of user data, so as more security practices become commonplace the adtech industry will continue to find new means of subversion and surveillance. A newer method of tracking and recording user data is emerging because it manages to subvert most of the current means of protection, that method has been dubbed fingerprinting - and for good reason.

As data privacy has gained more media attention, users have become more aware of the extent of data collection and online surveillance. Users have been searching for ways to improve their browser security resorting to the use of VPNs (virtual private networks), Ad blockers, and password managers to secure themselves. These products all share the goal of protecting their user’s devices and privacy when online - specifically VPNs will claim to anonymize and protect you when browsing the internet as if they are a one stop shop in security. While a VPN is a good starting point, they really only obscure their users IP address - not their online activity. Regardless, the spread of these technologies suggests that more people are interested in protecting their data- whether you are trying to avoid the prying eyes of governments, hackers, Internet service providers, or the ad tech industry.

Understanding that there are different ways to protect your information online is more important now than ever, and because there are so many of them they can be confusing or branded to do something they can’t.

A VPN is one of the most common methods of digital security, essentially allowing a user to send their data through a protected connection that also alters their IP address. VPNs can be used to create a private network when on a public wireless network protecting you from any bad actors or vulnerabilities while using an insecure network. By changing the IP address of the user to the server selected in the VPN, it can allow the user to access content that would otherwise be inaccessible because of the location. VPNs are incredibly useful when trying to get a unfiltered or uncensored browsing experience, because they allow the user to access content and information that could be restricted in their current location. By using a VPN you could view news coverage from around the world, access blacklisted websites, or use social media platforms that have been banned. In practice there are some cases where a VPN cannot function properly because it’s ip address has been blocked by a third party preventing it from connecting to the server that would be tunneling your browser traffic.

What a VPN won’t do is hide your activity from your Internet Service Provider or protect you from virus or virus tracking. Only antivirus software will protect your device from the effects of malware and viruses, these usually scan suspicious files and monitor applications. While having an antivirus software is advisable, it won’t do much to protect your activity online or stop the collection of your data. If you’re trying to avoid receiving targeted advertisements when browsing, ad blockers are often built into browsers or available extensions that can be used to block or hide advertisements on a webpage. By stopping ads and pop ups, ad blockers can give the user a more secure experience when browsing, preventing the user from getting redirected to unsafe websites.

Apart from ensuring their connection to the network is safe, users can also further secure their accounts by using password managers. The popular browsers have these built in now but a password manager essentially logs your passwords and cross references them with published leaks to check if your information is compromised. This is just an extra layer of security but should not be used as a replacement for complex unique passwords.

All of these will add up to make a user’s browsing experience more secure when online, but they will not protect the user from some methods of tracking. Even with all of these efforts, users can still be tracked through a myriad of small details. Depending on who is tracking your actions online these details, when put together, make up an incredibly unique identifier.

Not all browsers are created equal. Some browsers prioritize user security over the functionality of a website’s UI or graphic design. Most browsers can be configured to be more secure by turning off features like cookies, reducing javascript, and device permissions. Regardless of the amount of security, fingerprinting methods can construct some kind of profile of you based on a million small details. Details like the software and operating system on your computer, your apps - and the API’s they are connecting to can be monitored and recorded, any unique fonts, account settings, even the resolution of your window can be gathered. The browser’s version, permissions, timezone, cache, and plugins can all be recorded and logged. The majority of tracking data points come from Javascript-derived characteristics, specifically the navigator object and canvas fingerprint. The canvas fingerprint is a feature that is used in HTML5 to create graphics and illustrations in the browser. The feature can also be used to draw a hidden canvas that can be converted into an identifiable series of characters called a hash - all while being invisible to the user.

The Javascript Navigator Object allows the browser to see basic hardware specifications about your device. This can identify your device’s processor, your MAC (hardware) address, battery level, connected media devices, language settings, timezone, and your screen’s resolution. While javascript is used to generate a large amount of data points, some methods of fingerprinting don’t even rely on the browser or device, instead looking at the patterns and habits of their user. These data points can be construed from mouse movements, keyboard use, and patterns from visiting specific websites or social media platforms.

What can you do?

I would argue that if people are trying to be more secure they are likely to take extra precautions and use services that claim to protect them. This could be both beneficial and detrimental when it comes to browser fingerprinting. By changing settings they could make data collection harder but they could also be making their profile more unique. One of the most effective and simple suggestions for defending against fingerprinting is to install a ‘tracker blocker’ (essentially a form of adblock that prevents known tracking scripts from running). This will unfortunatly disable some level of functionality for the majority of websites. Many of those data collection methods use Javascript to track users, which is unfortunate because the means of data collection are deeply ingrained in how browsers and websites function. More browsers are beginning to add support for a type of networking called TOR networking, this is essentially going to attempt to make your fingerprint appear as standard as possible while blocking javascript in the browser. TOR networking and the TOR browser also function in a similar fashion as a vpn, directing the users information through multiple different servers to make tracking online activity much harder.

Wide net being cast

Browser fingerprinting and tracking allows for the invasion of user privacy, even when that user is attempting to anonymously traverse the web. This method can be used to monitor and collect data on individuals, regardless of that user’s intent online. It just shows data collection and processing is taking place on such a large scale without the consent or awareness of the users. This data can be gathered and sold to advertising agencies or third parties to fuel real-time marketing analytics. This can be used to create extremely targeted advertisements and promotes predictive algorithms that can categorize and misrepresent people. Data collection, analysis, and profiling is being used to not only predict user behaviour but also influence it. This concept has been seen and exemplified by the actions of Cambridge Analytica and the events of the 2016 United States Election and will likely continue to be seen unless there is more legislative action for data rights and data privacy. The longer users wait to prioritize data rights and rights to digital privacy, the harder it will be to undo the collection that is already occurring. Users should be pushing for more representation and legislation for their data, to give them the ability to control how their data is collected and used. Until then, users can protect themselves from browser fingerprinting by disabling javascript and making their systems appear as standard as possible with the use of tracker blockers, TOR networking, and by being mindful of their usage patterns when browsing.

Data Visualization

An example of how complex and unique a digital fingerprint can be- these data visualizations represent some of the most common fields a user is tested for when browsing the internet. Everything from unique fonts to their connected media devices is recorded and logged, each point on the visualization is another data point that can be used to identify or track a user. Different hierarchies hold different sets of tested information, the majority of the visual being made of javascript properties and the detected fonts installed on a users computer. The first two data visualizations include the data points created by checking for custom fonts, while the third just includes the header for the section. I chose to create the different versions to show the data points generated without the use of custom fonts, as the amount of queries from that section made it hard to perceive the other data points in the visualization. I also chose to represent the data radially to attempt to evoke the visual of an actual fingerprint.

Bibliography
1
Dark Connections

The internet is made up of interconnected pieces of data about its users. Every website has trackers installed in it, mostly belonging to Google or Facebook, that keep tabs on the people using it. This data is neither protected or encrypted, often fully accessible to anyone with the means to access it. Though these companies store our data and use it to sell their products to us, they are in no way responsible for it. This entire system is almost always not implicit and shrouded in the background of its utility. This section aims to connect these dots that exist in the dark underbelly of the internet, that we have a vague idea about, but that are not necessarily clear.
Making these connections can make the online experience feel scary and unsafe, but it already is. Although governments and large corporations are often seen as the problem, the truth is that they are far less interested in you or I than someone who knows us personally and has an agenda that involves us. This section shines a light on the dark patterns that enable your data to be collected and potentially mobilized against your interest.

2
Digital Forensics

In order to combat the practice of dark data, one can exploit the loopholes in its architecture. But in order to do this, we need to at least comprehend the full extent of the information that is collected about us. It is now possible for us to demand the data that is collected about us, though this option is not directly obvious to most people. Resources like APIs, Google Takeout, and OSINT tools allow us to conduct small-scale investigations with regards to where our data lives and what data exists about us. This section is a collection of attempts by the authors to gain access to and interpret their own data that exists online.
However, awareness of the data does not guarantee its control. Google may give us a copy of the data that exists about us in its servers through its Google Takeout service; but this does not mean that that we now own this data. Google can still use it however it likes, it has not been deleted from their databases. We are being given only an illusion of control and this is intentional. Digital Forensics can only grant us a window into this massive machine, the machinations of which may still continue to be unclear. This section explores these windows and what they teach us both about ourselves and about the technology that we utilize.

3
Data Futures

What is the future of dark data? People are increasingly aware that information about them is collected online. Governments are making efforts to regulate Big Tech and protect the privacy of citizens. How can we imagine better ways to exist in the system? How can we protect ourselves from its repercussions? This section speculates how dark data is changing as a practice. It discusses ways in which people can take action and re-examine their browsing methods. The ideas discussed here think about how technology can be used to propose solutions to the problem it has created.
It is important to consider that the practice of data collection and exploitation is ongoing. There is no easy way out of these cycles. However, we would like to believe that sparking deliberate thought and action to help you orient yourself in this Wild West landscape can make the process of coming to terms with dark data easier.

4
About

This digital edition was compiled from scholarship, research, and creative practice in spring 2021 to fulfill the requirements for PSAM 5752 Dark Data, a course at Parsons School of Design.

Editors

  • Sarah Nichols
  • Apurv Rayate

Art Directors

  • Nishra Ranpura
  • Pavithra Chandrasekhar

Technology Directors

  • Ege Uz
  • Olivier Brückner

Faculty

  • David Carroll
  • Melanie Crean

Contributors

This site needs no privacy policy because we did not install any tracking code and this site does not store any cookies.