The Deep Web – The Area of the Internet You Don’t Know About
I’ve referenced Dark Social before. In short terms that is the social media activity that most analytic systems can’t trace. Most of that activity to linked to mobile usage. In this, instance when I speak of the Deep Web there is truly a deeper, at times more dangerous connotation. The internet that most of us know, Youtube, Facebook, Amazon, etc….really only makes up at most 1% of the actually Web. Researchers are still trying to calculate a more exact figure but much like deep space the web continues to grow.
For moment though let’s think of the Web in terms of the ocean instead of outer space. What we generally navigate is the bare surface of what the web actually is. It is space that we navigate often via search engines. Yahoo, Bing, Google and the like open the doors to a vast array of web pages but they are far more limited than you may think. Their actual search capabilities still leave TRILLIONS of web pages untouched. Here’s an excerpt from CNN that gives some more insight to the matter.
Though the Deep Web is little understood, the concept is quite simple. Think about it in terms of search engines. To give you results, Google ( ), Yahoo ( ) and Microsoft’s ( ) Bing constantly index pages. They do that by following the links between sites, crawling the Web’s threads like a spider. But that only lets them gather static pages, like the one you’re on right now.
“When the web crawler arrives at a [database], it typically cannot follow links into the deeper content behind the search box,” said Nigel Hamilton, who ran Turbo10, a now-defunct search engine that explored the Deep Web.
Google and others also don’t capture pages behind private networks or standalone pages that connect to nothing at all. These are all part of the Deep Web.
50% of Deep Web Institutional databases and more all create material that are public but generally can’t be reach by standard means. It’s estimated that half of the Deep Web is a seas of reports, databases and study pieces ranging from The Patent and Trademark Office to NASA. Businesses with their own Intranet systems make up another 10-15%. These are spaces you generally would need to pay to access.
That leaves the most dangerous corners of the Web which experts refer to as Tor. Existing within a system of redirected, bouncing signals to avoid tracking, this is where illegal online activity is taking place. Weapons and drugs sales, pirated media and all types of generally banned services and materials can be found here.
While most of what I’ve mentioned today most of you will never see, there is a growing professional push to gain access to more of the web. Various universities are diving into the Deep Web looking to regularly search it’s contents. Stanford, for example, has built a prototype engine called the Hidden Web Exposer, HiWE. Others that are publicly accessible are Infoplease, PubMed and the University of California’s Infomine.
There is also a browser bundle to reach Tor but that is not something I’d share with the masses, but it’s out there. As with most great technological advances, we are always playing catch up. It’s so much bigger then Retweets, SEO and Keywords when you step back and look at the bigger picture. Just keep in mind the how vast the space is we are engaged in daily and be responsible.