Results 1 to 2 of 2

Thread: Google Censorship - How It Works

  1. #1
    Join Date
    Jan 2005
    Location
    America
    Posts
    30,749

    Google Censorship - How It Works

    Google Censorship - How It Works
    An anticensorware investigation by Seth Finkelstein

    http://www.sethf.com/anticensorware/...censorship.php

    (Gold9472: I saw this on www.waynemadsenreport.com today. As people of this board know, this story was published on Friday. When I do a search for 9/11 related stories, I first go to google, and type in "9/11" in the search field. That story was on the front page of Google News (for 9/11 related articles). Now, it's no longer there.)

    Abstract: This report describes the system by which results in the Google search engine are suppressed.

    Google Exclusion, introduction
    Google is arguably the world's most popular search engine. However, contrary perhaps to a naive impression, in some cases the results of a search are affected by various government-related factors. That is, search results which may otherwise be shown, are deliberately excluded. The suppression may be local to a country, or global to all Google results.

    This removal of results was first documented in a report Localized Google search result exclusions by Benjamin Edelman and Jonathan Zittrain , which investigated certain web material banned in various countries. Later, this author Seth Finkelstein discussed a global removal arising from intimidation generated from the United Kingdom town of Chester, in Chester's Guide to Molesting Google .

    My discussion here is not meant to criticize Google's behavior in any way. Much of it is in reaction to government law or government-backed pressure, where accommodation is an understandable reaction if nothing else. Rather, documenting and explaining what happens, can inform public understanding, and lead to more informed resistance against the distortion of search results created by censorship campaigns.

    How it works
    A Google search is not simply a raw dump of a database query to the user's screen. The retrieval of the data is just one step. There is much post-processing afterwards, in terms of presentation and customization.

    When Google "removes" material, often it is still in the Google index itself. But the post-processing has removed it from any results shown to the user. This system can be applied, for quality reasons, to remove sites which "spam" the search engine. And that is, by volume, certainly the overwhelming application of the mechanism. But it can also be directed against sites which have been prohibited for government-based reasons.

    Sometimes the fact that the "removed" material is still in the index can be inferred.

    Global censorship
    For the case of Chester , which concerned a single "removed" page, the internal indexing of the target page could be established by comparison with a search for the same material on another search engine.

    Consider a Google search for the word "lesbian" on the site torkyarkisto.marhost.com . It returns a page titled "The Kurt Cobain Quiz", with a count of

    Results 1 - 1 of about 2

    The "about" qualifier there represents many factors, but sometimes encompasses blacklisted pages. This can be seen here by comparing to an AltaVista search for the word "lesbian" on the site torkyarkisto.marhost.com

    There are two pages visible in that case, the "Quiz" page, and the "Chester" page which caused all the trouble in the first place.

    Since we know the "Chester" page was once in the Google index, it must be the other page referred to in "about 2". QED.

    Local censorship
    In this situation, comparing results from the different Country Google searches, is often revealing. The tests are often best done using the "allinurl:" syntax of Google, which searches for URLs which have the given components (note the separate components can appear anywhere in the URL, so "allinurl:stormfront.org" is "stormfront" and "org" in the URL, not just the string "stormfront.org" as might be naively thought). Stormfront.org is a notorious racist site, often banned in various contexts.

    Consider the following US search:
    http://www.google.com/search?num=100...stormfront.org
    This returned: Results 1 - 27 of about 50,700.

    Now compare with the German counterpart (Google.DE):
    http://www.google.de/search?num=100&...stormfront.org
    This returned: Results 1 - 9 about 50,700.

    Immediate observation: The rightmost (total) number is identical. So identical results are in the Google database. It's simply not displaying them. How is it determining which domain results to display?

    Note the hosts of which "stormfront.org" URLs are visible on the German page:

    irc.stormfront.org:8000/
    www4.stormfront.org:81/
    lists.stormfront.org:81/

    What do these all have in common?
    They all have a port number after the host name.
    The exclusion pattern obviously isn't matching the ":number" part of the URL.
    It's matching a pattern of "*.stormfront.org/" in the host, as in the following which are displayed the US search, but not the German search.

    www.stormfront.org/
    kids.stormfront.org/
    women.stormfront.org/
    nna.stormfront.org/
    www4.stormfront.org/

    Even more interesting, the German page has a broken URL listed at the bottom: http/www.stormfront.org/quotes.htm . That's not a valid URL, so it seems to escape the host check.

    Thus, the suppression again appears to be implemented as a post-processing step using very simple patterns of prohibited results.

    The same behavior is observed in a German "stormfront.org" images search
    This returned: Results 1 - 6 about 1,410.
    Versus a US "stormfront.org" images search
    This returned: Results 1 - 18 about 1,410.
    (note identical right-hand numbers, and hosts matching "*.stormfront.org/" pattern are suppressed in the German results)

    And also in a German "stormfront.org" directory search
    This returned: Results 1 - 8 about 15.
    Versus a US "stormfront.org" directory search
    This returned: Results 1 - 10 about 15.
    (note again identical right-hand numbers, and hosts matching "*.stormfront.org/" pattern are suppressed in the German results)

    Conclusion
    Contrary to earlier utopian theories of the Internet, it takes very little effort for governments to cause certain information simply to vanish for a huge number of people.
    No One Knows Everything. Only Together May We Find The Truth JG


  2. #2
    jetsetlemming Guest
    When you first said something about this, my first thought was "websites shut down". It looks like it's real, however. Any info on the owner/operators of Google?

Similar Threads

  1. Supreme Court Strikes Down Internet Censorship Law
    By Gold9472 in forum The New News
    Replies: 0
    Last Post: 01-22-2009, 12:50 PM
  2. Google 'in talks to buy YouTube'
    By Cloak & Swagger in forum The New News
    Replies: 3
    Last Post: 10-06-2006, 07:27 PM
  3. 9/11: Press for Truth Google Video
    By Eckolaker in forum 9/11 Justice Forum
    Replies: 11
    Last Post: 09-12-2006, 07:58 PM
  4. Replies: 0
    Last Post: 04-03-2006, 10:15 AM
  5. Google Wipes out OnlineTV.com for 911?
    By 911Eyewitness in forum The New News
    Replies: 6
    Last Post: 10-08-2005, 04:29 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •