[iDC] Alternatives to black-box page-rank algorithm (was conference summary part 2: the internet as playground and factory)

Fri Nov 20 09:51:33 UTC 2009

Hi Frank,

I fully support your quest on finding ways for 'civil' control over
the information gathering giants like Google.  I think it is important
to check all the possibilities - and your proposal to use a 'trusted
advisory committee' and analogy to law that 'allow trade secrets in
litigation to be reviewed' was surprising to my mind of an engineer,
but I wanted to put forward one potential problem with such a solution
- it amounts to the fact that deciding why a particular output
appeared on a Google search page means debugging the Google algorithm
together with the state of it's database which is huge and changes
constantly.   I would imagine that this is already a hard task for
Google programmers and it is possible that they need to do some
up-front work, i.e. saving a part of the engine state to do that (and
this can be a part of the answer why they don't want to discuss that)
- and it would be nearly impossible to do that with outside forces and
with the level of certainty required by a law proceeding.  But it is
entirely possible that we could force Google to make such an
investigation possible.

At the current stage I think we could also appeal to the "Don't be
evil" corporate motto of Google.  They used it to gather peoples trust
- now it is time that the people check if Google stick to it.

On Thu, Nov 19, 2009 at 8:33 PM, Frank Pasquale
<frank.pasquale at gmail.com> wrote:
> Zbigniew Lukasiak raises important issues below.  I don't think there will
> be many commercial alternatives developing, for reasons I give here:
>
> http://madisonian.net/2009/03/18/seven-reasons-to-doubt-competition-in-the-general-search-engine-market/
>
>
>
> So we have to respond to the dominant search engine (i.e., the
> Googlement) we've got. Gaming is serious problem.  My short answer is that a
> trusted advisory committee within the Federal Trade Commission (the US’s
> national privacy and consumer protection regulator) could help courts and
> agencies adjudicate coming controversies over search engine practices,
> without revealing rankings to the public.  Such a committee, like the FISA
> court, would not practice “total transparency”—it would practice “qualified
> transparency,” only releasing relevant methods “in camera” to entities that
> have a bona fide complaint.  Such a committee would extend to the
> administrative realm an old judicial practice called “protective orders,”
> which allow trade secrets in litigation to be reviewed.
>
>
>
> This institution might provide one method for developing what Christopher
> Kelty calls a “recursive public”–one that is “vitally concerned with the
> material and practical maintenance and modification of the technical, legal,
> practical, and conceptual means of its own existence as a public.”
> Questioning the power of a dominant intermediary like Google is not just a
> prerogative of the anxious.  Rather, monitoring is a prerequisite for
> assuring a level playing field online.
>
>
>
> However, even if we think that type of institutional solution is not
> practical, it’s still valuable to consistently remind people of the
> weaknesses of “algorithmic authority,” as I do here:
>
> http://balkin.blogspot.com/2009/11/assessing-algorithmic-authority.html
>
>
>
> I think that sort of consciousness-raising is important because, at the
> conference, one participant at the closing session said that media studies
> was in a primitive state, closer to “alchemy” than a real science like
> physics. We need to bear in mind the power of internet intermediaries before
> treating the web as a natural phenomenon to be studied and understood using
> the models of natural science.
>
>
>
> Search engines are referees in the millions of contests for attention that
> take place on the web each day.  There are myriad entities that want to be
> the top result in response to a query like “sneakers,” “best restaurant in
> New York City,” or “best employer to work for.” The top and right hand sides
> of many search engine pages are open for paid placement; but even there the
> highest bidder may not get a prime spot because a good search engine strives
> to keep even these sections relevant to searchers.  The unpaid, organic
> results are determined by search engines' proprietary algorithms, though
> users often fail to distinguish between unpaid and paid placements.
>
>
>
> Given the secrecy of search engines’ ranking algorithms and carriers’
> network management practices, it is very difficult for an entity to
> determine whether it has a “stealth marketing problem” online—i.e., a
> competitor that is somehow leveraging payments or business partnerships with
> intermediaries in order to gain greater relative exposure.  Recognizing this
> problem, the FTC has taken some tentative steps toward recognizing the
> potential for consumer deception and cultural distortion here.  In 2002, the
> agency sent a letter to various search engine firms recommending that they
> clearly and conspicuously distinguish paid placements from other results.
> But neither the FTC nor other potential regulators has followed up such
> guidance with systematic monitoring.
>
>
>
>  In order for the FTC to determine whether its guidance is actually being
> followed, it will need to develop sophisticated methods of understanding how
> organic results are determined.  Without such an understanding, it will be
> impossible to distinguish between paid and organic content.  This monitoring
> needs to happen in real-time, rather than after a dispute arises, for many
> reasons.  First, data retention may be spotty. Second, the history of
> regulation of high technology industries indicates that government lag in
> understanding how critical infrastructure functions can effectively neuter
> even a strong regulatory regime. Just as Danny Weitzner has called for an
> “independent panel of technical, legal and business experts to help [the
> FTC] review, on an ongoing basis, the privacy practices of Google,” the
> agency needs to develop the capacity for understanding the ranking practices
> of Google and its competitors.  This capacity could, in turn, enable
> litigants to submit focused queries to a nonbiased third party that could
> quickly give critical information to courts mired in discovery disputes in
> search-related lawsuits.
>
>
>
> I hope this counts as a practical response that respects Google’s war
> against spammers.  As Elizabeth Van Couvering, has argued, search engines
> often operate using a “war schema . . .  as they assume the role of guardian
> or protector of something precious—in this case, access to the Web” (Is
> Relevance Relevant? Market, Science, and War: Discourses of Search Engine
> Quality, 12 J. Computer-Mediated Comm. 866, 880 (2007)).   The public should
> have some idea how the internet is shaped by search engines.  And where, as
> in the case of books, the problem of spamming should be less acute than that
> on the web as a whole, more transparency may well be appropriate.
>
>
>
> All best,
>
> --Frank
>
>
>
> PS: France's "Commission Nationale De L'Informatique et des Libertes"
> (CNIL) appears to have taken some important steps regarding privacy, but I'd
> love to hear from French list members to hear if it's actually an
> institutional model for assuring that "the development of information
> technology remains at the service of citizens and does not breach human
> identity, human rights, privacy or personal or public liberties."
>
>
>
>
>
> On Thu, Nov 19, 2009 at 3:08 AM, Zbigniew Lukasiak <zzbbyy at gmail.com> wrote:
>>
>> Hi there,
>>
>> I have not been at the conference and I don't know if this point was
>> raised, if it was then - please forgive me.
>>
>> On Wed, Nov 18, 2009 at 6:28 AM, nathan jurgenson
>> <nathanjurgenson at gmail.com> wrote:
>> > Frank Pasquale forcefully called on Google to be more transparent. Given
>> > what was discussed above, as well as Google’s central status in our
>> > day-to-day knowledge-seeking life, Pasquale leaves us with questions to
>> > ponder: should its page-rank algorithm be public? Should Google be
>> > allowed
>> > to up-rank or down-rank links based their relationship to the company?
>> > Should Google be able to simply remove pages from its listings? Should
>> > Google be forced to let us know when they do these things? ~nathan
>>
>> I am also more and more afraid of the kafquesque world of Google
>> government of our information sources - but they do have a valid point
>> for the secrecy of page-rank: this is about defending against those
>> that try to game the system.  If the page-rank algorithm was public it
>> would be analysed and effective ways to game it would be found and we
>> would drown under the deluge of spam.  Now there are still people and
>> companies that try to analyse the black-box - but at least their
>> actions cannot be very effective.
>>
>> If we are to be constructive in our criticism Google for the black-box
>> algorithm we should also propose some alternative.   Most probably
>> there is no alternative that Google could unilaterally deploy - most
>> probably this would require a complex web of law, social norms and
>> technical changes.  This would be an interesting project.
>>
>> Cheers,
>> Zbigniew Lukasiak
>> http://brudnopis.blogspot.com/
>> http://perlalchemy.blogspot.com/
>> _______________________________________________
>> iDC -- mailing list of the Institute for Distributed Creativity
>> (distributedcreativity.org)
>> iDC at mailman.thing.net
>> https://mailman.thing.net/mailman/listinfo/idc
>>
>> List Archive:
>> http://mailman.thing.net/pipermail/idc/
>>
>> iDC Photo Stream:
>> http://www.flickr.com/photos/tags/idcnetwork/
>>
>> RSS feed:
>> http://rss.gmane.org/gmane.culture.media.idc
>>
>> iDC Chat on Facebook:
>> http://www.facebook.com/group.php?gid=2457237647
>>
>> Share relevant URLs on Del.icio.us by adding the tag iDCref
>

-- 
Zbigniew Lukasiak
http://brudnopis.blogspot.com/
http://perlalchemy.blogspot.com/