[iDC] “Forge lock in” is not the problem. It’s the choices that people make, that are the problem.

Samuel Rose samuel.rose at gmail.com
Tue Oct 27 15:12:24 UTC 2009


At http://esr.ibiblio.org/?p=1282 Eric S. Raymond writes:

  "  The worst problem with almost all current (code) hosting sites is
that they’re data jails. You can put data (the source code revision
history, mailing list address lists, bug reports) into them, but
getting a complete snapshot of that data back out often ranges from
painful to impossible.

    Why is this an issue? Very practically, because hosting sites,
even well-established ones, sometimes go off the air. Any prudent
project lead should be thinking about how to recover if that happens,
and how to take periodic backups of critical project data. But more
generally, it’s your data. You should own it. If you can’t push a
button and get a snapshot of your project state out of the site
whenever you want, you don’t own it.

    When berlios.de crashed on me, I was lucky; I had been preparing
to migrate GPSD off the site due to deteriorating performance; I had a
Subversion dump file that was less than two weeks old. I was able to
bring that up to date by translating commits from an unofficial git
mirror. I was doubly lucky in that the Mailman adminstrative (sic)
pages remained accessible even when the project webspace and
repositories had been 404 for two days.

    But actually retrieving my mailing-list data was a hideous process
that involved screen-scraping HTML by hand, and I had no hope at all
of retrieving the bug tracker state.

    This anecdote illustrates the most serious manifestations of the
data-jail problem. Third-generation version-control (hg, git, bzr,
etc.) systems pretty much solve it for code repositories; every
checkout is a mirror. But most projects have two other critical data
collections: their mailing-list state and their bug-tracker state.
And, on all sites I know of in late 2009, those are seriously jailed.

    This is a problem that goes straight to the design of the software
subsystems used by these sites. Some are generic: of these, the most
frequent single offender is 2.x versions of Mailman, the most widely
used mailing-list manager (the Mailman maintainers claim to have fixed
this in 3.0). Bug-trackers tend to be tightly tied to individual
hosting engines, and are even harder to dig data out of."

Eric acknowledges that distributed revision control solves the problem
of the code repository being a “data jail”. My opinion is that the
other problems are solved by extremely low cost hosting of your own
email lists (many shared hosting providers offers GNUMailman lists for
5-10 per month)  plus, hosting your own distributed bug tracking via
tools like http://bugseverywhere.org/be/show/HomePage

It’s my opinion that the building blocks (and more than 90% of the
solutions) exist to route around the blockages caused by “forge
lock-in”.  The distribution of communication could be done via
http://openmicroblogging.org/protocol/0.1/ which could allow people to
post to development discussions from almost anywhere online, and have
the messages tracked and linked to via microblog. This would obsolete
the need for email discussion of development altogether (a change I
would fully welcome). this could also synchronize with discussions
happening in IRC channels (where most developers now actually discuss
development these days anyway) .  Tools already exist to connect IRC
with asynch online discussion.

The conclusions that I draw:

   1. “forge” sites have obsoleted themselves to being anything other
than a convenient mirror for project release files
   2. It’s more important to me to focus my time and energy on ways to
route around the blockage, than to decry the blockage, especially when
it’s now 100% possible and affordable for people with even meager
resources to route around
   3. Most importantly: many of the problems that are a concern (not
just with “forge lock-in”, but also with data lockin from social media
websites) can be easily solved now with distributed solutions.  The
first question you should ask is: “how will my activities and data
interoperate with others?“, and NOT “how will this best work for me?”
if you concentrate on how your approach will interoperate best with
others, there is still room to address how it will work best for you.
But, if you only concetrate on how your approach will work best for
you, you’ll miss opportunities by ignoring interoperability.
later problems of “data jails”, “data lockin” etc. Invest now in
infrastructure that allows for expansion, connection with others in a
plurality of ways, and allows for as much distribution of
infrastructure as possible. This investment will return to you
exponentially, as you have chosen infrastructure that is permenantly
adaptable to new standards, new pressures etc

For anyone, not just developers, but also for people discussing
problems such as the problems with “crowdsourcing” and “locked social
media” discussed frequently at
https://mailman.thing.net/mailman/listinfo/idc  the choices that *you*
make, are what locks you in to the systems that you see are
constraining you. There are other choices that you can make now, and
they are worth the investment, but many are not making these choices.

The same is true for how you structure collaboration, and/or how you
participate in collaborative processes, how your “surplus labor” is
used.  In all cases, before jumping in and using services run by
companies who’s primary and sometimes even legally required focus is
to seek monetary profit, spend the time looking at what the plausible
alternatives are, and design for interoperability, for adaptability.
Give your money and resources to those people that already exist that
will give you solutions that won’t lock you in and that will let you
adapt over time. Don’t always opt for the instant gratification
choice, as there are people out there who can and will capitalize on
your need for instant gratification, then complain after the fact that
the choice you make made it harder for you to adapt and sustain your
activities over time. If you change your focus from “what is best for
me now” to “how I can make this activity as interoperable and flexibly
adaptable as possible” will not have the dilemma of having to deal
with “lock in” or leaching of “surplus labor” problems

Sam Rose
Social Synergy
Tel:+1(517) 639-1552
Cel: +1-(517)-974-6451
skype: samuelrose
email: samuel.rose at gmail.com

"The universe is not required to be in perfect harmony with human
ambition." - Carl Sagan

More information about the iDC mailing list