Archives de catégorie : Architecture

Un lab pour que l’État démocratise les blockchains ?

La technologie des blockchains va transformer notre société de manière au moins aussi radicale que le Web. L’État a besoin d’un laboratoire de recherche et développement dédié aux blockchains :

pour accélérer la transformation numérique des métiers du financement et de la sécurisation des politiques publiques,
pour en démocratiser les applications d’intérêt général, de manière à éviter que cette technologie ne renforce exclusivement des pouvoirs privés (notamment financiers),
et pour renseigner nos gouvernants et hauts fonctionnaires sur les menaces et opportunités liés à cette technologie ainsi que ses capacités et limites réelles.

La Caisse des Dépôts et Consignations a récemment annoncé la création d’un groupe de travail sur ce thème (voir aussi leur communiqué de presse). Développer ce genre de laboratoire serait une aventure passionnante au service de laquelle je souhaiterais mettre mes compétences professionnelles.

La blockchain bouleverse la régulation des relations sociales

La « blockchain » est la technologie sous-jacente aux cryptomonnaies telles que Bitcoin. Le caractère monétaire, plus ou moins intraçable et anonyme, de Bitcoin n’a qu’une importance anecdotique. Sa technologie sous-jacente est, elle, révolutionnaire.

Une blockchain est un ordinateur mondial unique qui, au lieu d’être installé dans un immense hangar spécialisé, est installé de manière « distribuée » sur les millions d’ordinateurs et smartphones des utilisateurs qui l’exécutent tous ensemble, simultanément en se coordonnant les uns avec les autres de manière automatique via Internet. Comme tout ordinateur, un ordinateur « blockchain » exécute des applications logicielles. Les premières blockchains, telles que celle de Bitcoin, permettaient principalement d’exécuter des logiciels de transaction financière entre utilisateurs (« Alice transfère 1 bitcoin à Bob »). Mais les blockchains plus récentes, comme celle d’Ethereum, permettent d’exécuter des applications multi-utilisateurs aussi complètes que celles auxquelles nous ont habitués nos ordinateurs et smartphones. A la différence d’un ordinateur ou d’un smartphone habituel, une blockchain, en pratique, ne peut pas être éteinte ni piratée : il ne s’agit pas d’une infrastructure informatique entre les mains d’une personne, d’un groupe d’individus ou d’un opérateur mais d’un ordinateur virtuel unique ultra-sécurisé car s’exécutant simultanément sur des millions d’ordinateurs. Si quelques-uns des ordinateurs participants viennent à « tricher » ou défaillir, ils sont automatiquement détectés par les autres et leurs déclarations sont rejetées.

Étant donné qu’une blockchain est un ordinateur multi-utilisateurs, on qualifie ses logiciels de « contrats intelligents » (« smart contracts ») ou contrats algorithmiques auxquels souscrivent ou non les utilisateurs qui le souhaitent. Un utilisateur de blockchain est un souscripteur de contrat algorithmique. Contrairement à un contrat traditionnel, les contrats algorithmiques apportent une garantie mathématique d’exécution : étant donné qu’il s’agit de logiciels sur un ordinateur que nul ne peut éteindre ni compromettre, ces contrats seront exécutés tels qu’ils ont été écrits (avec leurs éventuels bugs de conception), sans échappatoire possible.

La blockchain est donc une technologie sociale qui permet de conclure et d’exécuter des contrats entre personnes dans des conditions de sécurité telles qu’elles rendent inutile de faire confiance aux contractants ou à un tiers de confiance, à un magistrat ou un arbitre, pour en obtenir l’exécution prévue. On parle de technologies « sans confiance », c’est-à-dire qui permettent de s’exonérer de tout tiers de confiance.

En résumé, la technologie des blockchains, sous-jacente à Bitcoin et aux autres cryptomonnaies, redéfinit la manière dont se régulent certaines relations sociales. La blockchain rend possible des relations contractuelles (y compris des contrats de propriété) tellement sûres qu’elles rendent inutiles les tiers de confiance et les arbitrages humains habituellement requis pour la garantie d’exécution d’un contrat. La décision humaine intervient lors de la conception du contrat et lors de sa souscription. Mais elle n’intervient plus lors de son exécution.

L’histoire se répète, c’est le moment de s’y mettre

A titre prospectif, voici un scénario d’évolution des blockchains qui suit la même logique historique que celle de la démocratisation de l’Internet et du Web. D’ici 2018, la technologie des blockchains, jusqu’ici réservée aux spécialistes des cryptomonnaies et autres cryptanarchistes gagnera en maturité au point d’être utilisable facilement par l’ensemble de la population. A partir de 2018, la blockchain connaîtra une démocratisation de même envergure que celle du Web dans la deuxième moitié des années 1990.

Histoire du Web	Histoire des blockchains
1963 = idée d’un réseau global d’ordinateurs	1988 = concept de cryptomonnaie (David Chaum)
1973 = définition de TCP/IP	1998 = définition des premières cryptomonnaies distribuées (b-money et bit-gold)
1983 = adoption de TCP/IP et du concept d’Internet, premier serveur DNS	2008 = invention du bitcoin
1990 = invention du World Wide Web	2015 = invention de Ethereum, la première blockchain à langage de programmation complet
1993 = début de la démocratisation du Web avec le premier navigateur Web multimédia « NCSA MOSAIC »	2018 = début de la démocratisation des blockchains avec les premières applications mobiles d’utilisation de de contrats algorithmiques.

Les applications de la blockchain

Les blockchains vont être appliquées dans tous les domaines où la sécurité économique est importante et où les tiers de confiance sont traditionnellement présents. Elles redéfiniront le rôle des intermédiaires financiers et contractuels (phénomène de désintermédiation).

Poursuivons notre exercice de prospective :

Les applications immédiates, d’ici 2 à 5 ans, sont les suivantes :

blockchains bancaires privées pour la sécurisation des transactions financières (cf. le laboratoire créé en 2015 par les banques Barclays, Goldman Sachs, BBVA, UBS, Credit Suisse, JP Morgan, Royal Bank of Scotland, Commonwealth Bank of Australia, et de State Street)
sécurisation des registres, certificats et actes administratifs : gestion du cadastre (cf. les intentions exprimées au Ghana et au Honduras), transformation numérique des métiers notariaux, sécurisation des actes juridiques, sécurisation et ouverture des registres du commerce, des listes d’organismes agréés, des diplômes et certificats de compétences, etc.
sécurisation et automatisation des transactions administratives et des versements associés (Chèque Emploi Service Universel, allocations familiales, déclarations administratives, déclaration et collecte des impôts et taxes, …)
identification partielle des migrants et réfugiés, renforcement de la coopération policière européenne, lutte contre l’évasion fiscale, …
émergence de la propriété collaborative (« smart property ») dans la continuité des AirBNB et autres Uber.

Les applications à moyen terme, d’ici 5 à 10 ans, pourraient être les suivantes :

démocratisation des micro-paiements et des portes-monnaies individuels sur blockchains, « explosion » du nombre de contrats et développement des nouveaux usages autour de la consommation collaborative, accélération des phénomènes « d’ubérisation » ; de même que chaque nouvel internaute avait sa « page Web » ou son « mur Facebook », chaque internaute publiera sur une blockchain sa collection de souscriptions et contrats et y gérera les transactions de sa vie quotidienne, en plus ou moins grande conformité avec les législations locales,
développement à grande échelle de la propriété collaborative (exemple : je possède cet objet de telle heure à telle heure et uniquement si telles conditions sont remplies, et tu en es propriétaire le reste du temps), nouvelles formes de contrats de prestations et de travail collaboratif (par exemple résolution de problèmes algorithmiques contre paiement, ou participation à des oracles collaboratifs organisés autour du principe du Point de Schelling),
nouvelles formes de collecte de dons et impôts volontaires, nouveaux modes de redistribution des richesses (cf. revenu de base, smart basic income), démocratisation des monnaies alternatives spécialisées,
nouveaux instruments pour le financement des politiques publiques (exemple : smart Social Impact Bonds)
sécurisation de l’open data pour en faire une infrastructure informationnelle publique utilisable pour les transactions de la vie courante,
développement et fiabilisation des marchés de prédiction semi-automatisés par appel à une foule d’internautes livrant des prédictions (crowd-sourcing) et aux capacités croissantes de prédiction et d’apprentissage automatisé (machine learning)
émergence des premières D.A.O. (Distributed Autonomous Organizations), organisations gouvernées sans intervention humaine : banques sans banquiers, assurances sans assureurs, mutuelles sans administrateurs, sociétés privés sans gérants ni conseils, associations sans conseil d’administration, partis politiques sans bureaux politiques, complexes logiciels et intelligences artificielles restreintes.

Les applications à long terme, d’ici 10 à 20 ans, donneront toute sa dimension à la révolution blockchain. En particulier, la société sera marquée par de nouveaux sentiments identitaires d’appartenance à des communautés fondées sur des contrats algorithmiques. Après la tribu ou la famille, puis la nation et la société privée, les contrats algorithmiques créeront une surcouche de régulation sociale au travers de communautés contractuelles transnationales, concept exploré par plusieurs auteurs de science-fiction de style « cyberpunk » ou « steampunk » comme par exemple les « phyles » de Neal Stephenson (L’Âge de Diamant).

Impact sur les acteurs économiques

A court terme, la technologie des blockchain annule une part croissante de la valeur ajoutée de tout ou partie des tiers de confiance, en les rendant inutiles et obsolètes. Le phénomène est à rapprocher de la manière dont le Web a annulé une partie de la valeur ajoutée des distributeurs de contenus culturels, ce qui avait entraîné la crise de l’industrie du disque et de la presse. Les blockchains transforment la confiance contractuelle en une matière première faiblement différenciatrice, dont tous les opérateurs économiques peuvent disposer à faible coût, sans pouvoir y construire un avantage concurrentiel durable.

En banalisant (commoditization) la technologie des opérateurs et tiers de confiance, la blockchain détruit les rentes des « intermédiaires » de confiance et déplace les gisements de valeur ajoutée vers une « couche technologique » plus élaborée : celle des applications et usages des blockchains. Ce faisant, elle crée un phénomène d’ « innovation de rupture » qui érode les parts de marché des opérateurs traditionnels « de confiance » (banques, assurances, mutuelles, professions réglementées de la confiance, voire administrations) et les acculera progressivement vers des niches à plus forte valeur ajoutée mais à plus faible volume, jusqu’à les faire disparaître. Le périmètre exact des industries concernées reste à préciser.

Les innovation de rupture permises par la blockchain permettent à de nouveaux acteurs de concurrencer ces opérateurs traditionnels sur leurs segments « bas de gamme » qu’ils leur délaisseront progressivement au fur et à mesure que les applications des blockchains gagneront en maturité, en simplicité, en praticité et donc se démocratiseront. Les nouveaux acteurs des blockchains proposeront des offres « sans tiers de confiance » tellement simples, pratiques et peu coûteuses qu’ils y donneront accès pour des segments de clientèle très nombreux, très volumineux et jusqu’ici exclus du marché des applications traditionnelles de la confiance contractuelle.

On devrait, par exemple, voire émerger des offres de services bancaires sans banques, de crédit sans organisme de crédit, d’assurance sans assurance, de mutuelles sans mutuelles, de notaires sans notaires, d’huissiers sans huissiers, pour des contrats simples, pratiques et peu coûteux jusqu’ici hors de portée des particuliers. On peut d’ailleurs constater les prémisses de ce phénomène à travers l’émergence des prêts entre particuliers.

Des excès de la financiarisation ou de l’ubérisation de l’économie sont à craindre. Mais des innovations sociales solidaires de très grande ampleur sont également à espérer.

A plus long terme, l’administration et les services publics sont directement concernés. En effet, l’administration publique a aussi une raison d’être qui est de nature économique : celle d’offrir aux citoyens des services publics dans le respect de l’intérêt général et avec un haut niveau de confiance, celui inspiré par la puissance de l’État et sa capacité régalienne de violence (police, justice, prison, armée). Supposons que les citoyens puissent commencer à obtenir des services similaires en souscrivant à des contrats algorithmiques et que les preuves mathématiques d’exécution de ces contrats, offertes par leur nature algorithmique, leur inspirent davantage confiance que l’État (qui est plus fiable : les mathématiques ou les États ?). Alors ils se tourneront progressivement vers ce nouveau mode de régulation de leurs relations, meilleur garant de l’intérêt général que l’État. Leur qualité de souscripteur pourrait prendre plus d’importance à leurs yeux que leur qualité de citoyen. Jusqu’à il y a peu, l’État exerçait une sorte de monopole sur la défense de l’intérêt général, monopole déjà largement érodé par le développement du secteur associatif et celui, plus limité, du mécénat privé. Mais rien n’empêchera désormais un collectif de personnes d’écrire un contrat algorithmique défendant la vision de l’intérêt général à laquelle ils souscrivent (leur politique), d’y souscrire et d’y trouver un substitut pratique aux services publics avec toute la confiance mathématique conférée par la nature algorithmique de ces contrats.

Et si les citoyens, plutôt que d’espérer changer le monde par le débat démocratique et le vote, se mettaient à vouloir changer de monde en souscrivant à des contrats collectifs reflétant leurs valeurs et leur vision politique ? Et si l’État, prenant les devants, proposait des applications blockchain et des contrats algorithmiques incarnant les valeurs et principes de notre démocratie et promouvant l’intérêt général, tout en laissant au citoyen la possibilité d’y exprimer ses besoins et aspirations individuelles ? Et si on mettait les blockchains au service de l’État, de l’intérêt général et de l’innovation sociale ?

A/B split testing with Plone

I have a deep interest in the lean startup method. One of the favorite tools of the lean startuper is A/B split testing. My favorite software package is Plone. Can Plone be used for A/B split testing without having to develop a specific python product ? The answer is probably yes.

Here is my recipe for a starter toward A/B split testing with Plone :

take a fresh Plone
add a PloneFormGen
add a “Thank you” page for each and every option you want to test ; note the ID of the pages (e.g. page “optionA” and page optionB”) ; the user will be redirected to one of these pages
add a text field to the form (multiple-lines text field not one-line string field)
override the default value of this field with the following tales expression :

python:[random.seed(str(request.AUTHENTICATED_USER) + request.REMOTE_ADDR), random.choice([i.getId() for i in here.aq_parent.aq_inner.listFolderContents(contentFilter={“portal_type” : “FormThanksPage”})])][1]
make the text field a hidden and server-side field
overrides the form’s custom validation action with the following expression :

redirect_to:request/form/page

where “page” is the ID of the text field you set up above.
add a Data Recorder to the form so that the value of the “page” field gets recorded

What do we have now ? We have a form with a button. When the user clicks on the button, she is randomly redirected toward one the several “Thank You Pages” that you have defined. The redirection is based on the IP address of the user and her username if she is authenticated. The redirections are uniformly distributed against your destination pages. And they are recorded in the data record field.

You A/B split test is not complete and several further steps must be taken before this is a fully operational solution but that was an enjoyable hack to make for me. Have fun with it and tell me how you would proceed with split testing and continuous deployment using Plone !

Agrégeons les missions de volontariat, bénévolat et mécénat de compétences

Je suis en train de regarder la video de la conférence Digital4Change avec M.Yunus et M.Hirsch chez HEC la semaine dernière.

Et la première intervention de Martin Hirsch me laisse croire qu’il y a une action technologique à fort impact social qui pourrait être lancée par un geek qui s’ennuierait un peu (ça doit bien exister, non ?). L’objectif social serait de faciliter l’engagement des jeunes (et moins jeunes) dans des actions de volontariat (à temps plein, indemnisé), de bénévolat (sur le temps libre, non indemnisé) ou de mécénat (à temps partiel ou plein, sur le temps de travail) auprès d’associations d’intérêt général. Ce que je suggère, c’est d’agréger et publier en linked data l’ensemble des missions de volontariat publiées par les diverses agences de service civique, en commençant par l’agence nationale française. Il y a même déjà un flux RSS 2.0 des missions.

Ensuite, il y a Unis Cités, le précurseur français. Pas de flux RSS ?

En France, on a aussi les missions de bénévolat. Je pense à Passerelles & Compétences et de multiples autres organisations similaires.

A l’étranger, il y a des dizaines de sites équivalents. Je pense notamment à idealist.org (qui a aussi un flux RSS de missions) et d’autres.

Ce serait vraiment génial de publier des flux interopérables de toutes ces bases de volontariat et d’en publier une vue agrégée sur un site central, avec possibilité de recherche multicritère et d’alerte (en flux RSS ou par email).

Bon, ceci dit, c’est un peu nul de ma part de suggérer cette idée dans la mesure où je ne prendrai probablement pas le temps de lancer techniquement une telle initiative. Mais je serais ravi d’y participer et d’y inclure les flux de missions de plate-forme ouvertes de mécénat de compétences (Mecenova, Wecena, et pourquoi pas makesense et d’autres, …). Ca inspire un hacker de passage par ici ?

SVG as an alternative to Flash, here comes bliotux

As a follow-up to my SMIL-animated SVG for accessible textbooks article, here is a copy of the README file of wecena.bliotux. I currently have 4 full-time wecena volunteers currently making accessible textbooks for children with cognitive disabilities (mainly dyspraxia) under the supervision of an INSERM medical research lab and of a dyspraxia-related non-profit organization, Dyspraxique Mais Fantastique. They currently use Didapages, a Flash-powered proprietary authoring tool to make these would-be accessible textbooks. But we are not satisfied by this tool and I wanted to propose an open-standards free software alternative. So I wrote wecena.bliotux as a proof-of-concept of such an alternative technological framework.

Beyond dyspraxia and children with disabilities, I think bliotux may be of some use for any developer looking for an alternative to Flash as a technology to make highly-graphical, ineractive and animated offline or online applications. The source code is available under the wecena subversion repository (until I create a dedicated repository). Here is a full copy of the README file :

wecena.bliotux

This software package is a framework for building web applications having the following buzzwords

web
apps: run in your web browser
offline
apps: no web server, no Internet connection required
rich
applications : highly graphical user interfaces, using SVG
animated
applications : pages can include (interactive) animations using (SMIL-powered) animated SVG templates
interactive
: interaction/behaviour is defined in a simple Javascript file corresponding to a given page
with persistence
of user data and application state : using local storage with persistence engines such as Google Gears (or HTML5 localstorage when it’s mature enough in Firefox)
template-based
: pages sharing a common layout/structure are based on template files
document-oriented: a simple data structure in a data.js file defines the data used to populate the corresponding SVG template for any given page
free software: distributed under the Affero GPL License (even though I am not 100% sure of the exact meaning of the Affero version for offline applications BTW…)
based on open standards: SVG now (Daisy Profile for SMIL+SVG, CSS and WAI-ARIA in the roadmap) rather than based on proprietary technologies such as Microsoft Silverlight or Adobe Flash
highly accessible
even though using JavaScript (see open standards…)
as cross-browser
compatible as possible: apps should run on any web browser as long as they offer some support for SVG and Javascript; and bliotux users should not have to care much about browser compatibily.

The original aim of this package is to build a non-Flash interactive animations management framework so thataccessible
textbooks can be made for children with cognitive disabilities (mainly dyspraxia).
But it could be used to produce any set of interactive animations
such as books, websites, interactive animations or I don’t know what.
You imagine.
You experiment.
You tell me what it may be useful for !

The following JavaScript libraries are used

jQuery
jQuery.SVG
(RaphaelJS
might have been a better choice).
jQuery.jStore
(PersistJS
might have been an acceptable choice).
Google Gears
(as a dependency of jStore because the implementation of HTML localstorage by Firefox has a bug)
doctestjs
(because I would be so cool if only I could figure out how to use Javascript doctests for this project…)

Disclaimer with regards to JavaScript as a programming language :
Ahemm… Javascript was selected because we wanted to have one and only one language to be used both for the making
of bliotux-powered templates and pages and for their execution.
And their execution should not require any
prior installation of software : the web browser should be the only required stuff.
And Javascript seems to be the only open-standards-oriented way to offer rich interactivity to SVG in web browsers.
Too bad.

How to use wecena.bliotux ?

At the moment wecena.bliotux is nothing but a proof-of-concept.
More will come in case the project I’m working on selects this technology
as a viable alternative to the Flash-based proprietary product we are
currently using in order to make accessible textbooks for children
with cognitive disabilities.

Download and install bliotux

It’s in a subversion repository.
There is some subversion documentation available in
case you don’t know how to download software from a subversion repository. Bliotux is stored
in the wecena repository but it will get its own repository some day.

Create a template

Bliotux pages are based on templates.
Let’s create a first template.

Name your template

Choose a name for your template. In this example,
the name is
simpleOperation
because it is a template page for textbooks
for children learning additions and other simple mathematical operations.

Name a template folder accordingly.
For instance, I have
wecena.bliotux/templates/simpleOperation/

Define the layout of your template

This part is the job of a graphics designer.

The layout of a template is defined by a SVG file.
(Download, install and) use any SVG editor to create such a file.
I personnally use Inkscape, which is free software.

Your SVG template should be named
layout.svg
and
should be stored under the template folder.
Here it goes:
wecena.bliotux/templates/simpleOperation/layout.svg

The next version of Inkscape should allow you to use its new timeline-based animation editor capabilities to add
animation to your template.
At the moment, you will have to have an XML developer edit the source code of your SVG
template and add animation (animated SVG) instructions “by hand” if needed.

Here is a clue about how to possibly accelerate the development of such SVG animations without waiting for the
next version of Inkscape :

Download and install Open Office Impress
Make a (duplicate) sketch of your layout in Impress
Add the desired animation effects to it using the rich set of animation features Impress offers
Save your animated Impress presentation in its native .ODP format
Open this file using an archive handler (such as winzip under windows) :
Open Office files are nothing but ZIP archives containing XML and graphics
Edit the source code of the main XML file this .ODP archive contains.
Ask your XML developer to copy, paste and adapt the animations instructions therein
into your
layout.svg
file.
(The animation instructions can easily be located : they use the
anim:
namespace).

Define the interactivity of your template

This part is the job of a Javascript developer.

This is the hardest part if you are not a developer.
It should be easy if you have any experience in web development.

In the case of a children textbook for teaching additions and other simple mathematical operations,
we’d like our “simpleOperation” template to display a simplified virtual keyboard with numbers.
When the child clicks on a number, this number is added to a “result” text element in the template layout.
So we need to know how to use an SVG element (the number we want to click on) as an interactive button
which will display some text result as the content of an other SVG element.

The interactivity of your template is first prepared in your
layout.svg
file.
Using Inkscape XML Editor (Ctrl + Shift + X), you add event attributes
to the SVG elements you want to add some interactivity to. This involves accessing
the XML source code of the SVG file, which you should not be afraid of thanks to
Inkscape XML Editor.

For instance, let’s say you have a SVG group of elements which you want to
act as a button. You select this group using Inkscape. You press Ctrl+Shift+X. The
XML Editor opens. There you see the group of elements as a <g … > element.
You then want to add interactivity to this group. You have to add a
onclick
attribute.
The value of this attribute should be “clickButton(evt)”. This means that whenever the
user mouse clicks on this button, a MouseEvent event called “evt” will be fired and
some Javascript function called “clickButton” will have to handle this event so that
something special happens.

Now you have injected some interactivity attributes into the XML source code of
the SVG file of your template. This source code now includes things like this :
<g onclick="clickButton(evt)" ...

Let’s develop this clickButton Javascript function so that you define what should
happen whenever the button is clicked. This definition is written in a Javascript file
you have to name “interaction.js” and which sits under the template folder:
wecena.bliotux/templates/simpleOperation/interaction.js

For instance, this file could contain the code below (see included examples, too, if needed) :

function clickButton(evt){
    alert('You clicked the button !');
    $('.whereResultShouldBeDisplayed', svg.root()).html('Clicked !');
    $('.someSVGElementsWhichShouldBeEmptiedWhenButtonGetsClicked', svg.root()).html('');
    storageSave('.whereResultShouldBeDisplayed', 'Clicked !');
    storageSave('.someSVGElementsWhichShouldBeEmptiedWhenButtonGetsClicked, '');
}

If you are as unfamiliar with Javascript as I am, you need some more explanations here.
What does this function says ?

It says that it takes an input parameter called “evt”. But it won’t use it in this case.

It first displays a popup alert window with a message (‘You clicked…’)

Then it changes the content of the SVG displayed in the web browser. It writes the text ‘Clicked !’ in
every SVG (or HTML BTW) element which has an attribute called “class” (the same attribute which can be used
for CSS files) including the value “whereResultShouldBeDisplayed”.

For instance, let’s say you have this text element in your layout.svg file :

<text
  id="text4790"
  y="386.98224"
  x="454.43787">
  <tspan
    y="386.98224"
    x="454.43787"
    id="tspan4786"
    class="whereResultShoudlBeDisplayed someOtherClass">Not clicked yet.</tspan>
</text>

Then, once the user clicks the button, your interaction.js file will have this text element changed into this :

<text
  id="text4790"
  y="386.98224"
  x="454.43787">
  <tspan
    y="386.98224"
    x="454.43787"
    id="tspan4786"
    class="whereResultShoudlBeDisplayed someOtherClass">Clicked !</tspan>
</text>

Can you see the difference ?

For more information about how Javascript can have the web browser manipulate
the content of the page at runtime, please see jQuery API documentation. Just remember to
apply jQuery selectors to the root of the SVG document (
svg.root()) and you should be fine.

There is also this call to storageSave in your interactivity function. What does it mean ?

storageSave
is a function defined by bliotux.
It takes 2 input parameters : a key and its value.
It will have this pair of (key, value) made persistent in the local web browser.
Even if the browser (and possibly computer) is closed (shutdown), this (key, value) pair is still available
and can be later retrieved using another bliotux function :
storageLoad(key).
Next time the same page is displayed, any SVG element which corresponds to key (as a jQuery selector) will have
its content filled with value.

In this example, storing the text
"Clicked !"
as
the value of the key
.whereResultShouldBeDisplayed
means 2 things:

this text
"Clicked !"
can be further retrieved with any Javascript call to
storageLoad('.whereResultShouldBeDisplayed')
next time this page is displayed using the same web browser, the
"Clicked !"
text will be added to all SVG elements which have the
whereResultShouldBeDisplayed
class attribute in their source code.

As a result of this, the state of each page can be made persistent
so that when the user returns to a given page he already interacted with
this page displays the exact same info/aspect/behaviour as before.

Now you have your
interaction.js
file which defines the full interactivity of your template document.

Create a page

Creating a page is much easier than creating the template a page is based on.
But it requires writing some (extremely simple) code using any text editor (Windows notepad…).
Any brave user should be enabled to do so.

You have a full bliotux template, including an SVG layout (possibly including animation) and Javascript interactivity.
Now let’s create a page based on this template.

Name the folder with the page name

In this example, let’s name a first page
Sesamath_CP_page-094_exercice-001
along the name of a French free (as in free speech) textbook vendor.
In order to do so, we create this folder:
wecena.bliotux/pages/Sesamath_CP_page-094_exercice-001/

When we want to access this page, we’ll have to direct our web browser to such an URL as
file:///home/jean/wecena.bliotux/index.xhtml?page=Sesamath_CP_page-094_exercice-001

Define the template this page uses

Which template will this page use ?
The answer comes as a Javascript file we have to create:
wecena.bliotux/pages/Sesamath_CP_page-094_exercice-001/data.js

This file contains the declaration of variables describing this page.
The variable called template defines the template to be used for this page:

var template = 'simpleOperation';

Populate the template

The next variable in this
data.js
file define data which will get injected into the template so that
the page is built :

var data = {
  '.pageCentaine':'',
  '.pageDizaine':'9',
  '.pageUnite':'4',
  '.exerciceCentaine':'',
  '.exerciceDizaine':'',
  '.exerciceUnite':'2',
  '.operande1Centaine':'',
  '.operande1Dizaine':'',
  '.operande1Unite':'7',
  '.operateur':'-',
  '.operande2Centaine':'',
  '.operande2Dizaine':'',
  '.operande2Unite':'5',
  '.resultatCentaine':'',
  '.resultatDizaine':'',
  '.resultatUnite':'',
};

This data associative array lists (key, value) pairs which define which content should be injected where.
The key (for instance
.pageCentaine
) is a jQuery selector to be applied to the root of the SVG template.
The value is some SVG code which is to be inserted as the content of any SVG element matching the key.

Rather than using
id
attributes as selectors (
#pageCentaine
), it seems preferable to use
class
attributes (
.pageCentaine
) which carry the meaning (semantics) of the corresponding SVG element and can be reused
several times in the same template (whereas IDs should be unique, I suppose).
Anyway, the SVG template should be edited so that the corresponding
class
attribute are present where needed.

Include some page-specific graphics

Using the mechanism of templates and the data.js file, you may have your SVG template include some areas where
pages could have specific bitmap (JPEG, PNG) files displayed.
This is just the matter of including such a JPEG file in the
layout.svg
file,
giving the corresponding SVG element an appropriate class attribute (using Inkscape XML editor for instance)
and then defining in
data.js
the name of the picture file to insert in this area of your layout for this specific page.

But you can also have given pages include full SVG files.
For instance, the left part of
simpleOperation/layout.svg
is meant to display a funny but didactic illustration
where characters (such as Tux the penguin) invite the child to perform the mathematical operation at hand.
Such an illustration could contain page-specific animations.
Adding an animated GIF file would not be enough.
The full power of SVG for animations may be required.
In such cases, you can define an svgParts variable in the data.js file of the page :

var svgParts = {
  '#illustration': 'illustration.svg'
}

This variable says : “Hey, bliotux, please look at my template
and find the SVG element with
illustration
as the value of its
id
attribute.
Then replace this full SVG element with the first
g
element (SVG group) you will find
in the
illustration.svg
file sitting under this page folder. Thanks.”

That’s it

You can access and test your page at a URL which should look a bit like that (the exact path depends on the folder hierarchy
on your hard drive):
file:///home/jean/wecena.bliotux/index.xhtml?page=Sesamath_CP_page-094_exercice-001

Side note : Now I realize I can’t use doctestjs for this document so it’s pretty useless to me.
It would have been much useful if only I had figured out a way to have some Javascript code generate
a template document in the filesystem during the doctest so that I can further test bliotux on it
using doctestjs. Maybe later…

SMIL-animated SVG for accessible textbooks

Dyspraxia is a serious learning disability for 250.000 children in elementary schools in France. Not that French children are particularly disadvantaged. It just happens that it seems to be a very wide spread kind of disability and the proportion of dyspraxic children should roughly be the same from country to country. In order to overcome this obstacle, the nonprofit organization I currently work for is leading the way toward adapting the ergonomy of existing paper textbooks and helping textbook editors creating the accessible (and digital) textbook of the future. Maybe you’ve heard of any similar initiatives ?

Their first attemps were made using a French e-learning authoring tool called Didapages. Up to version 1.1 it was free for non-commerciale uses. Version 2 is much more commercially oriented. And closed-source. And only runs on Windows. And despite its ease of use for educators and non-IT specialists, it has several drawbacks and limitations, partly due to the technology it uses, Flash, and partly because its developer does not think he can build a sustainable business model using free software licensing. Too bad. I am looking for an alternative solution, as some part of its user community does.

Free software packages such as Xerte, eXe, Scenari, Docebo and others look attractive. But none is the ideal solution : either they are also based on Flash, or their community is almost non-existant and their development may have stopped some time ago. Educators are not developers. And the crowd of educators might be missing a critical mass of developers in order for a very striving free software community to have developped around any elearning authoring tool. The bells and whistles of proprietary products have much more appeal to the average teacher.

From a technology perspective, I had a look at open standards for acessible, animated and interactive contents. W3C, please show me the way. The relevant standards seem to be :

HTML 5 for content, with its Javascript-animated “canvas” element for sprite-based animations (for bitmaps graphics) ;
SMIL for animated documents and for limited interactivity, possibly also combined/extended with Ecmascript for more interactivity ;
CSS for styling, possibly some day with Webkit-like CSS animation but this option does not excite me much ; CSS animation may require Javascript or SMIL
SVG for graphics : there is such a thing as SVG Animation, and Ecmascript can be embedded in a SVG file in order to provide more interactivity and to overcome some current interactivity limitation of SMIL ; SVG is for vector graphics but could also embed (and animate) bitmap graphics (used as sprites).

The advantage of SMIL and SMIL-animated SVG over Flash seems to be that SMIL is a declarative technology. This “document” model allows less dependency on scripting and more flexibility through earlier or further transformations (with templating, XSLT or content management engines). This allows the animation and, to a lesser extent, interactivity aspects of educational content to be a native part of the content itself and not to be an afterthought. It facilitate later and looser coupling with further technologies. It allows more ReSTfullness (restafari !). It does not cause cancer. Well, I don’t know. It tastes good. (note to myself : consider discarding this whole paragraph) :)

Flash applets, on the other hand, can be made somewhat accessible but this may not be an easy task for the average Flash developer, and SMIL sounds like a much more accessibility-friendly technology. There even is a DAISY profile for SMIL documents. I should have a deeper look into these profiles.

But interactivity with specific application logic seems to require a bit of scripting anyway, doesn’t it ? Here comes Ecmascript with SMIL, which should probably be limited to a minimum. Can you always provide accessibility-safe fallback mechanisms for a SMIL document if you introduce scripting for interactivity ? I am not sure. I will have to figure this out. Maybe the DAISY SMIL profile tells me more about this.

After a first glance at these standards and being an non-expert in animated contents, it seems to me that there ARE available and mature open standards which cover most of the accessible and digital textbook related concerns. There should be no need to develop any addiction for Flash authoring systems.

But the problem is that these standards are still “emerging”. They were proposed several years ago, are slowly maturing and their support in modern web browsers only starts to become a reality. The most advanced support for SMIL-animated SVG comes with Opera. And is said to be available in Firefox 3.6 as far as I understood. I’ll test this stuff with Opera until Firefox 3.6 comes to ubuntu. The lack of consistent support for SMIL and SVG animation can be overcome with the use of free software SDK or Javascript libraries which take SMIL elements as input and generate equivalent Javascript instructions as output. For instance, the RaphaelJS Javascript library allows browsers to support animated SVG even if such a support is not built-in for them. As far as I understand, the Ample SDK allows SMIL animations to be supported by non SMILable browsers, too.

The main problem is not in web browser support, though. The main problem is that there is almost no (free software) authoring tools for such animation and interactivity technologies. Limsee2 is a code editor/development environment for SMIL (does it support SVG animation ?) but its INRIA authors stopped working on it some time ago. And there seems to be no real community behind it. Limsee3 is not a further version of Limsee 2 (despite the name). It is a WYSIWYG SMIL authoring tool but it does not seem to support SVG animation (does it ?). And it may also probably stop being developed as soon as the governmental subsidies behind the corresponding research project end. Yet another research package soon to be dying on the labs shelves ?

This sends me back to my above observation about the non-existence of a sufficiently-big or proficient-enough community of educators who can use AND develop such advanced authoring tools with accessibility in mind. Too bad…

Madswatter and Ajax animator are very early prototypes for animation authoring environments. There are other free software attempts currently aiming at proposing a proper animation editor: clash/geesas (which is a fork of pencil) and moing… Maybe you’ve heard of other projects ? Inkscape has some plan for introducing SMIL authoring capabilities. There even is a mockup of the user interface for the timeline-based authoring of animations. This is work in progress. Well, maybe this is more than just a work on blueprints : the Inkscape roadmap mentions simple and limited animation authoring as a feature for their next release (version 0.48) ! The 0.49 version should focus on much more support for animated SVG. Exciting ! This topic is hot right now. Itches are starting to be scratched a lot !

That being said, I realize I already have a tool for authoring animations. It’s Open Office Impress. And the Impress wiki tells me that its animation are based on SMIL ! When I have a look at the xml file saved by Impress (inside its ODP zipped archive), I can indeed see SMIL element names and attribute names mixed with Open Office specific elements and attributes, even though the resulting document may not be SMIL compliant, strictly speaking. A limited effort (XLST or a custom extension) may allow to produce real SMIL documents.

Instead of using elearning-specific authoring tools (think Xerte, eXe, …), what if futur editing software for educational contents were tools I (or any educator) already have on my desk : Inkscape for the creation of bits of animated graphics and/or Open Office Impress for the layout and animation of the overall animated document? In Inkscape, the “properties” window of any object even reveals some event fields for Ecmascript/Javascript instructions (onclick, onmouseover, etc.). Too bad Impress can’t properly import SVG content. But maybe this is not required. In the end, e-learning specific tools would be required anyway for the packaging of the resulting animated and interactive content into Learning Management Systems such as Moodle. Such content packages would need to be made SCORM or AICC compatible so that they expose their navigational and educational structure to these platforms via a standard API. I read the SCORM is not ideal as such an API from an accessibility perspective because it heavily relies on Javascript (it is a Javascript API). But does the use of a scripting language always prevent accessibility ? I don’t know. SCORM may be nice for portability from LMS to LMS. But so nice for accessibility.

At the moment, I feel like the ideal authoring chain of tools for educational content / textbooks would be as follows :

Inkscape in order to create the graphism, layout and animation of individual educational “applets” : cross words, coloring books, simulations, geometry tools, … the result being saved as an animated (and partial SMIL-interactivity) SVG file with event-hooks being defined so that we can go to the next step
an ECMAscript code editor (I am not into this emacs thing… Eclipse anyone ?) in order to transform this animated SVG file into an animated AND interactive SVG piece of content
Open Office Impress in order to create the layout, structure and general content of your course/manual/textbook chapter/whatever, inserting the SVG file and adding further animations as well as individual multimedia items (sound clips, videos, hyperlinks), the result being saved as a SMIL/HTML document
More scripting edition of this document if needed (but would it be needed at this stage ? I can’t tell)
CSS styling would be made ready for the document at this stage or earlier (can Open Office make any use of existing CSS stylesheets or would it always mix them into its own content format ?)
a SCORM packager such as Reload Editor would import this content and allow the author to specify the SCORM relevant bits of information, the result being saved as a Moodle-ready package
Your favority Moodle-like LMS platform would serve the content to users, possibly running on their laptop in an offline fashion

This whole chain of tools would probably benefit from being powered by a web content management system (Plone ? Drupal ?) so that the assembly line is smoother and allows widespread collaboration, with workflows, access control and so on. No need to get stuck back to the Dreamweaver era of the I-am-waiting-for-the-Dreamweaver-guy-to-update-my-textbook.

Now it’s your turn. What do you think ?

Le code du wecena est libre

“Vive le wecena libre !” comme qui dirait l’autre. Ce petit message pour signaler à ceux que cela intèresse que j’ai libéré le code qui me permet de faire tourner wecena.com. En d’autres termes, ce logiciel libre est désormais distribué (publiquement) sous licence GNU Affero General Public License v.3.

Le code en question constitue une suite de produits d’extension pour le système de gestion de contenu Web Plone. Certains de ces produits sont spécifiques au fonctionnement du wecena (les produits wecena_core et wecena_integration). Certains autres sont plus génériques et peuvent avoir leur utilité hors wecena. Je pense notamment à wecena_dynamicroles pour améliorer la flexibilité du système de sécurité de Plone et à wecena_ldapuser pour synchroniser de manière bidirectionnelle les utilisateurs Plone avec les entrées d’un annuaire LDAP.

Votre expertise python/Zope/Plone est plus que bienvenue si vous voulez vous amuser avec ces produits et filer un coup de main au passage !

How to get visual performance profiles from plone doctests ?

I am developping a couple of Plone 3.x products. They have some tests, including a huge functional doctest which takes a long time to run (about a couple of hours !) but covers some of my most interesting use cases. I wanted to use these tests in order to get some insights about possible performance bottlenecks and other optimization hot points in my code. The result of my effort was a very nice visual chart showing these bottlenecks and hotpoints.

[update: added another visualization package, see at the end of the post]

Here is how I had to proceed (note that I am more of a foolish and coward hacker than an expert and I decline any responsibility on the consequences of following my howto !) :

1. Give your python a suitable profiler

Plone 3.x requires zope 2.10 which in turn requires python 2.4. More recent versions are not supported AFAICS. Problem: python2.4 does not have a reliable performance profiling module. Its “hotshot” module is both slow (when loading statistics) and badly bugged : it crashes when you have it load some of the profiles it can generate. You have to add a better profiler to your python environment, namely cProfile (which is shipped with python 2.5).

I am a terrible sysadmin and I don’t really understand (and care about) how python manages its pathes and accesses its libraries. So I did this :

download and unzip the source tarball of python 2.5 so that you get cProfile source code
locate relevant files referring to lsprof (the old name of cProfile), using a grep -R lsprof * on the source directory
I personnally located the following files (I leave cProfile test files apart) : Lib/cProfile.py Modules/_lsprof.c and Modules/rotatingtree.* (.c and .h)
download and unzip the source tarball of python 2.4
copy the located cProfile files from their python 2.5 location to the proper dirs into the source code of your fresh python 2.4
update python 2.4 ‘s setup.py file so that the line below is added just after the hoshot one : exts.append( Extension(‘_lsprof’, [‘_lsprof.c’, ‘rotatingtree.c’]) )
did I mention I am so bad at hacking things that I don’t even provide a patch for the operations above ?
compile python 2.4 using a ./configure then make

At this point, you must have an executable python interpreter version 2.4 which includes cProfile. You can check by launching this python and trying a import cProfile which should not fail.

I replaced my system python2.4 by then doing a sudo make altinstall but I also had to manually tweak my system files so that this new python2.4 gets properly called (I am using ubuntu 8.10 intrepid, BTW) :

cd /usr/bin

sudo mv ./python2.4 ./python2.4.5

sudo ln -s /usr/local/bin/python2.4

Now, a plain command line call to python2.4 should give you an interpreter prompt which lets you import cProfile if you dare. I suffered some colateral damage here : the python prompt lost its ability to have previous lines copied at the prompt by pressing the Up/Down arrows. And I had to re-install reportlab from the source (some of my products depend on pisa which depends on reportlab). Anyone knows how to restore this Up/Down arrow capability ?

2. Recreate your buildout using this new python version

So that zope gets recompiled using your new python version :

rm -Rf parts bin develop-eggs

python2.4 bootstrap.py

bin/buildout

3. Patch zope testrunner so that it supports cProfile instead of only supporting hotshot

I got a bit confused because my buildout contains 2 zope testrunners. It took me some time to figure out which was which : the one which is used by the zope instance your buildout creates is the one which is shipped with zope 2.10 and is located at parts/zope2/lib/python/zope/testing/. The other one I have is in the zope.testing egg. I don’t know how and why I got such an egg. Anyway, this egg supports both hotshot and cProfile whereas zope 2.10 testrunner doesn’t. So I hacked the weaker/older zope 2.10 testrunner with some inspiration from zope.testing so that cProfile can be used when running tests. Here is the diff you can use for enhancing parts/zope2/lib/python/zope/testing/testrunner.py. Oops, left version is the modified one, right version is the original one.

38,69d37
< before_tests_hooks = []
< after_tests_hooks = []
< available_profilers = {}
<
< try:
<     import cProfile
<     import pstats
< except ImportError:
<     pass
< else:
<     class CProfiler(object):
<         “””cProfiler”””
<         def __init__(self, filepath):
<             self.filepath = filepath
<             self.profiler = cProfile.Profile()
<             self.enable = self.profiler.enable
<             self.disable = self.profiler.disable
<
<         def finish(self):
<             self.profiler.dump_stats(self.filepath)
<
<         def loadStats(self, prof_glob):
<             stats = None
<             for file_name in glob.glob(prof_glob):
<                 if stats is None:
<                     stats = pstats.Stats(file_name)
<                 else:
<                     stats.add(file_name)
<             return stats
<
<     available_profilers[‘cProfile’] = CProfiler
<
75,98c43
<     pass
< else:
<     class HotshotProfiler(object):
<         “””hotshot interface”””
<
<         def __init__(self, filepath):
<             self.profiler = hotshot.Profile(filepath)
<             self.enable = self.profiler.start
<             self.disable = self.profiler.stop
<
<         def finish(self):
<             self.profiler.finish()
<
<         def loadStats(self, prof_glob):
<             stats = None
<             for file_name in glob.glob(prof_glob):
<                 loaded = hotshot.stats.load(file_name)
<                 if stats is None:
<                     stats = loaded
<                 else:
<                     stats.add(loaded)
<             return stats
<
<     available_profilers[‘hotshot’] = HotshotProfiler
—
>     hotshot = None
288c233
<     if len(available_profilers) == 0 and options.profile:
—
>     if hotshot is None and options.profile:
320,324c265,266
<         if available_profilers.has_key(‘cProfile’): prof = available_profilers[‘cProfile’](file_path)
<         else: prof = available_profilers[‘hotshot’](file_path)
<         before_tests_hooks.append(prof.enable)
<         after_tests_hooks.append(prof.disable)
<
—
>         prof = hotshot.Profile(file_path)
>         prof.start()
335c277,278
<             prof.finish()
—
>             prof.stop()
>             prof.close()
342c285,292
<         stats=prof.loadStats(prof_glob)
—
>         stats = None
>         for file_name in glob.glob(prof_glob):
>             loaded = hotshot.stats.load(file_name)
>             if stats is None:
>                 stats = loaded
>             else:
>                 stats.add(loaded)
>
459d408
<                 [hook() for hook in before_tests_hooks]
461d409
<                 [hook() for hook in after_tests_hooks]
656,659c604
<     [hook() for hook in before_tests_hooks]
<     results = run_tests(options, tests, layer_name, failures, errors)
<     [hook() for hook in after_tests_hooks]
<     return results
—
>     return run_tests(options, tests, layer_name, failures, errors)

Oh, BTW, this diff also lets you filter out the profiling of the setup and teardown steps of your tests which are of poor value compared to actual tests. Thanks to Daniel Nouri for this.

At this point, you should have given your zope instance the capability of profiling tests using cProfile. You can check it by asking for a debug prompt from zope : bin/instance debug The prompt you get should allow you to safely import cProfile

4. Profile your test

Say you have a Products called Products.DearProduct with some tests. Profile them :

bin/instance test -s Products.DearProduct –profile

At this point, you should get a tests_profile.*.prof file saved in the current dir. It contains the performance profile cProfile generated, using the pstats format. You can manually load and analyze this data. Or have a limited GUI show you what it’s like. Or you can go for the nicer, more insightful version which follows.

5. Visualize and analyze the performance profile you generated

Thanks to Ingeniweb folks, I heard of gprof2dot and xdot. Download them (the scripts, not the folks). Use them to generate and display a very nice graph :

chmod 744 gprof2dot.py

chmod 744 xdot.py

./gprof2dot.py -f pstats -o profile.dot tests_profile.*.prof

./xdot.py profile.dot

Note the * you may replace with the ID of the profile generated above. Or you can use the fancy but dangerous one-liner below which runs the tests, generates the profile, generates the corresponding graph, displays the results of tests and displays the graph for analysis :

rm -f tests_profile.*.prof && rm -f profile.pstats && rm -f profile.dot && bin/single-instance test -s Products.MyDearProduct –profile > /tmp/test.txt ; ./gprof2dot.py -f pstats -o profile.dot tests_profile.*.prof && less /tmp/test.txt ; ./xdot.py profile.dot

At this point, you should be starring at nice colored graph which represent the flow of your tests and the method which may be performance bottlenecks. And you should be hoping that it was worth the effort.

[Here starts the update]

After some contemplation moment, I tried to analyze the graph of my tests and did not feel extremely happy with this graph visualization. It indeed shows me that the slowlyness of functional doctest is mostly due to the testing framework (zope.testbrowser, etc.). This slowlyness “hides” the optimization opportunities of my code. And I don’t know how to exclude some products from the being profiled or from being present in the profile stats (I would have liked to filter out zope.testbrowser and other Plone-specific things). But, all hope is not lost, here comes kcachegrind:

sudo apt-get install kcachegrind

sudo easy-install pyprof2calltree

pyprof2calltree -o output.calltree.stats -i tests_profile.*.prof -k

Using kcachegrind with the help of pyprof2calltree, I was able to focus on my product methods and identify those methods which deserve some caching. Added some @memoize decorators and reran the profiled tests so that I could enjoy the performance improvement… Happy I am, happy thou shalt be.

What do you think ?

Appel à projets informatiques d’intérêt général

Vous connaissez un projet informatique qui pourrait contribuer à rendre le monde meilleur ? A sauver la planète ? A créer une innovation Internet d’utilité publique ? Ou juste à faciliter la vie de votre association ? A faire avancer une grande cause ou une toute petite ? A faire avancer la science ? Alors répondez à cet appel car je pense pouvoir booster ce projet en recrutant pour lui des mécènes informatiques.

En effet, dans le cadre de ma nouvelle entreprise, je propose mes services professionnels à tout projet informatique d’intérêt général: je fournis (à coût zéro, cf plus bas) mes compétences en tant que directeur de projets informatiques innovants ainsi que l’accès aux compétences de très nombreux autres ingénieurs informaticiens, sur leur temps de travail. Vous voulez des compétences d’ingénieurs informaticiens pour rendre le monde meilleur ? En voila !

Notez que je ne place, a priori, aucune limitation de thème ou de domaine : lutte contre la pauvreté, recherche scientifique, défense de l’environnement, santé, handicap, protection de l’enfance, etc. peu importe du moment que ce projet va vraiment dans le sens de l’intérêt général et de l’utilité publique (cf. ci-dessous).

Les conditions à remplir

Pour que mon entreprise puisse intervenir, votre projet informatique doit absolument :

être “d’intérêt général”, c’est-à-dire être porté par un organisme ayant le droit, en France, d’émettre des reçus fiscaux en échange des dons reçus (mécénat)
ne pas être un tout petit projet: il doit nécessiter, de la part des mécènes, au moins 1 ingénieur à temps plein
être porté par une équipe déjà active : je peux fournir entre 2 fois et 5 fois le temps que vous passez déjà sur le projet, en tant que bénévoles ou salariés ; si vous ne travaillez pas déjà sur le projet, je ne peux rien faire (0 fois 2 égal 0 !)
être un projet qui en vaut vraiment la peine: avoir un véritable impact social, direct ou indirect, une utilité clairement mesurable et motivante, répondre à un défi de société à petite ou à grande échelle, être source, levier ou moteur de changement pour la société…
ne pas nécessiter de présence physique importante en dehors de la région parisienne (je démarre petit et près de chez moi, même si je suis un adepte du travail à distance et des “conf call”), bref être plutôt localisé près de Paris

Qu’est-ce qu’un projet informatique d’intérêt général ?

Un projet informatique est d’intérêt général si il est porté par un organisme bénéficiant du régime fiscal français du mécénat. Ah, ah… mystère, qu’est-ce que c’est que ce truc ? La loi française d’août 2003 sur le mécénat reste mal connue mais elle représente une source de revenus importante pour les organismes d’intérêt général. Plusieurs types d’organismes répondent à ce critère. Pour faire simple, il peut s’agir d’une association loi 1901 :

à but non lucratif : elle ne reverse pas de TVA, ne paye pas d’impôts sur les sociétés, a des administrateurs et un bureau bénévoles et désintéressés, ne vient pas concurrencer des entreprises commerciales ou alors elle le fait à des prix beaucoup plus bas que le marché et principalement pour un public défavorisé et sans “pratiques commerciales” (publicité, …) ; demandez l’avis d’un comptable si besoin
et dont l’objet est à caractère philanthropique, éducatif, social, humanitaire, sportif, familial, culturel, artistique, environnemental, culturel, littéraire, scientifique…
et dont les activités ne bénéficient pas à un cercle restreint de personnes (contrairement aux syndicats ou aux associations d’anciens élèves d’une école par exemple …)

Au besoin, une association loi 1901 peut être facilement créée pour porter ce projet (statuts et déclaration en préfecture) et réunir les conditions de l’intérêt général. Il n’y a pas de condition d’ancienneté ni de taille de l’association. Il n’y a pas non plus forcément besoin d’obtenir un agrément administratif (comme ce serait le cas pour les associations “reconnues d’utilité publique”, ce qui est une reconnaissance très difficile à obtenir de nos jours).

Pour en savoir plus sur la notion d’intérêt général, je vous invite à consulter le site mécénat du ministère de la culture ainsi que les explications de l’Association pour le Développement du Mécénat Industriel et Commercial (ADMICAL).

Comment je peux aider, en pratique ?

Si vous consacrez déjà du temps à votre projet, je peux donc démultiplier cet effort.

Exemple: avec 4 autres bénévoles, vous consacrez au moins, chacun, une journée par semaine à votre projet (soit un équivalent temps plein, 5 jours de travail par semaine), alors je peux vous fournir, en complément, l’équivalent de 2 ingénieurs à temps plein (10 jours de travail par semaine), voire plus si votre projet est très simple à gérer.

Cette aide prendra la forme de:

un accompagnement permanent par mon entreprise : au moins une demi-journée d’assistance et de conseil par semaine, en fonction du volume de votre projet ; plus un service de représentation et de suivi de votre projet auprès des entreprises mécènes,
des interventions individuelles d’un grand nombre (50, 100, 200…?) de professionnels de l’informatique, ingénieurs, techniciens ou consultants, pour des durées variables et parfois courtes (par exemple une semaine), sur leur temps de travail,
la possibilité de renforcer votre équipe bénévole par les contributions ultérieures de certains de ces intervenants sur leur temps libre (constitution éventuelle d’une communauté à la mode open source si votre projet s’y prête)
l’accès à un système d’information sécurisé sur le Web pour gérer votre projet, vos intervenants, vos relations avec les mécènes et automatiser la gestion de toute la paperasse administrative qui va avec (contrats, convention de mécénat, reçus fiscaux, …)

Comment ça marche ?

Je créé actuellement une entreprise à vocation sociale dont l’objectif est de fournir aux innovateurs sociaux les mêmes moyens informatiques que ceux dont disposent les entreprises les plus modernes. Mon activité s’appuie sur le mécénat de sociétés de services en informatique (SSII) qui s’engagent dans des démarches de “développement durable” (ou, plus exactement, de “responsabilité sociale de l’entreprise”). Elles souhaitent faire du mécénat de compétences en informatique par mon intermédiaire : faire don du temps de travail de leurs ingénieurs et consultants sous la forme d’une prestation de service gratuite gérée via le Web. J’appelle ça “faire du wecena” (Wecena, c’est le nom de ma boîte !).

Le financement de cette aide est indirectement assuré à 100% par l’Etat français, grâce à la loi sur le mécénat des entreprises. En effet, l’Etat accorde une réduction d’impôts importante à toute entreprise qui décide d’aider concrètement un organisme d’intérêt général (don d’argent, don en nature, don de compétences et temps de travail…). Les SSII mécènes que je rencontre sont prêtes à se lancer dans l’aventure en proposant à leurs ingénieurs de faire avancer votre projet pendant ces périodes de temps que l’on appelle l'”inter-contrat” (ou intercontrat ou “période de stand-by” ou …) : il s’agit de ces périodes de quelques jours à quelques mois qui commencent lorsque l’ingénieur termine un projet pour un client et n’est pas encore affecté à un autre projet pour un nouveau client.

Cela impose une contrainte importante dans la gestion de votre projet: les ingénieurs réalisant la prestation de service vont se relayer à un rythme très rapide, certains ne seront présents que 48H tandis que d’autres seront disponibles 2 ou 3 mois dans l’année. La durée moyenne d’intervention individuelle se situe quelque part entre une semaine et un mois (selon le métier de l’intervenant et l’état du marché de l’informatique, et aussi selon la politique du mécène). C’est le rôle de mon entreprise que de vous aider à gérer cette contrainte. Notez que cette contrainte a également quelques avantages : si votre projet est suffisament simple et “découpable” en petites tâches (à l’aide de méthodes et d’outils de gestion adaptées, que je vous fournis), vous aurez ainsi l’occasion de proposer votre cause à une multitude d’intervenants que vous pourrez recruter en autant de bénévoles potentiels une fois leur mission de wecena terminée. C’est par exemple le cas de projets portant sur de l’initiation à l’informatique, de l’animation d’atelier informatique auprès de personnes défavorisées, d’interventions multiples d’installation de PC ou de réseau local… Pour des projets plus complexes (développement, conseil, …), votre implication est plus importante et le wecena ne peut pas représenter plus de 2 fois le temps que vous y consacrez déjà.

Quelques exemples de projet

Pour vous aider à vous faire une idée du type de projet qui peuvent bénéficier du wecena, voici quelques exemples de projets que j’ai déjà présenté à des mécènes :

conception et réalisation d’un logiciel innovant pour faciliter l’utilisation du clavier et de la souris par des personnes ayant un handicap moteur
amélioration de l’infrastructure informatique d’une ONG travaillant dans la lutte contre l’exclusion: remplacement d’un parc de postes de travail, interventions d’administration système sur des serveurs de fichiers et d’application, …
déploiement d’un progiciel de reporting financier sur des prestations de services en mode projet pour une association recevant d’importantes subventions publiques
refonte d’applicatifs Web pour la gestion documentaire, la gestion des relations et contacts et la gestion des adhésions pour une association Internet dans le domaine de la famille et de la protection de l’enfance
création d’un blog par un écrivain public d’une ONG franco-africaine pour sensibiliser des étudiants français au problématiques du développement Nord-Sud
assistance à la webisation d’un système de gestion d’établissements de santé pour une association du secteur sanitaire et social
initiations informatiques et formation aux logiciels internes pour des bénévoles retraités d’une association humanitaire

Ce ne sont que quelques exemples pour vous donner le ton. Aucun de ces projets n’a encore démarré.

Avertissement

Mon entreprise en est encore à une phase de démarrage et d’expérimentation. Je ne peux actuellement vous garantir ni que votre projet en particulier sera sélectionné par un mécène (les projets les plus solides et les plus ambitieux auront plus de chances bien entendu) ni même de pouvoir démarrer mon accompagnement tout de suite. En effet, l’aide que je peux vous apporter est en soi un projet (créer une entreprise…) : j’y crois énormément puisque j’ai quitté mon employeur précédent pour me lancer dans cette aventure, et j’y consacre tout mon temps et mes compétences. Mais, ceci dit, démarrer ce genre d’entreprise sociale innovante prend du temps et représente aussi une part de risque, d’incertitude, bref d’aventure… Le premier projet que j’accompagnerai pourrait démarrer fin 2008 (si les étoiles s’alignent comme prévu) ou au plus tard début 2009 (si j’ai moins de chance). Les mécènes que je rencontre sont déjà sur le pied de guerre et ont déjà commencé à examiner les projets informatiques que je leur présente. Certains ont déjà exprimé leur préférence et se mettent en ordre de bataille… En croisant les doigts, j’espère qu’un premier projet pourrait démarrer peu après la rentrée scolaire 2008.

Pour participer à l’aventure…

Vous connaissez une équipe qui porte un projet informatique d’intérêt général et a besoin de temps d’informaticiens pour aller plus loin et plus vite ? Faites-lui suivre l’adresse de cet article !

Votre projet répond aux conditions présentées ci-dessus ? Pour vous en assurer, posez la question via un commentaire ci-dessous ou contactez-moi directement par email à l’adresse suivante: projets (chez) wecena (point) com ou bien encore à mon adresse de blogueur: sig (chez) akasig (point) org. Le site Web de mon entreprise ne devrait pas ouvrir ses portes avant le démarrage du premier projet. En attendant, c’est ici que ça se passe. Vous avez des conseils à me donner, des avis ou des contacts à partager ou des suggestions à faire ? Ils seront bienvenus: je vous invite également à utiliser la fonction commentaires de ce blog.

Plone + Freemind = eternal love ?

Congratulations to Plone and Freemind, two great open source software packages, which have celebrated weddings recently and have promptly released a new born “Plone Freemind v.1.0” extension product for Plone. I have been really fond of Plone and Freemind for several years now. It’s good news to learn that Freemind mindmaps can now be published and managed via a Plone site… even though I yet have to imagine some valuable use for this ! :)

Pierre Levy vs Tim Berners-Lee, round 0.1

Yesterday, I attended a research seminar at the “Université de Paris 8”. Pierre Levy is a philosopher and professor and head of the collective intelligence chair at the University of Ottawa, Canada. He presented the latest developments in his work on IEML, which stands for Information Economy Meta Language. Things are taking shape on this side and this presentation gave me the opportunity to better understand how IEML compares to the technologies of the Semantic Web (SW).

IEML: not another layer on top of the SW cake

IEML is proposed as an alternative to SW ontologies. In SW, the basic technology is URI (Uniform Resource Identifier) which uniquely (and hopefully permanently) identify concepts (“resources”). Triples then combine these URIs into assertions which then form a graph of meaning that is called an ontology. IEML introduces identifiers which are not URIs. The main difference between URIs and IEML identifiers is that IEML identifiers are semantically rich. They carry meaning. They are meaningful. From a given IEML identifier, one could derive some (or ideally all?) of the semantics of the concept it identifies. Indeed these identifiers are composed of 6 semantic primitives. These 6 primitives are Emptiness, Virtual, Actual, Sign, Being, Thing (E,V,A,S,V and T) and were chosen to be as universal as possible, i.e. not dependent on any specific culture or natural language. The IEML grammar is a way to combine these primitives and logically build concepts with them (also using the notion of triples-based graphs). These primitives are comparable to the 4 bases of DNA (A,C,T and G) that are combined into a complex polymer (DNA) : with a limited alphabet, IEML can express an astronomically huge number of concepts in the same way the 4 letters-alphabet of DNA can express a huge number of phenotypes.

Meaningness of identifiers

When I realized that the meaningful IEML identifiers are similar in their role to URIs, my first reaction was of being horrified. I have struggled for years against “old-school” IT workers who tend to rely on database keys for deriving properties of records. In a former life in the IT department of big industrial corporation, I was highly paid to design and impose a meaningless unique person identifier in order to uniquely and permanently identify the 200 000 employees and contractors of that multinational company in its corporate directory. The main superiority in meaningless identifiers is probably that they can be permanent: you don’t have to change the identifier of an object (of a person for instance) when some property of this object changes over time (the color of the hair of the person, or Miss Dupont getting married and getting called Misses Durand while still keeping the same corporate identifier).

The same is true for URIs whenever it is feasible: if a given resource is to change over time, its URI should not be dependent on its variable property (http://someone.com/blond/big/MissDurand having to change into http://someone.com/white/big/MissesDupont is a bad thing).

The same may not be true when concepts (not people) are to be identified. Concepts are supposed to be permanent and abstract things with IEML (as in the SW I guess). If some meaningful semantic component of a given concept changes then… it’s no longer the same concept (even though we may keep using the same word in a natural language in order to identify this derived concept).

In the old days, IT workers used to introduce meaning in identifiers so that (database) records could more easily be managed by humans, especially during tasks like visually classifying or sorting records in a table or getting an immediate overview of what a given record is about. But this often got seen as a bad practice when the cost of storage (having specific fields for properties that used to be stored as part of a DB key) and the cost of computation (getting a GUI for querying/filtering a DB based on properties) got lower. More often that not, the meaningful key was not permanent and this introduced perverse effects including having to assign a new key to a given record when some property changed or managing human errors when the properties “as seen in the key” were no longer in sync with the “real” properties of the record according to some field.

That’s probably part of the rationale behind the best practices in URI design and web architecture: an URI should be as permanent as possible I guess, in order not to change when the properties of a resource it identifies change over time. Thus web architectures are made more robust to time.

With IEML, we are back to the ol’times of meaningful identifiers. Is it such a bad thing ? Probably not because the power of IEML relies in the meaningness of these identifiers which allow all sorts of computational operations on the concepts. Anyway, that’s probably one of the biggest basic difference between IEML and the SW ontologies.

Matching concepts with IEML

Another aspect of IEML struck me yesterday: IEML gives no magic solution to the problem of mapping (or matching) concepts together. In the SW universe, there is this recurring issue of getting two experts or ontologies agree on the equivalence of 2 resources/concepts: are they really the same concept expressed with distinct but equivalent URIs ? or are they distinct concepts ? How to solve semantic ambiguities ? Unless we get a solution to this issue, the grand graph of semantic data can’t be universally unified and people get isolated in semantic islands which are nothing more than badly interconnected domain ontologies. This is called the problem of semantic integration, ontology mapping, ontology matching or ontology alignment.

A couple of years ago, I hoped that IEML would solve this issue. IEML being such a regular and to-be-universal language, one could project any concept onto the IEML semantic space and obtain the coordinates (identifier) of this concept in this space. A second person or expert or ontology could also project its own concepts. Then it would just be a matter of calculating the distance between these points in the IEML space. (IEML provides ways of calculating such distances). And if the distance was inferior to some threshold, 2 concepts could then be considered as equivalent for a given pragmatic purpose.

But yesterday, I realized that the art of projecting concepts into the IEML space (i.e. assigning an identifier to a concept) is very subjective. Even though a Pierre Levy could propose a 3000-concepts dictionary that assigns IEML coordinates (identifiers) to concepts that are also identified by a short natural language sentence (like in a classic dictionary), this would not prevent a Tim Berners-Lee to come with a very different dictionary that assigns different coordinates to the same described concepts. Thus the distance between a Pierre-Levy-based IEML word and a TBL-based IEML word would be … meaningless.

In the SW, there is a basic assumption that anyone may come with a different URI for the same concepts and the URIs have to be associated via a “same as” property so that they are said to refer to the very same concept. When you get to bunches of URIs (2 ontologies for instance), you then have to match these URIs which refer to the same concepts. You have to align these ontologies. This can be a very tedious, manual and tricky process. The SW does not unify concepts. It only provides a syntax to represent and handle them. Humans still have to interprete them and match them together when they want to communicate with each other and agree on the meaning that these ontologies carry.

The same is more or less true with IEML. With IEML, identifiers are not arbitrarily defined (meaningful identifiers) whereas SW URIs are almost arbitrarily defined (meaningless identifiers). But the meaningful IEML identifiers only carry human meaning if they refer to the same (or similar) human/IEML dictionary.

Hence it seems to me that IEML is only valuable if some consensus exists about how to translate human concepts into the IEML space. It is only valuable to the extent that there is some universally accepted IEML dictionary. At least for basic concepts (primitives and simple combinations of IEML primitives). The same is true in the universe of SW technologies and there are some attemps at building “top ontologies” that are proposed as shared referentials for ontology builders to align their own ontologies with. But the alignment process, even if theoretically made easier with the existence of these top ontologies is still tricky, tedious and costly. And the critical mass has not been reached in sharing the use of such top ontologies. There is no top consensus to refer to.

Pierre Levy proposes a dictionary of about 3000 IEML words (identifiers) that represent almost all possible low-level combinations of IEML primitives. He invites people to enhance or extend his dictionary, or to come with their own dictionaries. Let’s assume that only minor changes are made to the basic Pierre Levy dictionary. Let’s assume that several conflicting dictionary extensions are made for more precise concepts (higher-level combinations of IEML primitives) . Given the fact that these conflicting extensions still share a basic foundation (the basic Pierre Levy dictionary), would the process of comparing and possibly matching IEML-expressed concepts be made easier ? Even though IEML does not give any automagical solution to the problem of ontology mapping, I wonder whether it makes things easier or not.

In other words, is IEML a superior alternative to SW ontologies ?

Apples and bananas

Yesterday, someone asked: “If someone assigns IEML coordinates to the concept of bananas, how will these coordinates compare to the concept of apples ?” The answer did not satisfy me because it was along the lines of : “IEML may not be the right tool for comparing bananas to apples.”. I don’t see why it would be more suitable for comparing competencies to achievements than for comparing bananas to apples. Or I misunderstood the answer. Anyway…

Pierre Levy made much effort in describing the properties of his abstract IEML space so that IT programmers could start programming libraries for handling and processing IEML coordinates and operations. There even is a programming language being developped that allows semantic functions and operations to be applied to IEML graphs and to allow quantities (economic values, energy potentials, distances) to flow along IEML-based semantic graphs. Hence the name of Information Economy.

So there are (or will soon be) tools and services for surviving in the IEML space. But I strongly feel that there is a lack of tools for moving back and forth between the world of humans and the IEML space. How would you say “bananas” in IEML ? Assuming this concept is not already in a consensual dictionary.

As far as I understand the process of assigning IEML coordinates to the concept of “bananas” is somehow similar to the process of guessing the “right” (or best?) chinese ideogram for bananas. I don’t speak chinese at all. But I imagine one would have to combine existing ideograms that would best describe what a banana is. For instance, “bananas” could be written with a combination of the ideograms that mean “fruits of herbaceous plant cultivated throughout the tropics and grow in hanging clusters“. It could also be written with a combination of the ideograms that mean “fruits of the plants of the genus Musa that are native to the tropical region of Southeast Asia and Australia.” Distinct definitions of bananas could refer to distinct combinations of existing IEML concepts (fruits + herbaceous plant + hanging clusters + tropics or fruits + plants + genus Musa + Southeast Asia + Australia). Would the resulting IEML coordinates be far away from each other ? Could a machine infer that these concepts are closely related if not practically equivalent to each other ? How dependent would the resulting distance be on conflicts or errors in underlying IEML dictionaries ?

I ended the day with this question in my mind: How robust is the IEML translation process to human conflicts, disagreements and errors ? Is it more robust than the process of building and aligning SW ontologies ? Its robustness seems to me as the main determinent factor of the feasibility of the new collective-intelligence-based civilization Pierre Levy promises. If only there were a paper comparing this process to what the SW already provides, I guess people would realize the value of IEML.

Web scraping, web mashing

5 Ways to Mix, Rip, and Mash Your Data introduces promising web and desktop applications that extract structured data feeds from web sites and mix them together into something possibly useful to you. Think of things like getting filtered Monster job ads as a convenient RSS feed, along with job ads from your other favorite job sites. This reminds me my Python hacks for automating web crawling and web scraping. Sometimes, I wish I could find time for working a bit further on that…

Web 2.0 architectures with Java

Here are two things to go beyond Web Services (with ReSTfullness in mind):

the “Web-Oriented Architecture” (WOA) concept is a lightweight version of the “Service-Oriented Architecture” concept, in a more Web 2.0 fashion.
restlets are a Java framework for Web 2.0 applications; they replace servlets API and facilitate the composition of “mashups” or Web applications relying on information collected from other live Web applications; there even is an open source reference implementation that is about to reach maturity and is actively developped

Take-away: beyond theory (ReST), there now are concepts (WOA) and tools (restlets) for building composite Web applications without requiring SOAP, WSDL and the whole bunch of overbloated WS-* standards that come with them.

WikiCalc: Web 2.0 spreadsheets wikified

WikiCalc is a nice piece of GPLed software that pusblishes wiki pages that are structured like Excel spreadsheets are: one can view and edit tables, modify calculation formulas in cells, manage their formatting through the web browser, etc. It brings to spreadsheets the inherent advantages of many wikis: ease of use for Web publications, ease of modification, revisions track for undoing unwanted changes by other users, RSS views on recent changes made to the page. It brings to wikis the inherent advantages of spreadsheets: live calculations, nice formatting, compliance with corporate way of thinking and managing things (will we see a WikiSlides with bulletpoints and animations in some future?). More than this, WikiCalc lets spreadsheets grab input data from external web sites and do live calculations from it: some formulas generate HTTP requests to web services in order to retrieve the latest value for a stock quote, weather forecasts, and so on. Last but not least, the flexible architecture of WikiCalc allows an offline use still via the user’s browser and a synchronization mechanism will let the online version get updated once the connection is restored.

A nice 10 min long WikiCalc screencast with audio is available here.

In a former life, I was managing a team of web project managers in a multinational industrial corporation. As my boss wanted to get simple-to-update weekly/monthly status report about every project, we had tried using a wiki page per project in order to publish and update those reports. It was tedious and not nicely formatted for a corporate environment. I imagine that a nice immediate use of WikiCalc would be to let small project teams update project status reports on an intranet, including nicely formatted timelines and budget indicators. It would still maintain the update effort at a minimal and convenient level and would preserve the wiki flexibility of linking to the project documentation and resources.

We knew structured wiki pages for managing forms or category schemes. WikiCalc introduces spreadsheet structures while preserving the open and unstructured spirit of wikis. Next steps for future wikis would be to allow semantic structures to be managed the wiki-way, like in some early semantic wiki prototypes. [update: see Danny Ayers blog entries on how WikiCalc could relate to the Semantic Web vision]

Mise en relation par le web sémantique

Le projet européen de recherche Vikef vise à développer des technologies de mise en relation de personnes grâce aux technologies du Web Sémantique. Principale application envisagée: la mise en relation de professionnels dans des salons et de scientifiques lors de conférences. Lancé en avril 2004, le projet prendra fin en mars 2007. Ce projet est mené notamment par Xerox, l’insitut Fraunhofer et Telefonica.
Spontanément, je ne peux m’empêcher de me réjouir d’un tel projet et de m’inquiéter de l’utilisabilité des applications qui vont en découler: va-t-on demander aux utilisateurs de modéliser leurs centres d’intérêts? Ca ne me paraît pas très réaliste. Je demande à voir !

Invention d’un système de coaching automatique sur téléphone mobile

[Ceci est le résumé de l’une de mes réalisations professionnelles. Je m’en sers pour faire ma pub dans l’espoir de séduire de futurs partenaires. Plus d’infos à ce sujet dans le récit de mon parcours professionnel.]

En 2005, le projet de recherche informatique MobiLife, mené conjointement par 22 entreprises et universités européennes, dispose d’un logiciel pour téléphone mobile qui permet à un sportif de visualiser son contexte d’entraînement : rythme cardiaque, lieu, heure… En tant qu’ingénieur de recherche, je suis chargé d’inventer un système exploitant ce type de données pour offrir à l’utilisateur des recommandations personnalisées et dépendant du contexte. Je propose aux partenaires un scénario utilisateur qui est accepté puis j’en supervise l’implémentation. J’implémente une partie du système côté serveur (J2EE) et côté téléphone (J2ME). L’application devient ainsi capable d’apprendre les habitudes d’entraînement du sportif, bonnes ou mauvaises, de prédire ses prochains choix d’exercice, de les comparer à ce que recommenderait un entraîneur expert dans les mêmes conditions et, sur cette base, d’alerter le sportif par des petits clips videos personnalisés sur son téléphone : “Attention, il est tard et après 2 exercices de course sur le tapis roulant, vous avez habituellement tendance à trop forcer sur l’exercice suivant ; vous devriez plutôt passer sur le vélo pour un exercice de difficulté moyenne de 10 minutes“. Le système inventé est transposable dans d’innombrables situations de mobilité : coaching alimentaire, formation continue, gestes pour l’environnement, guides touristiques,… A l’occasion d’une journée portes ouvertes des laboratoires Motorola, j’organise la démonstration de cette application devant 40 journalistes et analystes européens.

Critiques du web sémantique

Le Web Sémantique est l’objet de nombreuses critiques. On reproche principalement à cette vision technologiste son manque de pragmatisme. Voici les références de deux articles illustrateurs de ces critiques.

Clay Shirky soutient que la gestion d’ontologies n’est pas une sinécure et que les technologies de “social tagging”/”folksonomies” sont une alternative beaucoup plus adaptée à l’Internet que ne l’est la vision du Web Sémantique des spécialistes des ontologies. Selon moi, l’alternative proposée (le social tagging) est bonne mais la critique anti-ontologies est exagérée car je ne pense pas que la vision de Tim Berners-Lee du web sémantique soit autant portée sur une modélisation ontologique top-down des connaissances que l’on veut bien le dire. Bref, pour moi, la solution d’avenir ce serait quelque chose du genre “semantic social tagging”. Cet article est un bon point de départ pour découvrir les folksonomies, comparées à la modélisation ontologique des connaissances.

Sur un ton plus comique, on peut trouver une libre reprise d’un sketch des comiques anglais “Monty Python” qui se moque de l’approche top-down des spécialistes du web-sémantique ; la critique est aisée depuis que Tim Berners Lee a été adoubé chevalier par la reine d’Angleterre… Elle n’en reste pas moins tout à fait amusante et intéressante.

Au passage, je vous suggère une manière simple de se représenter le web sémantique : imaginez que le web devienne non pas simplement un gros document hyper-texte mais également une grosse base de données. Avec le web sémantique, les applicatifs (agents ou non) pourront “librement” traiter des données produites par d’autres applications.

From flat text to structured data

This article shows an example of how to build structured data sets from flat text. The example given is the detection of existing relationships between (ex-)members of the British government by data-mining the wikipedia. This relates to “bubble-up folksonomies”. These folks at the BBC are smart !

How to ReSTfully Ajax

Here are some pointers for learning more about the Ajax programming model and how to properly design your Ajax application :

Ajax is said to be the cross-platform successor to Java… huh… (David, thank you for this pointer)
Ajax should be ReSTfully considered before use
Is Ajax ReSTless and dirty ?

While I am mentionning the Representational State Transfer (ReST) architecture style, here are some additional and valuable resources on this topic :

Do we need a ReST toolkit for application developpers ? What would it look like ?
NetKernel is claimed to be such a ReSTful toolkit
A ReSTful toolkit for application developers would be a low cost disruption to the heavyweight SOA products that are (by far) overshooting the market of IT departments in big corporations ; it nicely fits the Christensen model of disruptive innovations

Comparator

Comparator is a small Plone product I recently hacked for my pleasure. It’s called comparator until it gets a nicer name, if ever. I distribute it here under the GNU General Public License. It allows users to select any existing content type (object class) and to calculate a personnalized comparison of the instances of this class. For example, if you choose to compare “News Items”, then you select the news items properties you want to base your comparison upon (title, creation date, description, …). You give marks to any value of these properties (somewhat a tedious process at the moment but much room for improvement in the future, there). Comparator then let’s you give relative weights to these properties so that the given marks are processed and the compared instances are ranked globally.

It’s a kind of basic block for building a comparison framework, for building Plone applications that compare stuff (any kind of stuff that exists within your portal, including semantically agregated stuff). Let’s say that your Plone portal is full of descriptions of beers (with many details about all kinds of beers). Then adding a comparator to your portal will let your users give weights to every beer property and rank all the beers according to their personal tastes.

Comparator is based on Archetypes and was built from an UML diagram with ArchgenXML. Comparator fits well in my vision of semantic agregation. I hope you can see how. Comments welcome !

Daisy vs. Plone, feature fighting

A Gouri-friend of mine recently pointed me to Daisy, a “CMS wiki/structured/XML/faceted” stuff he said. I answered him it may be a nice product but not enough attractive for me at the moment to spend much time analyzing it. Nevertheless, as Gouri asked, let’s browse Daisy’s features and try to compare them with Plone equivalents (given that I never tried Daisy).

The Daisy project encompasses two major parts: a featureful document repository

Plone is based on an object-oriented repository (Zope’s ZODB) rather than a document oriented repository.

and a web-based, wiki-like frontend.

Plone has its own web-based fronted. Wiki features are provided with an additional product (Zwiki).

If you have different frontend needs than those covered by the standard Daisy frontend, you can still benefit hugely from building upon its repository part.

Plone’s frontend is easily customizable either with your own CSS, with inherting from existing ZPT skins or with a WYSIWYG skin module such as CPSSkin.

Daisy is a Java-based application

Plone is Python-based.

, and is based on the work of many valuable open source packages, without which Daisy would not have been possible. All third-party libraries or products we redistribute are unmodified (unforked) copies.

Same for Plone. Daisy seems to be based on Cocoon. Plone is based on Zope.

Some of the main features of the document repository are:
* Storage and retrieval of documents.

Documents are one of the numerous object classes available in Plone. The basic object in Plone is… an object that is not fully extensible by itself unless it was designed to be so. Plone content types are more user-oriented than generic documents (they implement specialized behaviours such as security rules, workflows, displays, …). They will be made very extensible when the next versions of the “Archetypes” underlying layer is released (they include through-the-web schema management feature that allow web users to extend what any existing content type is).

* Documents can consists of multiple content parts and fields, document types define what parts and fields a document should have.

Plone’s perspective is different because of its object orientation. Another Zope product called Silva is more similar to Daisy’s document orientation.

Fields can be of different data types (string, date, decimal, boolean, …) and can have a list of values to choose from.

Same for Archetypes based content types in Plone.

Parts can contain arbitrary binary data, but the document type can limit the allowed mime types. So a document (or more correctly a part of a document) could contain XML, an image, a PDF document, … Part upload and download is handled in a streaming manner, so the size of parts is only limitted by the available space on your filesystem (and for uploading, a configurable upload limit).

I imagine that Daisy allows the upload and download of documents having any structure, with no constraint. In Plone, you are constrained by the object model of your content types. As said above this model can be extended at run time (schema management) but at the moment, the usual way to do is to define your model at design time and then comply with it at run time. At run time (even without schema management), you can still add custom metadata or upload additional attached files if your content type supports attached files.

* Versioning of the content parts and fields. Each version can have a state of ‘published’ or ‘draft’. The most recent version which has the state published is the ‘live’ version, ie the version that is displayed by default (depends on the behaviour of the frontend application of course).

The default behaviour of Plone does not include real versioning but document workflows. It means that a given content can be in state ‘draft’ or ‘published’ and go from one state to another according to a pre-defined workflow (with security conditions, event triggering and so). But a given object has only one version by default.
But there are additional Plone product that make Plone support versioning. These products are to be merged into Plone future distribution because versioning has been a long awaited feature. Note that, at the moment, you can have several versions of a document to support multi-language sites (one version per language).

* Documents can be marked as ‘retired’, which makes them appear as deleted, they won’t show up unless explicitely requested. Documents can also be deleted permanently.

Plone’s workflow mechanism is much more advanced. A default workflow includes a similar retired state. But the admin can define new workflows and modify the default one, always referring to the user role. Plone’s security model is quite advanced and is the underlying layer of every Plone functionality.

* The repository doesn’t care much what kind of data is stored in its parts, but if it is “HTML-as-well-formed-XML”, some additional features are provided:
o link-extraction is performed, which allows to search for referers of a document.
o a summary (first 300 characters) is extracted to display in search results
o (these features could potentially be supported for other formats also)

There is no such thing in Plone. Maybe in Silva ? Plone’s reference engine allows you to define associations between objects. These associations are indexed by Plone’s search engine (“catalog”) and can be searched.

* all documents are stored in one “big bag”, there are no directories.

Physically, the ZODB repository can have many forms (RDBMS, …). The default ZODB repository is a single flat file that can get quite big : Data.fs

Each document is identified by a unique ID (an ever-increasing sequence number starting at 1), and has a name (which does not need to be unique).

Each object has an ID but it is not globally unique at the moment. It is unfortunately stored in a hierarchical structure (Zope’s tree). Some Zope/Plone developpers wished “Placeless content” to be implemented. But Daisy must still be superior to Plone in that field.

Hierarchical structure is provided by the frontend by the possibility to create hierarchical navigation trees.

Zope’s tree is the most important structure for objects in a Plone site. It is too much important. You can still create navigation trees with shortcuts. But in fact, the usual solution in order to have maximum flexibility in navigation trees is to use the “Topic” content type. Topics are folder-like object that contain a dynamic list of links to objects matching the Topic’s pre-defined query. Topic are like persistent searches displayed as folders. As a an example a Topic may display the list of all the “Photo”-typed objects that are in “draft” state in a specific part (tree branch) of the site, etc.

* Documents can be combined in so-called “collections”. Collections are sets of the documents. One document can belong to multiple collections, in other words, collections can overlap.

Topics too ? I regret that Plone does easily not offer a default way to display a whole set of objects in just one page. As an example, I would have enjoyed to display a “book” of all the contents in my Plone site as if it were just one single object (so that I can print it…) But there are some Plone additional products (extensions) that support similar functionalities. I often use “Content Panels” to build a page by defining its global layout (columns and lines) and by filling it with “views” from exisiting Plone objects (especially Topics). Content Panels mixed with Topics allow a high flexibility in your site. But this flexibility has some limits too.

* possibility to take exclusive locks on documents for a limitted or unlimitted time. Checking for concurrent modifications (optimistic locking) happens automatically.

See versioning above.

* documents are automatically full-text indexed (Jakarta Lucene based). Currently supports plain text, XML, PDF (through PDFBox), MS-Word, Excel and Powerpoint (through Jakarta POI), and OpenOffice Writer.

Same for Plone except that Plone’s search engine is not Lucene and I don’t know if Plone can read OpenOffice Writer documents. Note that you will require additional modules depending on your platform in order to read Microsoft files.

* repository data is stored in a relation database. Our main development happens on MySQL/InnoDB, but the provisions are there to add support for new databases, for example PostgreSQL support is now included.

Everything is in the ZODB. By default stored as a single file. But can also be stored in a relational database (but this is usually useless). You can also transparently mix several repositories in a same Plone instance. Furthermore, instead of having Plone directly writing in the ZODB’s file, you can configure Plone so that it goes through a ZEO client-server setup so that several Plone instances can share a common database (load balancing). Even better, there is a commercial product, ZRS, that allows you to transparently replicate ZODBs so that several Plone instances setup with ZEO can use several redundant ZODBs (no single point of failure).

The part content is stored in normal files on the file system (to offload the database). The usage of these familiar, open technologies, combined with the fact that the daisywiki frontend stores plain HTML, makes that your valuable content is easily accessible with minimal “vendor” lock-in.

Everything’s in the ZODB. This can be seen as a lock-in. But it is not really because 1/ the product is open source and you can script a full export with Python with minimal effort, 2/ there are default WebDAV + FTP services that can be combined with Plone’s Marshall extension (soon to be included in Plone’s default distribution) that allows you to output your content from your Plone site. Even better, you can also upload your structured semantic content with Marshall plus additional hacks as I mentioned somewhere else.

* a high-level, sql-like query language provides flexible querying without knowing the details of the underlying SQL database schema. The query language also allows to combine full-text (Lucene) and metadata (SQL) searches. Search results are filtered to only contain documents the user is allowed to access (see also access control). The content of parts (if HTML-as-well-formed-XML) can also be selected as part of a query, which is useful to retrieve eg the content of an “abstract” part of a set of documents.

No such thing in Plone as far as I know. You may have to Pythonize my friend… Except that Plone’s tree gives an URL to every object so that you can access any part of the site. But not with a granularity similar to Daisy’s supposed one. See silva for more document-orientation.

* Accesscontrol: instead of attaching an ACL to each individual document, there is a global ACL which allows to specify the access rules for sets of documents by selecting those documents based on expressions. This allows for example to define access control rules for all documents of a certain type, or for all documents in a certain collection.

Access control is based on Plone’s tree, with inheritance (similar to Windows security model in some way). I suppose Plone’s access control is more sophisticated and maintainable than Daisy’s one but it should require more investigation to explain why.

* The full functionality of the repository is available via an HTTP+XML protocol, thus providing language and platform independent access. The documentation of the HTTP interface includes examples on how the repository can be updated using command-line tools like wget and curl.

Unfortunately, Plone is not ReST enough at the moment. But there is some hope the situation will change with Zope 3 (Zope’s next major release that is coming soon). Note that Zope (so Plone) supports HTTP+XML/RPC as a generic web service protocol. But this is nothing near real ReSTful web services…

* A high-level, easy to use Java API, available both as an “in-JVM” implementation for embedded scenarios or services running in the daisy server VM, as well as an implementation that communicates transparently using the HTTP+XML protocol.

Say Python and XML/RPC here.

* For various repository events, such as document creation and update, events are broadcasted via JMS (currently we include OpenJMS). The content of the events are XML messages. Internally, this is used for updating the full-text index, notification-mail sending and clearing of remote caches. Logging all JMS events gives a full audit log of all updates that happened to the repository.

No such mechanism as far as I know. But Plone of course offers fully detailed audit logs of any of its events.

* Repository extensions can provide additional services, included are:
o a notification email sender (which also includes the management of the subscriptions), allowing subscribing to individual documents, collections of documents or all documents.

No such generic feature by default in Plone. You can add scripts to send notification in any workflow transition. But you need to write one or two lines of Python. And the management of subscriptions is not implemented by default. But folder-like object support RSS syndication so that you can agregate Plone’s new objects in your favorite news aggregator;

o a navigation tree management component and a publisher component, which plays hand-in-hand with our frontend (see further on)

I’ll see further on… :)

* A JMX console allows some monitoring and maintenance operations, such as optimization or rebuilding of the fulltext index, monitoring memory usage, document cache size, or database connection pool status.

You have several places to look at for this monitoring within Zope/Plone (no centralized monitoring). An additional Plone product helps in centralizing maintenance operations. Still some ground for progress here.

The “Daisywiki” frontend
The frontend is called the “Daisywiki” because, just like wikis, it provides a mixed browsing/editing environment with a low entry barrier. However, it also differs hugely from the original wikis, in that it uses wysiwyg editing, has a powerful navigation component, and inherits all the features of the underlying daisy repository such as different document types and powerful querying.

Well, then we can just say the same for Plone and rename its skins the Plonewiki frontend… Supports Wysiwyg editing too, with customizable navigation tree, etc.

* wysiwyg HTML editing
o supports recent Internet Explorer and Mozilla/Firefox (gecko) browsers, with fallback to a textarea on other browsers. The editor is customized version of HTMLArea (through plugins, not a fork).

Same for Plone (except it is not an extension of HTMLArea but of a similar product).

o We don’t allow for arbitrary HTML, but limit it to a small, structural subset of HTML, so that it’s future-safe, output medium independent, secure and easily transformable. It is possible to have special paragraph types such as ‘note’ or ‘warning’. The stored HTML is always well-formed XML, and nicely layed-out. Thanks to a powerful (server-side) cleanup engine, the stored HTML is exactly the same whether edited with IE or Mozilla, allowing to do source-based diffs.

No such validity control within Plone. In fact, the structure of a Plone document is always valid because it is managed by Plone according to a specific object model. But a given object may contain an HTML part (a document’s body as an example) that may not be valid. If your documents are to have a recurrent inner structure, then you are invited to make this structure an extension of an object class so that is no more handled as a document structure. See what I mean ?

o insertion of images by browsing the repository or upload of new images (images are also stored as documents in the repository, so can also be versioned, have metadata, access control, etc)

Same with Plone except for versioning. Note that Plone’s Photo content type support automatic server-side redimensioning of images.

o easy insertion document links by searching for a document

Sometimes yes, sometimes no. It depends on the type of link you are creating.

o a heartbeat keeps the session alive while editing

I don’t know how it works here.

o an exlusive lock is automatically taken on the document, with an expire time of 15 minutes, and the lock is automatically refreshed by the heartbeat

I never tried the Plone extension for versioning so I can’t say. I know that you can use the WebDAV interface to edit a Plone object with your favorite text processing package if you want. And I suppose this interface properly manages this kind of issues. But I never tried.

o editing screens are built dynamically for the document type of the document being edited.

Of course.

* Version overview page, from which the state of versions can be changed (between published and draft), and diffs can be requested. * Nice version diffs, including highlighting of actual changes in changed lines (ignoring re-wrapping).

You can easily move any object in its associated workflow (from one state to another, through transitions). But no versioning. Note that you can use Plone’s wiki extension and this extension supports supports diffs and some versioning features. But this is not available for any Plone content type.

* Support for includes, i.e. the inclusion of one document in the other (includes are handled recursively).

No.

* Support for embedding queries in pages.

You can use Topics (persistent queries). You can embed them in Content Panels.

* A hierarchical navigation tree manager. As many navigation trees as you want can be created.

One and only one navigation tree by default. But Topics can be nested. So you can have one main navigation tree plus one or more alternatives with Topics (but these alternatives are limited for some reasons.).

Navigation trees are defined as XML and stored in the repository as documents, thus access control (for authoring them, read access is public), versioning etc applies. One navigation tree can import another one. The nodes in the navigation tree can be listed explicitely, but also dynamically inserted using queries. When a navigation tree is generated, the nodes are filtered according to the access control rules for the requesting user. Navigation trees can be requested in “full” or “contextualized”, this last one meaning that only the nodes going to a certain document are expanded. The navigtion tree manager produces XML, the visual rendering is up to XSL stylesheets.

This is nice. Plone can not do that easily. But what Plone can do is still done with respect to its security model and access control, of course.

* A navigation tree editor widget allows easy editing of the navigation trees without knowledge of XML. The navigation tree editor works entirely client-side (Mozilla/Firefox and Internet Explorer), without annoying server-side roundtrips to move nodes around, and full undo support.

Yummy.

* Powerful document-publishing engine, supporting:
o processing of includes (works recursive, with detection of recursive includes)
o processing of embedded queries
o document type specific styling (XSLT-based), also works nicely combined with includes, i.e. each included document will be styled with its own stylesheet depending on its document type.

* PDF publishing (using Apache FOP), with all the same features as the HTML publishing, thus also document type specific styling.

Plone document-like content type offer PDF views too.

* search pages:
o fulltext search
o searching using Daisy’s query language
o display of referers (“incoming links”)

Fulltext search is available. No query language for the user. Display of refers is only available for content type that are either wiki pages or have been given the ability to include references from other objects.

* Multiple-site support, allows to have multiple perspectives on top of the same daisy repository. Each site can have a different navigation tree, and is associated with a default collection. Newly created documents are automatically added to this default collection, and searches are limited to this default collection (unless requested otherwise).

It might be possible with Plone but I am not sure when this would be useful.

* XSLT-based skinning, with resuable ‘common’ stylesheets (in most cases you’ll only need to adjust one ‘layout’ xslt, unless you want to customise heavily). Skins are configurable on a per-site basis.

Plone’s skins are using the Zope Page Templates technology. This is a very nice and simple HTML templating technology. Plone’s skins make an extensive use of CSS and in fact most of the layout and look-and-feel of a site is now in CSS objects. These skins are managed as objects, with inheritance, overriding of skins and other sophisticated mechanism to configure them.

* User self-registration (with the possibility to configure which roles are assigned to users after self-registration) and password reminder.

Same is available from Plone.

* Comments can be added to documents.

Available too.

* Internationalization: the whole front-end is localizable through resource bundles.

Idem.

* Management pages for managing:
o the repository schema (the document types)
o the users
o the collections
o access control

Idem.

* The frontend currently doesn’t perform any caching, all pages are published dynamically, since this also depends on the access rights of the current user. For publishing of high-trafic, public (ie all public access as the same user), read-only sites, it is probably best to develop a custom publishing application.

Zope includes caching mechanisms that take care of access rights. For very high-trafic public sites, a Squid frontend is usually recommended.

* Built on top of Apache Cocoon (an XML-oriented web publishing and application framework), using Cocoon Forms, Apples (for stateful flow scenarios), and the repository client API.

By default, Zope uses its own embedded web server. But the usual setup for production-grade sites is to put an Apache reverse-proxy in front of it.

My conclusion : Daisy looks like a nice product when you have a very document-oriented project, with complex documents with structures varying much from documents to documents ; its equivalent in Zope’s world would be Silva. But Plone is much more appropriate for everyday CMS sites. Its object-orientation offers both a great flexibility for the developer and more ease of use for Joe-six-pack webmaster. Plone still lacks some important technical features for its future, namely ReSTful web service interfaces, plus placeless content paradigm. Versioning is expected soon.

This article was written in just one raw, late at night and with no re-reading reviewed once thanks to Gouri. It may be wrong or badly lacking information on some points. So your comments are much welcome !

Bytes for good

Innover, Servir, Entreprendre !

Archives de catégorie : Architecture

Un lab pour que l’État démocratise les blockchains ?

La blockchain bouleverse la régulation des relations sociales

L’histoire se répète, c’est le moment de s’y mettre

Les applications de la blockchain

Impact sur les acteurs économiques

A/B split testing with Plone

Agrégeons les missions de volontariat, bénévolat et mécénat de compétences

SVG as an alternative to Flash, here comes bliotux

wecena.bliotux

How to use wecena.bliotux ?

Download and install bliotux

Create a template

Name your template

Define the layout of your template

Define the interactivity of your template

Create a page

Name the folder with the page name

Define the template this page uses

Populate the template

Include some page-specific graphics

That’s it

SMIL-animated SVG for accessible textbooks

Le code du wecena est libre

How to get visual performance profiles from plone doctests ?

Appel à projets informatiques d’intérêt général

Plone + Freemind = eternal love ?

Pierre Levy vs Tim Berners-Lee, round 0.1

Web scraping, web mashing

Web 2.0 architectures with Java

WikiCalc: Web 2.0 spreadsheets wikified

Mise en relation par le web sémantique

Invention d’un système de coaching automatique sur téléphone mobile

Critiques du web sémantique

From flat text to structured data

How to ReSTfully Ajax

Comparator

Daisy vs. Plone, feature fighting