Archives pour la catégorie Informatique

Very long-term backup fabbed with a reprap ?

How will your personal data be readable 2.000 years from now ? The Long Now Foundation blogs about a nickel-based 3 inches-large disk that can reliably hold high amount of printed data for at least 2.000 years. Data is printed on it in small font : a 750-power optical microscope is required to read the pages !

On the other side of the blogosphere, the reprap community considers adding an ElectroChemical Milling (ECM) tool head to their home DIY 3D printers :

With this tool head, it could machine any conductive material, regardless of how hard it is or how high its melting point.

Maybe someday, personal very-long term backups will be printed at home ?

At the moment, industrial ECM/EDM machines can « achieve a one micron positional accuracy and wire EDM walls as thin as 0.010” (.254mm) » or (ECM) make holes/traces as small as 0.2 mm large. I guess some progress is required before 750-power optical microscopes are required for reading data printed with this technology. But maybe that before 2.000 years from now, ultra-precision will be achieved by fabbers ? Id be curious of knowing which technique was used by the Long Now Foundation project and how difficult it would be to port this technique to the wonderful world of fabbers.

Rapid prototyping with microcontrollers ?

I have no clue about micro-electronics and embedded systems. I am a Web application architect and developer, working with very high-level programming languages such as Python (or Perl or Java). I hardly remember assembly language from my childhood experiments with an Apple IIe and almost never touched C or C++. But I have been dreaming lately of rapid-prototyping some advanced non-Web application in an embedded system using my programming skills. So I thought I could share bits of my ignorance here. Please bear with me and give me some hints in order for me to best get out of darkness ! :)

Microcontrollers are now gaining capabilities that are comparable to microprocessors of early personal computers. The two most popular microcontroller (uC) series are Microchip PIC uCs and Atmel AVR uCs. For instance the PIC18F25J10-I/SO costs around 3 or 5 euros per unit at Radio Spares (I am in France: think RS in the UK or Allied Electronics in the USA). It has the following characteristics: 40 MHz, RS-232 capabilities (serial port), a « C compiler optimized architecture », 48 kB of program memory (Flash mem) and around 4 or 5 kB of data memory (SRAM + EEPROM).

There are nice peripherals available, too. For instance this Texas Instrument CC2500 2.4GHz RF data transceiver (= transmitter + receiver) at around 2 to 3 euros per unit or current sensors approximately at the same price. In fact, periphals possibilities are limitless…

For free software hackers, there was a linux version for such chips : uCLinux. But is it still an active project ? I think I read that the comon linux kernel now includes everything that is required for it to run in embedded sytems. What about GNU utilities ? I know there are things like busybox on bigger but still embedded processors (phones). Anything equivalent on microcontrollers ?

There are simulators that will… let you pretend your desktop computer has a microcontroller inside, or sort of. :)

There is at least one C library for microcontrollers. C is considered as a « high-level programming language » in the embeddeds world ! That is to say that assembly language has been the norm. Some higher-levels languages can be used with microcontrollers, including some exotic-to-me Pascal-like languages like XPlo or PMP or Java-like but living dead things like Virgil and… what about my beloved Python ?

There are at least 2 projects aiming at allowing Python-programming on microcontrollers. pyastra is a « Python assembler translator » that can be used with some PIC12, PIC14 and PIC16 uCs. But it looks dead. Pymite looks sexier but not much more active :

PyMite is a flyweight Python interpreter written from scratch to execute on 8-bit and larger microcontrollers with resources as limited as 64 KiB of program memory (flash) and 4 KiB of RAM. PyMite supports a subset of the Python 2.5 syntax and can execute a subset of the Python 2.5 bytecodes. PyMite can also be compiled, tested and executed on a desktop computer.

At the moment, it seems like Python programming on microcontrollers is a dead end. Nothing worth investing time and efforts unless you want to also have to maintain a Python compiler… Same may be true for Java, not mentioning Perl. In fact, it seems to me that the object-oriented crowds are too far from microcontrollers applications to generate enough interest in initiatives such as Pymite, at the moment. Oh, and I am knowingly ignoring C++ which I did not investigate, having no experience in C++.

So what is left in terms of (open source) programming languages that would be of higher level than C ? The best guess I can make is Great Cow Basic, which is a free software Basic (procedural) language. Example programs look nice to me. It has been active recently. And it supports most of the chips I would consider experimenting with.

Next steps for me, I guess, would be to pick a PIC simulator and an IDE for Great Cow Basic (any eclipse plugin ?). Then I will probably have to figure out how a Basic program can be executed on a simulated PIC. And how a PIC simulator can be useful without all of the electronics that would surround it in any real setup. I’ll see that. When I have time to pursue my investigations and experiments in this micro-world.

And piclist is a great site for beginners.

3D scannerless scanning for fabbers

For several weeks (or more), I have been dreaming of the day I’ll get my hands on a Reprap (self-parts-printing 3D desktop printer, a DIY fabber). I have been lucky enough to have a good friend promise me he would give his free time for assembling such a printer for me as long as I pay for the parts. 3 days of work are required to assemble the parts which you can order via the web in case you don’t already have access to such a reprap, which is my case. I will try to wait for the next major release of Reprap, namely Mendel 2.0 (current version = Darwin 1.0) unless I can’t resist temptation long enough…

Anyway, I have mainly been dreaming of possible applications of fabbers. Their use is extremely competitive (and disruptively innovative) as soon as you want to print customized 3D shapes which can’t be bought from the mass-manufacturing market. For instance, a reprap is cool when you want to print a chocolate 3D version of your face (see the Fab@Home project) or a miniature plastic representation of your home or anything that has a shape which is very specific to your case (not to mention the future goal of printing 90% of complex systems such as robots, portable electronic devices including phones and… fabber-assembling robots…). And this is where 3D scanning is a must : with a 3D scanner, you can scan an existing object and build a 3D model from it which you can then modify and print at the scale you want.

So my dreams lead me to this question : I could get a fabber some time soon but how to also get a desktop 3D scanner ? Some people have already started hacking home 3D scanners. But I had also heard of techniques that allow users to build 3D models from existing objects using either a single picture of the object, 2 pictures, several images or even a small movie. Some techniques require that the parameters of the camera(s) are known (position, angles, distance, …). Some techniques require 2 cameras in a fixed and known setup (stereophotography). Some techniques require that the camera is fixed and the object lies on a turntable. I really know nothing about computer vision and the world of 3D techniques so I was happy to learn new words such as « close-range photogrammetry« , « videogrammetry« , « structure from motion« , « matchmoving« , « motion tracking » (which is the same as matchmoving) or « 3D reconstruction« . After some Web wandering, I identified several open source (of course) software packages that could offer some workable path from existing physical objects to 3D models of them using plain cameras or video cameras.

The idea would be the following :

you take an existing, very personal object, for instance your head !
with a common digital camera, you take pictures of your head from several angles
you load these pictures into your favorite 3D reconstruction free software package
it creates a 3D model of your head which you can then export to a 3D editor for possible adjustments (think Blender)
you export your corrected 3D model into the reprap software stuff
your reprap fabs your head out of plastic (or chocolate ?)

Here are the software projects I identified :

From a single image :
- jSVR, Single View Reconstruction, a semi-automatic process for identifying and exporting three-dimensional information from a single un-calibrated image, dead project ?
Using a turntable :
- Archimedes, Shape Reconstruction from Pictures, A Generalized Voxel Coloring Implementation, almost inactive ?
From stereo images :
- BLIX, Blender Extensions allow 3D measurement using pairs of calibrated camera views and image-based modeling using pairs of calibrated camera view, probably abandonware :-(
- reconststereo, 3D reconstruction using Stereo Vision, System Prototype to make 3D reconstruction solution using stereo images, Rest In Peace
- Stereo: Photo-metrology, Easy to use software and versatile methods of digitizing 3D objects, Active and growing !
- EStereo/StereoPlus (see also this page), allowing a user to load images and compute their disparity image and visualize/manipulate their 3D reconstruction
From a movie or a sequence of pictures :
- e-Foto, a free GNU/GPL educational digital photogrammetric workstation, but is it suitable for close-range photogrammetry ?
- Voodoo Camera Tracker, a tool for the integration of virtual and real scenes, estimates camera parameters and reconstructs a 3D scene from image sequences ; oops, this is not free software but freeware only
- Octave vision, Algorithms for the recovery of structure and motion, using Octave, a one-shot development, no future…
- Tracking / Structure from Motion, another piece of student homework
- libmv, a structure from motion library, which plans to one day take raw video footage or photographs, and produce full camera calibration information and dense 3D models, very promising but being rewritten at the moment (August 2008)
- GPU KLT a high-performance research implementation
Using the shadow of a stick (!) :
- Scanning with Shadows (see also this site), wave a stick in front of a light source to cast a shadow on the object of interest, and figure out its 3D shape by observing the distortion of the shadow
Don’t know which technique is used :
- OpenCV (see also this site), Intel’s Open Computer Vision library may some day contain some 3D reconstruction capabilities
- Voxelization, a .NET based framework, designed for helping in development of different volume reconstruction, 3D voxel visualization and color consistency algorithms in multi view dynamic scenes, dead project ?

My personal conclusion :

I haven’t tested any of these packages. At the moment, there seems to be no easy-to-use free software package that would compare to commercial stuff such as Photomodeler or ImageModeler or research works such as Microsoft Photosynth. However these techniques and algorithms seem to be mature enough to become present as open source package soon, especially given the emerging interest in 3D scanning for fabbers ! Most promising free packages for scannerless 3D scanning for fabbers are probably Stereo and libmv.

What do you think ?

Alitheia core de SQO-OSS pour mesurer la qualité du code

Un projet de recherche financé par la commission européenne (SQO-OSS) distribue, sous licence open source bien entendu, un logiciel qui analyse la qualité du code source d’un logiciel. Ce logiciel s’appelle Alitheia.

Alitheia parcourt des dépôts de code du style subversion/CVS (et notamment ceux de sourceforge). Des plugins fournissent des mesures du code (nombre de lignes de code, nombre de lignes de commentaires, etc.). Des modules d’Alitheia effectuent des statistiques à partir de ces mesures afin d’estimer la qualité globale du produit analysé. Alitheia se présente soit sous forme d’une application Web, soit, bientôt, sous forme d’un plugin pour Eclipse.

L’intérêt pratique d’Alitheia me semble actuellement limité: il y a peu de mesures disponibles dans la version de démo en ligne, la version pour Eclipse n’est pas encore disponible, les mesures sont effectuées au niveau de chaque fichier source et ne semblent pas encore agrégées au niveau du projet en lui-même (on peut savoir combien de lignes de commentaires il y a dans tel fichier mais pas dans le projet complet). Actuellement, la fonction la plus amusante semble être la mesure de la « productivité » de chaque développeur.

A terme, ce logiciel me semble très prometteur. Son intérêt dépendra essentiellement de la richesse des plugins de mesure disponibles, de l’existence d’un site public permettant de comparer entre eux les projets phare de sourceforge et tigris par exemple, et de la capacité d’Alitheia à produire des indicateurs agrégés significatifs. En ce qui concerne les plugins de mesure, j’espère qu’on va non seulement avoir des plugins mesurant des caractéristiques du code mais aussi (voire même surtout), des plugins mesurant la qualité de la communauté du projet: fréquence et délai des réponses sur les mailing lists, fréquentation du canal IRC de support, nombre et qualité des plugins et modules additionnels, durée de vie d’une version, etc. A suivre !

(via Le Monde Informatique)

Appel à projets informatiques d’intérêt général

Vous connaissez un projet informatique qui pourrait contribuer à rendre le monde meilleur ? A sauver la planète ? A créer une innovation Internet d’utilité publique ? Ou juste à faciliter la vie de votre association ? A faire avancer une grande cause ou une toute petite ? A faire avancer la science ? Alors répondez à cet appel car je pense pouvoir booster ce projet en recrutant pour lui des mécènes informatiques.

En effet, dans le cadre de ma nouvelle entreprise, je propose mes services professionnels à tout projet informatique d’intérêt général: je fournis (à coût zéro, cf plus bas) mes compétences en tant que directeur de projets informatiques innovants ainsi que l’accès aux compétences de très nombreux autres ingénieurs informaticiens, sur leur temps de travail. Vous voulez des compétences d’ingénieurs informaticiens pour rendre le monde meilleur ? En voila !

Notez que je ne place, a priori, aucune limitation de thème ou de domaine : lutte contre la pauvreté, recherche scientifique, défense de l’environnement, santé, handicap, protection de l’enfance, etc. peu importe du moment que ce projet va vraiment dans le sens de l’intérêt général et de l’utilité publique (cf. ci-dessous).

Les conditions à remplir

Pour que mon entreprise puisse intervenir, votre projet informatique doit absolument :

être « d’intérêt général », c’est-à-dire être porté par un organisme ayant le droit, en France, d’émettre des reçus fiscaux en échange des dons reçus (mécénat)
ne pas être un tout petit projet: il doit nécessiter, de la part des mécènes, au moins 1 ingénieur à temps plein
être porté par une équipe déjà active : je peux fournir entre 2 fois et 5 fois le temps que vous passez déjà sur le projet, en tant que bénévoles ou salariés ; si vous ne travaillez pas déjà sur le projet, je ne peux rien faire (0 fois 2 égal 0 !)
être un projet qui en vaut vraiment la peine: avoir un véritable impact social, direct ou indirect, une utilité clairement mesurable et motivante, répondre à un défi de société à petite ou à grande échelle, être source, levier ou moteur de changement pour la société…
ne pas nécessiter de présence physique importante en dehors de la région parisienne (je démarre petit et près de chez moi, même si je suis un adepte du travail à distance et des « conf call »), bref être plutôt localisé près de Paris

Qu’est-ce qu’un projet informatique d’intérêt général ?

Un projet informatique est d’intérêt général si il est porté par un organisme bénéficiant du régime fiscal français du mécénat. Ah, ah… mystère, qu’est-ce que c’est que ce truc ? La loi française d’août 2003 sur le mécénat reste mal connue mais elle représente une source de revenus importante pour les organismes d’intérêt général. Plusieurs types d’organismes répondent à ce critère. Pour faire simple, il peut s’agir d’une association loi 1901 :

à but non lucratif : elle ne reverse pas de TVA, ne paye pas d’impôts sur les sociétés, a des administrateurs et un bureau bénévoles et désintéressés, ne vient pas concurrencer des entreprises commerciales ou alors elle le fait à des prix beaucoup plus bas que le marché et principalement pour un public défavorisé et sans « pratiques commerciales » (publicité, …) ; demandez l’avis d’un comptable si besoin
et dont l’objet est à caractère philanthropique, éducatif, social, humanitaire, sportif, familial, culturel, artistique, environnemental, culturel, littéraire, scientifique…
et dont les activités ne bénéficient pas à un cercle restreint de personnes (contrairement aux syndicats ou aux associations d’anciens élèves d’une école par exemple …)

Au besoin, une association loi 1901 peut être facilement créée pour porter ce projet (statuts et déclaration en préfecture) et réunir les conditions de l’intérêt général. Il n’y a pas de condition d’ancienneté ni de taille de l’association. Il n’y a pas non plus forcément besoin d’obtenir un agrément administratif (comme ce serait le cas pour les associations « reconnues d’utilité publique », ce qui est une reconnaissance très difficile à obtenir de nos jours).

Pour en savoir plus sur la notion d’intérêt général, je vous invite à consulter le site mécénat du ministère de la culture ainsi que les explications de l’Association pour le Développement du Mécénat Industriel et Commercial (ADMICAL).

Comment je peux aider, en pratique ?

Si vous consacrez déjà du temps à votre projet, je peux donc démultiplier cet effort.

Exemple: avec 4 autres bénévoles, vous consacrez au moins, chacun, une journée par semaine à votre projet (soit un équivalent temps plein, 5 jours de travail par semaine), alors je peux vous fournir, en complément, l’équivalent de 2 ingénieurs à temps plein (10 jours de travail par semaine), voire plus si votre projet est très simple à gérer.

Cette aide prendra la forme de:

un accompagnement permanent par mon entreprise : au moins une demi-journée d’assistance et de conseil par semaine, en fonction du volume de votre projet ; plus un service de représentation et de suivi de votre projet auprès des entreprises mécènes,
des interventions individuelles d’un grand nombre (50, 100, 200…?) de professionnels de l’informatique, ingénieurs, techniciens ou consultants, pour des durées variables et parfois courtes (par exemple une semaine), sur leur temps de travail,
la possibilité de renforcer votre équipe bénévole par les contributions ultérieures de certains de ces intervenants sur leur temps libre (constitution éventuelle d’une communauté à la mode open source si votre projet s’y prête)
l’accès à un système d’information sécurisé sur le Web pour gérer votre projet, vos intervenants, vos relations avec les mécènes et automatiser la gestion de toute la paperasse administrative qui va avec (contrats, convention de mécénat, reçus fiscaux, …)

Comment ça marche ?

Je créé actuellement une entreprise à vocation sociale dont l’objectif est de fournir aux innovateurs sociaux les mêmes moyens informatiques que ceux dont disposent les entreprises les plus modernes. Mon activité s’appuie sur le mécénat de sociétés de services en informatique (SSII) qui s’engagent dans des démarches de « développement durable » (ou, plus exactement, de « responsabilité sociale de l’entreprise »). Elles souhaitent faire du mécénat de compétences en informatique par mon intermédiaire : faire don du temps de travail de leurs ingénieurs et consultants sous la forme d’une prestation de service gratuite gérée via le Web. J’appelle ça « faire du wecena » (Wecena, c’est le nom de ma boîte !).

Le financement de cette aide est indirectement assuré à 100% par l’Etat français, grâce à la loi sur le mécénat des entreprises. En effet, l’Etat accorde une réduction d’impôts importante à toute entreprise qui décide d’aider concrètement un organisme d’intérêt général (don d’argent, don en nature, don de compétences et temps de travail…). Les SSII mécènes que je rencontre sont prêtes à se lancer dans l’aventure en proposant à leurs ingénieurs de faire avancer votre projet pendant ces périodes de temps que l’on appelle l' »inter-contrat » (ou intercontrat ou « période de stand-by » ou …) : il s’agit de ces périodes de quelques jours à quelques mois qui commencent lorsque l’ingénieur termine un projet pour un client et n’est pas encore affecté à un autre projet pour un nouveau client.

Cela impose une contrainte importante dans la gestion de votre projet: les ingénieurs réalisant la prestation de service vont se relayer à un rythme très rapide, certains ne seront présents que 48H tandis que d’autres seront disponibles 2 ou 3 mois dans l’année. La durée moyenne d’intervention individuelle se situe quelque part entre une semaine et un mois (selon le métier de l’intervenant et l’état du marché de l’informatique, et aussi selon la politique du mécène). C’est le rôle de mon entreprise que de vous aider à gérer cette contrainte. Notez que cette contrainte a également quelques avantages : si votre projet est suffisament simple et « découpable » en petites tâches (à l’aide de méthodes et d’outils de gestion adaptées, que je vous fournis), vous aurez ainsi l’occasion de proposer votre cause à une multitude d’intervenants que vous pourrez recruter en autant de bénévoles potentiels une fois leur mission de wecena terminée. C’est par exemple le cas de projets portant sur de l’initiation à l’informatique, de l’animation d’atelier informatique auprès de personnes défavorisées, d’interventions multiples d’installation de PC ou de réseau local… Pour des projets plus complexes (développement, conseil, …), votre implication est plus importante et le wecena ne peut pas représenter plus de 2 fois le temps que vous y consacrez déjà.

Quelques exemples de projet

Pour vous aider à vous faire une idée du type de projet qui peuvent bénéficier du wecena, voici quelques exemples de projets que j’ai déjà présenté à des mécènes :

conception et réalisation d’un logiciel innovant pour faciliter l’utilisation du clavier et de la souris par des personnes ayant un handicap moteur
amélioration de l’infrastructure informatique d’une ONG travaillant dans la lutte contre l’exclusion: remplacement d’un parc de postes de travail, interventions d’administration système sur des serveurs de fichiers et d’application, …
déploiement d’un progiciel de reporting financier sur des prestations de services en mode projet pour une association recevant d’importantes subventions publiques
refonte d’applicatifs Web pour la gestion documentaire, la gestion des relations et contacts et la gestion des adhésions pour une association Internet dans le domaine de la famille et de la protection de l’enfance
création d’un blog par un écrivain public d’une ONG franco-africaine pour sensibiliser des étudiants français au problématiques du développement Nord-Sud
assistance à la webisation d’un système de gestion d’établissements de santé pour une association du secteur sanitaire et social
initiations informatiques et formation aux logiciels internes pour des bénévoles retraités d’une association humanitaire

Ce ne sont que quelques exemples pour vous donner le ton. Aucun de ces projets n’a encore démarré.

Avertissement

Mon entreprise en est encore à une phase de démarrage et d’expérimentation. Je ne peux actuellement vous garantir ni que votre projet en particulier sera sélectionné par un mécène (les projets les plus solides et les plus ambitieux auront plus de chances bien entendu) ni même de pouvoir démarrer mon accompagnement tout de suite. En effet, l’aide que je peux vous apporter est en soi un projet (créer une entreprise…) : j’y crois énormément puisque j’ai quitté mon employeur précédent pour me lancer dans cette aventure, et j’y consacre tout mon temps et mes compétences. Mais, ceci dit, démarrer ce genre d’entreprise sociale innovante prend du temps et représente aussi une part de risque, d’incertitude, bref d’aventure… Le premier projet que j’accompagnerai pourrait démarrer fin 2008 (si les étoiles s’alignent comme prévu) ou au plus tard début 2009 (si j’ai moins de chance). Les mécènes que je rencontre sont déjà sur le pied de guerre et ont déjà commencé à examiner les projets informatiques que je leur présente. Certains ont déjà exprimé leur préférence et se mettent en ordre de bataille… En croisant les doigts, j’espère qu’un premier projet pourrait démarrer peu après la rentrée scolaire 2008.

Pour participer à l’aventure…

Vous connaissez une équipe qui porte un projet informatique d’intérêt général et a besoin de temps d’informaticiens pour aller plus loin et plus vite ? Faites-lui suivre l’adresse de cet article !

Votre projet répond aux conditions présentées ci-dessus ? Pour vous en assurer, posez la question via un commentaire ci-dessous ou contactez-moi directement par email à l’adresse suivante: projets (chez) wecena (point) com ou bien encore à mon adresse de blogueur: sig (chez) akasig (point) org. Le site Web de mon entreprise ne devrait pas ouvrir ses portes avant le démarrage du premier projet. En attendant, c’est ici que ça se passe. Vous avez des conseils à me donner, des avis ou des contacts à partager ou des suggestions à faire ? Ils seront bienvenus: je vous invite également à utiliser la fonction commentaires de ce blog.

Plone + Freemind = eternal love ?

Congratulations to Plone and Freemind, two great open source software packages, which have celebrated weddings recently and have promptly released a new born « Plone Freemind v.1.0 » extension product for Plone. I have been really fond of Plone and Freemind for several years now. It’s good news to learn that Freemind mindmaps can now be published and managed via a Plone site… even though I yet have to imagine some valuable use for this ! :)

Mécénat de compétences en informatique: marché d’avenir!

Une étude commanditée par l’Admical, l’association pour le développement du mécénat, le confirme: ma petite entreprise de mécénat de compétences en informatique est bien sur un marché en croissance. Quelques chiffres extraits de l’étude [Edit: avec mes commentaires sans italiques] :

64% du mécénat est le fait du secteur des services, « mes » mécènes sont des sociétés de service en informatique,
47% des mécènes ont une action dans le secteur de la solidarité, c’est dans ce secteur que sont la plupart des associations que je représente,
45% des entreprises de plus de 200 salariés font du mécénat de compétences (vs. 31% dans l’étude précédente), mon offre est un dispositif de mécénat de compétences en informatique, pour sociétés de services en informatique

Pierre Levy vs Tim Berners-Lee, round 0.1

Yesterday, I attended a research seminar at the « Université de Paris 8 ». Pierre Levy is a philosopher and professor and head of the collective intelligence chair at the University of Ottawa, Canada. He presented the latest developments in his work on IEML, which stands for Information Economy Meta Language. Things are taking shape on this side and this presentation gave me the opportunity to better understand how IEML compares to the technologies of the Semantic Web (SW).

IEML: not another layer on top of the SW cake

IEML is proposed as an alternative to SW ontologies. In SW, the basic technology is URI (Uniform Resource Identifier) which uniquely (and hopefully permanently) identify concepts (« resources »). Triples then combine these URIs into assertions which then form a graph of meaning that is called an ontology. IEML introduces identifiers which are not URIs. The main difference between URIs and IEML identifiers is that IEML identifiers are semantically rich. They carry meaning. They are meaningful. From a given IEML identifier, one could derive some (or ideally all?) of the semantics of the concept it identifies. Indeed these identifiers are composed of 6 semantic primitives. These 6 primitives are Emptiness, Virtual, Actual, Sign, Being, Thing (E,V,A,S,V and T) and were chosen to be as universal as possible, i.e. not dependent on any specific culture or natural language. The IEML grammar is a way to combine these primitives and logically build concepts with them (also using the notion of triples-based graphs). These primitives are comparable to the 4 bases of DNA (A,C,T and G) that are combined into a complex polymer (DNA) : with a limited alphabet, IEML can express an astronomically huge number of concepts in the same way the 4 letters-alphabet of DNA can express a huge number of phenotypes.

Meaningness of identifiers

When I realized that the meaningful IEML identifiers are similar in their role to URIs, my first reaction was of being horrified. I have struggled for years against « old-school » IT workers who tend to rely on database keys for deriving properties of records. In a former life in the IT department of big industrial corporation, I was highly paid to design and impose a meaningless unique person identifier in order to uniquely and permanently identify the 200 000 employees and contractors of that multinational company in its corporate directory. The main superiority in meaningless identifiers is probably that they can be permanent: you don’t have to change the identifier of an object (of a person for instance) when some property of this object changes over time (the color of the hair of the person, or Miss Dupont getting married and getting called Misses Durand while still keeping the same corporate identifier).

The same is true for URIs whenever it is feasible: if a given resource is to change over time, its URI should not be dependent on its variable property (http://someone.com/blond/big/MissDurand having to change into http://someone.com/white/big/MissesDupont is a bad thing).

The same may not be true when concepts (not people) are to be identified. Concepts are supposed to be permanent and abstract things with IEML (as in the SW I guess). If some meaningful semantic component of a given concept changes then… it’s no longer the same concept (even though we may keep using the same word in a natural language in order to identify this derived concept).

In the old days, IT workers used to introduce meaning in identifiers so that (database) records could more easily be managed by humans, especially during tasks like visually classifying or sorting records in a table or getting an immediate overview of what a given record is about. But this often got seen as a bad practice when the cost of storage (having specific fields for properties that used to be stored as part of a DB key) and the cost of computation (getting a GUI for querying/filtering a DB based on properties) got lower. More often that not, the meaningful key was not permanent and this introduced perverse effects including having to assign a new key to a given record when some property changed or managing human errors when the properties « as seen in the key » were no longer in sync with the « real » properties of the record according to some field.

That’s probably part of the rationale behind the best practices in URI design and web architecture: an URI should be as permanent as possible I guess, in order not to change when the properties of a resource it identifies change over time. Thus web architectures are made more robust to time.

With IEML, we are back to the ol’times of meaningful identifiers. Is it such a bad thing ? Probably not because the power of IEML relies in the meaningness of these identifiers which allow all sorts of computational operations on the concepts. Anyway, that’s probably one of the biggest basic difference between IEML and the SW ontologies.

Matching concepts with IEML

Another aspect of IEML struck me yesterday: IEML gives no magic solution to the problem of mapping (or matching) concepts together. In the SW universe, there is this recurring issue of getting two experts or ontologies agree on the equivalence of 2 resources/concepts: are they really the same concept expressed with distinct but equivalent URIs ? or are they distinct concepts ? How to solve semantic ambiguities ? Unless we get a solution to this issue, the grand graph of semantic data can’t be universally unified and people get isolated in semantic islands which are nothing more than badly interconnected domain ontologies. This is called the problem of semantic integration, ontology mapping, ontology matching or ontology alignment.

A couple of years ago, I hoped that IEML would solve this issue. IEML being such a regular and to-be-universal language, one could project any concept onto the IEML semantic space and obtain the coordinates (identifier) of this concept in this space. A second person or expert or ontology could also project its own concepts. Then it would just be a matter of calculating the distance between these points in the IEML space. (IEML provides ways of calculating such distances). And if the distance was inferior to some threshold, 2 concepts could then be considered as equivalent for a given pragmatic purpose.

But yesterday, I realized that the art of projecting concepts into the IEML space (i.e. assigning an identifier to a concept) is very subjective. Even though a Pierre Levy could propose a 3000-concepts dictionary that assigns IEML coordinates (identifiers) to concepts that are also identified by a short natural language sentence (like in a classic dictionary), this would not prevent a Tim Berners-Lee to come with a very different dictionary that assigns different coordinates to the same described concepts. Thus the distance between a Pierre-Levy-based IEML word and a TBL-based IEML word would be … meaningless.

In the SW, there is a basic assumption that anyone may come with a different URI for the same concepts and the URIs have to be associated via a « same as » property so that they are said to refer to the very same concept. When you get to bunches of URIs (2 ontologies for instance), you then have to match these URIs which refer to the same concepts. You have to align these ontologies. This can be a very tedious, manual and tricky process. The SW does not unify concepts. It only provides a syntax to represent and handle them. Humans still have to interprete them and match them together when they want to communicate with each other and agree on the meaning that these ontologies carry.

The same is more or less true with IEML. With IEML, identifiers are not arbitrarily defined (meaningful identifiers) whereas SW URIs are almost arbitrarily defined (meaningless identifiers). But the meaningful IEML identifiers only carry human meaning if they refer to the same (or similar) human/IEML dictionary.

Hence it seems to me that IEML is only valuable if some consensus exists about how to translate human concepts into the IEML space. It is only valuable to the extent that there is some universally accepted IEML dictionary. At least for basic concepts (primitives and simple combinations of IEML primitives). The same is true in the universe of SW technologies and there are some attemps at building « top ontologies » that are proposed as shared referentials for ontology builders to align their own ontologies with. But the alignment process, even if theoretically made easier with the existence of these top ontologies is still tricky, tedious and costly. And the critical mass has not been reached in sharing the use of such top ontologies. There is no top consensus to refer to.

Pierre Levy proposes a dictionary of about 3000 IEML words (identifiers) that represent almost all possible low-level combinations of IEML primitives. He invites people to enhance or extend his dictionary, or to come with their own dictionaries. Let’s assume that only minor changes are made to the basic Pierre Levy dictionary. Let’s assume that several conflicting dictionary extensions are made for more precise concepts (higher-level combinations of IEML primitives) . Given the fact that these conflicting extensions still share a basic foundation (the basic Pierre Levy dictionary), would the process of comparing and possibly matching IEML-expressed concepts be made easier ? Even though IEML does not give any automagical solution to the problem of ontology mapping, I wonder whether it makes things easier or not.

In other words, is IEML a superior alternative to SW ontologies ?

Apples and bananas

Yesterday, someone asked: « If someone assigns IEML coordinates to the concept of bananas, how will these coordinates compare to the concept of apples ? » The answer did not satisfy me because it was along the lines of : « IEML may not be the right tool for comparing bananas to apples. ». I don’t see why it would be more suitable for comparing competencies to achievements than for comparing bananas to apples. Or I misunderstood the answer. Anyway…

Pierre Levy made much effort in describing the properties of his abstract IEML space so that IT programmers could start programming libraries for handling and processing IEML coordinates and operations. There even is a programming language being developped that allows semantic functions and operations to be applied to IEML graphs and to allow quantities (economic values, energy potentials, distances) to flow along IEML-based semantic graphs. Hence the name of Information Economy.

So there are (or will soon be) tools and services for surviving in the IEML space. But I strongly feel that there is a lack of tools for moving back and forth between the world of humans and the IEML space. How would you say « bananas » in IEML ? Assuming this concept is not already in a consensual dictionary.

As far as I understand the process of assigning IEML coordinates to the concept of « bananas » is somehow similar to the process of guessing the « right » (or best?) chinese ideogram for bananas. I don’t speak chinese at all. But I imagine one would have to combine existing ideograms that would best describe what a banana is. For instance, « bananas » could be written with a combination of the ideograms that mean « fruits of herbaceous plant cultivated throughout the tropics and grow in hanging clusters« . It could also be written with a combination of the ideograms that mean « fruits of the plants of the genus Musa that are native to the tropical region of Southeast Asia and Australia. » Distinct definitions of bananas could refer to distinct combinations of existing IEML concepts (fruits + herbaceous plant + hanging clusters + tropics or fruits + plants + genus Musa + Southeast Asia + Australia). Would the resulting IEML coordinates be far away from each other ? Could a machine infer that these concepts are closely related if not practically equivalent to each other ? How dependent would the resulting distance be on conflicts or errors in underlying IEML dictionaries ?

I ended the day with this question in my mind: How robust is the IEML translation process to human conflicts, disagreements and errors ? Is it more robust than the process of building and aligning SW ontologies ? Its robustness seems to me as the main determinent factor of the feasibility of the new collective-intelligence-based civilization Pierre Levy promises. If only there were a paper comparing this process to what the SW already provides, I guess people would realize the value of IEML.

Le rap du web design

Vous avez du mal à vous familiariser avec les bases du design de site Web ? Alors apprenez par coeur les paroles du rap de « Poetic Prophet », le rappeur de l’optimisation pour moteurs de recherche. C’est un mec bien (il utilise firefox). Yo.

Et voila ce que ça donne dans ses sessions de formation (on regrette juste que tout le monde ne se mette pas debout pour lever les bras).

(via The E-Learning Curve)

Parler malgré un handicap: l’exemple de Steria

Steria fait partie de ces SSII pionnières du mécénat. Via sa fondation, elle soutient des projets informatiques d’intérêt général sous la forme d’un soutien financier et aussi grâce à un « parrainage » bénévole d’un employé, encouragé par sa boîte. Un exemple de projet soutenu : un logiciel (open source bien sûr! mais dont je n’ai pas trouvé le site…) qui permet à des enfants handicapés moteurs de s’exprimer via une synthèse vocale et des pictogrammes.

Non aux agrégateurs de données personnelles

J’aime bien cette lettre ouverte de Philippe Coueignoux adressée à la Commission Européenne, au sujet du rachat de doubleclick par Google. Le message, sur le plan technique, est le suivant: la personnalisation de la publicité ne nécessite pas la collecte, la conservation, le transfert, bref l’agrégation de données personnelles. Pour que le consommateur bénéficie de pubs personnalisée, on peut faire autrement (et mieux) que créer des Big Brother façon Google + doubleclick.

Ah, si seulement certaines « tech’companies » (suivez mon regard) réussissaient à le comprendre… Il y a là un marché à prendre et pourtant, tout le monde essaie de suivre Google et d’avoir sa part du gâteau dans le Big Brothership… :-( Merci Philippe Coueignoux! Je ne sais pas ce que vaut son alternative à lui mais son point de vue vaut de l’or. Après l’essor du « green business » (produits verts), on connaîtra peut-être enfin celui du « privacy business »?

Personnalisation via Slashdot

Revenant de vacances, mon premier réflexe a bien sûr été de lire mes Dilbert en retard. Ensuite, je me suis mis à passer en revue Slashdot. En août et en juillet, je suis surpris de voir qu’il y a pas mal de news qui présentent des applications des technologies de personnalisation :

OpenAds, un serveur open source de publicité, avec ciblage et protection de la vie privée (voir aussi cet article de CNet)!
un recommendeur de thérapie en fonction de la tête qu’a votre tumeur cancéreuse,
un filtreur (personnalisé?) de flux RSS (ce n’est pas le premier)

Mon nouveau projet

Christian et Etum l’ont bien senti. :-) Je me relance dans une nouvelle aventure: la création d’une entreprise sociale ayant pour vocation de créer des Innovations Internet d’Utilité Publique.

Il y a deux ans, j’avais tenté d’identifier qui, en France, pouvait être créateur d’Innovations Internet d’Utilité Publique. Je n’étais pas revenu bredouille de mon expédition… mais presque. Depuis deux ans, j’ai travaillé à créer de l’innovation Internet (et mobilité) à vocation commerciale (recherche en applications mobiles Web 2.0ish). Maintenant, je vais essayer d’ajouter l’ingrédient « utilité publique » ou « intérêt général » à la sauce et voir si ça prend sous forme d’une activité professionnelle (et commerciale).

Concrètement, j’ai quitté mon job depuis le mois dernier et je prépare mon projet. Ca s’est décidé vite: un plan social a été annoncé au printemps et l’un de mes collègues aux compétences proches des miennes était sur la sellette. Au même moment, j’avais mon idée et la promesse d’une petite cagnotte si je me portais volontaire pour monter dans la charrette à la place de ce collègue. Tout le monde s’est mis d’accord et hop.

J’ai commencé par acquérir quelques connaissances qui me manquaient (notamment dans le domaine juridique), à tester mon idée auprès de prospects, à mobiliser quelques fournisseurs et à concevoir un peu d’outillage logiciel. Mes premiers contacts commerciaux sont plutôt positifs mais rien n’est joué tant que rien n’est fait ou signé! Alors je reste prudent.

Dans mon projet, il y a plein d’ingrédients bons pour la santé: un gros paquet d’open source et de prestation de service informatique, un fond de citoyenneté d’entreprise et de politiques de développement durable, une sauce épicée à l’Economie de communion, peut-être une pincée de coopératisme et un maximum d’innovation sociale et d’intérêt général.

Mes gènes de paranoïaque me mettent un peu mal à l’aise pour tout vous raconter ici dès aujourd’hui dans la mesure où, enthousiasme et extraversion obligent, j’aimerais tout vous dire mais j’ai un peu peur qu’en en disant trop, on dévoie mon idée avant que j’ai eu le temps de dire ouf. C’est sans doute idiot. C’est d’autant plus idiot que je voudrais que mon idée soit reprise par d’autres! Mais je sais pas encore comment alors il faut que j’y pense encore un peu avant de tout déballer n’importe comment.

En attendant, je vous invite à rêver. J’ai un génie dans ma bouteille. Il peut réaliser vos innovations Internet d’utilité publiques les plus folles. Il suffit d’en faire le souhait en postant un commentaire ci-dessous. Quel usage, service ou technologie Internet devrait-on répandre à travers le monde pour rendre celui-ci meilleur, pour aider à résoudre certains problèmes de société? Quels sont les problèmes de sociétés les plus cruels et pour lesquels il n’existe pas (encore) de technologie Internet uniquement faute d’intérêt commercial évident? Qui sont les entrepreneurs sociaux qui pourraient démultiplier leur puissance de changement social si seulement on leur forgeait quelques bons outils modernes? Quel sujet de société vous tient le plus à coeur pour que mon génie y consacre un peu de sa magie?

eXtreme Consulting?

Can agile methods such as eXtreme Programming (XP) be applied to consulting activities? What could eXtreme Consulting (XC) mean? Do you need to analyze the whole big picture before starting the delivery of good recommendations?

In XP iterations, users tell user stories which are prioritized and then transformed by rotating pairs of programmers into tested features. These features enable new uses of technology.

In XC iterations, I guess there would be decision makers telling decision making stories. These stories would be prioritized and then transformed by rotating pairs of consultants into argued and accepted recommendations. These recommendations would enable new decisions.

What about the agility of decision makers themselves, people who are to lead changes in their scope of responsibility? Couldn’t they follow similar methods and benefit from eXtreme Change Making (XCM)? In XCM iterations, there would be change leaders telling change leadership stories. These stories would be prioritized and then transformed by rotating pairs of change makers into tested change commitments from stakeholders in the organization. These commitments would enable changes in the organization, its rules and processes.

Had you ever heard of XC or XCM before reading this? What do you think? Why would such things be of any interest?

Appel à l’agilité

Imaginez un peu… Un projet informatique… Une équipe d’informaticiens… Et un taux de rotation des effectifs comme on n’en a jamais cauchemardé dans les pires des SSII offshore en Inde: les personnes restent rarement plus de 3 semaines/un mois sur le projet !

C’est le challenge méthodologique des missions solidaires pour prestataires en inter contrat: des prestas qui se relaient dans une équipe projet auprès d’une ONG le temps qu’on leur retrouve une « vraie » mission. Badr Chentouf soulignait l’importance de ce challenge méthodologique dans un commentaire récent. Comment rendre productive une telle équipe projet dont le gros des troupes ne reste que très peu de temps? Comment limiter la « charge d’entrée » sur le projet? Comment modeler la courbe d’apprentissage?

Dans un tel contexte, les méthodes les plus agiles pourraient paraitre on ne peut plus rigides et inadéquates, non? L’eXtreme Programming n’a pas été conçu pour gérer ce genre de situation, pas vrai?

Et même si, dans les communautés open source, on peut intervenir en peu de temps pour proposer un patch ou corriger un bug, la communauté repose sur des piliers permanents qui suivent le projet depuis de longues années et assurent que le mouvement brownien des contributeurs se traduit en évolution réelle à moyen terme.

Les outils de gestion de connaissances les plus ambitieux proposent de partager la connaissance des experts de l’entreprise avant qu’ils ne partent à la retraite. Mais il faut tout de même de longues semaines d’interview, de modélisation et de mise au point avant de bénéficier d’un système utile pour les successeurs de l’expert. Que faire en 15 jours? En mode incrémental…
Alors que faire? Comment organiser le travail et gérer sa continuité? Comment le coordonner? Comment transférer de la connaissance aux nouveaux arrivant en un temps record?

Ajouter dans l’équipe deux ou trois stagiaires qui sont là pour six mois et garantissent la continuité de la connaissance? Utiliser les méthodes comme XP en insistant sur le « pair programming » et la rotation des paires? Mettre au contraire le paquet sur la modélisation formelle à grands coups d’UML? Modéliser la connaissance du projet dans une usine à gaz de knowledge management (une « corporate memory »)? Ne jurer à l’inverse que par les wikis? S’appuyer sur un dictateur bienveillant mais bénévole et non présent sur site, qui agit comme « gatekeeper » sur le code produit? Inventer une nouvelle méthode agile à faire pâlir d’envie ses cousines?

Je n’ai pas la solution complète mais si on la trouvait, cela permettrait de mettre les meilleures technologies à la portée des ambitions des entrepreneurs sociaux les plus innovants. Quelles pistes de réflexion pourriez-vous partager?

Data mining vs. terrorists: terrorists win and citizens lose

Bruce Schneier is one of that kind of world-class computer security expert I love: he knows what he is talking about and does not overestimate the capabilities of computer technologies however fancy they are. With an extremely simple math explanation, he shows how dangerous, expensive and inefficient data mining technologies can be for identifying terrorist threats.

Web scraping, web mashing

5 Ways to Mix, Rip, and Mash Your Data introduces promising web and desktop applications that extract structured data feeds from web sites and mix them together into something possibly useful to you. Think of things like getting filtered Monster job ads as a convenient RSS feed, along with job ads from your other favorite job sites. This reminds me my Python hacks for automating web crawling and web scraping. Sometimes, I wish I could find time for working a bit further on that…

Missions solidaires pour prestas en intercontrat

Christian, Bader, Jef et Jjay m’ont donné quelques pistes pour améliorer mon idée: comment convaincre des sociétés informatiques de s’investir dans des projets technologiques à vocation solidaire? Merci à tous les 4!
Un gros risque, c’est que ce genre de choses « terminent à la comm' » comme l’indique Christian. Les SSII n’ont probablement « aucune velléité de changer le monde », en tout cas, ce n’est pas leur vocation. Leur préoccupation évidente semble être « le profit court terme ». Mais pourquoi les SSII n’emploient-elles pas leurs prestataires en inter-contrat à des projets profitables pour elles à plus long terme (projets internes, contribution open source, …) plutôt que de les laisser moisir dans un coin le temps qu’un commercial arrive à les recaser chez un client? D’une boîte à l’autre et d’une personne à l’autre, l’inter-contrat est vécu plus ou moins bien, avec des situations parfois cocasses. En tout cas, l’intercontrat est une source de problèmes pour les SSII et pour leurs employés.

D’un autre côté, il y a peut-être des leviers accessibles pour faire changer cette situation et, du même coup, répondre aux besoins technologiques des innovateurs sociaux.

La notion d’entrepreuriat social, ou d’ethique est très à la mode chez toutes les entreprises qui ont une médiocre image de ce côté là (ça inclue banques et SSII amha).

Que faire? Voici vos suggestions:

faire que les CLIENTS des SSII soient attentifs à ces démarches (dans leur processus de décision ), et comme par hasard tout se débloque

[Peut-être créer des] jeux-projets-concours [:] sélection des meilleurs projets et financement + aide logistique, ça peut marcher.

[De toute façon,] les idées ne peuvent pas venir de l’intérieur [et il faut que la solution permette d’] identifier une retombée financière à quelques mois

Ce n’est pas les SSII qu’il faut convaincre mais d’abord ceux qui travaille dans ces entreprises (de préférence d’une taille respectable à mon avis). S’ils sont motivés ils peuvent faire bouger leur management et toi tu peux les aider à trouver les arguments pour cela.

J’ai envie d’extraire de ces suggestions quelques éléments pour un cahier des charges : la solution doit…

apporter une carotte économique pour la SSII, du profit à court terme, peut-être en impliquant certains clients
s’appuyer à fond sur la motivation des employés, exploiter celle-ci par des formes d’animation adéquates
être économique viable (entreprise sociale, donc entreprise également)

Et si on achetait les prestataires en inter-contrats à leur SSII à un pourcentage symbolique de leur tarif journalier habituel? Cela fournirait l’incitation économique à leur SSII: « Du moment que je sais que je peux disposer de cet intercontrat dès que je le veux pour le mettre chez en client, pourquoi ne pas le vendre à 1% de son prix habituel à un client ‘entreprise sociale’. Si, en plus, ça redore un peu l’image de marque de la boîte et que ça motive certains employés, c’est ça de gagné en plus! »?

Et si ce montant symbolique était réuni par les employés motivés pour participer à l’opération et changer le monde à leur échelle? Pour 10 à 20 employés en mission (selon les périodes et les sociétés), il y en a, disons, 1 en intercontrat. Avec un abonnement/cotisation de quelques dizaines ou centaines d’euros par an et par personne, on réunit le montant nécessaire pour financer une mission solidaire. « Aujourd’hui, je suis chez un client. Mais demain, ça pourrait être moi en intercontrat. Alors, comme j’aimerais bien que certains de mes collègues et moi puissions avoir un véritable impact sur l’environnement/les plus pauvres/la démocratie/le développement des pays du Sud/la priorité de mon choix grâce à ce que l’on sait faire le mieux (la techno), j’achète avec eux le droit de participer à une telle mission lors de mon prochain intercontrat ».

Cette solution consisterait donc à créer un fournisseur de missions solidaires pour prestataires en intercontrat. Les clients sont des prestas qui veulent profiter d’un futur intercontrat pour essayer de changer le monde à leur échelle (plutôt que de se faire chier à éviter les patates et à traîner dans l’agence ou au siège). Les produits sont des missions à forte qualité sociale/environnementale pilotées par des pros du secteur, des gens de terrain qui peuvent vite faire sentir au presta les problèmes sociaux/environnementaux ou autres à traiter. Les autres fournisseurs, ce sont des SSII qui voient d’un bon oeil l’idée d’arrondir leurs fins de mois en vendant certains intercontrats sur un second marché, ultra-discount.

Comment répondre aux questions que ce genre de proposition pourrait soulever? Qu’est-ce qui donnerait suffisamment envie et confiance à un presta pour qu’il achète à l’avance, avec des collègues, son droit de participer à une mission technologique solidaire sur le terrain de son choix? Cette idée a sans doute un côté complètement délirant, mais qu’est-ce qu’on pourrait en faire de bien et d’un peu plus près de la réalité? Qu’est-ce que cela vous inspire? A votre tour!

How to install dozens of linux boxes with FAI?

[updated: the version of the python script was an obsolete one, I updated it, and changed the title of the post for more clarity]

I have 40 old computers (donation from a corporation) that are to be dispatched among small social work non-profit organizations and needy people in several French cities and probably also in Senegal. How to install a customized and usable version of linux on all of them despite the hardware heterogeneity of that collection of PCs and our lack of time? How to allow them to be reinstalled remotely without requiring any computer person to be present on site? I want the linux distribution to be Ubuntu, with a specific list of packages and configuration parameters. Some PCs have 1 hard drive of 9 GB or more, some others have up to 3 hard drives of sometimes 4GB, etc. The solution I found is to use FAI (Fully Automatic Installation) with a couple of custom enhancements such as a Python script that calculates the optimal partition tables for every PC.

Here are some notes about how I proceeded. If you want to contribute to similar projects (Information Technology and innovation for small non-profit organizations working in the field of social work in France or Africa), please drop me a comment here or by email at sig at akasig dot org.

Requirements and architecture

The way FAI works is as follows. The computer to install boots locally either from a CD-ROM, from a floppy disk or via a local networking protocol such as PXE or BOOTP. It then connects to a central installation server. It is served with instructions about how to install itself. It then downloads and installs packages from official repositories (e.g. Ubuntu repositories) or from the installation server if ever this server contains a mirror of the distribution repository. It is a package-based installer and differs from file-based installers such as System Imager (that relies on rsync).

Therefore, the main requirement is to have a server for centralizing the installation process. For testing purposes, I used my home PC with its DSL line and its Ubuntu Dapper distribution. But the production server is hosted in a data center and runs a debian.

For booting, the usual FAI way is to use a local DHCP server for retrieving information such as the address of the installation server. But in my case, I want to allow computers to (re-)install themselves from the premises and local area networks of non-profit organizations or even at home of individuals. I obviously can’t control the DHCP servers that are usually serving this critical installation information. Therefore, I had to work around this by using some special FAI options when creating the bootable CD-ROMs (see below).

Another issue I had to tackle is that FAI supports a limited amount of hardware heterogeneoity. For instance, if your computers don’t have the exact same amount of hard drive space, that’s usually not a problem for FAI. It comes with configuration mechanisms that handle that quite smoothly. But in my case, I have unknown computers to install, with various numbers and sizes of hard drives for instance. Therefore, I had to let computers calculate by themselves the optimal partitioning scheme for any hard drive setup. I did that with the help of a constraint programming library for Python. I also had to make sure that Python would be available on the computer at that stage of the installation process.

Eventually, I had to work around some access control constraints of FAI so that I could write the calculated optimal partitioning scheme to the computer to install. Indeed, when the computer to install first connects to the installation server, it mounts its root partition via NFS in read-only mode. And it doesn’t have access to the hard drive(s) yet. The solution I adopted is to write the optimal partitioning configuration to the FAI RAMdisk (on the computer to install) and to pre-define a symlink from the NFS-mounted root (on the installation server) to that configuration file so that FAI knows where to find it once it has been calculated (details below).

Other modifications I had to do include correcting some shebang lines in scripts that used sh whereas they should have been using bash in the case of an ubuntu server environment. I also had to correct the path a grub post-installation script to adapt it to Ubuntu. Eventually, I had to find the proper collection of FAI « classes » to define in order for Ubuntu to work properly.

I did not invent all of these tweaks and hacks (except the partitioning one). All of them were suggested by the extremely supportive FAI community via their #fai IRC channel on irc.oftc.net (special thanks to MrFai, sanso and h01ger). And I could not get into FAI without the great (but partly outdated!) FAI guide.

Now come the more detailed notes about how to (hopefully) reproduce the steps and tweaks described above.

A bit of FAI magic

On my Ubuntu dapper, the FAI package was a rather old one (v.2.10). Therefore I retrieved the more recent 3.1ubuntu1 package from the edgy repository and installed it manually. The first thing to do was then to go to /etc/fai and check every configuration file for possible updates to make. In /etc/fai/NFSROOT, for instance, I added python as a package to install in the virtual root partition that will be mounted via NFS by the computers to install. I also made sure that my NFS service would allow the target computers to connect and that the iptables firewall would not block these connections either. Then, I was ready for a sudo faisetup that created this virtual root partition under /srv/fai.

Once the NFS root hierarchy has been created, I manually added to it the python constraint programming library required by my partitioning hack. I downloaded the source tarball, pre-compiled it on my installation server with python setup.py build (probably useless). And I manually copied the .py and .pyc file to the proper site-packages directory of the NFS root (to /srv/fai/nfsroot/usr/lib/python2.4/site-packages/).

In order to create the bootable CD-ROMs that would allow computers to start using their local DHCP server but still to know how to connect to the central installation server, I had to use the following command line and options:

sudo make-fai-bootfloppy -B -Ieth0 -l -f /tmp/fai_floppy.img -i /tmp/fai_cdrom.iso -d f -F -v nfsroot=192.168.0.100:/srv/fai/nfsroot ip=:192.168.0.100:::::dhcp FAI_ACTION=install

It creates an ISO image of that bootable CDROM (/tmp/fai_cdrom.iso) that is then to be burnt. It also tells the path to the installation server. I had first tried without the -l option that asks for LILO to be used instead of GRUB but I could not figure out how to let GRUB not ignore the required nfsroot option. That option always disappeared from the kernel options GRUB specifies for booting. Therefore, I decided to use LILO instead. I also had troubles mixing the use of DHCP and the use of my nfsroot option and had to use the -d f option that is supposed to tell the computer to boot with a fixed IP address whereas it will actually refer to my ip= option that tells it to boot with DHCP but to notice that the installation server is at a given IP address. A bit tricky, isn’t it… Anyway, it worked and you just have to replace 192.168.0.100 by the IP address of your installation server and everything should be fine (let’s be optimistic…). As an alternative, you should refer to the man page of fai-cd which is another FAI command for creating a bootable CDROM. Maybe fai-cd is even more recommended than make-fai-bootfloppy indeed but I did not try because it has not yet been properly documented in the FAI guide.

Then, I added my partitioning script as a FAI hook that gets called just before the « partition » FAI task and only for computers that are assigned to my custom FAI class. In order to do so, I saved my script under /srv/fai/config/hooks with the filename partition.MYCLASS (where MYCLASS is the name you choose for describing the class of computers that will be using this partitioning script). Note that you should remove the .txt extension from the filename once you download it from this site.
When called, that script would create a new file name called MYCLASS that would contain the FAI syntax for specifying how partitions are to be created on disks (it’s called a FAI disk_config file indeed). But since this script is called at install time from a computer that mounted its root partition via NFS in read-only, I had to let the script save this MYCLASS file under /tmp/ which is then a writeable RAM disk. But for FAI to be aware of the existence of that file during its partitioning task, I first had to create a symlink from /srv/fai/config/disk_config/MYCLASS to /tmp/MYCLASS. (ln -s /tmp/MYCLASS /srv/fai/config/disk_config/MYCLASS). After some discussion with FAI folks on IRC, I understood this is not the optimal solution. Ideally, I should use the FAI mkrw script instead: it would create an appropriate writeable path on the RAM disk and the script would be stored there. Anyway, the symlink option also works though it’s less elegant.

Beyond creating this customized disk_config file for MYCLASS computers, I also modified and re-used the simple example files that are provided under the FAI examples directory. I created a FRENCH class by copying and modifying the GERMAN class that is provided there so that it tells that KEYMAP=fr-latin9. I used the FAIBASE class file and just modified it a bit: TIMEZONE=Europe/Paris. In the /srv/fai/class/50-host-classes script that defines default classes, I added to the last case the following classes: FRENCH, FAI_BOOTPART (in desperate hope it would add to the GRUB menu an option for booting the computer using FAI from the hard-drive in case of re-installation without CD), NTP and NETWORK (being unsure these were required for the NTP service to be installed by default and to receive proper configuration parameters. In /srv/fai/config/debconf/ I created a FRENCH debconf file by re-using the GERMAN one given as example. In /srv/fai/config/package_config/, I also copied and modified the GERMAN file into one called FRENCH and containing the identifiers of Ubuntu packages related to French setups.

As explained above, I also had to modify several FAI scripts in order to fully adapt them to the Ubuntu environment. These modifications were suggested to the FAI mailing list and forwarded to me by MrFai. They consisted in changing the #! /bin/sh shebang lines into #! /bin/bash lines for the following scripts: class/10-base-class, class/50-host-class, hooks/instsoft.FAIBASE, scripts/FAIBASE/10-misc, scripts/FAIBASE/30-interface, scripts/LAST/50-misc and also the mount2dir script. Last but not least, these modifications included modifying the config/files/boot/grub/menu.lst/postinst file so that it made reference to /sbin/update-grub instead of /usr/sbin/update-grub. I suppose that these changes are soon to be available right from the downloadable FAI packages once they get propagated there.

That’s it. With all of these operations, you should be able to install a fresh (and French) Ubuntu on any computer using the CD you have burnt. Or at least I could install a couple of them.

More things to do

But even then, we are not done with what’s required for our distributed infrastructure to be remotely maintainable. Here is my to-do list:

many of the hacks and tricks indicated above should probably not be done directly under the /srv hierarchy but under something like /usr/lib/fai/ or you might face the risk that they some of them get lost next time you recreate your nfsroot using the fai-setup script; there is probably some cleaning to be made here
check that the FAI_BOOTPART class was really taken into account because, at the moment, I could not see any FAI option in the GRUB menu of the installed computers
add bcfg2 with custom parameters to the classes to install so that the configuration can be properly managed remotely
check once again that the way NFS is offered to these remote computers will not create any security issue
create a new script that will select the flavor of the distribution to install depending on the amount of RAM on the PC. For instance, with less than 256 MB of RAM, it would be preferable to install a basic Ubuntu (without its Gnome desktop) and use another windows manager
setup the default user because the one provided by the DEMO class does not suit my need
add some more intelligence to the partitioning script so that it checks if ever there already is a suitable home partition and then ask FAI to preserve it instead of recreating it (and loosing the data it contains)
setup a proper SSH account on the server so that FAI can save its log files on it once the installation is done
activate the option that will let FAI save on the installation server the detailed hardware information it could read from the PC
create a unique and permanent identifier to be stored on the machine and on the server so that we can track PCs; in a first step, the MAC address may be usable but in some future, assigning a permanent UID to the whole list of hardware characteristics could be better if done smartly
check that the default Xorg options do not put old screens at risk (resolution and speed)
bring a bit of graphical customization in order to brand the desktops
add openvpn to the PCs so that we can connect to them remotely even when they are behing NAT routers
configure the authentication so that it is made against the central database (MySQL) that would also be used for the identity management of our Plone sites, with an nss_update mechanism that will allow authentication to succeed even when the central server is not reachable (caches the credentials on the PC)
for facilitating the initial installation, I should probably stop using bootable CDs and get back to the orthodox FAI way of booting from the network and using DHCPd for delivering instructions about the location of the installation server; however I first have to figure out how to let the computers’ GRUB menus offer a boot option that will not require the DHCPd to deliver those instructions and that will let them use the central installation server somewhere over the Internet

If you are interested in helping some social non-profit ventures with the maintenance and configuration of their PCs and/or have some clues about how to take action on some items of this todo list, please don’t hesitate to get in touch with me and to leave some comment here. Your help would be very much welcome!

Recherche informatique à vocation sociale

J’ai une idée un peu fumeuse de projet qui me trotte dans la tête et j’aimerais bien avoir vos conseils pour la rendre un peu plus réaliste. Il s’agit de proposer à des sociétés de « haute technologie » en informatique (SSII, éditeurs, constructeurs) d’investir quelques ressources (prestataires en inter-contrat, cotisation, …) dans des projets d’innovation technologique et sociale menés conjointement avec des chercheurs et des responsables d’associations reconnues dans leur secteur (action sociale, environnement, humanitaire, handicap, seniors…). Quels moyens pourraient exister pour rendre ce genre de chose attractives pour des SSII par exemple? Qu’est-ce qui pourrait leur donner envie de s’investir dans des projets de recherche informatique à vocation philantropique? Des projets qui visent à « changer le monde » à très grande échelle mais sur des points très spécialisés par le biais des TIC.

Par exemple, j’ai pensé à divers bénéfices possibles qu’une SSII pourrait tirer de ce genre d’action. Mais j’ai du mal à identifier lesquels pourraient avoir le plus de valeur aux yeux de dirigeants de ce genre de boîtes.

retombées médiatiques et amélioration qualitative de l’image de marque de la boîte en s’associant à des « grandes marques » du social (Croix-Rouge, Restos du Coeur, ATD, …), de l’environnement (WWF, Greenpeace, fondation Ushuaïa, …) ou de l’humanitaire (Croix-Rouge, médecins du monde, …)?
retombées médiatiques et accroissement quantitatif de la notoriété publique de la boîte ?
fidélisation des salariés: réduction du turn over, motivation « par les valeurs » de la société ?
facilitation du recrutement des jeunes diplômés ?
facilitation des relations avec les parties prenantes externes (syndicats, pouvoirs publics, certains actionnaires…) ?
formation à la communication interpersonnelle, acquisition par les salariés de « savoir-êtres » sur le terrain, au contact de populations en difficultés (milieux populaires, handicaps, seniors, …) ?
acquisition par les salariés de compétences techniques nouvelles au contact de chercheurs de pointe également impliqués dans ces projets (centres de recherche universitaire, recherche privée du genre Google Labs ou autres) ?
facilitation des relations avec certains clients (clients du secteur public: Etat, collectivités territoriales ou clients de l’économie sociale: mutuelles, coopératives, associations, fondations) ?
satisfaction personnelle des dirigeants de la boîte (philantropie) ?
autre idée de bénéfice pour la société?

Je sais que vous êtes nombreux à évoluer professionnellement avec moi dans le secteur de l’informatique, alors vos avis (même candides) me seraient précieux. Lesquels des points ci-dessus vous paraissent les plus prometteurs pour convaincre des dirigeants d’entreprises high-tech de se lancer dans ce genre d’aventure? Des suggestions? Qu’est-ce que vous en pensez?

Jean Millerat's bytes for good

Innover, Servir, Entreprendre !