Inventory tracking database

=Introduction=

In light of my failure with FGDB and the recent decision to pare down the number of different interpreters running on the freegeek.org server, I'm trying to work out how we may our existing infrastructure to construct an inventory system.

What we have already:


 * 1) Publicly-accessible server
 * 2) MySQL database
 * 3) httpd with mod_php
 * 4) Web framework (MediaWiki)

=Overview=

I propose to build an inventory database and use it to track our stuff. I further propose to use Wikimedia's user system to track our volunteers. This will mean, among other things, that every volunteer must have a wiki account- which in turn will require a full-time-ish wiki admin. Also I hope to implement a Kerberos or similar SSO system so that FGSEA volunteers' logins work on all FGSEA areas (save dynamo).

I intend to implement the new system as a MySQL database on dynamo. We can create arbitrary clients for this database with ODBC. Additionally I hope to make inventory reports accessible on this wiki via Special: pages. http://www.mediawiki.org/wiki/Writing_a_new_special_page

Special pages are technically not editable, as they are dynamically constructed from a database call. Unlike with normal wiki pages, Special pages can be read-restricted as well, in case we feel the need to keep our inventory secret for some reason.

Entries to the inventory database should be fairly automated, as follows: we'll build a custom bootable volume (cd, harddisk, usb stick, whatever) using a recent Ubuntu kernel and a minimal userland which, among other things will run lshw at startup. This script will then, if the machine is connected to a network, upload the resulting XML file, either to freegeek.org or to a local server, via curl. Then the XML file can be parsed into a data table by MySQL: http://dev.mysql.com/doc/refman/5.5/en/load-xml.html but this is a separate step that can be performed in a batch, possibly a cron job if we can come up with some method of validating the XML. Does that last matter? I'm not sure.

Tools and components:

client end:
 * 1) debootstrap - for building boot images
 * 2) lshw
 * 3) curl, wget, nc or similar

server end: This bit is already set up :)

=Project parts and necessary infrastructure=

Chunks of the project
This project can be conceptually broken up into a number of semi-independent pieces. That is, the person working on one piece shouldn't have to know much of anything specific about the other pieces. I will list them out here and if I can get people to adopt chunks, then I will put their names on each section so folks know who to contact for help.

Project Management
I'm afraid this will probably have to be me, User:koanhead, unless someone else wants to do it. It shouldn't be to difficult to find someone more experienced at management than I am.

See the FGSEA github here: https://github.com/organizations/freegeek-seattle

Rolling up a custom boot volume
This is the part I'm currently working on. See Custom boot volume for more details.

This boot volume will need a kernel that has modules enabled and that probes hardware on startup. In fact a vanilla Ubuntu distro would probably work, with a few simple additions: .
 * 1) It needs lshw and curl installed
 * 2) It needs a startup script in rc.S or simlar that runs lshw -xml, captures the output, and then sends that data to the database.

Setting up XML -> MySQL parser
This will involve figuring out how to make lshw's tree-structured XML output fit into MySQL, and designing the tables. The actual parsing can be done by a MySQL function, the prep work is pretty much all of it.

This prep work involves translating the XML Schema file in lshw source (doc/lshw.xsd) into a usable, normalized MySQL database. If we can find an automated tool to do this, great. It might need to be done by hand. Below is the XML Schema file in question:







             <xs:element name="clock" type="measured"/> <xs:element name="width" type="measured"/> <xs:element name="configuration" type="configinfo"/> <xs:element name="capabilities" type="capinfo"/> <xs:element name="resources" type="resinfo"/> <xs:element name="node" type="nodeinfo" maxOccurs="unbounded" /> </xs:sequence> <xs:attribute name="id" type="xs:string" /> <xs:attribute name="handle" type="xs:string" /> <xs:attribute name="class" type="xs:string" /> <xs:attribute name="claimed" type="xs:boolean" /> <xs:attribute name="disabled" type="xs:boolean" /> </xs:complexType>

<xs:complexType name="measured"> <xs:simpleContent> <xs:extension base="xs:decimal"> <xs:attribute name="units" type="xs:string" /> </xs:extension> </xs:simpleContent> </xs:complexType>

<xs:complexType name="configinfo"> <xs:sequence> <xs:element name="setting" type="configentry" maxOccurs="unbounded" /> </xs:sequence> </xs:complexType>

<xs:complexType name="configentry"> <xs:simpleContent> <xs:extension base="xs:string"> <xs:attribute name="id" type="xs:string" /> <xs:attribute name="value" type="xs:string" /> </xs:extension> </xs:simpleContent> </xs:complexType>

<xs:complexType name="capinfo"> <xs:sequence> <xs:element name="capability" type="capentry" maxOccurs="unbounded" /> </xs:sequence> </xs:complexType>

<xs:complexType name="capentry"> <xs:simpleContent> <xs:extension base="xs:string"> <xs:attribute name="id" type="xs:string" /> </xs:extension> </xs:simpleContent> </xs:complexType>

<xs:complexType name="resinfo"> <xs:sequence> <xs:element name="resource" type="resentry" maxOccurs="unbounded" /> </xs:sequence> </xs:complexType>

<xs:complexType name="resentry"> <xs:simpleContent> <xs:extension base="xs:string"> <xs:attribute name="type" type="xs:string" /> <xs:attribute name="value" type="xs:string" /> </xs:extension> </xs:simpleContent> </xs:complexType> </xs:schema>

This schema describes a base type "nodeinfo" such that nodes of this type may contain nodes of the defined subordinate types. The lack of occurrences of MinOccurs or other type of "must-occur" constraint means that:


 * 1) There's no obvious primary key per type
 * 2) There's no obvious way to identify which types will only occur sometimes

By "obvious" I mean "easy to program into an interpreting script".

This makes it difficult to auto-generate an appropriate relational schema, as does the fact that the XML schema provided makes no provision for normalizing its data. Since the XML schema is built into the lshw tool, we can't modify it without rebuilding the tool; and if we were going to do that, we might as well make it emit SQL anyway.

XSLT is the way to transform XML documents. xsltproc is a standard XSLT processor available on most GNU/Linux systems. It can be used to do the schema translation (according to this XSLT file: http://www.annlewkowicz.com/2008/01/create-ddl-from-xsd-file-part-ii.html ) and the translation of the XML files themselves.

Mike Dierken has short-cut this process by parsing the XSD into a MySQL schema for us. Here it is:

create table cap_info ( id                        varchar(255) not null,  content                   varchar(255),  constraint pk_cap_info primary key (id))

create table config_info ( id                        bigint not null,  constraint pk_config_info primary key (id))

create table measured ( id                        bigint not null,  units                     varchar(255),  content                   double,  constraint pk_measured primary key (id))

create table node ( id                        bigint not null,  description               varchar(255),  product                   varchar(255),  vendor                    varchar(255),  physid                    varchar(255),  version                   varchar(255),  serial                    varchar(255),  businfo                   varchar(255),  slot                      varchar(255),  size_id                   bigint,  capacity_id               bigint,  clock_id                  bigint,  width_id                  bigint,  configuration_id          bigint,  capabilities_id           varchar(255),  resources_id              bigint,  parent_id                 bigint,  constraint pk_node primary key (id))

create table res_info ( id                        bigint not null,  constraint pk_res_info primary key (id))

create sequence cap_info_seq;

create sequence config_info_seq;

create sequence measured_seq;

create sequence node_seq;

create sequence res_info_seq;

alter table node add constraint fk_node_size_1 foreign key (size_id) references measured (id) on delete restrict on update restrict; create index ix_node_size_1 on node (size_id); alter table node add constraint fk_node_capacity_2 foreign key (capacity_id) references measured (id) on delete restrict on update restrict; create index ix_node_capacity_2 on node (capacity_id); alter table node add constraint fk_node_clock_3 foreign key (clock_id) references measured (id) on delete restrict on update restrict; create index ix_node_clock_3 on node (clock_id); alter table node add constraint fk_node_width_4 foreign key (width_id) references measured (id) on delete restrict on update restrict; create index ix_node_width_4 on node (width_id); alter table node add constraint fk_node_configuration_5 foreign key (configuration_id) references config_info (id) on delete restrict on update restrict; create index ix_node_configuration_5 on node (configuration_id); alter table node add constraint fk_node_capabilities_6 foreign key (capabilities_id) references cap_info (id) on delete restrict on update restrict; create index ix_node_capabilities_6 on node (capabilities_id); alter table node add constraint fk_node_resources_7 foreign key (resources_id) references res_info (id) on delete restrict on update restrict; create index ix_node_resources_7 on node (resources_id); alter table node add constraint fk_node_parent_8 foreign key (parent_id) references node (id) on delete restrict on update restrict; create index ix_node_parent_8 on node (parent_id);

This role could also make the "People" tables that relate information on volunteers, vendors, etc to inventory and Wikimedia user data.

Building the Wikimedia extension to display these database reports as Special: pages
Again, most of the work here is done for you, especially if you already are a Wikimedia hacker. If you are not, then there are a few special objects and data structures to know about: This will require basic object-oriented PHP knowledge, and access to whichever group owns the Wikimedia code on the server (test server, that is).
 * http://www.mediawiki.org/wiki/Manual:Database_access
 * http://www.mediawiki.org/wiki/Writing_a_new_special_page

There might be more stuff that I've forgotten about. Hopefully someone will come along and add / change things.

=Concerns:=


 * 1) Doing automatic uploads with hard-coded passwords to our live webserver is obviously a Bad Idea. It's probably advisable to set up a secondary, local server to receive the files and then upload them in a batch. The local server can even do the XML->MySQL conversion, I suppose.
 * 2) Converting XML to MySQL tables is not a trivial task. lshw's XML output may not be suited to it. Not knowing how the import proceeds yet I'm not sure how will lshw's varying output will work with SQL tables' fixed number of fields. Here is an idea of what the XML output looks like. http://ezix.org/project/wiki/HardwareLiSter#Attributes has more information on how lshw works. The problem seems generalizable to: how to collapse a tree into one or more RDB tables?
 * 3) This is a HUGE amount of work for one guy to do. I estimate this is a 100+ man-hour job for someone with my level of expertise. I don't have 100 hours to devote to this. A little help here? => Team of three now with Blibbet and SUSY.
 * 4) Those working on this project, or at least whomever plugs in the code to make the Special: pages, will need Developer status on the wiki. So far only Mark and Will have the power to grant this. The need is not immediate- a lot of groundwork needs to be laid first.
 * 5) We can use mediawiki directly, or just use a Google Docs... yeah Google Docs isn't free, but I can make a wiki convention that works for us