README for BML - "Binary Markup Language" 

Version: v0.4,  2001-08-12
$Id: README.html,v 1.1 2001/05/01 16:26:40 anderst Exp $

  Authors:
  Anders W. Tell, Financial Toolsmiths AB, Maintainer
 

Design

BML Design document.

Goals:

Primary
  • An binary stream-representation of XML 1.0 information-items,all or a subset.
  • As simple stream-representation as possible, KISS.
  • Support all XML Schema datatypes
  • Use native and IEEE datatypes to facilitate fast stream to/from memory conversions.
  • Fast parsing and conversion to in-memory trees, W3C DOM or others.
  • Simple and small codebase for encoders /decoders.
  • The core BML files should be compatible with Java Micro Edition CLDC 1.0
    Secondary
     
  • Small binary-stream size compared to XML text-stream-representation, on average anyway. 

  • However GZIP sizes are a secondary goal compared to performance and ease of use.
  • Extensible on in terms of tokens and datatypes.
  • Allow encapsulation of OMG IIOP messages in a self describing stream-representation.
  • Include small and fast Tree API

Status (v0.40)

  • All four representation are now supported.
  • An small XML parser has been added.
  • Only a few datatypes are currentlu supported but most of these scheduled to be added in the next version.
  • The codebase should be really usable when version 0.6 is released .

Changes

Major changes are collected and described in the Changelog
 

TODO

A TODO list is kept in the TODO
 

Space considerations

Streamspace saving techniques

  • Tag compression
  • Native datatypes
  • ... there is more in there ...

  •  

Performance testing

The framework for testing performance is described here.

Stream performance enhancements techniques

  • Binary tokens
    • No need to discover or look for where the next token is.
    • No need to handle multiibyte character encodings in tokenization.
  • Native datatypes
    • No needs to do characters to native datatype conversion, which is a very expensive operation.
    • Matches many implementation languages and machines representation of native datatypes

    • (follows  CDR en/decoding rules without alignment (could be added later)).
    • Strings are preceded by a length indicator which improves reading and memory handling.
    • Explicit datatyping reduces the amout of syntax and error handling code.
  • Relativly small and simple grammer
    • Smaller code for parsing and reading.
    • Fewer bugs since the codebase is simpler and understandable.
  • Produces, on average, smaller streamsize
    • Fewer bytes to read.
  • Reusable information
    • Recurring information may be encoded only once and reused in other parts.
    • Namespace handle is simpler since no prefixes are used. Element, attributes and datatypes are all automatically bundled with namespace and localname.
    • Some information is packaged in a way that a tree implementation may reuse it without doing extensive / expensive information restructuring and conversion.

Requirements

  • JDK 1.3.x    (JDK 1.1.x, JDK 1.2.x should work, not tested though)
  • xerces.jar  - for SAX and DOM conversions
  • junit.jar    v3.5 - for testing
 Applications 
XML files to BML stream compressor:
  java -client org.openebxml.comp.bml.ext.appl.BMLCompressor -options filename

creates a file named: 'filename.bml'

BML Stream to XML streams:
  java -client org.openebxml.comp.bml.ext.appl.BMLDecompressor -options filename

creates a file named 'filename.xml'

BML Assembler
  java -client org.openebxml.comp.bml.ext.appl.BMLTokens filename

Parses a BML encoded stream and print the tokens to screen.
 
 
 Examples 
Examples are found in the 'examples' directory
NOTE: Many more examples are scheduled to be created.

README for BML - "Binary Markup Language"