Thursday, March 18, 2010

m$ access mdb files in linux

looks like the only game in town for reading an mdb database file from microsoft access is mdbtools. several other tools are built off of it, which can help to give a quick gui look at what a file contains. gmdb2 is easy enough to use, but i couldn't get oobase (from openoffice) set up before my attention span expired. here's a little python snippet that dumps out the data:
import sys, os
mdb = sys.argv[-1]
out = os.popen('mdb-tables -S -1 '+mdb)
tables = out.read().split('\n')[:-1]
s = ''
for table in tables:
#print table
s += table+'\n'
out = os.popen('mdb-export %s "%s"'%(mdb,table))
s += out.read()+'\n'
print s
it seems quite clear that the mdb jet 4 format is grossly inefficient for small databases. this snippet spits out some csv text which is about 13kB, compared to the original mdb that weighs in at 430kB. grokking around the mdb confirms that almost all of it is 0s (fortunately easy to see, since i understand the jet standard does not require allocated file space to be wiped clean).

No comments: