Friday, June 26, 2009

scanning with hplip

had to reboot my computer and remember how to make the hp officejet scanner/printer scan for me again. turns out hplip won't run if cupsd and dbus are not started, and they weren't by default. hp-check is pretty handy for finding these kinds of problems. then, i actually use scanimage to do the scanning. i think it's part of the sane-backends package rather than hplip, but it works for me. scanimage --help shows scanner-specific options. this gives a good 150 dpi scan of a letter-size page: scanimage --resolution 150 -x 216 -y 279 > output.pnm the easiest way i've found to make a pdf out of the scanned pages is first to convert each one to a png, then to a pdf (both times with convert). then, use pdftk in*.pdf cat output out.pdf to assemble the pages. pretty good compression with the png conversion (even without forcing greyscale or b/w) and verified to work on windows.

Tuesday, June 2, 2009

ica and quantitative finance

this site has some interesting papers on quantitative finance. in particular, i think the report on quant education would be an interesting read, even though it is a bit old now. the other paper, 'a first application of independent component analysis to extracting structure from stock returns' is the earliest reference i've seen on ica on financial data. now a number of people have been doing it, with mixed results imho. but there is a good point to be made here in that, if you are assuming independence, why just look at correlation? why choose an orthogonal basis orientation based on reprojection error L_2 minimization? why not look at mutual infomation or higher order moments and cumulants? if there are components that are uninterpretable, it is self-deceptive to force them to be small artificially and it will probably lead to overly optimistic estimates of risk. truth is, i have two goals for modeling log price relative time series: classification and time-windowed average estimation. for classification i want independence, and for the time averages i want to minimize time-averaged error (not necessarily time-averaged error^2). not only is amplitude significant; autocorrelation of the error time series is, too.

svn and xxdiff

svn and xxdiff don't play nice, so i wrote a little python script to clean up the args svn gives to an external diff. svnxxdiff: #!/usr/bin/env python import os import sys as = sys.argv i = as.index('-L') as[i] = '--title1' as[i+1] = '"'+as[i+1]+'"' i = as.index('-L') as[i] = '--title2' as[i+1] = '"'+as[i+1]+'"' as[0] = 'xxdiff' as.remove('-u') os.system(' '.join(as)) this alias saves some typing: alias svndiff 'svn diff --diff-cmd ~/local/bin/svnxxdiff'
EDIT: or i could just use tkdiff, part of the tkcvs package. it's smart enough to compare against latest repository version if there's only one file arg.

selling a home

it's a buyer's market, but a couple of websites might help those sellers out there. www.homegain.com has valuation tools and gives tips on staging. http://www.giftnetonline.com is a service that keeps a home in shape to sell while the owner is away.