Friday, February 26, 2010

math-like programming

i couldn't prove it at the time, but i had a feeling that i knew what i was doing....
i think i better understand now why it helps to keep a piece of code running correctly with unit tests while modifying it. it's like a mathematical equation. you can use substitutions and properties of operators and functions to modify the expressions, but only if you maintain equality at every step. maintaining that equality requires an understanding of the underlying math objects, and knowing which way to transform the expressions requires skill and creativity. but the hardest part is to lay down the governing equations to begin with -- to state the paradoxically precise abstraction of reality.
inasmuch as a computer language is an alternate grammar for discrete math, it is to be expected that to manipulate an existing expression while maintaining correctness is easier than to derive an algorithm from scratch (or from incorrect code). sometimes i have correct code that i still want to modify, for example, to generalize or to optimize. i need to think like a compiler and make my modifications in smaller steps, each maintaining correctness with respect to the unit tests, rather than try to leap at once to a large rewrite. faster, easier, and clearer thinking, often with solutions that present themselves along the way.
for example, can i move that assignment from the beginning to the end of the loop? i want to eliminate that variable; first i'll make it redundant. i think these two expressions are equivalent, so i'll put in an assert to test that before replacing the old one. now i have a better way to refactor and i know why to do it.

Thursday, February 25, 2010

numpy.searchsorted

i don't know why i keep forgetting how to find the first or last index of a value inside a numpy array. quick googling will turn up things like (findIn == toFind).all(1).nonzero()[0][0], which works but gets _really_ slow for large arrays since it's searching the whole thing. searchsorted is all i need, 99% of the time, and i don't know why it's not cited more often. i finally remembered (again) this latest time when i found myself reinventing a binary search (which is what searchsorted does).

Wednesday, February 24, 2010

myhdl

cool project converts python directly to vhdl or verilog. looks like it's under active development, with demo projects posted. if i ever need to use an fpga or cpld again, i know where to go.
and one of the examples uses the cypress fx2 usb controller chip, with references to other dsp interface projects (at 30 MB/s) including a 33khz adc fpga/fx2 microcontroller demo.

Monday, February 22, 2010

urlbst

might need to check out the urlbst package some time. it can make hyperlinks in my bibliographies to the referenced articles.

Friday, February 19, 2010

data acquisition, serial interfaces

nice little article summarizing the various serial bus standards for ic networks: rs-232, rs-422, rs-485, spi, i2c, mirowire, 1-wire, and plain old bit banging. a bit dated at 2002, but still handy.
looks like rs485 is what you need for high-speed transmission over significant distances. spi slows down a lot when you take it off the pcb and looks more like a can bus.
maybe the way to go for low volume is a tiny linux server. lantronix makes the xport pro. digi international makes the digiconnect me (or digi connectme?) that can run picotux. marvell semiconductor makes the sheevaplug. all pretty small and lightweight, and you can just plug an ethernet cable into the darn thing and forget about it.
hmm, the xport pro seems to be highly geared toward network services and the external interface is limited to ~8kB/s serial. digi connect me 9210 might be better for data logging. or maybe gumstix? it's hard to find specs for data acquisition on their website.
mccdaq.com have some ~reasonably priced boards, though their stuff looks like it's either pci boards or clunky usb endpoints. multichannel, though.
another route would be a high-speed usb (up to 40MB/s) with the cypress ez-usb fx2lp (like cyzc68013a). check out www.elrasoft.com/hsusbm.htm and the project listed there by dcarr.

Wednesday, February 17, 2010

mark levin show

i definitely prefer downloading the mp3 files rather than listen through the website's streamer. so here's how to for mark levin:
in http://www.marklevinshow.com/rss/ilevin.xml look in the enclosure tags. they have url metas that have the full url to the mp3s. wget away!

Tuesday, February 16, 2010

checkinstall with non-root

i wanted to use checkinstall as a non-root user, with sudo, and i found a way to make it work. i made a directory called checkinstall/packagename and cd'ed to it (this helps checkinstall get the right package name). then i ran 'checkinstall easy_install packagename' and it it worked great, once i had the following change: after this part of the checkinstall script
echogn "Installing Debian package..."
dpkg -i $DPKG_FLAGS "$DEBPKG" &> ${TMP_DIR}/dpkginstall.log
okfail
i put in this:
# added this to try sudo
if [ $? -gt 0 ]; then
echo "sudo dpkg -i $DPKG_FLAGS \"$DEBPKG\" &> ${TMP_DIR}/dpkginstall.log"
echogn "Installing Debian package with sudo..."
echo
sudo dpkg -i $DPKG_FLAGS "$DEBPKG" &> ${TMP_DIR}/dpkginstall.log
okfail
fi
since i'm installing stuff in my home dir, i have to answer yes when checkinstall asks if i want to see these files (so they can get included in the package and not excluded as config files as checkinstall expects).
now it all works (tested with easy_install) and it shows up in synaptic and everything.
also, i found what i am pretty sure is a bug. pretty obvious, so i'm surprised no one else seems to have posted anywhere about it. i changed the line
grep '^/home' ${TMP_DIR}/newfile > /${TMP_DIR}/unwanted
to
grep '^/home' ${TMP_DIR}/newfiles > /${TMP_DIR}/unwanted

unicode and python

this is a nice reference for dealing with unicode in python. explains things at just the right level.
one of the difficult things about working with unicode in python (as i have rediscovered once again) is that repr(), which gets called when you just ask for the return value of an expression, tries to encode as ascii (probably due to some language environment setting). but print() will exercise the terminal's capability to print out unicode characters. also, the default encoding (particular binary representation of, and therefore different from, the character set) is ascii.
In [426]: s = unicode('here\xe2\x80\x99s an apostrophe')
---------------------------------------------------------------------------
UnicodeDecodeError Traceback (most recent call last)
/home/tippetts/Ubuntu One/crunchSvn/svn/python/optimization/ in ()
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 4: ordinal not in range(128)
In [427]: s = unicode('here\xe2\x80\x99s an apostrophe','utf-8')
In [428]: s
Out[428]: u'here\u2019s an apostrophe'
In [429]: print s
--------> print(s)
here’s an apostrophe

samba mounts working

managed to get some samba mounts working from the windows enterprise servers. first had to install the smbfs package (not there by default on ubuntu, though smbclient is). then i made the mount points and chowned them to my uid/gid. put this line in fstab:
//servername/sharename /mount/point cifs users,uid=localusername,gid=localgroup,credentials=/etc/cifspw,domain=windomain
and made the /etc/cifspw file readable by my localgroup:
username=winusername
password=winpassword
and we're in business! local user can mount, etc. i wish i didn't have to make the credentials readable by non-root, but i get suid errors that way and i guess it's alright as long as i use a local group that is perfectly unique to my local user.

Monday, February 15, 2010

xlrd, loadtxt

interesting stuff in the epd
xlrd
python-excel reader to load an excel spreadsheet
numpy.loadtxt()
reads data from a text file into a numpy array
also, the scipy.stats module has many continuous and discrete distributions, with a common interface for them all. i should use this same interface for distributions of optimization objective functions in financial simulations.

Friday, February 12, 2010

easy_install uninstall

sometimes easy_install is a bit too easy. here's a great piece of advice. only tricky bit was that my easy-install.pth was in /usr/local/lib/python2.6/dist-packages/ and i had to use the -mxN options. and, unfortunately, this still leaves all the eggs and crud on the filesystem. this script was helpful for removing stuff in site-packages (/usr/local/lib/python2.6/dist-packages on my ubuntu system), but i still have things in /usr/local/bin...

Thursday, February 11, 2010

enthought tool suite on ubuntu

tried to use easy_install to put the ets on my ubuntu machine. it was not quite straightforward. had to install the libxtst-dev and python-dev packages, which was not clear from the beginning. finally i gave up trying to use easy_install and just grabbed the epd. easy peasy, but i did have to grab a patch to fix a missing ebmlib for editra. (patch -p6 editra-ebmlib.diff from lib/python2.6/site-packages/wx/tools/Editra/src/, and ignore the setup.py)
(grrr. blogger won't let me put a less-than into the patch command, and i'm too lazy to figure it out right now.)

Friday, February 5, 2010

python in a webapp

http://pyjs.org/examples/ http://pymw.sourceforge.net/ http://www.appcelerator.com/

camstudio

tried out a few different video capture/screen recording codes for windows recently. the hands down winner is camstudio. easy to use and set up, just make sure to set audio recording to your audio in device and your region to 'region' so you can resize the window. good framerate, decent compression, and very good audio.

google native client

i finally understand why google dropped chrome into the yet-another-browser mix: they are trying to take over the world. and it just might work. i’m about a year behind the official announcement, but i just found out about the native client project. sounds like what sun was originally trying to do with java, but in this case there is no need to rewrite everything in a new programming language. and there’s no performance hit from a bytecode compiler/virtual machine. everything runs native in a sandbox, with a recompile required to make jumps safe. the only real refactoring will be to take out disallowed system calls. write my code in linux, run it on windoze, mac, etc, anything that has intel hardware, and i will be able to deploy it through a website. now that’s cool, and it could actually swing people into the cloud computing mentality that will give google the home-field advantage. chrome is not just another browser; it will become a new virtual os. one dude got an unmodified python 2.6 to run by tweaking nacl and glibc to handle dynamic linking. i wonder if that is even necessary if i use cython to embed the interpreter and all modules into a static elf. at any rate, i hope google picks up his patches and charges ahead with this. i was just thinking about how to get around some of the restrictions of google appengine and offload some of the computation to the user’s machine. this would make both possible with minimal effort from me. need to keep an eye on this, and try it out eventually. like maybe when someone confirms scipy and numpy run under native-client. *fingers crossed*

openscad

openscad is a foss 3d cad program with csg and 2d extrusion modeling, based on a scripting rather than gui interface. looks like they made up their own scripting language (ugh) but at least one person out there seems to be driving it with python.

scan to pdf

here are a handy pair of commands that help convert scanned images to a pdf document: in python [os.system('convert -rotate 90 HP00%02d.jpg p%02d.pdf'%(i+1,i)) for i in range(1,15)] using the pdfjam package pdfjoin *.pdf
be careful, tho: pdfjoin will hang if you call it on files that have spaces in their paths (even if those spaces are not in the command line).
(in case i ever need it later) pdfjoin uses pdfpages with this macro:
\includepdf[pages=-,fitpaper=true,trim=0 0 0 0,offset=0 0,turn=true,noautoscale=false]{/var/tmp/5663150830922source1.pdf}

Thursday, February 4, 2010

google wave add-ons

here's a site with extensions for google wave. the thing i really need for a collaboration system is project planning. the one thing i can find on there now is planny. looks very young right now, but maybe i'll check back on both pages later.

unladen swallow

a few guys from google are working on an optimized jit compiler for python called unladen-swallow. they use llvm, and they acknowledge the similarity to pypy. not clear to me how it will be significantly better or different, but if they have google backing maybe it will be out of the lab faster.

Tuesday, February 2, 2010

pandas

pandas is a library for handling financial data with python/numpy. looks new, but with a good following. talks at the new york financial python user group and the london financial python user group.

update:
http://pandas.pydata.org/

Monday, February 1, 2010

active/managed etfs

interesting ref on active/quant managed etf families here. 'One example is the PowerShares FTSE RAFI US 1000 Portfolio ( PRF | Quote | Chart | News | PowerRating), which applies a fundamental weighting approach to the 1,000 largest U.S. equities. The index provider, Research Affiliates LLC of Pasadena, CA, has become widely known for its fundamental indexing. Chairman Robert Arnott has observed that equally weighting a given index of stocks historically adds about 180 bps/yr of return versus market cap weighting, and fundamental weighting tacks on an additional 80 bps/yr above equal weighting.'