Tuesday, January 29, 2013

python file encryption

just ripped off a stackexchange post to make a quick file encrypt/decrypt tool.

from pdb import set_trace as dbg

from Crypto.Cipher import AES
from Crypto import Random
import base64
import hashlib
BS = 16 # couldn't get it to decrypt properly with 32
pad = lambda s: s + (BS - len(s) % BS) * chr(BS - len(s) % BS)
unpad = lambda s : s[0:-ord(s[-1])]
def passwdToKey(passwd):
return hashlib.sha256(passwd).digest()[:BS]

class AESCipher:
def __init__( self, key ):
self.key = key

def encrypt( self, raw ):
raw = pad(raw)
iv = Random.new().read( AES.block_size )
cipher = AES.new( self.key, AES.MODE_CBC, iv )
return base64.b64encode( iv + cipher.encrypt( raw ) )

def decrypt( self, enc ):
enc = base64.b64decode(enc)
iv = enc[:BS]
cipher = AES.new(self.key, AES.MODE_CBC, iv )
return unpad(cipher.decrypt( enc[BS:] ))

def executeAction(action, fileIn, fileOut, pw):
aesc = AESCipher(passwdToKey(pw))
if action == 'encrypt':
unencrypted = file(fileIn, 'rb').read()
encrypted = aesc.encrypt(unencrypted)
# verify encryption
assert aesc.decrypt(encrypted) == unencrypted, 'invalid encryption'
file(fileOut, 'wb').write(encrypted)
elif action == 'decrypt':
file(fileOut, 'wb').write(aesc.decrypt(file(fileIn, 'rb').read()))
else:
raise Exception('Unknown command: ' + action)
def main():
import sys
args = sys.argv[1:]
action, fileIn, fileOut, pw = args[-4:]
executeAction(action, fileIn, fileOut, pw)

if __name__ == '__main__':
main()

Thursday, January 10, 2013

scan to pdf

some scanners use jpgs for each page when scanning to pdf. i've found it works better to scan to a mtiff and then covert that with
tiff2pdf -o output.pdf input.tif

i got pretty good results in terms of size and readability with 300dpi, medium quality, black and white.

i've tried encrypting the pdf, and it all works, except that on the kindle it won't let me do any annotations on an encrypted file, even when I try to allow it explicitly:
pdftk in.pdf output out.pdf user_pw foo owner_pw bar allow AllFeatures

all annotations, etc, work fine on the desktop so i think it's an issue with the kindle acroread. at least i can work with it after decrypting, kindle or elsewhere
pdftk encrypted.pdf input_pw foobar output decrypted.pdf

Monday, January 7, 2013

quant finance links

http://www.sierrachart.com/index.php?l=doc/developers.php
came across sierrachart while trying to research restrictions on the use of google finance data (based on http://www.google.com/intl/en/googlefinance/disclaimer/?ei=v2zjUPjvIeOZwQPGhAE ). sierrachart looks like it's written by a loose coalition of general nerds often for their own use, and sold for a fee to non-nerds. they advertise the google finance data importer openly, so i guess google doesn't mind? written in c++, so maybe easy to integrate with other quant libs?

http://www.derivitec.com/
started by a couple of equity derivative quants about a year ago, they sell risk models that go into spreadsheets and compute on the cloud with microsoft azure. head dude wrote a book ( http://www.amazon.com/The-Value-Uncertainty-Dealing-Derivatives/dp/1848167725/ref=sr_1_1?ie=UTF8&qid=1339681713&sr=8-1 ) (not out yet) about how to do risk quantification, including practical considerations. one of the few refs i've seen to model error risk and model parameter uncertainty.

i wanted to see if google would allow people to share the results of computation that used their finance data. the disclaimer statement seems ridiculously restrictive (can't even 'download or save' it? erm, then why is your server giving it to me?) for-fee data services like thomson-reuters ( http://thomsonreuters.com/products_services/financial/financial_products/a-z/datascope_select/#tab1 ) and xignite ( http://www.xignite.com/Product/XigniteBondsRealTime/ ) are very pricey. www.kibot.com is cheaper, but still hundreds of $ for each data type. sierrachart and ninjatrader both advertise their capability to download and extract data from finance.google.com and both show up on the first page of a google search. hmmm.

incidentally, i found (maybe re-found) possibly useful indicator historical data available for free from the world bank (eg, http://data.worldbank.org/data-catalog/world-development-indicators?cid=GPD_WDI ).

also, openquant might be interesting:
http://www.smartquant.com/openquant.php
okay, apparently that one is a few hundred bucks/month. this one is foss:
http://code.google.com/p/openquant/

otc derivatives data

the cftc is now in charge of collecting data from off-exchange trades of swaps, etc., thanks to dodd-frank. right now they are doing cdo and rates derivative; equity, forex, and commodities will come later.

i can't find any data on the cftc site except highly aggregated volume data. apparently anyone can apply to become a repository, they can meet the requirements. so far, the dtcc is the best source i can find. they offer rss and csv downloads:
https://rtdata.dtcc.com/gtr/dashboard.do

this could get very interesting, and could change the way investment banks run their business for good.

Tuesday, January 1, 2013

amazon book indie publishing links

salesrankexpress sends search requests over to http://sre.novelrank.com/
http://www.salesrankexpress.com/
looks promising. we'll see how well it works.

http://www.novelrank.com/
see above.

http://rankforest.com/
data limited to last 30 days unless you upgrade.

TitleZ is currently free for beta testing. might charge later.
http://www.titlez.com/
looks like it has the best features, but it's not working atm and the docs are really old (2006!).

i looked at tictap.com but wasn't very impressed.

info on correlating amazon sales rank with book sales per day:
http://www.fonerbooks.com/surfing.htm

he mentions using salesrankexpress.com to track ranking.

amazon's affiliate program has some data available through their api:
https://affiliate-program.amazon.com/gp/advertising/api/detail/main.html