Tuesday, June 29, 2010

another python debugger

pudb. text-based but looks easy to use.

a bit less temporary /tmp

hat-tip to pycuda, i got my ubuntu machine to relax about wiping my temp dir on every reboot:

On Debian (and possibly Ubuntu?), edit the file /etc/default/rcS and change

TMPTIME=0

to the number of days that you'd like to keep files in /tmp around. "30" works for me.

now i won't be so paranoid about installing kernel updates right away.

gpu programming with python

theano seems to have an advanced api, but it might be too advanced since it uses its own optimization and other magic goodies, and it seems so focused on its own objects that it's like a metaprogramming language. still, dangerously interesting. uses cuda, so nvidia only. here's some free advice from their tutorial:
Only computations with float32 data-type can be accelerated. Better support for float64 is expected in upcoming hardware but float64 computations are still relatively slow (Jan 2010).
Matrix multiplication, convolution, and large element-wise operations can be accelerated a lot (5-50x) when arguments are large enough to keep 30 processors busy.
Indexing, dimension-shuffling and constant-time reshaping will be equally fast on GPU as on CPU.
Summation over rows/columns of tensors can be a little slower on the GPU than on the CPU
Copying of large quantities of data to and from a device is relatively slow, and often cancels most of the advantage of one or two accelerated functions on that data. Getting GPU performance largely hinges on making data transfer to the device pay off.
the same website has a link to cudamat, which might be a more cooperative if lower-level way to go. target seems to be basic matrix and element-wise ops. actively developed.
pystream was developed as another cuda wrapper until about mid 2008, then abandoned when the company went off to develop gpulib, a cuda api for idl and (?) matlab.
pycuda handles the background stuff, but you still have to feed it c code (though there are tools for run-time code generation). looks like it's actively developed, though, with an impressive list of users, and like the others it does play nicely with numpy arrays. i think this is the place to start.
pygpu uses pycg and pyglew to generate cg code directly from python. so you write python and it will run on nvidia or ati hardware, under both linux and windows. unfortunately, neither the homepage nor the google code page show any signs of activity in the last few years. pycg (developed by the same guy) seems to have trickled off in late 2007, though ubuntu packages were uploaded to launchpad just a year ago. too bad, this looked like it might have been a good one.
the gpu stuff on scikits oddly seems intended for actual graphics stuff.
pyopencl ? here's a faq page contrasting cuda and opencl.
pycublas maybe just does matrix mult.

bond spread data

fred (research.stlouisfed.org) has a lot of economic indicator historical data, including moody's bond yields, but only for aaa and baa. i'd like to find something for more junky bonds....
moodys.com certainly would have these data, along with others i'd like to see. but registration is required and i don't know if that stuff is available for free.

Monday, June 28, 2010

Best Practices in Estimating the Cost of Capital: Survey and Synthesis

Robert F. Bruner, Kenneth M. Eades, Robert S. Harris, and Robert C. Higgins
nice peek at the popularity of various financial analysis techniques. just a bit dated now, as it came out in 1998, but still worth a look.

tea party != conservative populism

jeffrey friedman makes some good points about why the tea party leaders should not assume (or maybe even aim for) a conservative populism. here's the most sobering stat for me:
An April Rasmussen survey found that only 60 percent of Americans now believe that capitalism is better than socialism. Among those under 30, socialism and capitalism are nearly tied at 33 percent and 37 percent.

oil spill -> green pork

charles krauthammer lets fly a critique of obama's face-the-nation. seems a bit harsh for ck, but he does an effective job of deconstructing the basic argument.

obama vs. science

jonah goldberg makes an interesting connection between the anti-science claims against bush and the drilling moratorium. a minor point, perhaps, but one to put on the record.

portfolioscience

here's a company that sells software for portfolio optimization and risk analysis. maybe i should fill out their form and check out the demo some time, just to see how they do it and which metrics seem to be featured. in particular i'm curious about the riskapi efficient frontier optimizer.
aorda is another one, started circa 2006 by a ufl prof who was one of the early proponents of cvar. free download for crippleware, but have to register first. pay versions are very highly priced: commercial license is $10k/year.

why defend bp?

why are some on the right defending bp? rich lowry has a great editorial up on nro making the point that we don't have a dog is this fight while some republicans are determined to get bitten.

Wednesday, June 23, 2010

15.535

looking at the assignments and exams, i think i understand the concepts just fine, although the intricacies of interpreting how management might be manipulating their accounting numbers are not easy.
2
http://mit.edu/wysockip/www has useful stuff but doesn't have all the stuff from class anymore
peg ratios, often cited
3
cash flows over firm's life cycle
trend analysis: cfo vs ebx
red flags: growing discrepancy between net income and cash flows
undervalued liabilities, overcapitalization
investment activity
key: proceeds from exercise of stock options. good?
firm type: growth options vs assests in place
tech, growth: not much depreciation, financing primarily related to equity
airlines: cfo large compared to net income, even in loss years; large depreciation, investing; debt financing
retailers: walmart has large difference between cfo and net income
4
problems with residual income valuation
p/e or m/b with real options?
5
abnormal earnings with dcf (discrete cash flows)
what do analysts use? refs asquith et al., 2001
earnings multiple 99%
p/e 97
relative p/e 35
revenue multiple 15
price-to-book 25
cf multiple 13
dcf 13
eva 2
'model' 4
estimate price multiples for comparable firms avg/median/etc. why not use distro?
if current earnings are not good prediction for future: forward p/e or pro forma earnings (remove non-recurring) or price to operating cash flow
other p/e: peg, p/cf, levered, (debt+equity)/ebitda
m/b market to book
stock screener links
profitability: roa (return on assets)
roa decomposed into profit margin and asset turnover
roe (return on common equity)
roe decomposed into profit margin, turnover, leverage
short term liquidity
current ratio = current assets/current liabilities: short-term debt paying ability
quick ratio = (current assets-inventory)/current liabilities: acid test ratio
long-term solvency
long term debt ratio = long term debt/(long term debt+shareholder's equity)
d/e = long term debt/shareholders' equity
total liabilities/total assets
7
forecast eps goes down the last 6 months before release due to expectations management
8
detecting earnings management
ratio of volatility (stddev/mean) of accrual income measures to underlying volatility of sales and cfo
12
risk assessment
turnover: accounts receivable turnover, inventory turnover, fixed asset turnover, accounts payable turnover, days payable outstanding
short-term liquidity: current ratio, quick ratio (acid test), operating cash flow to current liabilities
long-term solvency (maybe a good way to value bonds?): debt/equity, long-term debt ratio (simple function of d/e), liabilities/assets
interest coverage ratio, in terms of both income and expenses or cash flow
refs modigliani-miller theorem without explaining: debt and equity financing are equivalent
absolute metrics: interest coverage, current ratio
13
cost of capital
equity cost of capital (discount rate)
capm: estimate beta (key issue) period typically 5 years; bloomberg, analysts, yahoo finance, etc
http://research.stlouisfed.org/fred/data/irates.html for risk-free rate and other data
fama-french 3-factor model extends capm with size, b/m (higher b/m->higher returns)
http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/data_library.html for rates, other data
long run averages: r_m-r_f (market-riskfree) 7.95% per year, r_smb (size premium) 3.32%, r_hml 5.05%
international
segmented/integrated capm: bekaert and harvey 1995
world capm holds if country stock market is integrated: http://www.msci.com/equity/index.html
ow, use r_country
'institutional investor' magazine ranks country credit risk 0-100
impressive fit to data: r_country = alpha + beta*rank
15
expected return depends on systematic risk
alpha = abnormal return = actual return - capm, for example
multiples valuation key assumption: earnings and book equity are comparable
drift strategies
returns over last 6-12 months predict next 6-12 months
post earnings announcement drift from under-reaction to news
red flag: again, gap between reported income and cfo
quality of earnings ratio: (earnings-cfo)/avg total assets
'widely accepted' evidence on fundamental trading strategies
e/p, b/m, cf/p: high->high future abnormal stock returns
var(cf)/p: high->low future abnormal stock returns
v/p (firm value from abnormal earnings model/price): high->low returns
short term reversal: high return this month->low next month
medium term momentum: high return past 6-12 months->high return next 6-12
accrual anomaly: high accounting accruals this quarter->low returns next quarter and beyond
16
bankruptcy detection
http://www.ibbotson.com/content/cc_1v11.asp cost of capital:$15/beta
altman z-score fit from manufacturing firm data
linear function of ratios
moody's, s&p use similar models to z-score to rate corp bonds
http://riskcalc.moodysrms.com/us/research/crm/45768.pdf
http://riskcalc.moodysrms.com/us/research/defrate.asp
17
mergers and acquisitions
'old' purchase method: goodwill asset created and amortized over 40 years
pooling of interests no longer permitted for valuing
18
employee stock options
20
off balance sheet activities
enron background
21
pension plans
defined benefit plans cause accounting problems
22
international financial analysis
insider (code law) codified system
close interplay gov, banks, unions, big firms
continental euro, japan
less public disclosure
outsider (common law)
us, uk, english-speaking
us vs uk differences
23
sarbanes-oxley and review
sarbox 2002
identify comparable firms
multex (?) via yahoo for quick industry benchmarks
will change: accounting rules, tech, market integration, contracting methods
won't change: thought process, economics

guild wars

skills at the end of nightfall:
n: meekness, well of dark
p: harrier's toss, never surrender, stand your ground

Wednesday, June 16, 2010

memory profiling with python

tough to find good memory profiling for python. heapy-pe and the other (didn't bother to remember the name; pysizer?) turned out to be no good to me with numpy arrays which (surprise!) tend to be the biggest data structures i deal with. here a couple of others to try some time:
meliae is new and more cli-oriented, but looks easy enough to try (and script).
dowser spawned off of cherrypy, but i think it works for any python code with the web server as sort of a gui (i think unlike dozer, which targets wsgi apps. or maybe dozer is just a wsgi version of the 'gui'?).

here's an example of objgraph to analyse memory usage.
i think these are more garbage collector approaches, rather than hook-and-trace, so maybe more likely to work with libs like numpy.

memory_profiler also comes recommended and looks interesting. pure python, so portable and hackable.

Tuesday, June 15, 2010

r-cran-fimport

ubuntu has a package for downloading free econometrics data: provides import function to access (free) data from Economagic, the US Federal Reserve, Forecasts.Org, Yahoo and other web sources. worth a look to see some sources that would be good to look at. the group of people who wrote this also have a link to a brief discussion of portfolio risk surfaces over the convex hull of achievable sets. interesting... i was thinking something along the same lines, and it's a little gratifying that working pros seem already to be doing something similar.

__get__ method for fun and profit

just learned (or maybe relearned) something cool about python: the __get__ special method gets called when an instance is accessed as an attribute of another instance. not only are there potential uses for this, it also holds the key to understanding the 'self' and 'class' special arg in methods. this is something that confused me a couple of times before, such as passing references to instance vs. class methods from outside the class to be used inside the instance.
so, for example, i could allow instances of one of my classes to know how and where it's getting passed around, and something about the context when something is asked of it. maybe a quick and dirty memory leak tracker, when i know beforehand which objects are the big boys but i don't know who's pointing at them.
or maybe a little internal usage auditor, when i'm considering the impact of a refactor.

Monday, June 14, 2010

valuation books

couple of books recommended by people in the valuation business. one comment about duffie is that he was kind of a disappointment as a consultant, since he likes to stay more in the theoretical than the practical. not sure if this is the best book from him, but it's fairly recent. (search for 'dynamic')
Investment valuation : tools and techniques for determining the value of any asset
Damodaran, Aswath.
interesting that he says most analysis/justification for valuation is on discounted cash flows (as it seems to be in the book), but most valuation in practice is with ratios in relative valuation. contingent claim valuation is a more recent perspective, looking at opportunities available to a firm and pricing them like options. i was disappointed at how little there is on bond and commodity valuation, especially given the promise in the subtitle. interesting chapter on evidence of market efficiency.
Credit risk : pricing, management and measurement
Duffie, Darrell.
financial statement analysis and security valuation, 3rd ed
stephen h. penman
658.15
more on valuation, including slightly less than simple forecasting and detecting financial statement manipulation
dynamics of markets: econophysics and finance
joseph l. mccauley
658.15:519.217
empirical refutation of common modeling assumptions
value at risk: the new benchmark for managing financial risk, 3rd ed
philippe jorion
658.155
different types of risk, some 'industry-standard' real-life practical-experience rules of thumb

Tuesday, June 8, 2010

mplayer dump

tried to record a lecture in realmedia format with mplayer, but it keeps dying in the middle. cache seems to help it get farther, but it still chokes. maybe i just need to give it a _big_ cache, or use the -cache-min option:
mplayer -audiofile-cache 8192 -cache 8192 -dumpstream -dumpfile out.rm rtsp://etc

encrypted pdf files

managed to get a pdf file with restricted permissions using pdftk (pdf toolkit). the man page was a bit unclear on this, since if you just put in an owner password it will not actually restrict the file and no passwd is requested (or needed) to open it. this will do the trick, with all restrictions:
pdftk in.pdf output out.pdf user_pw foo
or, to make 2 levels of access:
pdftk in.pdf output out.pdf owner_pw baz user_pw foo
now the restrictions will be in place if you put in foo as the passwd, but they will not if you put in baz.

redirecting stdin, stdout, stderr from python

saw a nice comment in a post on python daemon forks that explains how to redirect std* pipes, even when they are accessed from c. i've been frustrated trying to figure that out before.

More reliable i/o stream redirection. Just reassigning to the sys streams is not 100% effective if you are importing modules that write to stdin and stdout from C code. Perhaps the modules shouldn't do that, but this code will make sure that all stdin and stdout will go where you expect it to.

import os, sys

out_log = file('/out/log/file/name', 'a+')
err_log = file('/err/log/file/name', 'a+', 0)
dev_null = file('/dev/null', 'r')
sys.stdout.flush()
sys.stderr.flush()
os.dup2(out_log.fileno(), sys.stdout.fileno())
os.dup2(err_log.fileno(), sys.stderr.fileno())
os.dup2(dev_null.fileno(), sys.stdin.fileno())
(and another poster suggests closing sys.std* before duping.) cool. i need to remember this next time i wrap somebody's code that thinks it's a good idea to barf on the terminal without a --quiet option.
also, the demon implementation looks pretty clean, with extra tidbits sprinkled into the comments, and the author explains the reasons for doing things. i don't think i will need the double fork for my udev script, but the first fork will be necessary.

printing presentations from evince

sometimes i need to print presentation slides with just big text, and i want to do it n-up so it doesn't waste paper. i've found that the best way to do it with evince is to set 'layout' to two-sided: long edge (standard), pages per side: 6, page ordering: left to right, top to bottom, and 'paper' orientation: landscape. 12 slides per sheet and still very readable.
also, it often helps to concatenate pdfs together so the nup has multiple presentations without a page break:
pdftk in1.pdf in2.pdf in3.pdf cat output out.pdf

Monday, June 7, 2010

ocw

mit's opencourseware has quite a few classes up from course 15 (sloan school of management). worth a look.
15.010, maybe 15.060, 15.062, maybe 15.063, maybe 15.223/15.224, 15.351, 15.352, 15.356, 15.358, 15.369, maybe 15.391, 15.394, 15.402, 15.414, 15.431, 15.433, 15.616, 15.617, 15.628, 15.963, and 15.997 look interesting.
15.501, 15.511, 15.514, 15.515, 15.516, 15.518, 15.521, and 15.535 look like particularly interesting classes on finance.
i should also peek at 15.070, 15.075, 15.081, 15.084J, 15.085J, 15.093, 15.094J, 15.098,15.099, just to make sure i haven't missed any of the math there.

Friday, June 4, 2010

udev rules

finally got a udev rule in /etc/udev/rules.d/ so it will run my script on a hotplug event. first i had to run udevadm info --attribute-walk --name /dev/sdc1 to dump out a list of potential attributes to search. since this spits out info on the whole device chain, i found out that the section to look at was /devices/pci0000:00/0000:00:1d.7/usb1/1-7/1-7:1.0 (i tried matching on a lower level, but my script was executed twice with different env vars.) then i put a rule line into /etc/udev/rules.d/91-myhotplug.rules (91 because i want it to run after everything else) like this:
ACTION=="add", SUBSYSTEM=="usb", ATTRS{modalias}=="usb:v0...50", RUN+="/home/user/svn/bin/hotplugscript"
had to put in a sleep 6 before doing anything in the newly mounted fs because it seems to take about 5 s to settle down. (probably should loop-and-test in production-level code.) oh, and one other handy thing i googled up: use blockdev --flushbufs /dev/sdc1, for example, to flush one block device. sync just flushes them all. i have the whole bash script in brackets with a trailing & to make it fork off and return immediately; that's important because otherwise i found it hangs or dies at the sleep and nothing after that gets executed. and i did find that a umount after business is done in the udev script causes no problems.

Thursday, June 3, 2010

financial accounting info on google finance

google finance has easy access to the balance sheet, cash flow statement, and income statement of publicly traded companies for the last 5 quarters and 4 years, ready to scrape from an html table in a standard format, all in one page. for example, all this info for cisco is at
http://www.google.com/finance?q=NYSE:C&fstype=ii
probably would be a bit more reliable if there were one place i could grab xbrl files, but right now i can only find links to individual companies' websites. not very scriptable.
also, the summary page (eg, http://www.google.com/finance?q=NYSE:C) has a 'key stats and ratios' sidebar. i wonder if any of these are really metrics that the pros use. yahoo has insider and institutional trades at http://finance.yahoo.com/q/it?s=C although dailyfinance shows total holdings for insider trading at http://www.dailyfinance.com/company/citigroup-incorporated/c/nys/insider-transactions
yahoo has convenient pages for analyst estimates, showing average, high, and low earnings estimates from a number of analysts for current and next quarters and years; estimate and actual eps histories; eps estimate recent revisions; and estimates for revenue and growth for the company, industry, sector, and s&p 500 for comparison. for example, dell's page is at http://finance.yahoo.com/q/ae?s=dell pretty good start for fundamentals valuation. the analyst opinion page at http://finance.yahoo.com/q/ao?s=DELL also has price targets and quantitative recommendation poll numbers (in the chart near the bottom; i think the 'mean recommendation' listed above is is basically the dot product of the votes in each category with the arange(0,5) divided by the total votes). not sure i would completely base my decision making on free advice, but it could be worthwhile for a sanity check right before clicking the 'trade' button. looks like it only has these analyst opinions for individual companies; no etfs, etc. but the etfs have holdings info, so i could go from there to get some projections on stocks.
now i need good ways to project near/medium term trends in the bonds, precious metals, commodities, and fx markets.