Some small statistical studies of financial markets that precede the big data mining

Here I publish my notes on preliminary statistical research of financial markets.
Since every serious data mining is very expensive it is worth first doing a quick study. At best the results should be visualized and if one sees some patterns with naked eye it is reasonable to do the further scrutiny.

Unfortunately nearly all statistical (empirical) researches have a definite flaw: one cannot verify them since only conclusions and not the source data are presented. I do not want to follow this vicious tradition and make both source data and (if applicable) program routines available. However, Yahoo and other providers, which I take the financial data from will likely not be admired with the redistribution of these data. So I put them on web password-protected, if you want to check my statistical analysis, just let me know and I will mail you the passwords.



Note 1. Looking for seasonality: sell in May and go away
Source data (from yahoo.finance.com).
Monthly returns for IBM, Coca-Cola, Johnson-and-Johnson and DJ30 are calculated like in DJ30 - from Jan 1980 to Nov 2010 - yahoo.xls.
Then the data are imported in SPSS and the boxplots are generated.


Note 2. How an advanced private investor can leech historical data from Yahoo-Finance
Yahoo Finance generously allows to download historical market data for stocks free of charge.
However, what if a private investor would like to run a big data mining for which he needs the data on thousands of securities?!
It is hardly possible to download the data manually with a webbrowser.
In this small note I briefly explain how I solved this problem for myself.
Leecher-Script in PHP
List of Reuters Tickers





Remarks, comments, suggestions are welcome!