Archive for February, 2009

Last Two Weeks
This last mini vacation provided me with an opportunity to spend quality time learning data mining/machine learning; just as importantly it reaffirmed my belief in the application of science and technology over “common sense” to solving complicated problems

Programming Collective Intelligence has beautiful examples of how science can be applied to real world problems, but it only offers an introduction to a very complicated very diverse field of machine learning. 

Another Piece of the Puzzle
Data mining, software engineering, Python, Erlang and financial risk management are just pieces of a very complicated puzzle: algorithmic trading. On Monday, at my new job, another piece of puzzle will fall into place: the FIX Protocol. The next few weeks I’ll be busy learning FIX  but I know that working with FIX and getting re-exposed to the industry will help me make better decisions regarding which technology to use and which to avoid.

Going Forward
I still plan on studying and experimenting in my free time.  My roadmap includes: Introduction to Data Mining which I want to supplement with Orange; Mnesia, a distributed database which is used in conjunction with Erlang or possibly MonetDb which should be more suitable for data mining and tick databases and finally some tinkering with Markcetera and market simulators.

Parting Thoughts
This is a roadmap without a timeline, one thing I should have learned by now is that learning takes time and I have to mentally prepare myself for the long haul.  I don’t know if this combination of technology will yield the results I expect, but no knowing is exciting—the experimenting is half the fun. 

KSE’s website has been redesigned.  What were they thinking? Aleem, any comments? 

I can’t make sense of their new daily market feed either, but I did notice a new data portal where they’re selling real-time, historical, level-1 and level-2 market data directly.  They haven’t posted any prices yet, but this looks like a step in the right direction.

I’ve only read 6 chapters of Programming Collective Intelligence and I’ve learned more from it than two AI courses at the University of Houston.  The way the subject matter is presented in university courses is dreadful and obfuscated, filled with mathematical theory without any practical examples to make understanding AI more tractable.

This book covers a broad range of topics: optimization, searching, classification and decision trees, all which the author, Toby Segaran, explains in plain English, with very readable code examples that illustrate the application of these concepts to real world problems. 

Toby has done a great job explaining the source code included in this book, they’re very easy to understand.  I’ve never programmed in Python before, but I find myself learning Python using his examples.

In all, I highly recommend this book to all developers interested in machine learning or data mining.

Search