There is a lot of controversy today about the potentially unfair market advantage of premium subscribers to Thomson Reuters, who receive consumer sentiment reports 2 seconds before other subscribers and 5 minutes ahead of non-subscribers.  This is just the latest in a stream of unpopular revelations about the advantages high frequency traders have over private investors.  While it is always fun to watch the folks on live-cable-news shows hotly debate ‘fairness’, the reality is that the computerized trading game is fundamentally changing – advantages from speed of information and speed of trading are diminishing, leaving trading strategy quality as the main path to profits.  And building a better trading strategy is a big data problem.

Spy Trading

1 In 10 milliseconds, 100,000 shares traded after premium subscribers received early access to market sentiment data in May of this year – source Nanex, reported by CNBC

Advantages From Speed are Diminishing

High frequency traders have been able to do a lot with nanoseconds – it has been worth it for them to spend millions on tiny advantages in information speed – both in purchasing premium data services and in building the networks that let them access data as fast as possible.

Each breaking news story that casts the high frequency traders as the bad guy’s increases legislative pressure and the restrictions are starting to pile on – for example, volume restrictions to protect against sharp market moves and possible new rules mandating a minimum longevity of market orders.

High profits drew a lot of capital-heavy players into high frequency trading (HFT) in the last few years, who have built even more high-speed networks.  So differentiation on speed of trading alone is going away – competition is up and margins are down, and increased legislation, and the threat of more is hastening this effect.

So if speed of information and transaction is not a sustainable advantage, where will these firms look to strengthen competitive differentiation?

May the Best Strategy Win

If you look at how value is created in HFT, it depends not only on how quickly an algorithm can act and react (network speed), but fundamentally on how competitive the algorithm is and how quickly it is deployedthis is where the competitive battle field is moving, and these are big data problems.

The Best Strategy Depends on a Big Data Approach

Anyone who develops the mathematical algorithms that are used in HFT will tell you that a model that is tested against more data is going to be a better model.  The big winners in this space have moved from testing their trading algorithms against weeks or months of market data to years of data.  And they have added new sources of data like social media and news feeds so ‘sentiment analysis’ can help inform automated trading decisions.  So the same algorithm might be tested against 100 or even 1000 times more data than it used to be.  This is not an easy thing to do and it is the next obvious area of competitive differentiation.

I say ‘obvious’ because it is already happening. “A[n]… arms race is occurring with big data today. If you ignore it, it’s like trying to conduct a trade on a rotary phone while others have their computers in the data center”[1]

So what happens when you try to take your current infrastructure and get it to process 100 to 1000 times more stuff in the same or less time?  Right, it just can’t do it. If you cut it up into 1/100th and 1/1000th size problems and batch it through you have a lot of moving pieces to stitch back together which is tricky and time consuming… So, that’s not the answer.

You might think flash storage would solve the problem, but it turns out that, in this case, flash is not so flashy. Any form of large cache in the storage subsystem is fundamentally the wrong architecture for historical data with big columnar oriented databases – i.e. exchange ‘tick’ data. When you’re using historical data, you are reading constantly through all or most data, sequentially.  As soon as you have a big data problem – like multi-year tick data – a flash device falls over the instant any non-warm data is accessed.

Don’t take my word for it – check the STAC M3 benchmark results, where DataDirect Networks (DDN) consistently demonstrates that the massively parallel DDN SFA architecture is often faster than flash and is up to 8x faster than traditional storage.

Having benchmarks is especially important in this industry where ‘serious’ players will almost never talk about what they are using.  However, we can summarize the basic characteristics and experience of our customers as long as they are anonymized sufficiently to not be too recognizable.

For example take the large US Prop Trading firm that  took their NAS architecture as far as it could go and still needed 5x more performance to bring many new strategies to bear quickly and fail out fading or unsuccessful strategies in as close to real time as possible.  Or the international bank that finally had to throw out both a top enterprise storage and server vendor for failing to deliver the speeds needed to feed hundreds of program trading clients. These and many more have found the performance they need to day and move into the big data area of algorithmic trading with DDN.

Check out DDN for blind case studies – there will be several over the next few months highlighting how Hedge Funds, Proprietary Trading firms and the trading arms of large banks are adopting a big data approach based on DDN to build a FAIR market advantage based on getting more successful strategies into the market, faster.

[1] Posted May 6, 2013 by Ana Andreescu in Big Data, Thought Leadership Building a Fair Market Advantage

  • Laura Shepard
  • Laura Shepard
  • Senior Director of Marketing
  • Date: July 18, 2013