DataSift Announces Mega-Round. Apple Buys Topsy for $200 Million. Here’s What You Need to Know

Posted on Dec 3, 2013 | 30 comments


I’m super proud to announce that DataSift has just completed a $42 million financing round coming at the end of a year where its revenue grew several hundred percent year-over-year. Considering our revenue is SaaS revenue this achievement is even more remarkable.

datasift value

The timing of the announcement of this investment couldn’t have been timed more perfectly if we tried. Yesterday it was announced that Apple had acquired one of our competitors, Topsy, for more than $200 million. As this astute journalist pointed out, DataSift “likely would have cost a lot more to acquire.”

What gives? Why all the fuss about the Twitter firehose?

I started announcing my Twitter thesis back in 2011 (still serves as a useful read today). I stated that Twitter provided

  • Identity
  • Object Communications (now often called “the Internet of Things”)
  • Predictive Data
  • Augmented Data

And before that you might enjoy this longer analysis on why I invested in DataSift in the first place, which was written 2.5 years ago and still rings true today, stating the unique Twitter attributes that are disruptive:

  • Real time
  • Open
  • Asymmetric
  • Social
  • Viral
  • Location Aware
  • Referral Traffic
  • Explicit Indicator (intent)
  • Implicit Indicator (what can you infer about me)

If you want details to the bullets they’re in the posts above.

Put simply, the amount of public, real-time information that is now being created by hundreds of millions of users and soon billions of objects will change the way every major business, organization or government must operate.

It isn’t simply that when a leader in the US calls off negotiations with Iran he puts it on Twitter or when a leader from Iran rebuffs that Tweet publicly that a signal is created but it is the unseen. It is the oil pipeline explosion in Nigeria that is Tweeted before people even know a disruption may happen. It is the fact that somebody follows hate groups on Twitter and not commensurate opposing views and is about to be part of a selection group to be considered in an important trial. Those are the obvious cases.

But what if you’re a credit card company and you want to know where to find your next customers? Wouldn’t it make sense to look for graduation Tweets from high school or college? If you’re TrueCar wouldn’t you want to identify Tweets based on your geography and look for keywords like “crashed my car” “totalled” or “thinking about buying a new car. should I go Audi or BMW?”

datasift vc fundingHow can businesses not incorporate information into their marketing and sales funnels? How can governments not track hooligans, terrorists or criminals who give off public information.

Some startups I talk with mistakenly believe you can poll the Twitter API directly to get the feed but the Twitter API isn’t full fidelity, doesn’t have the full historical data corpus and isn’t real time.

But here’s the thing my partners & I love the most about DataSift and why we never would have considered selling for anything like $200 million.

Twitter is just the beginning.

DataSift is a real-time data processing platform that can be used with any data source including your internal data. It’s one thing to have “big data” initiatives with terabytes of data stored to query at any moment. But in a world where time is critical to decision-making and much of the data is flowing through public & private systems and perhaps not even in your data store yet – we believe real-time processing of data will become as valuable as big data storage itself.

Already 2/3rds of our customers are ingesting 2 or more data sources including Facebook, Tumblr, WordPress, Bit.ly and so on and we do private implementations with the likes of Yammer and others.

For technical teams we have a scripting language that allows teams to build complex queries from multiple data sources and ingest them in one single API stream. For marketers or business professionals we build a visual query builder that allows you to select data sources and human language queries against data and we will do the data extraction for you (and auto-generate the query language if your tech team wants to maintain or edit it).

No other vendor in the market allows a single API, a scripting language and a visual query builder and it’s these and other feature sets that have seen DataSift grow at the astronomical pace it has grown at.

And from an investment perspective we remain incredibly long DataSift. Upfront is an early-stage investor. We normally look to invest our first money below a $20 million valuation and when deals get to lofty prices we normally bow out to later-stage investors who have deeper pockets.

Not so DataSift. We co-led the A-round with IA Ventures. We co-led the next round with IA Ventures without even asking other VCs to participate so we did an A-1 round. We knew we had a winner. In the B round we invested the maximum amount we could alongside the lead – Scale Venture Partners. And even in this growth equity round led by Insight Partners we asked for our full prorata investment and took as much as we were allowed to.

Obviously I can’t predict the future and it’s up to the great team at DataSift to continue to execute as well as they have to date. What I am certain of is that real-time processing of big data (both public & private) is going to build some multi-billion dollar companies. And I believe we have as good a shot as anybody.

If you want to read the company’s take on their funding their official announcement is here.

It’s also worth noting what a great win this has been for the UK as our tech & product teams are still based outside of London and we continue to grow those operations under the guidance of Nick Halstead and Tim Barker. In short order that team will top 100 professionals as will our US operations headquartered out of San Francisco.

Huge congrats to everybody at DataSift whom I’ve enjoyed working with so much over the past 2.5 years. Nick – the incredibly visionary behind the company and our technology. Rob – the CEO who came on pre-revenue and build out an amazing organization. Tim, my former co-founder and long-time colleague & friend who joined as global head of products. Pier who has built a world-class sales organization and processes. Ming who the hero of so many customers whose primary reference to other customers is, “make sure you get a Ming.” Steve. Andrew. Lorenzo. And a host of many other people I’m leaving out.

I’d also like to express my gratitude to the great friends, investors and board members on one of the most active boards I’ve been involved with. Roger Ehrenberg. Rory O’Driscoll. Chris Smart. You’ve been marvelous.

Now can we please do at least one board meeting in LA?!?

 

  • http://arnoldwaldstein.com/ awaldstein

    Nice….congrats!

  • benjamindblack

    Good work

  • Ali Khoshgozaran

    Congrats Mark and the DataSift team. So this was the big announcement you were talking about :)
    DataSift is empowering our real-time search and discovery engine and we can’t wait for more datasources integrated and more features being added.
    Also totally agree with your long term stance on importance of real-time big data. I would just add that we see first hand how cross referencing real-time data coming from multiple real-time silos makes the information so much more valuable and actionable.

  • Jordan Thaeler

    What I really find interesting is that DataSift raised $7.5M before revenue hit the books (if Mark is correct in stating that Rob joined pre-revenue). I don’t think any enterprise company could raise money today without substantial revenue and even more likely EBITDA. It’s definitely thought-provoking to see VC become growth equity.

  • http://www.startupmanagement.org/ William Mougayar

    Nicely done. I like the transparency you offered in revealing a bit on the round participation splits among the various VCs. I’m curious as to why that is always a guarded secret.

    That said, there is a big need in continuing to make sense of the increasing mountain of social data. Does DataSift’s business come primarily from other product developers (like Sysomos) or are big co’s also buying their data directly.

    And curious if they plan to aggregate Disqus as I didn’t see it on the list.

  • Anders

    Congrats! I believe we have just scraped the surface of what can be done with real time big data analysis! Lets grab a coffee some day.
    Best,
    Anders Fredriksson
    Co-founder
    Shpare

  • http://bothsidesofthetable.com msuster

    thanks, ali. I checked out your website. I’d love to hear more of your views and meet the company. I DM’d you my details. Hope to chat soon.

  • http://bothsidesofthetable.com msuster

    VC didn’t become growth equity. We invested $5 million in an A round, which is a standard VC funding round. The company raised $1 million from seed investors along side us. And had raised $1.5 million from angels prior to us.

    And yes, it was pre revenue. We invested in Nick Halstead’s product vision and technical implementation and the hope that Rob Bailey would join the company (he joined right after but we talked to him before we invested).

  • http://bothsidesofthetable.com msuster

    Disqus initially signed an exclusive deal with our competitor who offered them a rev share. I don’t know why they did this as it would have been better for them to agree to have more vendors selling their data. But I have to assume our competitor paid them more than market rates or they felt they could only work with one partner at a time for some reason.

    Regardless, we’re customer drive. Our customers demand: Facebook, WordPress, Tumblr and increasingly international sources. Disqus is a wonderful commenting system and a great team. So far are customers aren’t asking us for their data. If that changes we’ll aggressively pursue.

  • Ali Khoshgozaran

    Thanks Mark. Got your message. Will be in touch shortly.

  • http://mattamyers.tumblr.com/ Matt A. Myers

    It’s smart to keep your thoughts open to not only Twitter as a data source, much like USV’s investment in Coinbase isn’t solely for Bitcoin – though has been mentioned it could facilitate transactions of other similar currencies (or perhaps non-similar?).

    Congrats to everyone involved. Growth is exciting.

  • http://www.yanado.com/ Ivan Mojsilovic

    Congrats, I guess DataSift will be your first >1B exit so you can shut the mouth of those who doesn’t understand why you are so great (and I only read your blog! ).

  • http://www.yanado.com/ Ivan Mojsilovic

    And kudos for Zemanta because it’s used on your blog!

  • http://LeanStartPad.com/ Jeff ‘SKI’ Kinsey

    Congratulations Mark on “seeing” the value early.

  • Eric Holmen

    Incredible timing! Luck favors the well prepared.

  • Ahmad al-As’ad

    Excellent timing. Congrats to you Mark and to the DataSift team.

  • iyerland

    I love your blog and the writing of your thought process, even when I don’t agree (as in this case). Companies that make tools to mine data are becoming (in my opinion) a commodity. If you go to any data conference, it’s amazing what tools are continually being put out there and it’s a very competitive space. It’s the companies that collect the data in the first place that will capture most of the value, so I’d invest in the next big data collection engine rather than the companies that are built on top of them (and yes, I’m biased).

  • iyerland

    btw, I should also say congrats and I have no doubt you’ll make a ton of money here regardless

  • phil_hendrix

    Mark – you and Ali would enjoy my report “Tuning into Consumers’ Digital Signals” (pdf @ http://bit.ly/pKrIWl). Highlights @Datasift along with others (@GeoIQ, @PlaceIQ, @Factual, etc.). Discusses emergence of real-time, location-awareness, etc.

  • laurayecies

    Congratulations to you and the entire team. This is especially impressive when you point back to the original thesis on both Twitter and the first investment. I do have a question though, you wrote that DataSift is the only vendor to offer the “single API” etc – I believe GNIP has a strong offering here http://gnip.com/products/realtime/data-collector/. BTW – I don’t see this as a weakness – if the market is as interesting as you write there will clearly be active competition as DataSift has with GNIP.

  • http://hirethoughts.blogspot.com/ Donna Brewington White

    Great Q William.

  • http://hirethoughts.blogspot.com/ Donna Brewington White

    This post is effervescent! My phone is crackling as I read it. Congratulations to you Mark and all involved. Thanks for sharing all this! Not that you could have helped yourself!

  • http://intelassets.com David McFeeters-Krone

    Mark,
    This is a free event on how what the NGA is trying to do with big data. I thought you or DataSift might want to know.

    National Geospatial-Intelligence Agency (NGA) is hosting an Operations Technology Day, geared towards academia and industry, on 11 December 2013, in Springfield, VA. The event theme will be “Living in the Data”. For details, please see the link below or I can fill you in (though I am not connected with the event).

    https://www1.nga.mil/Partners/BusinessOpportunities/Pages/default.aspx

  • Jordan Thaeler

    Looks like you guys guessed right!

  • http://www.startupmanagement.org/ William Mougayar

    Thanks for the explanation.

  • http://www.rossjaklik.com/ Ross Jaklik

    Indeed, Twitter is just the beginning. Data mining for marketing (DMFM) will be a standard practice even among small and medium sized businesses in the very near future!

  • Pradeep Javangula

    Enterprise search companies have been implementing this idea of content acquisition and aggregation from multiple sources for a long time. Think Verity, FAST, Autonomy, Semio et al. who have had massive content aggregation, classification and semantic analysis infrastructure in the back-end. The thing that is different now, potentially is, the incremental value of a tweet or a facebook post is infinitesimally small and evanescent; and search like interface is not the right paradigm to retrieve these. It isn’t entirely clear how much DataSift or Topsy have invested in the knowledge extraction from data sources of this nature, for analytics or business intelligence. Folks like Splunk have been doing this too in a specific realm, and have done well because it is highly useful and was an uncharted territory. Horizontal infrastructure plays are good to think about, but in my view have serious limits without focus. Specific question is – what business problem is DataSift solving for the enterprise and where is that budget coming from? If the answer is marketing, then they are too late and there are way too many players already in it. The application of data aggregation and knowledge extraction in specific vertical industries holds must greater promise – think Palantir. It takes a lot to build these sorts of platforms. Topsy acquisition is not in my view something to sneeze at, they know what they could and could not do – and Apple is getting core tech and team to help the content intelligence parts of the platform. Clearly, Google is the undisputed king of this infrastructure. Without a business model focus, this may not be all that hot.

  • tablespill

    cool

  • Noa

    Great news – and well deserved – for DataSift!
    One minor comment – you say “Some startups I talk with mistakenly believe you can poll the Twitter API directly to get the feed but the Twitter API isn’t full fidelity, doesn’t have the full historical data corpus and isn’t real time.”
    However, Twitter does have a streaming API which is very low latency – pretty much “real time”. I believe startups who are doing data analysis typically use that rather than the REST API.

  • Bob Solomon

    Congratulations! Mark, sorry for the late comment, but I strongly agree with you on real-time data analysis. I could never understand why anyone would want to analyze old information. In 1994, when I was brought on to head product development at CCA, we spun out a group to build one of the earliest data warehouse engines. We based the technology on what the team (an incredibly talented team) had learned working with one of the most powerful databases of its time , MODEL204. This time we created a flexible and portable engine optimized for OLAP. At the time, our competitors were also focused in eliminating the information silos in a company by populating a centralized data warehouse that contained integrated enterprise data – but it was “old” data. We focused on providing a real-time data ware house that was populated using our real-time data replicator.. The analysts of the day thought that it was impossible and that we were nuts. But it worked!

    Unfortunately, the board grew fearful of the competition and forced me to “kill” the data warehouse engine project. We sold it to Redbrick and we spun out a new company that I headed, Praxis International, that focused on the database replication engine. We quickly sold $1m worth of product and raised $6m from Goldman Sachs. We grew to $5m in yearly revenue and sold the company (a longer story). All of this was based on the premise that companies/people required real-time information analysis. I guess we were a bit ahead of the market.