Treasure Data: Next Generation Data Warehousing Platform (ngDW)
Blog by Tim Guleri, MD, Sierra Ventures
Today is a great day in Sierra Venture’s three-decade history in backing Big Data companies. Our latest investment was announced today; Treasure Data with a Series A lead by Sierra Ventures. I am proud to partner with the founders, Hiro Yoshikawa and Kaz Ohta, in pursuing their vision to build the next generation of a big data processing platform.
A bit of history
Long before data was “big” and the issues around loading, transforming, storing and analyzing data became as mainstream as they are today, a startup company called Teradata approached Sierra Ventures for venture funding. Back then (1980’s) the core issues were about having a separate data processing platform from your transactional system in order to allow enterprises to perform decision support at scale. Teradata rode that wave. We took Teradata public (NASDAQ: TDC) in 1987 and still today it continues to be a powerhouse in Data Warehousing with operation in 42 countries. This was Data Warehousing’s first wave or DW 1.0.
In the beginning of the 2000’s and with the advent of the Web, the demands on Data Warehousing began to change. In 2005, when I was trying to find out what’s next in database technologies I remember talking to a CIO of a Fortune 50 company who was complaining about the “Teradata Tax” – they were paying $20M a year to Teradata and still not getting what they needed. Their corporate data was growing so fast that Teradata’s solution was not cost effective. At that time I was a Series A investor in another fast growth open source company called Sourcefire (NASDAQ: FIRE) and went looking for open source disrupter in Data Warehousing.
The “Big Data” Cold call
I googled “open source Data Warehousing” and read about a company called Greenplum that had taken the core of Postgres and built a Share-Nothing MPP architecture optimized for Data Warehousing. I picked up the phone to call Greemplum and the guy that answered was founder Scott Yara. I went on to lead Series B in Greenplum and was the largest investor in the company (building off the hard work and patience of my good friend and great investor Leo Spiegel from Mission Ventures who lead the Series A). From 2005 to 2010 Scott and I partnered in many ways; starting with adding Bill Cook as CEO to the team, adding three 1M + customers to the customer list, including T Mobile, MySpace, and Zions Bank, and building a world class management team. In my journey at Greenplum, I believe I’ve had the pleasure the working with one of the best entrepreneur of this decade: Scott Yara. Scott, not only posses a brilliant technical mind that has the ability of “looking around the corner” and build products that markets want, but he does it with humility, grace and his signature infectious laugh.
Greenplum went on to be one of the most successful Data Warehousing 2.0 companies and was bought by EMC in 2010. It was an excellent reward to the founders and management team and a great venture return for Sierra. Greenplum (and Scott) continue their Big Data journey at Pivotal, the Big Data company owned by EMC and are doing may wonderful things there to push Big Data processing to meet the next generation needs.
The need for Next Generation Data Warehousing (NGDW)
Despite the no brainer economics for customers to buy Data Warehousing 1.0 or 2.0 technologies given the business impact these systems deliver, there was one thing that always remained elusive: Time to Value. No matter how much a customer paid or how skillful a team was assembled, it took six to nine months after the purchase was made for the customer to start seeing value.
The hidden reality of all big data projects is that 70% of cost and time goes into prepare the data before one can analyze it. Preparation of data includes data ingestion at scale, data cleansing, schema generation, and data storage/ normalization.
The other key reason for delay in Big Data projects is the procurement of infrastructure, and the constant need to keep adding memory/hardware to keep up with the growing needs of processing data at scale.
The final hurdle is buying or getting a Business Analytics or Business Intelligence infrastructure to work on the collected data in order to produce actionable reports or analytics.
So, as I started looking at this space again to make my next “big data” investment I was resolved to find a company that at its core, solved these problems.
Enter Treasure Data!
I was introduced via email to Hiro and Kaz by another entrepreneur whom I have a great deal of respect . When I met Hiro and Kaz I had a “Yara” moment. Our initial meeting took me back to 2005 when I had first met Scott Yara and he’d explained the power of Greenplum. The Treasure Data guys, like Scott, were also humble and brilliant. – just the type of personalities I love working with. They had built a next generation multi-tenant cloud service that had an integrated data ingestion architecture whereby you could ingest data first and define the schema later. Once ingested, the data could be manipulated, compressed and stored. The resulting data is presented an elegant API driven interface to any existing BI infrastructure. Truly a next generation data pipeline architecture. And the business results were no less spectacular. Since launching the service in Q4, 2012, Treasure Data had already closed 70 enterprise customers. I was hooked! As I continued to work with them, we went on several sales calls together. A few of the sales calls were to Sierra’s CIO Advisory Board members who had expressed interest in meeting them. Along the way, we developed a shared vision on where we could take this company over the next decade.
As Amazon Web Services (AWS) has shown, customer self-service and superior economic value always trumps large expensive sales forces, brigades of consulting bodies and top down selling motion; a game that has been perfected by the IBM’s , SAP’s and Teradata’s of the world. This game is permanently changed. It is my strong belief that the next billion-dollar enterprise software company will have bottom up buying, rapid time to value and high impact business value. That’s our mission at Treasure Data in the Data Warehousing category, and with Hiro and Kaz at the helm, I’m looking forward to building a phenomenal company that is the panacea to the Enterprise “Big Data” pain.