This week I’m at MongoDB’s annual user conference, MongoDB World. The big news already happened in March with the release of MongoDB 3.0 and its WiredTiger storage engine. Now we’re left with relatively minor news announcements -- and bigger questions about the future of the company and its relationship with the community around its open source database.
But first the news. The headliner this year is a benchmark saying MongoDB’s database is better than Couchbase and Cassandra (which presumably have their own benchmarks saying theirs is better) and BI integration stuff that was already available from third-party vendors.
That the “big news” is a snoozer is no surprise. After all, WiredTiger was only four months ago. Moreover, MongoDB has had a near-complete turnover in management, which is hard to do without some distraction from product development. That said, interesting new features are coming in MongoDB 3.2:
- Data-at-rest encryption. Like Hadoop, MongoDB is bringing data encryption into the core product. According to Kelly Stirman, VP of product marketing and strategy at MongoDB, this will be configured as a separate storage engine for MongoDB based on WiredTiger. It is a common codebase but will be configured as a separate storage engine.
- Document validation. This is basically like constraints or a table structure in an RDBMS and was previously forecasted when I covered the MongoDB 3.0 release. Now you can make sure your documents match the schema for which they are intended at write time. If you've worked on a complicated MongoDB project, this is nice to have, though it is somewhat funny because it wasn't long ago when schemalessness was praised as a primary advantage of NoSQL. According to Stirman, this will be open-sourced.
- Dynamic lookups in the aggregation framework. This is basically left outer joins only. Stirman said the company didn’t want to use the word “join” because people might think it supports other types of joins. This will be in the aggregation framework that is open source.
- BI connectors for Tableau, BusinessObjects, Qlik, and Cognos. I see some potential here because you might avoid some ETL and analyze a replica of your operational store, but I'm a bit skeptical that SQL tools like Tableau will be all that effective for analyzing data in MongoDB. Technologically I thought this was a snoozer until I talked to Stirman. He explained the big differentiating feature was that MongoDB’s new connectors move more of the processing to the database, whereas most of the existing connectors do a lot of filtering and aggregation on the client. Having seen this with the rather crappy Hadoop connectors, I can say this is a real thing that matters. More interesting is the way this positions MongoDB.
- Mongo Scout schema visualizer. It is an independent tool from MongoDB Management Service (MMS).
Unfortunately, there was nothing big on the MongoDB Management Service, which was the darling of the 2014 show. I was hoping for some Elastic-like vision about point-and-click deploy across your personal MongoDB cloud by now. Sadly it is not yet to be.
Stirman did talk about a future where MongoDB might support the likes of additional in-memory options, write-behind caching, a full set of joins, multidocument transactions, and another storage engine, perhaps columnar with different types of indexes for analytics -- all of which would be much bigger stories than what MongoDB announced today.
Management and sales turnover
MongoDB probably won't go public or be acquired anytime soon. I spoke to the company’s new CEO, Dev Ittycheria, at the VIP party before the conference, and he described MongoDB’s investors as “patient.” Further, he mentioned that CTO and co-founder Eliot Horowitz is practically the only member of management that he kept and virtually the entire sales force has been turned over. You usually can’t do that and issue an IPO in the same year.
I'd spoken with a few former MongoDB sales team members and they'd mentioned their key dissatisfaction was that the value proposition for MongoDB the company had been to “support” the development of MongoDB. As one former MongoDB salesperson put it, “[Big companies] aren't going to pay you because they love you.”
I asked Ittycheria what the new strategy was. He said that previously the sales team was the typical startup chaos -- or as CTO Horowitz put it, “The development team was always ahead of the business and now they're finally catching up.” Ittycheria outlined a tiered plan to focus on the new “commercial offerings” that are kept out of the community version, and work with customers scaling up who want to build a relationship, but not try and milk the community members who probably would never pay them for the enterprise version.
A fractured community
During his keynote Ittycheria talked up the community in terms of downloads, students, attendees, partners, and customers. He mentioned people building a career around the database. It left me thinking, “Is that all community is to you?”
In fact, working with the community is a growing problem for MongoDB. My fellow Durhamite and CEO of Percona, Peter Zaitsev, has set up his own rebel base open community event outside the conference. I can say from experience that these forks are usually due to immaturity. When it happened to JBoss, it was the lack of maturity of the people leaving the company who didn't really understand the upside they gave up and the legal constraints around them. It was immaturity on the JBoss side in terms of legal framework, misunderstanding of risks, and how JBoss handled criticism. It did result in changes. MongoDB's early fractures are nowhere near as serious, but important.
At the core is a question, now that the company has grabbed WiredTiger and built an “open core” model around MongoDB (Ittycheria made a Freudian slip at one point during his keynote and nearly said “monetized” rather than “monitored”): Do you lock things down or leave inroads for others? Today this is about the storage engine. Zaitsev reported that both Percona and its recently acquired Tokutek (a company that offers high-performance editions of MongoDB and MySQL) were locked out of the conference. Will they have the functionality they need?
To be fair, the MongoDB communications team asked me unprompted if I was interested in talking to Facebook’s RocksDB folks. There is a bit of a contrast in that RocksDB is unlikely to compete with MongoDB, whereas Percona plans to offer support even for WiredTiger. Yet other companies have successfully navigated to “co-opetition,” as Zaitsev put it.
When I asked Stirman about all of this, he said the Percona thing happened in only the last month. Facebook is clearly not a MongoDB competitor, whereas Percona is. “I’d say that we’re still sorting that out and it really depends on the situation. RocksDB with Facebook, it is unlikely that Facebook is going to get into the business of selling support and certification for RocksDB for MongoDB, but if there were a lot of interest in the community that is something we could potentially offer.
“With TokuMXse [Tokutek’s storage engine for MongoDB 3.0], that’s different. That is Percona’s business model. I think in the market right now you would either buy from Percona or from MongoDB based on which storage engine you want to use.”
Zaitsev characterized MongoDB as “reluctant open source” and said it is “open source as in ‘here is your GitHub,’ but from a community standpoint it behaves more like Oracle, a traditional big enterprise company.” Yet even Oracle went so far as to sponsor meetups when Percona toured.
Stirman characterizes this more as a situation that evolved too quickly for MongoDB to adapt. “The Percona situation emerged in the past month or so. We’re still establishing how that’s going to work. But the precedent for Percona with Oracle or MySQL, they contribute a lot of code back to the codebase. There is competition with Oracle on the commercial end, but on the community end there is a lot of cooperation where both contribute to the codebase. I expect something similar will emerge with MongoDB.”
However, given that much of what is being released was developed privately but will be “blessed” upon the community GitHub after the fact, some of Zaitsev’s criticism rings true. This isn’t being run very inclusively.
Can open core and community coexist?
I asked Ittycheria where the company will draw the line between the open core and enterprise versions. Is it going to get a bit absurd like in Elasticsearch where LDAP integration -- something like 500 lines of code -- is considered “enterprise?” Ittycheria suggested that the line would be drawn around the idea of massive deployment and management rather than more core features.
Herein lies the question. Is it possible to run these JBoss-style open source startups (where the company holds and manages the code) without ugly forks in the community? Can MongoDB pursue its core business interests, monetize the project, and still let an ecosystem thrive around it?
To be honest, as much as I'm critical of the Apache thing, I haven't seen this done yet outside of a more “foundation” model. What happens if the ecosystem shuts down, the community version becomes a limited beta, and you can't actually contribute anything and see it get in the “supported” version you use? Do you eventually defeat the entire benefit of open source other than price (and maybe not needing a source code escrow, which is really a function of price)?
I think JBoss was largely unsuccessful at navigating these waters; there were forks that didn't need to happen. The businessman might ask, “Who cares?” But I happen to know that these problems cut the value of the company and made JBoss look for acquisition a bit earlier than it might have. There were unforeseen consequences of the CDN (Core Developers Network)/JBoss split. I might have sailed the Caribbean had it been better handled. If MongoDB fails this test, it may see the price of its eventual acquisition or often-rumored IPO cut significantly.