Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

As a guy who was lurking around for basically whole duration of CouchDB development. I would argue that Damien did not fall prey to the second system syndrome. He's just too much of a pragmatic for something like that. The soundness of Damiens engineering views and approaches is hugely similar to those of Linus Torvalds.

The truth in my eyes is that CouchDB is misunderstood, the same way Lotus Domino is misunderstood. And if majority of users misunderstand what it is good about then it is not going to be used in an optimal fashion.

What I guess that Damien is going to do is build a database that does shit people expected CouchDB to do. And I believe that Damien is a hedge that ensures that goods are going to be delivered.



Would you mind to explain why Lotus Domino is misunderstood? This is genuine question by the way, not the kind that ask for debate/arguments. I have friends who do Lotus Domino apps and since I have no idea much about the platform, perhaps you can share story about your statement.


Oh sure.

There are a few key points I could analyze, but I will just briefly touch on the CAP theorem. CAP Theorem states that every database engine can satisfy two properties from a pool of three (Consistency, Availability, Partition tolerance). CA type databases are the vast majority of datastores in use out there (everything SQL). While Domino and CouchDB (CouchDB is sort of Free Software Domino) are AP type databases.

What does that mean? Well first it means that a lot of the design patterns commonly used in CA CRUD apps goes out the door and requires a different approach.

Let me list some: 1. No JOINs, you can either store the referenced data directly in the referee object data OR you can save a key reference and do another query to get related data. It may not seem a lot, but once datasets start growing, this can be quite a pain.

2. No Ad-Hoc querying. There is no concept of "let me open psql and prod the data a bit", not in a production scale database at least. This becomes a contention when customers want a way or a tool that enables them to create arbitrary reports. Usually this can be worked out with a bit of patience and foresight (let me build you another view), but humans suck at that kind of behavior.

3. They are not (really) scalable. Replication in Couch and Domino is not really intended as performance measure. It is more a failover and data portability measure. Also in this kind of distributed database, you need to pay attention to sort of "write jurisdiction". Couch touts "advanced conflict resolution mechanisms", which is true until the same field is modified in two different replicas, it is impossible to merge this kind of conflict without loosing data and human must be used to decide what gets to stay. The issue with this is that people seeing "replication conflict", don't think "Oh we had a race condition, let me resolve this" - they think "Oh its the stupid, crappy, goddamn database again, we really should move to SQL".

4. No schema. Document is a bucket, you throw in whatever you please. This may be a good thing or a bad thing. Depends. But if you are too enthusiastic about it, you might wake up to quite a head ache one morning in a couple of years.

5.(Couch specific) Map/Reduce: Perhaps originally it was intended to become a cluster level querying mechanism, but the truth is that albeit it being a very smart indexing mechanism, people associated it with Google's MapReduce (which was a big, big buzzword back in 2006), which led to a lot of disappointment on the users side.

6.(Domino specific) IBM doesn't really know what to do with Domino platform. They tried to kill it, but failed at it. Then forgot to market it. Then remembered to market it, but forgot to develop it. Then remembered to develop it, but failed at it. Then bolted some Java Abomination on top of it. Its the kind of MBA stuff nobody really understands. Why is this important? Because Domino is pretty good application platform for developing "access" type applications for businesses and IBM hates that part, because this kind of stuff is IBM GCS turf and why would they let you develop internal app for 1000$, when they can rob you of 50$K?

The main point is that with Couch/Domino you can do pretty much everything you can do with relational data stores, but it will look different. It will feel different and it means some compromises you might not expect (or are at least not used to).

Honestly all things considered, after years of experience there are indeed very few problems that call exactly for a Couch/Domino type database. However I certainly see how a DB of this kind should find its place in each and every major information system.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: