번역 작업
원문
원문
NoSQL DB 비교 분석 자료
MongoDB
- 구현 : C++
- 특징 : 몇가지 SQL과 비슷한 속성을 가짐(Query, index 등)
- 라이센스 : AGPL
- 프로토콜 : Custom, binary(BSON)
- 특징
- Master/slave replication(auto failover with replica sets)
- Sharding built-in
- Queries are javascript expressions
- Run arbitrary javascript functions server-side
- Better update-in-place than CouchDB
- Uses memory mapped files for data storage
- Performance over features
- Journaling (with --journal) is best turned on
- On 32bit system, limited to 2.5Gb
- An empty database takes up 192Mb
- GridFS to store big data + metadata (not actually an FS)
- Has geospatial indexing
- 주요 사용처
- 만약 다양한 쿼리가 필요하다면
- map/reduce 함수가 아니라 인텍스를 선호한다면
- big DB에서 좋은 성능을 바란다면
- CouchDB를 사용하길 원하지만, filling up disks에 너무 많은 데이터 변화가 있다면
- 사용예
- For most things that you would do with MySQL or PostgreSQL, but having predefined columns really holds you back.
CouchDB
- 구현 : Erlang
- 주요특징 : DB consistency, 쉬운 사용
- 라이센스 : Apache
- 프로토콜 : HTTP/REST
- 상세특징
- Bi-directional replication
- continuous or ad-hoc
- with conflict detection
- thus, master-master replication
- MVCC - write operations do not block reads
- Previous version of documents are available
- Crash-only(relabel) design
- Needs compacting from time to time
- Views: embedded map/reduce
- Formatting views : lists&shows
- Server-side document validation possible
- Authentication possible
- Real-time updates via_changes
- Attachement handling
- thus CouchAps(standalone js apps)
- jQuery library included
- 주요 사용처
- 누적, 일시적 데이터 변화, 미리 정의된 쿼리가 사용될 때, versioning이 중요한 곳
- For accumulating, occasionally changing data, on which pre-defined queries are to be run. Places where versioning is important.
- 사용예
- CRM, CMS systems. Master-master replication is an especially interesting feature, allowing easy multi-site deployments.
HBase
- 구현 : 자바
- 주요특징 : 수십억 row x 수백만 column
- 라이센스 : Apache
- 프로토콜 : HTTP/REST(also Thrift)
- 상세특징
- Modeled after Google's BigTable
- Uses Hadoop's HDFS as storage
- Map/reduse with hadoop
- Query predicate push down via server side scan and get filters
- Optimizations for real time queries
- A high performance Thrift gateway
- HTTP supports XML, Protobuf, and binary
- Cascading, hive, and pig source and sink modules
- Jruby-based (JIRB) shell
- Rolling restart for configuration changes and minor upgrades
- Random access performance is like MySQL
- A cluster consists of several different types of nodes
- 주요사용처
- Hadoop is probably still the best way to run Map/Reduce jobs on huge datasets. Best if you use the Hadoop/HDFS stack already.
- 사용예
- Analysing log data
Cassandra
- 구현 : 자바
- 주요특징 : Best of BigTable and Dynamo
- 라이센스 : Apache
- 프로토콜 : Custom, binary(Thrift)
- 상세특징
- Tunable trade-offs for distribution and replication (N, R, W)
- Querying by column, range of keys
- BigTable-like features: columns, column families
- Has secondary indices
- Writes are much faster than reads
- Map/reduce possible with Apache Hadoop
- All nodes are similar, as opposed to Hadoop/HBase
- 주요사용처
- When you write more than you read (logging). If every component of the system must be in Java. ("No one gets fired for choosing Apache's stuff.")
- 사용예
- Banking, financial industry (though not necessarily for financial transactions, but these industries are much bigger than that.)
- Writes are faster than reads, so one natural niche is real time data analysis
댓글
댓글 쓰기