replication - Concurrent writes to cassandra replicas - Is duplication possible? -
i have 2 machine cluster running cassandra 1.2.6. using keyspace has replication factor of 2. application demands me write both replicas in parallel , let cassandra replication , hoping cassandra not duplicate key/value on replica nodes.
for example:
- i have nodes node1 , node2. have keyspace has replication factor 2 configured on , column family push key/value pairs
- i use python client (pycassa) write cluster.
- a key, "keyx", hashes node1 , node2. (i find out key hashes servers through node tool command. (`$nodetool getendpoints keyspacename columnfamilyname keyhexstring`)
- i use client write (keyx, value) concurrently nodes node1 , node2. (in connection pool give specific server name)
- when writing, wait 1 write succeed (to master). (consistency level one)
- now, monitor through `$nodetool status` command amount of disk space cluster uses.
- i write around 100 keys each having 2mb values.
ideally should store around 400mb on disk overhead storing keys should marginal compared value sizes using.
observations:
- if not write nodes key hashes to, cassandra internally handles replication , data size around 400mb. (200mb on each node 100 keys 2mb value)
- if write nodes key hashes to, cassandra writing more expected amount of data disk. high 15% more. in our tests cassandra write ~460mb instead of 400mb.
my question is, behavior (15% overhead) expected? there configuration need tweak cassandra handles concurrent writes replicas.
thanks!
there 2 possible causes of 15% space can think of.
one because replica store 2 copies of column temporarily. if write column twice in cassandra @ different times, 2 copies may go separate memtables end in separate sstables on disk. @ point later, when sstables merged through compaction process, older value discarded, freeing space. in test run nodetool compact
force compaction , see if space usage goes down.
another possible cause depends on how did test when didn't write both nodes. if did @ consistency level one, possible of writes dropped other replica, doesn't have keys yet. can sure running nodetool repair
. space used in first observation may not keys.
you should aware writing replicas @ consistency level 1 not guarantee each replica holds copy. node receiving data not have store return success write, if replica. may overloaded (in workload, due not enough i/o write data out) , drop write, while succeeding in writing different replica. cause less space used in second observation, isn't happening in test since relatively small amount of data.
if need guarantee have 2 copies should write @ consistency level , write once.
Comments
Post a Comment