Python: talk about key-value databases

April 28, 2009 § Leave a comment

by Bob Ippolito: http://blip.tv/file/1949416/

Mysql: benchmarking many writes

April 28, 2009 § Leave a comment

Purpose:

  • Benchmarking Mysql writes to be compared with key-value databases writes.

Basic info:

  • 2.53 G Core 2 Duo Mac Book Pro
  • 4 GB RAM
  • Ruby client

Code can be found: here

Results for 100,000 rows with 16 char length value:

Write 100000 rows with string-length: 16
Thread ID: 659670
Total: 9.095328

%self     total     self     wait    child    calls  name
66.58      6.06     6.06     0.00     0.00   100000  Mysql#query (ruby_runtime:0}
20.30      9.10     1.85     0.00     7.25        1  Integer#times (ruby_runtime:0}
13.11      1.19     1.19     0.00     0.00   100000  Object#insert_statement (/Users/didip/projects/ruby/mysql-profile/write_profile.rb:27}
0.00      9.10     0.00     0.00     9.10        1  Object#write_many_profile (/Users/didip/projects/ruby/mysql-profile/write_profile.rb:37}

Results for 1,000,000 rows with 16 char length value:

Write 1000000 rows with string-length: 16
Thread ID: 659670
Total: 88.175784

%self     total     self     wait    child    calls  name
66.33     58.49    58.49     0.00     0.00  1000000  Mysql#query (ruby_runtime:0}
20.48     88.18    18.06     0.00    70.12        1  Integer#times (ruby_runtime:0}
13.19     11.63    11.63     0.00     0.00  1000000  Object#insert_statement (/Users/didip/projects/ruby/mysql-profile/write_profile.rb:27}
0.00     88.18     0.00     0.00    88.18        1  Object#write_many_profile (/Users/didip/projects/ruby/mysql-profile/write_profile.rb:37}

Python: Object to JSON

April 27, 2009 § Leave a comment

I want to store python object to JSON. Including a complex one (not just object that contains simple attributes).

Apparently, John Paulett has a need for that too. Here’s his code. The best part is: It uses CJSON! Bingo.

Python: using Memcache Client

April 23, 2009 § Leave a comment

For some reason, it’s always hard to find example on how to create memcache client object using Python.

This time, I will remember:

import memcache
memc = memcache.Client(['127.0.0.1:11211'])

Ruby: ActiveRecord

April 23, 2009 § Leave a comment

All the possible ways of cleaning up database connections via ActiveRecord:

http://coderrr.wordpress.com/2009/01/12/rails-22-activerecord-connection-cleanup/

Ruby: Moneta

April 21, 2009 § Leave a comment

In short, Moneta is just like Shove in Python. It is a single interface to various dictionary/hash like storages. Which means all those sexy Key-Value databases.

I’m currently benchmarking Moneta’s Tokyo vs Redis vs Memcache.

Results so far:

  • Surprisingly, the Tokyo implementation is significantly faster. Even against memcache implementation. Why?
  • Similar to LightCloud benchmark, the size of value does not affect speed of storing/getting.

Reference:

Programmer Competency Matrix

April 19, 2009 § 1 Comment

http://www.indiangeek.net/wp-content/uploads/Programmer%20competency%20matrix.htm

JSON vs Thrift vs PBuffer

April 18, 2009 § Leave a comment

Initially I want to write a simple profile tests to find out, but I found this question on StackOverflow. Which lead to these:

  1. http://bouncybouncy.net/ramblings/posts/thrift_and_protocol_buffers/
  2. http://bouncybouncy.net/ramblings/posts/more_on_json_vs_thrift_and_protocol_buffers/
  3. http://bouncybouncy.net/ramblings/posts/json_vs_thrift_and_protocol_buffers_round_2/

Conclusion:

JSON wins since it is cross-platform and fast enough. But, protocol buffer is interesting since the size of binary data is small.

Now it’s time to profile all available json libraries in Python world.

Python: Drinking the Tokyo Kool-Aid

April 16, 2009 § 1 Comment

After reading what LightCloud can do, of course, it’s only natural to create object that serialized to Tokyo.

And that exactly what I did. The project (called Hail) is still infant, but the profile tests already answers some of my questions and curiosity about LightCloud (and Tokyo).

One obvious weakness I need to tackle: Serializing is too slow.

Questions that got answered:

  • Slowness is not caused by the size of the object, instead it is caused by number of items.
  • LightCloud does execute a lot of function calls. Most of the are really fast though.
  • EDIT: Tokyo is fast! Especially after I compare it with Memcache. But LightCloud is not. Tokyo is not as fast as I thought… but this is not final thought, I should create profile_test on raw tokyo tyrant node. On top of that LightCloud overhead is not negligible.
  • Serializing to cjson is faster than cPickle. That’s surprising.

Next, I should test getting items from both memcache and tokyo. I’m expecting it to be really fast.

References:

Git Cheat Sheet

April 11, 2009 § 1 Comment

A couple of notes for myself and readers about Git:

  • Git does not allow you to add empty directory.
  • Do not forget to do initial commit when starting a new repository. Otherwise you will get: error: src refspec master does not match any. Reference: here.
  • There are a couple of configurations need to be set before performing git pull:
    git config branch.master.merge 'refs/heads/master' and 
    git config branch.master.remote 'origin'
    
  • How to ignore files? Read here.
  • The best Git GUI on OS X is: GitX.
  • Git cheat sheet by GitHub: [link]

If you are SVN users:

  • git checkout is not what you think. The functionality that you might want is git clone.
  • git checkout is used to switch branch or re-checkout a file.
  • git commit is not what you think. The command only do local commit. To push to ‘central‘ repo, you want to do git commit followed by git push.

Where Am I?

You are currently viewing the archives for April, 2009 at RAPD.

Follow

Get every new post delivered to your Inbox.