Reddit Web Server

December 11, 2009 § Leave a comment

As of late November 2009 – December 2009, It appears that Reddit is using Tornado web server. Interesting.

You can get the same type of information by using curl (curl -I http://www.reddit.com)

Source:

6diagrams.com: Create your own forum.

October 4, 2009 § Leave a comment

I’ve been building this little thing for a while now. That little thing is Forum software.

It does everything you would expect from using PHPBB, Reddit or HackerNews, and more. With 6diagrams you can bookmark that interesting topic without having to go through your piles of links in that bookmark tab. It also, of course, remember your own posting.

To learn more of what it can do, please follow this link.

Why?

Already 3 guys asked me this questions. Why are you building this thing that looks like Reddit or HackerNews rip off? Well… in my defense, Reddit, HackerNews, (and Google) do a great job in displaying list of interesting things effectively. It’s not a coincidence that I’m following the path of Reddit/HackerNews because I want to follow the path of success.

Besides that UX decision, I build this because I’m starting to have difficulties when blogging. The standards of blogging is going higher and higher. Blogging takes more than just well-written article these days. I have to stay consistent in following the blogs general theme. If I want to write slightly controversial topic, I better do a lot of research and write it in concise manner. When my post is mildly interesting, I have to be vigilant in responding to visitors in less than 24 hours time frame. It’s tough (I believe this is 1 of many keys to Twitter success, it’s very personal).

Those are too much burden for a programmer who have probably only 1 hour a week to write a blog post. Sometimes I just want to voice out my opinion regardless of the accuracy (I never policed any of my real life friends when it comes to factual accuracy, I’m hoping to experience the same thing online). I want to write something that’s interesting enough to create a conversation for social interaction and personal knowledge.

That’s why I defaulted back to Forum format. Forum is great, it doesn’t force people to use real name. Participation in a thread can extends to weeks, so that I have time to respond to conversation (just like mailing list).

Most forums have informal, laid-back conversation about common things that people like, such as OSS project, anime, coffee grinders, Mitsubishi Evolution, etc. There’s no pressure to be right, everyone is simply sharing and hanging out.

That’s why I build 6diagrams.

Technical

6diagrams is open source forum software. The project page can be found here. It is built using Python, Pylons, MySQL, TinyMCE, and Tokyo Tyrant.

For those who stopped by and created accounts, I hope you find 6diagrams enjoyable.

Pylons: production environments

September 20, 2009 § Leave a comment

First of all, there are several wiki posts at pylonshq.com already, below are some of them:

I believe there’s not a single best solution in choosing any of these strategies. This post is intended as sharing my experience in some of these. I’m definitely not set on particular one and will change my mind once I gained better understanding.

My testing methodology

I’m currently testing multiple deployment configurations on the same app, thus giving me the opportunity to blog about it. I use Apache Benchmark (on client side), top and ps afx on server side. My machine is 1 Linode 1440 instance.

My AB setup are:

ab -n 500 -c 50 -k http://rootapp.com/

ab -n 1200 -c 50 -k http://rootapp.com/

ab -n 1500 -c 50 -k http://rootapp.com/

ab -n 800 -c 800 -k http://rootapp.com/

The results varied insignificantly with some subtle interesting differences. I’ll explain those below.

Note: I am testing these configuration on dynamic AJAX-y web application, thus reporting hard numbers is not very useful.

CherryPy vs Paste HTTP Server behind NGINX

One thing¬† that I noticed immediately is that CherryPy has better performance than Paste’s HTTP server. On both, having multiple processes does not help much on overall performance, but significantly reduces number of failed requests. When run under multiple processes, CherryPy consistently have the least number of failed requests.

For my setup having 7-8 processes is the sweet spot. When I have more than that, top is telling me that the latter processes are under utilized.

Setting up CherryPy on your production.ini is painless:

use = egg:PasteScript#cherrypy
numthreads = 20
request_queue_size = 512
host = %(http_host)s
port = %(http_port)s

By just comparing the two, CherryPy is easily the winner.

Lighttpd and SCGI

This gist is basic configuration to get SCGI up and running on lighttpd while the following is setup for your production.ini:

use = egg:Flup#scgi_thread
host = %(http_host)s
port = %(http_port)s

given the same AB configuration as CherryPy and Paste counterpart, lighttpd and SCGI consistently capable of handling 30 requests/seconds. About 8-10 requests/seconds more than CherryPy. Even though this setup is better, I noticed that memory consumption continues to go up after 2 weeks. I haven’t spend much time in investigating why. The reason I didn’t choose this path is more because I simply like NGINX better.

If only SCGI module on NGINX isn’t so experimental.

NGINX and FastCGI

This gist is basic configuration to get FastCGI running on NGINX while the following is setup for your production.ini:

use = egg:Flup#fcgi_thread
host = %(http_host)s
port = %(http_port)s

With this configuration, I consistently get about 25 requests/seconds. It’s a bit behind lighttpd and SCGI configuration. Interestingly, when run under ab -n 1500 -c 50 -k, this configuration hangs NGINX requiring it to be restarted. It only happen once though.

Again, when load balanced properly (depending on your app), any one of these configurations would work well. Hopefully this post can help others to get up to speed in Pylons deployment.

Python vs Ruby, slightly more in-depth

July 13, 2009 § 54 Comments

Disclaimers

Edit (2009/08/04): It is not obvious for many readers that this blog and its articles are meant as opinion piece. It is also not obvious that I like both languages. Just to be clear, I do like both languages.

Introduction

Not too long ago on Reddit, someone posted Python vs Ruby. This kind of comparisons would be more interesting if the author take some time to actually highlights the merit of both languages.

So, let me take a stab at it.

Python has map, reduce, lambda, and list comprehension. Ruby has select, collect, reject, inject, and block, (and lambda).

Edit (2009/08/04): Thanks to everyone who reminded me that Ruby has lambda too.

These techniques allow programmers to perform operations on lists (or dict/hash) effectively. Some of these are not optimized for speed, so do not expect much on speed gain.

Both have set type, collection of distinct values. Set and List/Array are cast able bi-directionally.

Edit (2009/08/08): I forgot about Generator Expression in Python. It looks more or less like list comprehension, but it works using iterator as opposed to containing all values inside in-memory list.

Edit (2009/08/04): I mention lambda as one of the tools that help me manipulating list, not to point out that Python is cool for having lambda (Ruby is cool for having lambda too).

Python does not have anonymous function, but anonymous function maybe out of scope in this section. Because I can simply use lambda.

See:

Reflection and Meta Programming and Monkey Patching

Edit (2009/08/04): Thanks to readers for pointing it out that these are not the same thing. Although they do serve the same purpose for my use cases.

Both languages supports reflection (and meta-programming and monkey patching). That means you have access to the inner working of an object. Python gives you a lot of access via __these_kind_of_methods__ (I never knew what these are called), while Ruby gives you access to everything inside object.

Example of Python monkey patching: You can swapped out object.__class__ with a completely different class. You can also added extra methods to object.__dict__

You can manipulate Python classes on run-time, but not basic classes such as int or basestring. While in Ruby, you can manipulate everything, including replacing/adding methods inside Integer or String. Even though by default attributes are private, Ruby does not try to stop me from accessing them (use send).

As many of you might already know, Rails monkey-patch global object (e.g. object.blank?). I’m glad that Django and Pylons does NOT do that.

eval()/exec() are simply evil (annoys me) in both languages. They make debugging more difficult.

See:

File manipulation

Manipulating files in Python is horrible. The whole os, os.path, shutil, filecmp, tempfile business is convoluted and inconsistent. IMHO, Ruby wins big time.

Documentation

It’s not that easy to read Python documentation because it’s written like a narration. Whereas Ruby documentation follows Javadoc (which is my personal favorite) style. Use apidock.com for even better RTFM experience.

Edit (2009/08/04): Yes. That is my personal preference. If that’s not clear.

Testing

Ruby wins a lot of TDD practitioners. There are plethora of Ruby modules created for making testing experience truly wonderful. See: RSpec, Shoulda, Factory Girl, Selenium.

Python mocking libraries are still not trivial to use. Testing is an area where Python can learn from Ruby (Yes, Selenium also supports Python).

Edit (2009/08/04): Thanks for telling me about windmill!

Visitor Pattern

Visitor pattern is a technique of decoupling logic from object. Often times, there are logic which needs to be shared among objects that do not share the same parent. Decorator is Python’s implementation to visitor pattern, while in Ruby, this could be done by including module/mixin.

Edit (2009/08/04): Example on why I think decorator is visitor pattern (See @InputEvaluator below):

class InputEvaluator(object):

  def __init__(self, func):
    self.func = func

  def __call__(self):
    # add functionality before self.func is executed
    self.func()
    # add functionality after self.func is executed
    # While I'm at it, I can manipulate things inside self.func.__class__, or __name__

They are not PHP

Both do not have GOTO and are general purpose language. They have real objects and objects can persist longer than the life cycle of HTTP request.

As general purpose language, both have interactive console (plus debugger). Useful for testing features that I forgot. PHP5 does have CLI, but seriously…

Although, I have to say PHP’s require_once is nice. That’s 1 thing I have to gripe about in Python, circular import.

Circular Import

A lot of pythonistas say that if programmer have circular import, then s/he usually have bad design. That’s likely to be correct. But, on those rare cases where the design is good, circular import becomes a huge pain in the neck. A good example of this would be:

2 SqlAlchemy model classes which have classmethod that calls the other class. Perfectly legitimate use case, but now both of those model classes have to be put under the same file because of circular import (To NOT have to do this,  create a method that calls the other classmethod). I believe this ruins code maintainability.

Edit (2009/08/04): See commentary’s input on how to avoid this situation.

HTTP and other basic networking

Python comes with webbrowser, urllib2, smtp, http, SocketServer, HttpServer, and more, while Ruby only has net/HTTP

With all those tools, building things like web spider is trivial in Python.

Edit (2009/08/04): This section is just about standard library. I don’t have enough material to elaborate on this. Thus, it’s fair to criticize this section.

Threading

Both are terrible in threading. Python has GIL which limits its threading performance, while Ruby’s threading is leaking memory. (I think 1.9 address this issue. Anyone can confirm this?)

Edit (2009/08/04): Yes! yes! yes! for those who said that Jython and JRuby do not have these problems.

Daemonize

Both does not have daemonize as part of standard library, although it’s very easy to roll my own.

Java

Jython and JRuby exists and both are making using Java significantly more productive.

  • JRuby is actually really nice and have “real” threading implementation.
  • Using Jython for manipulating Swing objects is surprisingly a happy experience.

Modules (for Web Apps)

Both have so many useful modules for building web applications.

Python have: Django, Pylons, web.py, Beautiful Soup, SqlAlchemy, Paste, Werkzeug, Routes (totally “inspired” by Rails), Shove, Pygments, a dozen or so template languages (my favorite is Mako), 4 different JSON modules (cjson is faster than simple-json when looping through 10,000 times. I don’t actually know if this is the best way to benchmark the two), various performance improvement modules (psyco, pyrex, cython)

Ruby have: Rails, Merb, Sinatra, HPricot, DataMapper, Mongrel, ActiveSupport, Moneta, erb, json, RubyInline

Edit (2009/08/04): If it’s not obvious, I am making direct comparison between Python and Ruby here. Yes, I have used all these modules (except Merb and DataMapper. They look awesome though.)

They are both totally interchangeable for building web applications. Both still needs javascript to make awesome looking web applications.

Edit (2009/08/04): I need to elaborate this point, there are modules in both Python and Ruby which sole purpose is to generate javascript code. For front end web development, I prefer to solve javascript problems in javascript. Go jQuery!

IMHO, outside web app realm, Python is better positioned. See: Pyglet, WxWidget, SciPy, etc.

Getting Paid

Python surprisingly lacks of mature library that handles online payments (Python people are not worried about paying customer?). I would appreciate it if anyone can point me to a good payment API in Python.

Whereas Ruby have Payment and ActiveMerchant

Edit (2009/08/04): See comments below for Python payment module.

Big Companies Backing

I believe Python is winning here. Google, Youtube, Yelp, Nasa, Honeywell, etc. use Python. On the other hand, yellowpages.com, AboutUs, and these guys use Ruby. I heard that Amazon Fresh uses RoR, can anyone confirm?

Edit (2009/08/05): Some have suggested that Apple is leaning towards Ruby camp, especially with MacRuby project (link).

Conclusion

These languages are interchangeable for building web application. Neither are more awesome. They get the job done and they make programmers happy.

Thanks HN visitors for giving thoughtful comments! I’ll try my best to keep up with you guys in updating this article.

[fixing layout]

SQLAlchemy: does not call __init__

June 14, 2009 § Leave a comment

When performing query(), SQLAlchemy does not execute __init__ of the corresponding ORM objects.

Thus, if you have some logic inside __init__, those won’t get executed.

To have the desired behavior, you need to put such logic inside a function that takes no arguments, then, attach @orm.reconstructor on it.

Reference:

Yes! Syntax Highlighting for Mako in TextMate!

May 31, 2009 § 1 Comment

For the longest time I kept staring at ‘plain text mode’ while editing Mako templates. No Mas!

TextMate bundle for Mako template has existed.

To install:

  • cd ~/Library/Application\ Support/TextMate/Bundles/
  • svn co http://svn.makotemplates.org/contrib/textmate/Mako.tmbundle

After Reloading your bundle, syntax highlighting is available under HTML (Mako).

Reference:


									

HTML Quickie: Long String Overflowing Your Div

May 25, 2009 § 6 Comments

Do you have that problem? I do.

EDIT(05/31/2009): Even better solutions:

Hyphenator.js: http://code.google.com/p/hyphenator/

OR:

General CSS solution:

overflow: scroll

/EDIT

The HTML solution:

use <wbr>. (But you have to figure out yourself where to put the tag)

The IE-specific CSS solution:

word-wrap: break-word

PHP solution:

wordwrap(‘your_very_long_string_here’, 15) // Break after 15 characters

Python solution:

import textwrap

textwrap.fill(‘your_very_long_string_here’, 15) # Break after 15 characters

References:

Pylons Quickie: Mako Output

May 24, 2009 § Leave a comment

By default Mako HTML escapes all output.

To NOT have this behavior, change the setting of TemplateLookup in environment.py.

Resources:

Pylons Quickie: I want to use distance_of_time_in_words

May 17, 2009 § Leave a comment

distance_of_time_in_words is a useful function that converts boring looking datetime into something more attractive (and SEO optimized) like: 30 seconds ago.

Pylons gain this functionality via WebHelper which blatantly inspired by Rails.

But, out of the box, this functionality does not exists in my templates. To enable it, I need to add this line in helpers.py:

from webhelpers.date import distance_of_time_in_words

Resource:

WebHelpers Documentation

Pylons Cheat Sheet

May 13, 2009 § Leave a comment

http://workaround.org/pylons/pylons-cheatsheet.html

Where Am I?

You are currently browsing entries tagged with python at RAPD.

Follow

Get every new post delivered to your Inbox.