A special thanks to Andrea "MEgrez" Talon whom solved some doubts I had about Node.js
Why ThreadsFirst, what's a thread? A thread is parallelization inside a process. Why threads were created? The answer is funny: to resolve fork's low performances. There were many tries to improve fork: Apache's web server used preforking, then with Apache 2.0 it uses threads.
Felix von Leitner wrote a very interesting document about various techniques to improve servers performances.
Why Threads are HardThreads are hard and even good programmers have difficulties to write multithreaded applications. Why?
- Synchronization: it's foundamental to manage a good synchronization among threads. You must protect critical-section. Doing it with Java is easier (it has monitors), but in C/C++ it's harder
- Debug: most of unusual software bugs, come from error in multithreading. Also, just think about how hard could be to reproduce an error condition in a multithreaded program: it means to recreate crash condition, with threads executed in the same sequence in the same time. How hard it could be?
Enter Node.jsSome servers are really fast and easy to write. They're mono-thread servers whom manages multiple connections using select system call. These servers work with many non-blocking sockets. When a event happens on a socket, select calls the function you wrote to manage a single connection. There's no critical-sections, 'cause connections are executed once at time. It's not portable, but a particular library, libevent, grant this.
- It will be fast, 'cause it hasn't threads, neither forks
- Memory usage is constant in time
Node.js isn't the only event-based I/O manager. There's also Twisted and gevent for Python enthisiasts. But it seems there's more hype around Node.js
So, the meaning of this post?Threads, mainly invented for performance reasons can't compete aganist simpler solutions like Node.js. If I should create a new server, probably I'll use Node.js or Twisted instead of a custom implementation using Java threads or another thing.
But, pay attention! Node.js is mono-thread, so it serves a single request once at time. For small requestes (a HTML page, a message, coordinates) this approach is unbeatable. But if you're working with heavy transmissions (e.g. files of many hundreds of MB), every client will be served once at time and will wait until all clients before it will be served sequentially. This means, in a scenario where 5 clients ask for a file with download during 10 seconds:
- first client will be served immediatly
- second in 10 seconds
- third after 20 seconds
- fourth after 30 seconds
- last after 40 seconds
You see: in this situation a multithreaded server is the only choice.