select
VS poll
VS epoll
比较
时间
select
-> poll
-> epoll
file descriptor数量限制
portability
poll
is a POSIX standard interface
epoll
is Linux-specific
data structure of file descriptors
select VS poll
stackoverflow What are the differences between poll and select?
A
The select()
call has you create three bitmasks to mark which sockets and file descriptors you want to watch for reading, writing, and errors, and then the operating system marks which ones in fact have had some kind of activity; poll()
has you create a list of descriptor IDs, and the operating system marks each of them with the kind of event that occurred.
The select()
method is rather clunky and inefficient.
1、There are typically more than a thousand potential file descriptors available to a process. If a long-running process has only a few descriptors open, but at least one of them has been assigned a high number, then the bitmask passed to select()
has to be large enough to accomodate that highest descriptor — so whole ranges of hundreds of bits will be unset that the operating system has to loop across on every select()
call just to discover that they are unset.
NOTE:
2、Once select()
returns, the caller has to loop over all three bitmasks to determine what events took place. In very many typical applications only one or two file descriptors will get new traffic at any given moment, yet all three bitmasks must be read all the way to the end to discover which descriptors those are.
NOTE:
意思是需要监控的file descriptor非常多,但是实际只有"a few descriptors open",也就是说,只有几个file descriptor上有event,但是不得不遍历所有的file descriptor,来找到这几个有event的file descriptor,显然算法复杂度是O(N)
3、Because the operating system signals you about activity by rewriting the bitmasks, they are ruined and are no longer marked with the list of file descriptors you want to listen to. You either have to rebuild the whole bitmask from some other list that you keep in memory, or you have to keep a duplicate copy of each bitmask and memcpy()
the block of data over on top of the ruined bitmasks after each select()
call.
So the poll()
approach works much better because you can keep re-using the same data structure.
In fact, poll()
has inspired yet another mechanism in modern Linux kernels: epoll()
which improves even more upon the mechanism to allow yet another leap in scalability, as today's servers often want to handle tens of thousands of connections at once. This is a good introduction to the effort:
http://scotdoyle.com/python-epoll-howto.html
While this link has some nice graphs showing the benefits of epoll()
(you will note that select()
is by this point considered so inefficient and old-fashioned that it does not even get a line on these graphs!):
http://lse.sourceforge.net/epoll/index.html
Update: Here is another Stack Overflow question, whose answer gives even more detail about the differences:
Caveats of select/poll vs. epoll reactors in Twisted
poll VS epoll
The epoll API performs a similar task to poll(2): monitoring multiple file descriptors to see if I/O is possible on any of them. The epoll API can be used either as an edge-triggered or a level-triggered interface and scales well to large numbers of watched file descriptors.
NOTE:
上面已经总结了epoll相比于poll的优势:
1、支持 "edge-triggered"
2、scales well to large numbers of watched file descriptors
stackoverflow poll vs. epoll insight [duplicate]
stackoverflow Why is epoll faster than select?
cnblogs 为什么人们总是认为epoll 效率比select高!!!!!!
NOTE:
从实现原理上对它们进行对比