最近一直在读《java并发编程实践》,书是绝对的好书,翻译不能说差,也谈不上好,特别是第一部分的前面几章,有的地方翻译的南辕北辙了,还是要对照着英文版来看。我关注并发编程是从学习Erlang开始的,在多核来临的时代,有人说并发将是下一个10年的关键技术。java5之前的多线程编程很复杂,况且我也没有从事此类应用的开发,了解不多,而从jdk5引入了让人流口水的concurrent包之后,java的并发编程开始变的有趣起来。
书中第6章以编写一个web server为例子,引出了几种不同版本的写法:单线程、多线程以及采用jdk5提供的线程池实现。我就用apache自带的ab工具测试了下各个版本的性能,在redhat9 p4 2g内存的机器上进行了测试。
ab
-
n
50000
-
c
1000
http:
//
localhost/index.html >benchmark
单线程模式,顺序性地处理每一个请求,50000并发很快就没有响应了,不参与比较了。再来看看我们自己写的多线程方式处理每个请求:
package
net.rubyeye.concurrency.chapter6;
import
java.io.BufferedReader;
import
java.io.DataOutputStream;
import
java.io.File;
import
java.io.FileInputStream;
import
java.io.IOException;
import
java.io.InputStreamReader;
import
java.net.InetAddress;
import
java.net.ServerSocket;
import
java.net.Socket;
public
class
ThreadPerTaskWebServer {
public
static
void
main(String[] args)
throws
IOException { ServerSocket server
=
new
ServerSocket(
80
);
while
(
true
) {
final
Socket connection
=
server.accept(); Runnable task
=
new
Runnable() {
public
void
run() {
try
{ handleRequest(connection); }
catch
(IOException e) { e.printStackTrace(); } } };
new
Thread(task).start(); } }
public
static
void
handleRequest(Socket socket)
throws
IOException {
try
{ InetAddress client
=
socket.getInetAddress();
//
and print it to gui
s(client.getHostName()
+
"
connected to server.\n
"
);
//
Read the http request from the client from the socket interface
//
into a buffer.
BufferedReader input
=
new
BufferedReader(
new
InputStreamReader( socket.getInputStream()));
//
Prepare a outputstream from us to the client,
//
this will be used sending back our response
//
(header + requested file) to the client.
DataOutputStream output
=
new
DataOutputStream(socket .getOutputStream());
//
as the name suggest this method handles the http request, see
//
further down.
//
abstraction rules
http_handler(input, output); socket.close(); }
catch
(Exception e) {
//
catch any errors, and print them
s(
"
\nError:
"
+
e.getMessage()); } }
//
go back in loop, wait for next request
//
our implementation of the hypertext transfer protocol
//
its very basic and stripped down
private
static
void
http_handler(BufferedReader input, DataOutputStream output) {
int
method
=
0
;
//
1 get, 2 head, 0 not supported
String http
=
new
String();
//
a bunch of strings to hold
String path
=
new
String();
//
the various things, what http v, what
//
path,
String file
=
new
String();
//
what file
String user_agent
=
new
String();
//
what user_agent
try
{
//
This is the two types of request we can handle
//
GET /index.html HTTP/1.0
//
HEAD /index.html HTTP/1.0
String tmp
=
input.readLine();
//
read from the stream
String tmp2
=
new
String(tmp); tmp.toUpperCase();
//
convert it to uppercase
if
(tmp.startsWith(
"
GET
"
)) {
//
compare it is it GET
method
=
1
; }
//
if we set it to method 1
if
(tmp.startsWith(
"
HEAD
"
)) {
//
same here is it HEAD
method
=
2
; }
//
set method to 2
if
(method
==
0
) {
//
not supported
try
{ output.writeBytes(construct_http_header(
501
,
0
)); output.close();
return
; }
catch
(Exception e3) {
//
if some error happened catch it
s(
"
error:
"
+
e3.getMessage()); }
//
and display error
}
//
}
//
tmp contains "GET /index.html HTTP/1.0 ."
//
find first space
//
find next space
//
copy whats between minus slash, then you get "index.html"
//
it's a bit of dirty code, but bear with me
int
start
=
0
;
int
end
=
0
;
for
(
int
a
=
0
; a
<
tmp2.length(); a
++
) {
if
(tmp2.charAt(a)
==
'
'
&&
start
!=
0
) { end
=
a;
break
; }
if
(tmp2.charAt(a)
==
'
'
&&
start
==
0
) { start
=
a; } } path
=
tmp2.substring(start
+
2
, end);
//
fill in the path
}
catch
(Exception e) { s(
"
errorr
"
+
e.getMessage()); }
//
catch any exception
//
path do now have the filename to what to the file it wants to open
s(
"
\nClient requested:
"
+
new
File(path).getAbsolutePath()
+
"
\n
"
); FileInputStream requestedfile
=
null
;
try
{
//
NOTE that there are several security consideration when passing
//
the untrusted string "path" to FileInputStream.
//
You can access all files the current user has read access to!!!
//
current user is the user running the javaprogram.
//
you can do this by passing "../" in the url or specify absoulute
//
path
//
or change drive (win)
//
try to open the file,
requestedfile
=
new
FileInputStream(path); }
catch
(Exception e) {
try
{
//
if you could not open the file send a 404
output.writeBytes(construct_http_header(
404
,
0
));
//
close the stream
output.close(); }
catch
(Exception e2) { } ; s(
"
error
"
+
e.getMessage()); }
//
print error to gui
//
happy day scenario
try
{
int
type_is
=
0
;
//
find out what the filename ends with,
//
so you can construct a the right content type
if
(path.endsWith(
"
.zip
"
)
||
path.endsWith(
"
.exe
"
)
||
path.endsWith(
"
.tar
"
)) { type_is
=
3
; }
if
(path.endsWith(
"
.jpg
"
)
||
path.endsWith(
"
.jpeg
"
)) { type_is
=
1
; }
if
(path.endsWith(
"
.gif
"
)) { type_is
=
2
;
//
write out the header, 200 ->everything is ok we are all
//
happy.
} output.writeBytes(construct_http_header(
200
,
5
));
//
if it was a HEAD request, we don't print any BODY
if
(method
==
1
) {
//
1 is GET 2 is head and skips the body
while
(
true
) {
//
read the file from filestream, and print out through the
//
client-outputstream on a byte per byte base.
int
b
=
requestedfile.read();
if
(b
==
-
1
) {
break
;
//
end of file
} output.write(b); } }
//
clean up the files, close open handles
output.close(); requestedfile.close(); }
catch
(Exception e) { } }
private
static
void
s(String s) {
//
System.out.println(s);
}
//
this method makes the HTTP header for the response
//
the headers job is to tell the browser the result of the request
//
among if it was successful or not.
private
static
String construct_http_header(
int
return_code,
int
file_type) { String s
=
"
HTTP/1.0
"
;
//
you probably have seen these if you have been surfing the web a while
switch
(return_code) {
case
200
: s
=
s
+
"
200 OK
"
;
break
;
case
400
: s
=
s
+
"
400 Bad Request
"
;
break
;
case
403
: s
=
s
+
"
403 Forbidden
"
;
break
;
case
404
: s
=
s
+
"
404 Not Found
"
;
break
;
case
500
: s
=
s
+
"
500 Internal Server Error
"
;
break
;
case
501
: s
=
s
+
"
501 Not Implemented
"
;
break
; } s
=
s
+
"
\r\n
"
;
//
other header fields,
s
=
s
+
"
Connection: close\r\n
"
;
//
we can't handle persistent
//
connections
s
=
s
+
"
Server: SimpleHTTPtutorial v0\r\n
"
;
//
server name
//
Construct the right Content-Type for the header.
//
This is so the browser knows what to do with the
//
file, you may know the browser dosen't look on the file
//
extension, it is the servers job to let the browser know
//
what kind of file is being transmitted. You may have experienced
//
if the server is miss configured it may result in
//
pictures displayed as text!
switch
(file_type) {
//
plenty of types for you to fill in
case
0
:
break
;
case
1
: s
=
s
+
"
Content-Type: image/jpeg\r\n
"
;
break
;
case
2
: s
=
s
+
"
Content-Type: image/gif\r\n
"
;
case
3
: s
=
s
+
"
Content-Type: application/x-zip-compressed\r\n
"
;
default
: s
=
s
+
"
Content-Type: text/html\r\n
"
;
break
; }
//
//
so on and so on
s
=
s
+
"
\r\n
"
;
//
this marks the end of the httpheader
//
and the start of the body
//
ok return our newly created header!
return
s; } }
测试结果如下:
Concurrency Level: 1000
Time taken for tests: 111.869356 seconds
Complete requests: 50000
Failed requests: 0
Write errors: 0
Total transferred: 4950000 bytes
HTML transferred: 250000 bytes
Requests per second: 446.95 [#/sec] (mean)
Time per request: 2237.387 [ms] (mean)
Time per request: 2.237 [ms] (mean, across all concurrent requests)
Transfer rate: 43.20 [Kbytes/sec] received
修改下上面的程序,采用jdk5提供的线程池:
private
static
final
int
NTHREADS
=
5
;
private
static
Executor exec;
public
static
void
main(String[] args)
throws
IOException { ServerSocket server
=
new
ServerSocket(
80
);
if
(args.length
==
0
) exec
=
Executors.newFixedThreadPool(NTHREADS);
else
exec
=
Executors.newFixedThreadPool(Integer.parseInt(args[
0
]));
while
(
true
) {
final
Socket connection
=
server.accept(); Runnable task
=
new
Runnable() {
public
void
run() {
try
{ handleRequest(connection); }
catch
(IOException e) { e.printStackTrace(); } } }; exec.execute(task); } }
默认线程池大小取5,后经过反复测试,线程池大小在5左右,测试结果达到最佳。测试采用线程池的结果如下:
Concurrency Level: 1000
Time taken for tests: 51.648142 seconds
Complete requests: 50000
Failed requests: 0
Write errors: 0
Total transferred: 4978908 bytes
HTML transferred: 251460 bytes
Requests per second: 968.09 [#/sec] (mean)
Time per request: 1032.963 [ms] (mean)
Time per request: 1.033 [ms] (mean, across all concurrent requests)
Transfer rate: 94.14 [Kbytes/sec] received
与上面结果一比较,牛人写的线程池终究是大大不一样。当连接数增加到10W以上,两个版本之间的性能差异就更明显了。这里采用的是固定线程池,如果采用缓冲线程池会怎么样呢?newFixedThreadPool改为newCachedThreadPool方法,测试可以发现结果与固定线程池的最佳结果相似。CachedThreadPool更适合此处短连接、高并发的场景。后来,我想Erlang写一个简单的web server,性能上会不会超过采用线程池的这个版本呢?试试:
%%
httpd.erl
-
MicroHttpd
-
module(httpd).
-
export([start
/
0
,start
/
1
,start
/
2
,process
/
2
]).
-
import
(regexp,[split
/
2
]).
-
define(defPort,
80
).
-
define(docRoot,
"
.
"
). start()
->
start(
?
defPort,
?
docRoot). start(Port)
->
start(Port,
?
docRoot). start(Port,DocRoot)
->
case
gen_tcp:listen(Port, [binary,{packet,
0
},{active,
false
}]) of {ok, LSock}
->
server_loop(LSock,DocRoot); {error, Reason}
->
exit({Port,Reason}) end.
%%
main server loop
-
wait
for
next connection, spawn child to process it server_loop(LSock,DocRoot)
->
case
gen_tcp:accept(LSock) of {ok, Sock}
->
spawn(
?
MODULE,process,[Sock,DocRoot]), server_loop(LSock,DocRoot); {error, Reason}
->
exit({accept,Reason}) end.
%%
process current connection process(Sock,DocRoot)
->
Req
=
do_recv(Sock), {ok,[Cmd
|
[Name
|
[Vers
|
_]]]}
=
split(Req,
"
[ \r\n]
"
), FileName
=
DocRoot
++
Name, LogReq
=
Cmd
++
"
"
++
Name
++
"
"
++
Vers, Resp
=
case
file:read_file(FileName) of {ok, Data}
->
io:format(
"
~p ~p ok~n
"
,[LogReq,FileName]), Data; {error, Reason}
->
io:format(
"
~p ~p failed ~p~n
"
,[LogReq,FileName,Reason]), error_response(LogReq,file:format_error(Reason)) end, do_send(Sock,Resp), gen_tcp:close(Sock).
%%
construct HTML
for
failure message error_response(LogReq,Reason)
->
"
<html><head><title>Request Failed</title></head><body>\n
"
++
"
<h1>Request Failed</h1>\n
"
++
"
Your request to
"
++
LogReq
++
"
failed due to:
"
++
Reason
++
"
\n</body></html>\n
"
.
%%
send a line of text to the do_send(Sock,Msg)
->
case
gen_tcp:send(Sock, Msg) of ok
->
ok; {error, Reason}
->
exit(Reason) end.
%%
receive data from the socket do_recv(Sock)
->
case
gen_tcp:recv(Sock,
0
) of {ok, Bin}
->
binary_to_list(Bin); {error, closed}
->
exit(closed); {error, Reason}
->
exit(Reason) end.
执行:
erl
-
noshell
+
P
5000
-
s httpd start
+P参数是将系统允许创建的process数目增加到50000,默认是3万多。测试结果:
Concurrency Level: 1000
Time taken for tests: 106.35735 seconds
Complete requests: 50000
Failed requests: 0
Write errors: 0
Total transferred: 250000 bytes
HTML transferred: 0 bytes
Requests per second: 471.54 [#/sec] (mean)
Time per request: 2120.715 [ms] (mean)
Time per request: 2.121 [ms] (mean, across all concurrent requests)
Transfer rate: 2.30 [Kbytes/sec] received
结果让人大失所望,这个结果与我们自己写的多线程java版本差不多,与采用线程池的版本就差多了,减少并发的话,倒是比java版本的快点。侧面验证了
这个讨论的结论:
erlang的优势就是高并发而非高性能。当然,这三者都比不上C语言写的多线程web server。测试了unix/linux编程实践中的例子,速度是远远超过前三者,不过支持的并发有限,因为系统创建的线程在超过5000时就崩溃了。如果采用jdk5进行开发,应当充分利用新的并发包,可惜我们公司还停留在1.4。
文章转自庄周梦蝶 ,原文发布时间 2007-08-29