Tomcat7 request line(请求行)源码解析
本文试图说清楚tomcat 如下几个问题:
- tomcat 底层到底有几层buffer,是怎么一层一层读上来到应用层的
- tomcat request line 解析
要分析tomcat 读,首先你要知道tomcat nio的线程模型,如果不了解这个知识的话,不好理解本文。
先上一张tomcat 的buffer 关系图:

Tomcat 大概的流程图如下:

SocketBuffe
SocketBuffe 是tomcat最NIO层面的buffer,也是tomcat的一层buffer,可以通过connect 配置缓冲区大小,是否用direct buffer,通过这里我们可以看优化tomcat时,可以指定direct 为true
public SocketBufferHandler(int readBufferSize, int writeBufferSize,
boolean direct) {
this.direct = direct;
if (direct) {
readBuffer = ByteBuffer.allocateDirect(readBufferSize);
writeBuffer = ByteBuffer.allocateDirect(writeBufferSize);
} else {
readBuffer = ByteBuffer.allocate(readBufferSize);
writeBuffer = ByteBuffer.allocate(writeBufferSize);
}
}
readBufferSize 大小默认是8192byte,即8k,如果你的post 请求内容比这个大,
如果配置使用堆外内存DirectByteBuffer,tomcat 清理是采用主动清理的方式,方法是通过反射拿到DirectByteBuffer的cleaner 方法,再通过反射执行cleaner方法,拿到cleaner对象,在free时执行cleaner 清除
堆内的引用对象cleaner
可以看出tomcat的底层socket buffer 是用完了就回收的, 没有重用,这点不得不佩服netty的优化,有内存池,buffer 对象池两层优化。
Http11Processor
Tomcat 接收到clien 发送的http 请求后,读http请求的报文由
Http11Processor 的service方法负责处理,
Http11Processor 由ConnectionHandler 创建,tomcat 对关键的类都实现了重用,以减少频繁创建和销毁的开销,会从recycledProcessors 里pop出来,
if (processor == null) {
processor = recycledProcessors.pop();
if (getLog().isDebugEnabled()) {
getLog().debug(sm.getString("abstractConnectionHandler.processorPop",
processor));
}
}
if (processor == null) {
processor = getProtocol().createProcessor();
register(processor);
}
Http11Processor 的创建
Http11Processor 创建需要指定tomcat 读缓冲区的大小,即包含请求头header的大小,请求body的大小等
maxHttpHeaderSize 默认是8k
public Http11Processor(int maxHttpHeaderSize, AbstractEndpoint<?> endpoint,int maxTrailerSize,
Set<String> allowedTrailerHeaders, int maxExtensionSize, int maxSwallowSize,
Map<String,UpgradeProtocol> httpUpgradeProtocols, boolean sendReasonPhrase) {
super(endpoint);
userDataHelper = new UserDataHelper(log);
inputBuffer = new Http11InputBuffer(request, maxHttpHeaderSize);
request.setInputBuffer(inputBuffer);
outputBuffer = new Http11OutputBuffer(response, maxHttpHeaderSize, sendReasonPhrase);
response.setOutputBuffer(outputBuffer);
// Create and add the identity filters.
// tomcat 通过filter来 读body,IdentityInputFilter 读非Chunk body
inputBuffer.addFilter(new IdentityInputFilter(maxSwallowSize));
outputBuffer.addFilter(new IdentityOutputFilter());
// Create and add the chunked filters.
inputBuffer.addFilter(new ChunkedInputFilter(maxTrailerSize, allowedTrailerHeaders,
maxExtensionSize, maxSwallowSize));
outputBuffer.addFilter(new ChunkedOutputFilter());
// Create and add the void filters.
inputBuffer.addFilter(new VoidInputFilter());
outputBuffer.addFilter(new VoidOutputFilter());
// Create and add buffered input filter
inputBuffer.addFilter(new BufferedInputFilter());
// Create and add the chunked filters.
//inputBuffer.addFilter(new GzipInputFilter());
outputBuffer.addFilter(new GzipOutputFilter());
pluggableFilterIndex = inputBuffer.getFilters().length;
this.httpUpgradeProtocols = httpUpgradeProtocols;
}
Http11InputBuffer 的创建
public Http11InputBuffer(Request request, int headerBufferSize) {
this.request = request;
headers = request.getMimeHeaders();
this.headerBufferSize = headerBufferSize;
filterLibrary = new InputFilter[0];
activeFilters = new InputFilter[0];
lastActiveFilter = -1;
parsingHeader = true;
parsingRequestLine = true;
parsingRequestLinePhase = 0;
parsingRequestLineEol = false;
parsingRequestLineStart = 0;
parsingRequestLineQPos = -1;
headerParsePos = HeaderParsePosition.HEADER_START;
swallowInput = true;
inputStreamInputBuffer = new SocketInputBuffer();
}
Http11InputBuffer 创建时,就指定了headerBufferSize 的大小,还有个inputStreamInputBuffer,inputStreamInputBuffer 是在读http body时用到的。在分析http body时会讲到
在拿到可用的Http11Processor 后,调用它的核心方法service 方法,service 方法比较长,我们只关注几个关键点
1 初始化读写缓冲区
//初始化读缓冲区,inputBuffer即Http11InputBuffer
inputBuffer.init(socketWrapper);
//初始化写缓冲区
outputBuffer.init(socketWrapper);
我们看下Http11InputBuffer init的代码如下:
void init(SocketWrapperBase<?> socketWrapper) {
wrapper = socketWrapper;
wrapper.setAppReadBufHandler(this);
int bufLength = headerBufferSize +
wrapper.getSocketBufferHandler().getReadBuffer().capacity();
if (byteBuffer == null || byteBuffer.capacity() < bufLength) {
byteBuffer = ByteBuffer.allocate(bufLength);
byteBuffer.position(0).limit(0);
}
}
init 方法是为Http11InputBuffer 内部创建一个读缓冲区byteBuffer,就是这个byteBuffer 在后面的读请求头,header,body 时都会用到,这是tomcat的一个核心bytebuffer
初始化bytebuffer的大小:大小为headerBufferSize + socket buffer的大小
headerBufferSize 默认8*1024
socket buffer size 默认是8192
看完了buffer 初始化的工作,下面就是开始解析http 协议 内容了,我们知道http 协议内容分为三部分即:
request line + request header + request body 组成

那首先解析的是http 请求头 request line
//解析请求头inputBuffer 是上面提到的Http11InputBuffer
if (!inputBuffer.parseRequestLine(keptAlive)) {
//如果没有读到完整的请求行,parsingRequestLinePhase 是 1
if (inputBuffer.getParsingRequestLinePhase() == -1) {
return SocketState.UPGRADING;
} else if (handleIncompleteRequestLineRead()) {
//如果没有读到一个完整的请求头,则需要等待继续读,即需要重新注册读事件
break;
}
}
接下来我们看看inputBuffer.parseRequestLine 方法有个读标记
parsingRequestLinePhase,parsingRequestLinePhase的值代表读请求行不同的部分
parsingRequestLinePhase = 0
初始值,这时byteBuffer 是空的,即position == limit = 0,触发第一次读这读后面再分析
先要确定parsingRequestLineStart 的值,怕前面有换行或者回车符,如果没有则第一个就是本次请求的buffer的。并设置parsingRequestLinePhase 为2
parsingRequestLinePhase = 2
parsingRequestLinePhase 为2 就开始读method,直到读到第一个空格为止,设置request的method,并设置parsingRequestLinePhase = 3
if (byteBuffer.position() >= byteBuffer.limit()) {
if (!fill(false)) // request line parsing
return false;
}
// Spec says method name is a token followed by a single SP but
// also be tolerant of multiple SP and/or HT.
int pos = byteBuffer.position();
byte chr = byteBuffer.get();
if (chr == Constants.SP || chr == Constants.HT) {
space = true;
//读到了空格,说明知道了method的长度。即pos - parsingRequestLineStart,这时就可以设置request的method了。
request.method().setBytes(byteBuffer.array(), parsingRequestLineStart,
pos - parsingRequestLineStart);
} else if (!HttpParser.isToken(chr)) {
byteBuffer.position(byteBuffer.position() - 1);
throw new IllegalArgumentException(sm.getString("iib.invalidmethod"));
}
parsingRequestLinePhase = 3
parsingRequestLinePhase 是计算请求url的偏移位置。往后读,直到读到非空格的字符。并parsingRequestLinePhase = 4
//跳过空格,让parsingRequestLineStart 到url的第一个偏移位置。
if (parsingRequestLinePhase == 3) {
// Spec says single SP but also be tolerant of multiple SP and/or HT
boolean space = true;
while (space) {
// Read new bytes if needed
if (byteBuffer.position() >= byteBuffer.limit()) {
if (!fill(false)) // request line parsing
return false;
}
byte chr = byteBuffer.get();
if (!(chr == Constants.SP || chr == Constants.HT)) {
space = false;
byteBuffer.position(byteBuffer.position() - 1);
}
}
parsingRequestLineStart = byteBuffer.position();
parsingRequestLinePhase = 4;
}
parsingRequestLinePhase = 4
这阶段主要计算出两部分,因为get请求url后面可能是带参数的,所以需要计算出url的偏移量和长度,以及查询参数的偏移量parsingRequestLineQPos
和长度,以及url结束时的偏移end
,url的偏移在parsingRequestLinePhase = 3时计算好了。
所以url的内容为
url length = parsingRequestLineQPos - parsingRequestLineStart
如果url部分有?
,则查询参数的内容为
queryStr length = end - parsingRequestLineQPos - 1
//有查询参数
if (parsingRequestLineQPos >= 0) {
request.queryString().setBytes(byteBuffer.array(), parsingRequestLineQPos + 1,
end - parsingRequestLineQPos - 1);
request.requestURI().setBytes(byteBuffer.array(), parsingRequestLineStart,
parsingRequestLineQPos - parsingRequestLineStart);
} else {
request.requestURI().setBytes(byteBuffer.array(), parsingRequestLineStart,
end - parsingRequestLineStart);
}
parsingRequestLinePhase = 5;
parsingRequestLinePhase = 5 和 3一样,需要调过后面的空格,防止出现多个空格。并计算出了请求行最后一部分协议版本的偏移parsingRequestLineStart,设置parsingRequestLinePhase = 6
parsingRequestLinePhase = 6
从parsingRequestLineStart读,直到读到回车符CR标记end,读到LF就认为是结会束了,并设置parsingRequestLineEol 结束的偏移量
Protocol 的内容为:
end- parsingRequestLineStart
到此tomcat 读http 头就算理完了。上面都是在bytebuffer 已经有内存的基础上做的,但是bytebuffer 内容是怎么读到的,经历了那几次copy,还是个谜,下面我们就来揭开下:
在上面每个阶段都会判断当前bytebuffer 是否还有数据可读:
/*
position = limit 即已经读完了,需要执行fill重新填充,参数是false,
是非阻塞读,那什么时候阻塞读呢,是我们在调用getInputStream()时,
就是阻塞的
*/
if (byteBuffer.position() >= byteBuffer.limit()) {
if (!fill(false)) // request line parsing
return false;
}
fill 调用的是NioSocketWrapper的read方法:
@Override
public int read(boolean block, ByteBuffer to) throws IOException {
//先从tomcat 底层socket buffer 缓冲区读,如果buffer缓冲区还有未读的buffer,则不需要到OS底层读缓冲区读
int nRead = populateReadBuffer(to);
if (nRead > 0) {
return nRead;
/*
* Since more bytes may have arrived since the buffer was last
* filled, it is an option at this point to perform a
* non-blocking read. However correctly handling the case if
* that read returns end of stream adds complexity. Therefore,
* at the moment, the preference is for simplicity.
*/
}
//到这里是tomcat socketBufferHandler 的read buffer 已经读完,
// The socket read buffer capacity is socket.appReadBufSize
// tomcat buffer 缓冲区已经读完,则需要从OS底层缓冲区读
int limit = socketBufferHandler.getReadBuffer().capacity();
/**
* 在第一读的时候,to 是空的,to.remaining() 即 socket buffer capacity + header size
* 所以在第一次读时to.remaining() >= limit 为true,即直接从os 底层读到byte buffer */
if (to.remaining() >= limit) {
//to buffer 缓冲区可写空间大于个socketBufferHandler read buffer 的容量
//设置该次读取的最大值limit,即socket buffer的大小。
to.limit(to.position() + limit);
//realy read from os buffer to app buffer to
//直接从os 底层读到 to buffer,避免一次copy
nRead = fillReadBuffer(block, to);
updateLastRead();
} else {
// Fill the read buffer as best we can.
// 先读到tomcat socketBufferHandler 的 read buffer
nRead = fillReadBuffer(block);
updateLastRead();
// Fill as much of the remaining byte array as possible with the// data that was just read
if (nRead > 0) {
nRead = populateReadBuffer(to);
}
}
return nRead;
}
populateReadBuffer 是负责把tomcat 底层的socket buffer 即nio buffer的内容copy到 Http11Processor 的bytebuffer。
protected int populateReadBuffer(ByteBuffer to) {
// Is there enough data in the read buffer to satisfy this request?
// Copy what data there is in the read buffer to the byte array
// read buffer 刚写入了数据,需要为读做好准备,即做fip 操作
// limit = position,position = 0;
socketBufferHandler.configureReadBufferForRead();
// copy socketBufferHandler buffer 到 to
int nRead = transfer(socketBufferHandler.getReadBuffer(), to);
if (log.isDebugEnabled()) {
log.debug("Socket: [" + this + "], Read from buffer: [" + nRead + "]");
}
return nRead;
}
通过前面的分析,to 即Http11Processor 的byteBuffer,byteBuffer的容量为 header size + socket buffer size,在第一读的时候,to 是空的,to.remaining() 即 socket buffer capacity + header size - 0,所以在第一次读时to.remaining() >= limit 为true,即直接从os 底层读到byte buffer, 待到byte buffer 剩余的空间不足socket buffer大小时,才会先读到socket buffer,socket buffer 大小比bytebuffer大,这样一次能从底层os 读更多的内容,再 从socket buffer copy到byte buffer,这样做的意义应该是能减少一次系统底层调用read.
总结下:
如果请求行 + header + body的大小不足一个8k,即只有一次底层系统io 读就可以读完,后面都是从byte buffer里面取
如果请求行 + header + body 大于8k,小于16k,需要读两次系统io读操作。即刚好把Http11Processor 的bytebuffer 读满,前面8k是tomcat在解析时读取的,body部分是我们在主动调用getParameter或者getInputeSteam时读取的。
如果请求行 + header + body 大于16k,需要读两次系统io读操作。即刚好把Http11Processor 的bytebuffer 读满。则需要三次系统IO读操作,前两次是读到bytebuffer,后面的部分是socket buffer,通过后面分析body解析时可以确认下。
好了,解析http 请求行算是写完了,解析header 和 body 部分得另起一篇文章,太长了没有人读。
备注:如有分析的不对的地方,还请指出,欢迎讨论。