Issues with CoreFoundation chunked requests (iPhoneOS 2.x / OS 10.5.x) and Apache / mod_proxy

A comment was made in my previous article, stating that the “c” of “chunked” in the Transfer-Encoding: chunked header is similarly capitalised in Mac OS 10.5.5, which means that the flaw in Apache’s mod_proxy is exposed by more than just the iPhone.

So if you’re trying to do chunked HTTP POST requests using the CFNetwork stack, make sure your Apache configuration can deal with the chunked header:

RequestHeader edit Transfer-Encoding Chunked chunked early

Memory issues with NSMutableURLRequest’s setHTTPBody: method in iPhoneOS 2.1

The iPhone developers at my work have been tearing their hair out over the last few days, trying to resolve the last few memory issues in our iPhone application before we send it off for approval by Apple. One of the problem areas they’ve noticed is when a photo is uploaded to our service via HTTP POST (both from the camera and otherwise).

Not sure who had the bright idea, but one of the developers decided to try passing in the HTTP body as an NSInputStream (via -[NSMutableURLRequest setHTTPBodyStream:]), rather than as NSMutableData (via -[NSMutableURLRequest setHTTPBody:]). Magically, this seems to have solved the memory leak issues.

Unfortunately, it created another issue.

You might remember that 3 months ago I wrote an article on how much of a pain in the arse it was to accept HTTP POST requests from user agents specifying a Transfer-Encoding HTTP header value of chunked (resulting in a POST request with no Content-Length header). In that article, I proposed a solution using Apache 2 and mod_proxy’s proxy-sendcl option to get things working again. This worked fine for our J2ME clients, but when our iPhone application started blowing chunks at us, our server crapped out with the dreaded 500 error I thought I’d fixed for good:

Chunked Transfer-Encoding is not supported

After whipping out Wireshark, we realised that there was a tiny difference between between what our J2ME client was doing and what the iPhone was doing.

This is the header the J2ME app was sending:

Transfer-Encoding: chunked

And this is the header the iPhone was sending:

Transfer-Encoding: Chunked

As much as I’d like to think the different casing of chunked and Chunked wouldn’t affect the behaviour of mod_proxy, it seems it does. Fortunately, we can work around this problem too by using Apache’s mod_headers module. This allows us to do the following:

RequestHeader edit Transfer-Encoding Chunked chunked early

When combined with the solution from my previous article, this leaves us with the following complete solution:

Apache configuration to reconstitute “chunked” HTTP requests
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
ProxyRequests Off

<Proxy http://localhost:81>
  Order deny,allow
  Allow from all
</Proxy>

Listen 80

<VirtualHost *:80>

  RequestHeader edit Transfer-Encoding Chunked chunked early

  SetEnv proxy-sendcl 1
  ProxyPass / http://localhost:81/
  ProxyPassReverse / http://localhost:81/
  ProxyPreserveHost On
  ProxyVia Full

  <Directory proxy:*>
    Order deny,allow
    Allow from all
  </Directory>

</VirtualHost>

Listen 81

<VirtualHost *:81>
  ServerName ooboontoo
  DocumentRoot /path/to/my/rails/root/public
  RailsEnv development
</VirtualHost>

“Transfer-Encoding: chunked”, or, Chunky HTPP!

Providing a web service for a bunch of browsers is a relatively straightforward affair. It’s really only once you jump out of the back-end and into the front-end side of things where issues like browser incompatibilities start to become a problem. Thankfully, I feel like I’m in the position where I think I’ve got my head wrapped around what’s involved in providing solutions to these problems.

And then mobile phones came along.

The service I’m working on at the moment is consumed by a bunch of clients, including but not limited to web browsers, WAP browsers, the Flash player, iPhones, J2ME devices. It’s the last one that’s causing headaches at the moment.

You see, despite the fact that HTTP/1.1 is about 9 years old, not all web servers support the features that were introduced. The particular one I’m talking about is chunked tranfer encoding, but I’m sure there are many others.

To give you a general idea, the HTTP implementations on many mobile handsets will decide to use a chunked transfer encoding if the payload of a PUT/POST request is over an arbitrary threshold. This causes issues with servers like Nginx, Lighttpd, Thin, since most of those assume that an incoming HTTP request with a payload will also include a Content-Length header.

Well guess what? As of 9 years ago, that hasn’t been the case.

Take a look at this request:

HTTP request specifying “chunked” transfer encoding
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
POST /search.json HTTP/1.1
User-Agent: curl/7.16.4 (i486-pc-linux-gnu) libcurl/7.16.4 OpenSSL/0.9.8e zlib/1.2.3.3 libidn/1.0
Host: ooboontoo
Accept: */*
Transfer-Encoding: chunked
Content-Type: multipart/form-data; boundary=----------------------------ab5090ac7869


92
------------------------------ab5090ac7869
Content-Disposition: form-data; name="query"

zoooom
------------------------------ab5090ac7869--


0

Notice anything? If not, try and find a Content-Length header. Pre-HTTP/1.1 you’d expect to get an HTTP 411 error (length required), but after HTTP/1.1 it’s pretty clear what the HTTP/1.1 applications are obliged to do:

All HTTP/1.1 applications that receive entities MUST accept the “chunked” transfer-coding (section 3.6), thus allowing this mechanism to be used for messages when the message length cannot be determined in advance.

That’s why I find it so surprising that hacks are involved in allowing mobile clients to POST/PUT to what I’d traditionally thought of as HTTP/1.1 compliant web servers.

But anyway, you want to see the solution right?

Well, our initial solution involved Gerald Kaszuba writing a little web server in Python which went by the name of “Dechunker”. You can imagine what it did, but we quickly found that while it was the simplest way to avoid the problem, it also meant that over time we would end up needing to implement the functionality that was already available in most other web servers. Servers like Apache2 and Lighttpd have become incredibly hardened over the years, and we’re not going to achieve that overnight.

So I then took another look at Apache2, knowing that some modules do support chunked transfer encoding while others don’t. What I discovered was that Apache’s mod_proxy module could be used in front of anything that doesn’t support chunked encoding, since it can be configured to “dechunk” requests before passing them to a backend.

It looks a little something like this:

Apache configuration to reconstitute “chunked” HTTP requests
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
ProxyRequests Off

<Proxy http://localhost:81>
  Order deny,allow
  Allow from all
</Proxy>

<VirtualHost *:80>
  SetEnv proxy-sendcl 1
  ProxyPass / http://localhost:81/
  ProxyPassReverse / http://localhost:81/
  ProxyPreserveHost On
  ProxyVia Full

  <Directory proxy:*>
    Order deny,allow
    Allow from all
  </Directory>

</VirtualHost>

Listen 81

<VirtualHost *:81>
  ServerName ooboontoo
  DocumentRoot /path/to/my/rails/root/public
  RailsEnv development
</VirtualHost>

As you can see I’ve got Apache2 listening on port 80, which uses the proxy-sendcl environment variable available in mod_proxy to repack the HTTP body and add a Content-Length header to the request. This request is then passed back to a virtual host running on port 81, which is configured to use Phusion Passenger.

Turns out it’s pretty simple, and from what I’ve seen there haven’t been any negative performance impacts by proxying all requests. It’s not a permanent solution, and as soon as Phusion Passenger fixes the chunked encoding bug, I’ll drop mod_proxy from our configuration.

ActiveRecord Model Serialisation

I’ve been working on a JSON API for mobile clients recently, and in doing so I’ve realised how much you need to repeat serialisation options throughout Rails applications despite options generally being model specific.

This little patch solves that problem by allowing you to decorate your Rails models with model-wide serialisation options, like so:

ActiveRecord model with class-level serialisation options defined
1
2
3
4
class Article < ActiveRecord::Base
  has_many :comments
  serialization_options :include => :comments
end

This means that whenever you call to_json or to_xml on an instance of Article, you’ll get the comment association thrown in for you. You’ll find you can clean up your Controllers and remove explicit calling of to_json, which previously would have looked like this:

Defining serialisation options every time to_json is called
1
2
3
respond_to do |format|
  format.json :json => @article.to_json :include => :comments
end

But can now be change to this:

No longer any need to specify the options
1
2
3
respond_to do |format|
  format.json :json => @article
end

While it’s very simple, some people might find it useful. If you do, chuck this in your /lib directory and require it in RAILS_ROOT/config/environment.rb.

Allow class-level definition of serialisation options
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
module SerializationOptions
  def serialization_options(options = {})
    class_inheritable_accessor :serialization_options
    self.serialization_options = options.dup
  end
end

ActiveRecord::Base.send(:extend, SerializationOptions)

class ActiveRecord::Serialization::Serializer
  alias_method :old_initialize, :initialize
  def initialize(record, options = {})
    if record.respond_to? :serialization_options
      options = record.serialization_options.merge(options)
    end
    old_initialize(record, options)
  end
end

Andy Clarke Announces the “CSS Eleven” at Web Directions South 2007

The opening session at Web Directions South was given to Andy Clarke, who proceeded to wrap it up with an announcement of a group put together to tackle the recent issues regarding submission of proposals and recommendations to the W3C. The eleven involved are:

  • Cameron Adams
  • Jina Bolton
  • Mark Boulton
  • Dan Cederholm
  • Andy Clarke
  • Jeff Croft
  • Aaron Gustafson
  • Jon Hicks
  • Roger Johansson
  • Richard Rutter
  • Jonathon Snook

From what was explained, their aim is to work through the CSS specifications and give feedback and examples for some of the more difficult issues, and then provide a body of work to the W3C and/or browser vendors with the hope that it things along a little faster than is currently the case. More details are sure to appear on the CSS Eleven website.

Here’s a snap of his slide: