Friday, July 08, 2005

Project: WebDAV more

I was looking for an example on gzip support for web applications, and came across this page:
http://www.saddi.com/software/py-lib/

Middleware

Maybe I should have browsed through wsgi sample code right at the start, but reading gzipMiddleWare made something clicked in order to understand how WSGI and middleware actually worked. I guess in Java I had been dealing with pre-filters and post-processors to HTTP requests as separate entities, it did not occur to me that middleware really did both.

It also made me take another look at the Py Tut on generators and iterators as they are used (new to them). The WSGI application either returns a generator (yields) or a iterable as a object or a sequence of strings. If the application returned a single string (like what I did initially), it gets treated inefficiently as a iterable (sequence) of characters.


GZipping: Transfer or Content Encoding

The middleware for gzipping got me confused slightly, since the PEP333 gives:
"(Note: applications and middleware must not apply any kind of Transfer-Encoding to their output, such as chunking or gzipping; as "hop-by-hop" operations, these encodings are the province of the actual web server/gateway. See Other HTTP Features below, for more details.)".

Transfer-Encoding alludes only to "chunking". "Gzipping" is under Content-Encoding and could be supported by the application? Also, the Content-MD5 digest is produced from the gzipped content rather than the original content, if gzip content-encoding is in effect, and ignores Transfer-Encoding. Hope I got it right.


In which case, if gzip is present -> buffering of the sent content may be required to compute the MD5 and Content Length of the compressed data before sending. This may adversely affect performance given the size of the files that might be sent?



Project

On the project, I read more of the WebDAV specification, and it builds on nicely on the existing PyFileServer implementation planned. Locking and supporting arbitary dead properties may require a database to support.

There are a few other python DAV servers mentioned on http://www.webdav.org/projects/, but they are older (since 2000). In any case a WSGI WebDAV server would be great :)

1 comment:

Ian Bicking said...

I'm glad middleware clicked for you -- it's really what makes WSGI interesting.

For dead properties, the "database" can be as simple as something using shelve, or maybe even better simple XML files (less locking involved if each file gets its own metadata file). Or a pickle, or something using rfc822 headers, or whatever. It's not to say that a database would be useless; if the properties were queryable a database would be quite nice. But it should be optional (or maybe just put off entirely).