Eric Gazoni's Blog

Daily thoughts for computer scientists

Category: Python

Installing Python 2.4 under Ubuntu 11.10 (Oneiric Ocelot)

As I’m currently merging 2.x and 3.x branches of openpyxl, I now need to perform tests over all existing Python releases known to man. I decided to install a new Ubuntu VM to perform the the tests, but I struggled during 2.4 installation.

Tried pythonbrew, which was a very disappointing experience as nothing worked as it was supposed to.

Then I tried to build from source myself, but here again the GCC version available on ubuntu 11.10 is apparently no longer able to compile original python sources.

Here is how I did it :

  1. download RPM version for Fedora here : http://www.python.org/download/releases/2.4/rpms/ (direct link)
  2. install alien (sudo apt-get install alien)
  3. type the following command
$ sudo alien -i python2.4-2.4-1pydotorg.i386.rpm

Now you can type

$ python2.4

and everything should work !

openpyxl 1.5.6

Small compatibility release this time, no big features added, see the changelog by yourself:

  • [iter_worksheet] add support for calculated strings (they have a special data type for that ?)
  • [strings] make sure we always use unicode strings everywhere
  • [iter] fixed max row and column detection for iter reader
  • [styles] fixed custom number format detection under OOCalc
However, a large effort has been made on supporting the whole python 2.4 – 2.7 range. In the past, 2.4 and 2.5 compatibilities were damaged, now it should be restored back to normal.
Once again, thanks to all the contributors for their help 😉

openpyxl 1.5.5

After a long (long) break (sorry, I eventually bought a PS3, I knew this could happen), we come back with a new version of openpyxl !

This has been requested by a few people so here is a small changelog since the last (1.5.4 version):

  • Commented out the ‘scheme’ element as it appears to prevent the font name from being applied in Excel.
  • [cell] added Decimal type as a numeric type (fixes #78)
  • [writer] added write support for wrapped text (fixes #65)
  • [cell] fixed the numeric regexp (fixes #77)
  • [cell] added encoding support for data input (fixes #76)
  • [excel(reader)] Added try except around loading workbook properties, so that workbooks with no properties get the default properties
  • [date_time] Modified the re that deals with W3CDTF date format to allow dates without an ending Z (fixes #73)
  • [cell] Altered the re for numeric in order to block out numbers with leading zeroes (fixes #70)
  • [worksheet(writer)] fixed long number bug. using repr on a long number causes it to return the number with L appended. Now using str() on long numbers
  • [dump] watching file descriptors to avoid a ‘too many open files’ error when dumping a large number of worksheets

So, a lot of big fixes, such as file descriptors, numerical regexps and leading zeroes, and a few improvements, like encoding support.

I’m also very happy to see an ever-increasing number of contributors and of course users 🙂 Thanks again everybody !

PS: special thanks to Yaroslav Halchenko for the late night licence checks 😉

openpyxl 1.5.0 released

It has been almost one full year since I started this project, and it has now reached a state where it is well suited for production, and intensively tested by an increasing number of people around the globe.

The most recent additions I’m the most proud of are the optimized reader and writer, that become stable in this version. It took around 6 months to get them working properly (the reader was the hardest actually), and now they’re here !

You can read and write workbooks of any size, with low and almost constant memory consumption (which is not the case with Excel actually).

I’m also really happy that people keep sending patches and asking for features, so the project continues to live, even when I’m not fully available to work on it.

You can get the latest version of openpyxl either with easy_install:


easy_install openpyxl

or from the official website.

SSL Secured Piston Webservice

On FreeBSD, there are a few gotchas to work with Apache + SSL + Piston.

Here are my findings:

  • Enabling SSL in Apache 2.0

As most SSL-related functions are enclosed in <IfDefine SSL> blocks, adding

apache2_enable=&quot;YES&quot;
apache2_flags=&quot;-D SSL&quot;

to /etc/rc.conf will enable them.

  • Disabling _default_ SSL Virtualhost

There’s a _default_ virtalenv defined in the ssl.conf file, and activated at the same time as the rest of the SSL config.

I didn’t find a “clean” way to disable it, and it was conflicting with my own virtualhost, so I encapsulated if between <IfDefine SSLVH> tags and it did the trick 🙂

  • Generating SSL keys

I followed a guide found on google (in French). Extremely useful.

Copied them to /usr/local/etc/apache2/ssl.key/ and /usr/local/etc/apache2/ssl.crt/

  • Updating my virtualhosts to fetch HTTPS requests

As I disabled the _default_ virtualhost, I needed to make a copy of my existing (port 80) virtualhost, and merge it with what was defined in the _default_ one.

&lt;VirtualHost *:443&gt;

  ServerName servername.com

  SSLEngine On
  SSLOptions +FakeBasicAuth +ExportCertData +StrictRequire

  SSLCertificateKeyFile /usr/local/etc/apache2/ssl.key/server.key
  SSLCertificateFile /usr/local/etc/apache2/ssl.crt/server.crt

SSLCipherSuite ALL:!ADH:!EXPORT56:RC4+RSA:+HIGH:+MEDIUM:+LOW:+SSLv2:+EXP:+eNULL
&lt;FilesMatch &quot;.(cgi|shtml|phtml|php3?)$&quot;&gt;
    SSLOptions +StdEnvVars
&lt;/FilesMatch&gt;
usr/local/www/cgi-bin&quot;&gt;
    SSLOptions +StdEnvVars
&lt;/Directory&gt;

SetEnvIf User-Agent &quot;.*MSIE.*&quot; 
         nokeepalive ssl-unclean-shutdown 
         downgrade-1.0 force-response-1.0

CustomLog /var/log/httpd-ssl_request.log 
          &quot;%t %h %{SSL_PROTOCOL}x %{SSL_CIPHER}x &quot;%r&quot; %b&quot;

[...]
&lt;/VirtualHost&gt;
  • Open port 443 on the firewall

Almost forgot this one 🙂

Piston + HTTP Authentication + mod_FastCGI

I’ve spent much more time than necessary on this issue, so if it can help someone, I’m posting a working config for Django Piston + Apache’s mod_fastcgi + HTTP Basic Authentication.

Works with :

  • Apache 2.0.63
  • Django 1.2.3
  • Piston 0.2.2

I’ve updated the default config file found on the official Django website:

FastCGIExternalServer /project/dir/mysite.fcgi -socket /project/dir/mysite.sock -pass-header Authorization

&lt;VirtualHost 123.456.78.9&gt;
ServerName servername.com
DocumentRoot /project/dir
Alias /media /project/dir/media
RewriteEngine On
RewriteRule ^/(media.*)$ /$1 [QSA,L,PT]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^/(.*)$ /mysite.fcgi/$1 [E=HTTP_AUTHORIZATION:%{HTTP:Authorization},QSA,L]
&lt;/VirtualHost&gt;

tl;dr: -pass-header Authorization and [E=HTTP_AUTHORIZATION:%{HTTP:Authorization},QSA,L]

openpyxl turns 1.1

After two weeks of intense activity around openpyxl, I’m releasing version 1.1 today. This new version brings support for dates and number formats.

Several bugs have been fixed, thanks to the careful testing of two new contributors, Jonathan Peirce and Yaroslav Halchenko, both working on the PsychoPy project.

Thanks guys for boosting my morale, providing valuable advises and patches !

Many thanks goes to Marko Loparic for his support and enthusiasm 😉

You can get the sources for the latest version here http://bitbucket.org/ericgazoni/openpyxl. I expect a lot of bug reports with this new version, as it is stable but not extensively tested yet, and that the number of possibilities have seriously increased with the introduction of number formatting.

Keep in mind that the memory footprint is still high, but that it is the target for milestone 1.2. It should perform reasonably well if your needs are moderate (<100.000 cells), but if you want to add more data, then it might start consuming RAM pretty quickly. This holds for writing and reading.

Memory consumption is almost linear, and a 15MB workbook results in 450MB in RAM.

There is also a new mailing list for the project: http://groups.google.com/group/openpyxl-users. It’s pretty empty for now, but feel free to ask questions there, I’ll be reading it regularly.

Bug reports will be better handled if they are filed on the project bug tracker: http://bitbucket.org/ericgazoni/openpyxl/issues/new.

Happy coding !

openpyxl reaches 1.0 mark

After a few more efforts, I am pleased to announce the release of the first version of openpyxl.

The reader and the writer are working and tested for strings and numbers.

I have been able to read and write simple Excel 2007 xlsx files from Python and open them with Excel.

You can clone the repository using Mercurial:

hg clone https://ericgazoni@bitbucket.org/ericgazoni/openpyxl

or download the release in zip format.

Edit: 1.0 release is really outdated, you might want to get a more recent version here.

The (sparse for now) documentation can be found on the wiki.

Reader usage (using the “empty_book.xlsx” file from the previous example)

from openpyxl.reader.excel import load_workbook

wb = load_workbook(filename = r'empty_book.xlsx')

sheet_ranges = wb.get_sheet_by_name(name = 'range names')

print sheet_ranges.cell('D18').value # should display D18

Code is published under the MIT licence, so you can use it for whatever use you need, and I’d be very happy if  you drop me an email if  you use it 🙂

If you don’t find it useful, spot a bug, or want to suggest an enhancement, you can do so by filling a ticket on the tracker.

Features that will be added in the next version are listed here, so if you need something in this list, please be patient or send me a message to tell me to hurry 😉

openpyxl: simple writer done

I’ve been very busy on openpyxl the last few days, and I managed to get a working writer for basic data types (strings, numerics).

For the impatient, you can clone my bitbucket repository:

hg clone https://ericgazoni@bitbucket.org/ericgazoni/openpyxl

It’s still a work in progress, so expect some quirks here and there, and if that happens, please file a new issue here.

If you like it, you can also drop a comment below or send me an email (see Contact page).

Usage is pretty simple as you can see:

from openpyxl.workbook import Workbook
from openpyxl.writer.excel import ExcelWriter

from openpyxl.cell import get_column_letter

wb = Workbook()

ew = ExcelWriter(workbook = wb)

dest_filename = r'empty_book.xlsx'

ws = wb.worksheets[0]

ws.title = &quot;range names&quot;

for col_idx in xrange(1, 40):
    col = get_column_letter(col_idx)
    for row in xrange(1, 600):
        ws.cell('%s%s'%(col, row)).value = '%s%s' % (col, row)

ws = wb.create_sheet()

ws.title = 'Pi'

ws.cell('F5').value = 3.14

ew.save(filename = dest_filename)

Next features are:

  1. a working reader (so that I can read back files generated by the writer)
  2. dates support
  3. calculations
  4. formatting

IronPython and WPF

Last week I came across a few websites that were dealing about dynamic generation of Winforms in IronPython.

I’m not much into code-generated UIs, because it’s easy to get two or three controls on a form, but as soon as you have a dozen, it can be a nightmare to lay them out properly only with code. For example, it might need several tries to get a decent width for your text boxes, or a pleasing height for your lists. When using a WYSIWYG UI editor, at least you’re playing with the real thing, and save a lot of time on the design process.

On the other side, I’m not much into the Visual Studio way of doing UIs (aka “mouse click hell”), where it’s so tempting to put your logic behind the form, because that’s the way it expects you to do it.

The best way of designing forms I know is how Qt does it:

  1. design your interface in a WYSIWYG, drag-n-drop designer
  2. save it in a programming language agnostic format (Qt uses XML)
  3. translate it into a module in your favorite programming language, through a specialized compiler
  4. import it in your application
  5. now you can plug it to your application logic

I wanted to use the same flow in .NET, but that was not possible … until introduction of WPF and Xaml format.

Read the rest of this entry »