Eric Gazoni's Blog

Daily thoughts for computer scientists

Category: Open source

Installing Python 2.4 under Ubuntu 11.10 (Oneiric Ocelot)

As I’m currently merging 2.x and 3.x branches of openpyxl, I now need to perform tests over all existing Python releases known to man. I decided to install a new Ubuntu VM to perform the the tests, but I struggled during 2.4 installation.

Tried pythonbrew, which was a very disappointing experience as nothing worked as it was supposed to.

Then I tried to build from source myself, but here again the GCC version available on ubuntu 11.10 is apparently no longer able to compile original python sources.

Here is how I did it :

  1. download RPM version for Fedora here : http://www.python.org/download/releases/2.4/rpms/ (direct link)
  2. install alien (sudo apt-get install alien)
  3. type the following command
$ sudo alien -i python2.4-2.4-1pydotorg.i386.rpm

Now you can type

$ python2.4

and everything should work !

openpyxl 1.5.6

Small compatibility release this time, no big features added, see the changelog by yourself:

  • [iter_worksheet] add support for calculated strings (they have a special data type for that ?)
  • [strings] make sure we always use unicode strings everywhere
  • [iter] fixed max row and column detection for iter reader
  • [styles] fixed custom number format detection under OOCalc
However, a large effort has been made on supporting the whole python 2.4 – 2.7 range. In the past, 2.4 and 2.5 compatibilities were damaged, now it should be restored back to normal.
Once again, thanks to all the contributors for their help 😉

openpyxl 1.5.5

After a long (long) break (sorry, I eventually bought a PS3, I knew this could happen), we come back with a new version of openpyxl !

This has been requested by a few people so here is a small changelog since the last (1.5.4 version):

  • Commented out the ‘scheme’ element as it appears to prevent the font name from being applied in Excel.
  • [cell] added Decimal type as a numeric type (fixes #78)
  • [writer] added write support for wrapped text (fixes #65)
  • [cell] fixed the numeric regexp (fixes #77)
  • [cell] added encoding support for data input (fixes #76)
  • [excel(reader)] Added try except around loading workbook properties, so that workbooks with no properties get the default properties
  • [date_time] Modified the re that deals with W3CDTF date format to allow dates without an ending Z (fixes #73)
  • [cell] Altered the re for numeric in order to block out numbers with leading zeroes (fixes #70)
  • [worksheet(writer)] fixed long number bug. using repr on a long number causes it to return the number with L appended. Now using str() on long numbers
  • [dump] watching file descriptors to avoid a ‘too many open files’ error when dumping a large number of worksheets

So, a lot of big fixes, such as file descriptors, numerical regexps and leading zeroes, and a few improvements, like encoding support.

I’m also very happy to see an ever-increasing number of contributors and of course users 🙂 Thanks again everybody !

PS: special thanks to Yaroslav Halchenko for the late night licence checks 😉

openpyxl 1.5.0 released

It has been almost one full year since I started this project, and it has now reached a state where it is well suited for production, and intensively tested by an increasing number of people around the globe.

The most recent additions I’m the most proud of are the optimized reader and writer, that become stable in this version. It took around 6 months to get them working properly (the reader was the hardest actually), and now they’re here !

You can read and write workbooks of any size, with low and almost constant memory consumption (which is not the case with Excel actually).

I’m also really happy that people keep sending patches and asking for features, so the project continues to live, even when I’m not fully available to work on it.

You can get the latest version of openpyxl either with easy_install:


easy_install openpyxl

or from the official website.

openpyxl starts being used

When I started working on openpyxl a few months ago, I didn’t know it would catch that much activity around it. I’m very happy to see that it can apparently help  so many people 🙂

I’ll try to follow-up on the bug fixes and new features as far as my time permits, and will usually answer emails within the day. Thank you everyone for using the library, even though it is still far from being perfect 😉 Keep posting bugs on the tracker or ideas and requests on the mailing list !

openpyxl turns 1.1

After two weeks of intense activity around openpyxl, I’m releasing version 1.1 today. This new version brings support for dates and number formats.

Several bugs have been fixed, thanks to the careful testing of two new contributors, Jonathan Peirce and Yaroslav Halchenko, both working on the PsychoPy project.

Thanks guys for boosting my morale, providing valuable advises and patches !

Many thanks goes to Marko Loparic for his support and enthusiasm 😉

You can get the sources for the latest version here http://bitbucket.org/ericgazoni/openpyxl. I expect a lot of bug reports with this new version, as it is stable but not extensively tested yet, and that the number of possibilities have seriously increased with the introduction of number formatting.

Keep in mind that the memory footprint is still high, but that it is the target for milestone 1.2. It should perform reasonably well if your needs are moderate (<100.000 cells), but if you want to add more data, then it might start consuming RAM pretty quickly. This holds for writing and reading.

Memory consumption is almost linear, and a 15MB workbook results in 450MB in RAM.

There is also a new mailing list for the project: http://groups.google.com/group/openpyxl-users. It’s pretty empty for now, but feel free to ask questions there, I’ll be reading it regularly.

Bug reports will be better handled if they are filed on the project bug tracker: http://bitbucket.org/ericgazoni/openpyxl/issues/new.

Happy coding !

openpyxl reaches 1.0 mark

After a few more efforts, I am pleased to announce the release of the first version of openpyxl.

The reader and the writer are working and tested for strings and numbers.

I have been able to read and write simple Excel 2007 xlsx files from Python and open them with Excel.

You can clone the repository using Mercurial:

hg clone https://ericgazoni@bitbucket.org/ericgazoni/openpyxl

or download the release in zip format.

Edit: 1.0 release is really outdated, you might want to get a more recent version here.

The (sparse for now) documentation can be found on the wiki.

Reader usage (using the “empty_book.xlsx” file from the previous example)

from openpyxl.reader.excel import load_workbook

wb = load_workbook(filename = r'empty_book.xlsx')

sheet_ranges = wb.get_sheet_by_name(name = 'range names')

print sheet_ranges.cell('D18').value # should display D18

Code is published under the MIT licence, so you can use it for whatever use you need, and I’d be very happy if  you drop me an email if  you use it 🙂

If you don’t find it useful, spot a bug, or want to suggest an enhancement, you can do so by filling a ticket on the tracker.

Features that will be added in the next version are listed here, so if you need something in this list, please be patient or send me a message to tell me to hurry 😉

openpyxl: simple writer done

I’ve been very busy on openpyxl the last few days, and I managed to get a working writer for basic data types (strings, numerics).

For the impatient, you can clone my bitbucket repository:

hg clone https://ericgazoni@bitbucket.org/ericgazoni/openpyxl

It’s still a work in progress, so expect some quirks here and there, and if that happens, please file a new issue here.

If you like it, you can also drop a comment below or send me an email (see Contact page).

Usage is pretty simple as you can see:

from openpyxl.workbook import Workbook
from openpyxl.writer.excel import ExcelWriter

from openpyxl.cell import get_column_letter

wb = Workbook()

ew = ExcelWriter(workbook = wb)

dest_filename = r'empty_book.xlsx'

ws = wb.worksheets[0]

ws.title = &quot;range names&quot;

for col_idx in xrange(1, 40):
    col = get_column_letter(col_idx)
    for row in xrange(1, 600):
        ws.cell('%s%s'%(col, row)).value = '%s%s' % (col, row)

ws = wb.create_sheet()

ws.title = 'Pi'

ws.cell('F5').value = 3.14

ew.save(filename = dest_filename)

Next features are:

  1. a working reader (so that I can read back files generated by the writer)
  2. dates support
  3. calculations
  4. formatting

Myth of the genius programmer

This session held at Google I/O last year brings so many important ideas to become a better programmer that is should definitely be shown in CS classes (and in some IT shops :p).

Thumbs up guys !

openpyxl: my python xlsx library

Update: openpyxl 1.0 is now out !

At a customer, we read a lot of Excel files. We’ve tried the conventional approaches, that are xlrd and xlwt, pyinex, and COM automation.

That’s COM that we mainly use, because it’s able to deal with every Excel file format, from the ancient Excel 5 to most recent Excel 2007 Office Open XML format.
However, we experience from time to time stability issues (Excel is a complex beast, sometimes you don’t fully understand why it is angry).

We then looked for a native reader for .xlsx format, to get rid of the Excel part of the equation, but unfortunately, there are only two small read-only libraries for now:

Finally, I thought that I was the only guy who needed a native .xslx writer, and decided to stick with COM for now.
I wouldn’t be doing this project now without a tweet from Tarek Ziadé, who was also looking for such a library. That meant that we were at least two in need for the same thing, so I simply decided to write it.

Trust me, the Office Open XML format is open, but it’s also a bit twisted, so I spent a few days gathering documentation, and I finally landed on the PHPExcel library, that was already doing what I needed, but in PHP.

So now, I’m busy porting the PHPExcel library under Python, which is really easy, because of the similarities between both languages, but I can also benefit from all the nice things that come with Python, so the code is much simpler.

You can follow my progress on bitbucket: http://bitbucket.org/ericgazoni/openpyxl/