Skip to content

Category: Open source

Goodbye BitBucket

Around 2008-2009, I had the chance to work together with Python pioneers at Logilab when at a customer.
They are great supporters of the Python ecosystem and used (at least then) Mercurial for all their code versioning needs.
Coming from a CVS/Subversion heritage, I quickly found Mercurial to be a vastly superior solution and embraced it enthusiastically.

As a result, in 2010, when I started to work on openpyxl, Mercurial was the obvious choice to me. At that time, git (boosted by the GitHub platform launched in 2008) was already growing fast and was the “popular” option. However, its arcanic CLI was a joke in comparison to the beginner-friendly, no-nonsense experience Mercurial was providing.

Finding hosting for the project was not easy, since while git experience itself was no fun, GitHub was really setting the standard for code hosting platforms. After a few experiments (including running my own repository, exploring GNU Savannah, Google Code, …) I finally settled with BitBucket.

The rest is part of openpyxl’s history, and while I often got told “WhY U No UsIng GiHUb??!”, mercurial was the VCS I knew the best, and as the main developer, it seemed to me that my efficiency was the main driver.

When Charlie joined on the project, and later took it over completely, he also shared the idea that the main developer experience is the most important, and remained attached to hg, and BitBucket as a result.

This era now comes to an end, with what Charlie describes, and I agree, as “an absolute disgrace” from Atlassian to sunset support of Mercurial in their hosting offering.

I’m no longer actively involved in the project, and while I keep it on my radar, I don’t yet know what are Charlie’s plans. From what I can see, Heptapod (founded by former Logilab and Mercurial core dev Pierre-Yves David and his pals at Octobus) could offer a way out. Combining Mercurial with GitLab would definitely be a great alternative, bridging the best hosting experience with the most robust VCS.

I’m really looking forward to Heptapod to gain the traction it deserves (while I also agree that GitHub got a significant head start, and developers seem to put hype before practicality nowadays).

Regardless of where Openpyxl will live on, I’m fully trusting Charlie to make the right call for the project, and will be updating this post once he has made a decision.

Comments closed

Installing Python 2.4 under Ubuntu 11.10 (Oneiric Ocelot)

As I’m currently merging 2.x and 3.x branches of openpyxl, I now need to perform tests over all existing Python releases known to man. I decided to install a new Ubuntu VM to perform the the tests, but I struggled during 2.4 installation.

Tried pythonbrew, which was a very disappointing experience as nothing worked as it was supposed to.

Then I tried to build from source myself, but here again the GCC version available on ubuntu 11.10 is apparently no longer able to compile original python sources.

Here is how I did it :

  1. download RPM version for Fedora here : http://www.python.org/download/releases/2.4/rpms/ (direct link)
  2. install alien (sudo apt-get install alien)
  3. type the following command
$ sudo alien -i python2.4-2.4-1pydotorg.i386.rpm

Now you can type

$ python2.4

and everything should work !

1 Comment

openpyxl 1.5.6

Small compatibility release this time, no big features added, see the changelog by yourself:

  • [iter_worksheet] add support for calculated strings (they have a special data type for that ?)
  • [strings] make sure we always use unicode strings everywhere
  • [iter] fixed max row and column detection for iter reader
  • [styles] fixed custom number format detection under OOCalc
However, a large effort has been made on supporting the whole python 2.4 – 2.7 range. In the past, 2.4 and 2.5 compatibilities were damaged, now it should be restored back to normal.
Once again, thanks to all the contributors for their help 😉
7 Comments

openpyxl 1.5.5

After a long (long) break (sorry, I eventually bought a PS3, I knew this could happen), we come back with a new version of openpyxl !

This has been requested by a few people so here is a small changelog since the last (1.5.4 version):

  • Commented out the ‘scheme’ element as it appears to prevent the font name from being applied in Excel.
  • [cell] added Decimal type as a numeric type (fixes #78)
  • [writer] added write support for wrapped text (fixes #65)
  • [cell] fixed the numeric regexp (fixes #77)
  • [cell] added encoding support for data input (fixes #76)
  • [excel(reader)] Added try except around loading workbook properties, so that workbooks with no properties get the default properties
  • [date_time] Modified the re that deals with W3CDTF date format to allow dates without an ending Z (fixes #73)
  • [cell] Altered the re for numeric in order to block out numbers with leading zeroes (fixes #70)
  • [worksheet(writer)] fixed long number bug. using repr on a long number causes it to return the number with L appended. Now using str() on long numbers
  • [dump] watching file descriptors to avoid a ‘too many open files’ error when dumping a large number of worksheets

So, a lot of big fixes, such as file descriptors, numerical regexps and leading zeroes, and a few improvements, like encoding support.

I’m also very happy to see an ever-increasing number of contributors and of course users 🙂 Thanks again everybody !

PS: special thanks to Yaroslav Halchenko for the late night licence checks 😉

Comments closed

openpyxl 1.5.0 released

It has been almost one full year since I started this project, and it has now reached a state where it is well suited for production, and intensively tested by an increasing number of people around the globe.

The most recent additions I’m the most proud of are the optimized reader and writer, that become stable in this version. It took around 6 months to get them working properly (the reader was the hardest actually), and now they’re here !

You can read and write workbooks of any size, with low and almost constant memory consumption (which is not the case with Excel actually).

I’m also really happy that people keep sending patches and asking for features, so the project continues to live, even when I’m not fully available to work on it.

You can get the latest version of openpyxl either with easy_install:


easy_install openpyxl

or from the official website.

7 Comments

openpyxl starts being used

When I started working on openpyxl a few months ago, I didn’t know it would catch that much activity around it. I’m very happy to see that it can apparently help  so many people 🙂

I’ll try to follow-up on the bug fixes and new features as far as my time permits, and will usually answer emails within the day. Thank you everyone for using the library, even though it is still far from being perfect 😉 Keep posting bugs on the tracker or ideas and requests on the mailing list !

1 Comment

openpyxl turns 1.1

After two weeks of intense activity around openpyxl, I’m releasing version 1.1 today. This new version brings support for dates and number formats.

Several bugs have been fixed, thanks to the careful testing of two new contributors, Jonathan Peirce and Yaroslav Halchenko, both working on the PsychoPy project.

Thanks guys for boosting my morale, providing valuable advises and patches !

Many thanks goes to Marko Loparic for his support and enthusiasm 😉

You can get the sources for the latest version here http://bitbucket.org/ericgazoni/openpyxl. I expect a lot of bug reports with this new version, as it is stable but not extensively tested yet, and that the number of possibilities have seriously increased with the introduction of number formatting.

Keep in mind that the memory footprint is still high, but that it is the target for milestone 1.2. It should perform reasonably well if your needs are moderate (<100.000 cells), but if you want to add more data, then it might start consuming RAM pretty quickly. This holds for writing and reading.

Memory consumption is almost linear, and a 15MB workbook results in 450MB in RAM.

There is also a new mailing list for the project: http://groups.google.com/group/openpyxl-users. It’s pretty empty for now, but feel free to ask questions there, I’ll be reading it regularly.

Bug reports will be better handled if they are filed on the project bug tracker: http://bitbucket.org/ericgazoni/openpyxl/issues/new.

Happy coding !

Comments closed

openpyxl reaches 1.0 mark

After a few more efforts, I am pleased to announce the release of the first version of openpyxl.

The reader and the writer are working and tested for strings and numbers.

I have been able to read and write simple Excel 2007 xlsx files from Python and open them with Excel.

You can clone the repository using Mercurial:

hg clone https://ericgazoni@bitbucket.org/ericgazoni/openpyxl

or download the release in zip format.

Edit: 1.0 release is really outdated, you might want to get a more recent version here.

The (sparse for now) documentation can be found on the wiki.

Reader usage (using the “empty_book.xlsx” file from the previous example)

from openpyxl.reader.excel import load_workbook

wb = load_workbook(filename = r'empty_book.xlsx')

sheet_ranges = wb.get_sheet_by_name(name = 'range names')

print sheet_ranges.cell('D18').value # should display D18

Code is published under the MIT licence, so you can use it for whatever use you need, and I’d be very happy if  you drop me an email if  you use it 🙂

If you don’t find it useful, spot a bug, or want to suggest an enhancement, you can do so by filling a ticket on the tracker.

Features that will be added in the next version are listed here, so if you need something in this list, please be patient or send me a message to tell me to hurry 😉

11 Comments

openpyxl: simple writer done

I’ve been very busy on openpyxl the last few days, and I managed to get a working writer for basic data types (strings, numerics).

For the impatient, you can clone my bitbucket repository:

hg clone https://ericgazoni@bitbucket.org/ericgazoni/openpyxl

It’s still a work in progress, so expect some quirks here and there, and if that happens, please file a new issue here.

If you like it, you can also drop a comment below or send me an email (see Contact page).

Usage is pretty simple as you can see:

from openpyxl.workbook import Workbook
from openpyxl.writer.excel import ExcelWriter

from openpyxl.cell import get_column_letter

wb = Workbook()

ew = ExcelWriter(workbook = wb)

dest_filename = r'empty_book.xlsx'

ws = wb.worksheets[0]

ws.title = &quot;range names&quot;

for col_idx in xrange(1, 40):
    col = get_column_letter(col_idx)
    for row in xrange(1, 600):
        ws.cell('%s%s'%(col, row)).value = '%s%s' % (col, row)

ws = wb.create_sheet()

ws.title = 'Pi'

ws.cell('F5').value = 3.14

ew.save(filename = dest_filename)

Next features are:

  1. a working reader (so that I can read back files generated by the writer)
  2. dates support
  3. calculations
  4. formatting
Comments closed

Myth of the genius programmer

This session held at Google I/O last year brings so many important ideas to become a better programmer that is should definitely be shown in CS classes (and in some IT shops :p).

Thumbs up guys !

[youtube=http://www.youtube.com/watch?v=0SARbwvhupQ]

Comments closed