ChangeLog: October 24, 2019 – The Python 3 and Django 1.11 Migration

Hi, Christian Hammond here. Welcome back to ChangeLog, where we cover the latest going on in Review Board, RBCommons, and Beanbag in general. It’s been a while since we’ve posted one of these, but we’re aiming to bring this back as a weekly series.

There’s a lot to talk about.

The Short Version

  • We delayed Review Board 4.0 to bring Django 1.11 and Python 3 support. This will not be released by Python 2’s End Of Life date.
  • Python 3 and Django 1.11 support has been in the works for a long time, but is far more complex than it may seem.

The Longer Version: Where are we today

We’ve had four main focuses this year:

  • Getting Review Board 4.0 beta 1 ready to launch. This is way behind our original schedule, but intentionally so — we’ll go over the reasons why in a minute.
  • Bringing integrations options to RBCommons and improving team management, billing, and signup.
  • Building out new functionality for Power Pack (PDF diffing, and in-progress authentication improvements).
  • Growing our business (the prior two tasks were part of that), supporting customers, and assisting some with large projects of their own.

This along with a multitude of smaller tasks taking our time throughout the year (and, on a personal note, dealing with the aftermath of the Paradise, CA “Camp Fire.”)

Most of you are here to find out about Review Board, so let’s dive into that.

The “World Update”: Review Board 4.0, Python 3, and Django 1.11

Our original goal for Review Board 4.0 was to bring support for multi-commit review requests and to ship that. This was a large project, as Review Board was originally written before DVCS was common (Subversion and Perforce were the primarily open source and enterprise solutions).

The problem though, is Python 2 End Of Life is coming up fast, and if we kept on schedule with Review Board 4.0, it would be a long time until we’d support Python 3. So we decided to delay 4.0 in order to get it ready for Python 3.

Note that we also still need to support Python 2.7, since many companies are still heavily tied to 2.7 (older distros, custom extensions and scripts).

So what’s hard about supporting Python 3? Welllll….

The Python Compatibility Problem

Python 3 is leaps and bounds better than Python 2 in most ways, but porting a large and complex product from Python 2 to 3 is even harder than you think.

Most Python users know of the major differences:

  • Modules have moved and functions have been renamed
  • The default string type has changed from byte strings to Unicode
  • Some operators have changed
  • There’s new syntax additions
  • etc.

We thought we prepared years ago to make this move easy. We used Unicode strings in every file. We used the six module to help with using the right modules, types, and functions.

In the end, it was harder than we expected. Our biggest challenge was definitely the Byte Strings vs. Unicode Strings differences. We thought we were in good shape for this, but we weren’t close.

Review Board does a lot of text processing. We’re parsing uploaded diffs, pulling source files from repositories, matching those up and applying the patch to the source, and generating side-by-side diffs. Much of this logic is over 10 years old, even so, we put a substantial amount of work in preparing for Python 3 years ago, so we were shocked by how much we got this wrong.

The problem comes down to the differences in how Python 2 and 3 would handle shoving two different string types together, which is very easy to do accidentally. Python 2 would roll with it, as in many cases the string types were compatible (and would automatically encode/decode so long as the content was basically ASCII). In Python 3, they outright break — which is a better approach, but hard to transition to — and it led us down a rabbit hole of problems.

Missing b'...' prefixes, changes in string return types from Python functions, functions that are no longer compatible with both string types, and very difficult decisions to make regarding compatibility with third-party SCMs.

All the little regressions and inconsistencies added up. Here’s a few more examples:

  • Using Python’s pickle library was a mess. Defaults have changed and pickled data became incompatible (due to module reference and string type changes). We had to build in compatibility layers here and test them thoroughly.
  • Anything that even subtly/unintentionally relied on dictionary sort orders would break.
  • Getting data in/out of processes, streams, and many other objects and functions is now way more sensitive to encoding issues, and required a lot of careful rewrites and testing.
  • Many functions (in Python and third-party modules) that used to return lists now return generators, due to their reliance on other functions that changed return types, and this can cause all sorts of subtle behavioral changes.

We’ve spent a lot of time tracking down issues that might immediately crash or might affect data several stages down, and sometimes be traced to something outside our codebase.

Fixing some Python issues means upgrading third-party modules, which can introduce their own new set of changes, regressions, and new rounds of work.

And the biggest source of that was Django.

The Django Problem

Django 1.6 (the version we’ve been using) breaks on Python 3.6+. This meant we had to upgrade to a newer version, something we’ve been putting off.

Django 1.7 introduced built-in support for database migrations. Quite nice for many projects, but the design was sub-optimal for applications like ours (upgrades could take hours or days longer than our approach) and was fundamentally incompatible with our own Django Evolution.

Many other core components of Django (the foundation for the administration UI, template rendering, URL management, forms, and all sorts of other common and obscure parts) have also largely changed over time in some pretty important ways.

We needed a solution for all of this, and we knew that would take time. To compensate we had to:

  1. Rewrite Django Evolution completely to coordinate evolutions and migrations and support modern Django: ~2 years of work, off and on, with hard problems to ponder
  2. Rewrite our administration UI completely to disconnect from Django’s (ensuring we don’t break again when we move to Django 2.x): estimated ~3 months of steady work, still in progress
  3. Build compatibility layers to help keep older code working and help with new code: ~2 months
  4. Just porting in general, updating dependencies, etc: ~3 months

We’re not done, but we’re close.

The Extensions Problem

Review Board is built to allow third-parties to extend its functionality, modify behavior, and link up with other in-house services. We try to be careful not to break extension functionality, and over time we’ve gotten more strict about providing compatibility and an upgrade path for functionality we want to deprecate or change.

The Python 3 and Django 1.11 upgrade is going to affect just about everybody who’s writing an extension. Some of the work we’ve done on adding compatibility layers will help with this process, but it’s going to mean a longer upgrade cycle for some companies.

We knew from the beginning that we’re going to have our hands full helping these companies out (as part of our support contracts), and that it’d be in our best interest to delay the release and break everything all at once instead of spreading out the breakages across multiple releases, and to also give ourselves time to figure out how best to minimize those breakages.

Most projects moving to Python 3 or to newer versions of Django don’t have to worry about this domino effect in the same way.

Timetable?

I hesitate to say when beta 1 will be done (been wrong before), but I can say that the last major piece of development is wrapping up. We’ll be kicking the tires on it, and want to get a beta out as soon as it’s stable enough.

We may not enable Python 3 builds for beta 1, focusing instead on Django 1.11 testing (Python 3 support is still going to be in development during this time), however we’re working with select customers on real-world testing against Python 3.

Community Questions

Every week, we’d like to address some questions, concerns, suggestions, etc. from the community. If you have any questions for next week, please reach out to us.

Q. The community forum seems quiet at times. Why is that?

A. Support requests from companies with support contracts are conducted over a separate support tracker. We always prioritize these support requests over any other work, and increasingly more companies are moving to this method of support for Review Board.

Q. Will Review Board 4.0 ship with Python 3 before the Python 2 End Of Life date?

A. No, it’s going to miss that date. Python 2.7 is still going to be required for now.

Q. There’s years of open review requests on reviews.reviewboard.org. Why is that?

A. We work with university students every semester to help prepare them for their jobs in the tech industry. They spend their semester building features for Review Board, most of which are prototypes. These make up the majority of the review requests on there. Others are contributions that might be outdated, might have been missed, might be incomplete, or might just not have been reviewed yet.

Next Week

We’ll be going over our new CSS component standard for the project and dive deeper into the new administration UI.

Again, if you have any questions, or anything you’re curious about and want us to cover, please reach out on our community forum.

Christian Hammond

President/CEO of Beanbag. Developer of Review Board and RBCommons. Lover of sushi and bees. Not at the same time.

2 thoughts on “ChangeLog: October 24, 2019 – The Python 3 and Django 1.11 Migration

Comments are closed.