Posts tagged ‘python’

XML in Python – What If You Need XPath?

One of the lovely things about Python is that there are so many free libraries to choose from. But sometimes that’s a bad thing, because people love to reinvent the wheel, thinking that they can make it somehow rounder and more efficient. This of course results in a lot of dead code and modules that haven’t been updated since the Stone Age.

Recently at work I found myself looking at a new package to replace one really old dead one: PyXML. We’ve got a good chunk of code that reads and writes data to an XML file basically as a flat-file database for when we (or our customers) are not using PostgreSQL, Oracle, etc.

Normally that wouldn’t be such a big deal, except that we have one requirement: XPath.

There are an awful lot of good XML parsers out there in the world. CPython now even comes with one build in: ElementTree. …And cElementTree (the compiled version of ElementTree) which is the same API, just a whole lot faster – unless you’re using PyPy, which we’re not – but I digress.

The problem with ElementTree however is that it doesn’t fully support XPath. In fact, it barely does at all. It’d be nice if it did, but it doesn’t much, so there you are.

The following is a run-down of my research into a few other Python libraries that do fully support XPath XML coding standards. I wish I could release the benchmark code that my performance evaluations are based on, but the benchmark code is based on real use cases of proprietary code. Meaning that they also don’t necessarily represent the on-paper perfect-world performance of these libraries, but more useful real-world use-cases. These were run under CPython 2.7.3 on a Windows 7 64-bit Intel Xeon workstation.

PyXML: The Baseline

You’d think that PyXML, having last been updated in 2004, would be long dead. And it is. But if you don’t mind fixing some minor things in this all-Python XML library, it actually does still work. You just have to find-and-replace the two places where “as” is used as a variable name, since Python now protects keywords with a vengeance. Simple changing the name to “_as” is enough to fix the problem, and then you can continue to use the long-dead PyXML on Python 2.7.3. Since this is the library that our code used in the past for Python XPath XML support, it’s what I used as a baseline to compare other libraries to. We also used Python’s minidom with PyXML, which is not exactly known for speed…

FourThought 4Suite: 2X

This is another dead project, having last been updated in 2006 as far as I can tell. It’s written by the same company that wrote most of what is in PyXML. (According to Wikipedia, it’s also the same company that brought you PowerPoint?) Unfortunately their corporate website seems to be down, meaning that they’re likely just as dead as their 4 Suite package.  Fortunately the great thing about places like SourceForge, besides the whole open-source thing, is that they’re also a great repository for dead packages and code.

The advantage of 4Suite is that because it was written by the same people as PyXML, 4Suite can be used with minimal code changes. It contains the PyXML API with very few differences. It just adds a whole lot more. But the one big differences is that you don’t use Python’s minidom, you use 4Suite’s cDomlette. Cute. And yes, it’s compiled code. And at least on CPython, it runs faster. It’s a little more than twice as fast as PyXML.

libxml2: 2.5X

Finally, a living breathing project! Based on a Python wrapping of the Gnome XML parser written in C, libxml2 is a breath of fresh air. Err … sort of. There’s a nice object-oriented wrapper written in Python. Which would be good … if it were documented. But darned if I can find any API documentation. And since the libxml2 wrapper changes the API dramatically from the original C code that it came from, it takes a bit of figuring out to use. It’s also slow. Oh, sure, the two-and-a-half times the performance of PyXML seems great. It’s even better than 4Suite. Barely. But if you’ll read through the rest, you’ll see it’s not so impressive after all, and there’s actually a very useful alternative right under its nose.

libxml2.libxml2mod: 6X

Yes, that’s right, it’s still technically the same libxml2 module as above. But if you load the libxml2mod.pyd file directly and skip the object-oriented Pythonic wrapper, going straight to the literal Gnome libxml2 APIs, you’ll have a lot more programming work (as the API is a lot more effort to code to) with a much better performance of six times the speed of PyXML. And it fully supports XPath. Who could ask for more?

Well, I could, actually. I don’t know if it’s the distribution that I got, or if it’s just not fully wrapped, or what, but there were some pieces of the Gnome C-code’s API missing from the Python libxml2mod.pyd file. The largest omission to me was XPath’s compile operation was completely missing. Since this can be vital to improving performance of executing an evaluate query across multiple nodes, it makes the 6X speed improvement even more impressive, as I was forced to do things the slow way, without compile. Which can of course be done. But it just makes you wonder, because the Gnome library definitely has this API, so it’s a mystery why the libxml2.pyd file didn’t.

PyQt4: -2.5X

If you’re using Qt4 as your Python GUI, you might as well use the Qt4 XML parser … right?

Well, maybe not.

Now don’t get me wrong. I love Qt.

Or at least, I loved the Qt that Trolltech put out.

But ever since Nokia bought Qt, it’s gone downhill. Fast. And this is a perfect example, right here.

The Qt4 XML parser is the darndest most complicated pile of API I’ve ever run across. Oh, it’s highly flexible. In theory. And it fully supports XPath. … In theory.  (I certainly haven’t tested every last feature.) But darned if I didn’t run into all sorts of mess just trying to convert the benchmark code to using Qt4’s XML parser. It was even worse when a bug (I don’t know if it’s in Qt4 or PyQt4) prevented me from evaluating to a QString, like you’re supposed to be able to do. So simple property lookups required the full QXmlResulItems overkill where I resolve the first result item from my results class instance, get the model index from that item, then use the model pointer from the index with the index to resolve it into a string.  Instead of just getting the first string, like I’d wanted and like it should have been able to do.  And not only is the API a mess (a highly flexible mess, but still a mess all the same), but it’s also two and a half times slower than PyXML. I honestly didn’t even think that it would be possible to write an XML parser slower for CPython than a pure-Python implementation that uses a DOM no less. Surely the C++ compiled-code PyQt4 would have a much faster XML parser than PyXML, right?

Well, apparently not!  As my benchmarks showed.

It was slow. Really slow.

Three-legged horse at the racetrack slow!

So I would highly suggest, to anyone using Qt4, DO NOT USE QT4’s XML PARSER! It’s that bad. To code for, and in performance. Find yourself another library for your XML needs. Trust me, you’ll be much happier that way.

I can only hope now that Digia owns Qt that some of these horrendous trainwrecks that have plagued Qt4 can finally be sorted out over time. Not likely to be seen in Qt5 though, as that’s still Nokia’s aborted afterbirth. Digia probably won’t get things straightened out until Qt6. And goodness knows how many years away that could end up being! :(

It’s hard to believe that with these lovely landmines in Qt that I still love it.  But the thing is, as bad as some parts of Qt are, no one has ever come close to doing anything better as an all-around solution to platform independent computer programming.  I just wish the original integrity of Trolltech had even remotely carried on to Nokia.  I just hope that Digia can give back some of the polish that Qt once had.

ElementTree: 3X

So I know, I already said that ElementTree doesn’t really support XPath properly yet. I really wish that it did. It’d be nice if I could just use the libraries built into Python for everything, and a good XML parser seems like a no-brainer. But for whatever reason, XPath is not really a part of ElementTree. They have kind of added beginning support to XPath type evaluate strings into the ElementTree find/findall queries, but a full implementation of XPath it is not. It doesn’t even support the full XPath string standard there.

Still, at least for enough of our use case, I was able to code for ElementTree. Converting the code from a full on XPath PyXML implementation to ElementTree and its lame partial implementation of XPath-based queries wasn’t as much work as it could have been. It’s nowhere near as much work as, say, converting the code to PyQt4, or even to libxml2. Which was pleasantly surprising. It’s a nice simple API, so I can see why people love it.

And the performance? It’s about three times faster than PyXML, making it a fair improvement. For a pure-Python implementation it’s actually quite amazing to squeeze that much out. But then, there’s a reason people don’t use DOM anymore. But the real treat comes next.

cElementTree: 18X

And here we have a real winner! Also included in Python, it’s the same API as ElementTree, just a very well written compiled-code implementation wrapped for Python. The same code that ran my ElementTree port also ran cElementTree with only the library name changing. Exactly like it should.

And the results were astounding. The real-use-case benchmark of XML parsing was a whopping eighteen times faster than our old PyXML code. Ding-dong, the DOM is dead!

Of course the problem is, all of our existing code is written for DOM using PyXML, so it’ll take a while to convert all of that to cElementTree.

As a side note, if any PyPy enthusiasts want to know why CPython programmers can’t convert to PyPy just yet (maybe not ever) here’s the reason why. A well-wrapped compiled code library runs like a champ in CPython. As a result, a lot of us big data/number crunchers have lots of compiled code in our Python projects. And since PyPy only just barely even runs compiled code, slowing things down far worse in that than native Python code, this leaves a lot of us CPython folks out in the cold. If you want the serious data crunchers to switch to PyPy then you have to start taking that compiled code lag more seriously or else we’re never going to be able to join you in your fancy little JIT Python interpreter’s dance.

Conclusion

So if you’re a Python 2.7 programmer looking for the best XPath XML parser ever, well, if you’re staying true to XPath at least, I’d say go with libxml2.

However, if you can swing it (and you’ll need to really evaluate your code to determine this) you might be able to get away with the rather unfinished XPath implementation in cElementTree. In which case you won’t need to install any third-party package for XML parsing and you’ll get blinding performance out the asterisk. (And obviously, if you’re coming at a whole new XML parser, and you don’t need XPath at all, then go with cElementTree since it’s what everyone in Python land is using and it’s got great performance.)

Hopefully the all-Python ElementTree runs just as great on PyPy, giving the world a pretty well rounded solution.

If any ElementTree authors catch this, hey, could you please work on supporting XPath a little more seriously?

And finally, dear god of all things software, whatever you do, avoid Qt4’s XML parser like the plague! Unfortunately I can’t speak to Qt5 yet as there’s still a lot of untested theory there that, professionally, we just don’t want to even approach mucking about with yet. Let things get a few more minor version numbers under the hood and then we can re-evaluate a PyQt5 upgrade path. (Or maybe even PySide.) But even if Qt’s XML parsing gets a major performance improvement, the API is still just as likely to suck wet donkey fur for being so “flexible”. Seriously, what committee designed that API? It’s everything that you could ever need … without being anything that you’d ever want! Yeesh!

Python – Transitioning from Numeric to NumPy Part 2 – What Exactly IS The Point Of Oldnumeric Again?

Okay, so here’s an update for you. In theory most of the Numeric to NumPy conversion using NumPy’s oldnumeric compatibility layer works just as detailed previously. There are however two exceptions / problems that I’ve found since then.

Problem 1 – Savespace

If you used Numeric array.savespace, you’re SoL. NumPy supports that like it supports returning the USA to British rule. Or in other words, not at all. Not even slightly. And if you ask why, you’ll no doubt catch all sorts of flak for even daring to ask, apparently. As if consuming half of the memory isn’t desirable when you don’t need the accuracy. And as if using graphics cards as GPGPUs isn’t catching on in exactly the kinds of fields you’d want a numerical library, where single precision FLOPs still greatly outperform double precision FLOPs to sickening proportions. So no, no reasons whatsoever for NumPy to support savespace whatsoever. (And yes, I am indeed rolling my eyes here.)

If you heavily used the savespace feature your spacesaver arrays are all upconverted to double now, regardless. Which might not matter to you, other than consuming vast amounts of memory.

Or … if you dared to ever typecheck (I know it’s not really a “Python thing” to typecheck, but sometimes needs must, especially if you dared to C++ Boost your Python library), it just might mean banging your head against a wall for who knows how many lines of code.

If you’re 1) insane and 2) a Python flexibility extremist, you can theoretically create a workaround for this problem, assuming you don’t mind performance penalties. By inheriting from the numpy.ndarray class you can create your own class that does support savespace and spacesaver properly. Which is a lot of work, frankly. Because every math operator needs an overwrite because spacesaver is like a virus, infecting any arrays it comes into contact with. And then all you’ve really done is just fixed the class. The extreme part comes next: You also rewrite every single function that creates an array (though really if you’re converting from Numeric to NumPy, you probably only need to do the ones that Numeric had, and only the arguments that Numeric supported) to return your class instead of a straight ndarray. Oh, and the fun Pythonic step, then replace the class and method pointers in oldnumeric with your fixed versions. Which if you do at the very beginning of your code (say, in your own module where you import numpy.oldnumeric) as long as you keep that modified version of oldnumeric in Python’s memory, Python will “cheat” by loading that one instead of reloading the module, so your fix will affect all of your application. Or you could just fix the oldnumeric import in the NumPy side. Or, if you were especially daring, you could fix NumPy itself to add back this feature that just about anyone, except for the NumPy authors, seems to comprehend having value not just in the past, but also forward-looking towards the days when you CUDA had a V8.

Problem 2 – Contiguous

Here’s another one that’ll catch you by surprise, but probably only matter if you foolishly wrote compiled code such as in C++ with Boost to speed up your Python. For some reason there seems to be a bug in NumPy ndarray where even though array data should be contiguous … it just sometimes isn’t. So if you actually check for that / require that in your compiled code, you just might be surprised at failures that by all reasoning shouldn’t be failing that particular check.

Of course there is a nice easy way to work with non-contiguous array data using PyArray_ContiguousFromObject. It’s pretty simple, but does require cleaning up your new Python object with Py_DECREF if you want to prevent memory leaks. Which could mean restructuring your whole method, depending on if you used return to exit before the end of the function. On the plus side though, if your array is contiguous (which if you’re running into this bug, it should have been in the first place) then there’s only an almost negligible performance hit as PyArray_ContiguousFromObject won’t actually copy your data into a new array. Of course if you did run into this bug, then you’re right, you’re going to somewhere hit the performance penalty as this approach copies your array data. But hey, at least it’ll still run. Whereas not doing this workaround could leave you in all sorts of trouble if your NumPy ndarray data isn’t contiguous when it should have been.

So in conclusion … WTF?!?!?!

Can you upgrade your Python code from Numeric to NumPy easily? Err … maybe. Hopefully? Kind of. It all depends on just what features of Numeric you used, because NumPy (and with it oldnumeric) clearly does not encapsulate 100% of Numeric. In the end, you might just find that rewriting a million lines of mixed Python and C++ code to use straight NumPy is about as much work as trying to cheat by using NumPy’s rather sorely incompatible oldnumeric. Why oldnumeric wasn’t written to be a lot more compatible to Numeric is beyond me. You’d expect that kind of incompatibility with straight NumPy, but not from a layer that’s sole purpose is to offer you backward compatibility. :( What was even the point?

Still, someone who didn’t use Numeric to extremes might find NumPy’s oldnumeric an easy solution. Maybe. I guess. Though I can’t imagine why you’d have been using Numeric in the first place instead of straight-up Python arrays … err … lists and tuples … if that’s the case.

Python and C++ Boost – Tips For A Numeric To NumPy Conversion

Chances are that if this matters to you, it’s something that you’ve already gone through. After all, anyone still using Numeric in Python in this day and age is working with an incredibly outdated environment. Still, sometimes it happens. Sometimes in business settings validating a new environment is not such an easy thing to do as it is in academic or hobby worlds. So just in case, here are my experiences of upgrading from Numeric to NumPy:

Tip 1 ) Replace Numeric with NumPy’s Old Numeric

The first trick is that NumPy contains most of what you need already in the numpy.oldnumeric module. This saves an awful lot of effort as you don’t have to rewrite random portions of some million lines of code. The incredible vast majority of the work involved is one simple Pythonic twist:

    import numpy.oldnumeric as Numeric

And if you’re concerned about remaining backward compatible with your old environment then, you can even add an exception handler to choose which is the right one to use:

    try:
        import Numeric
    except ImportError:
        import numpy.oldnumeric as Numeric

Now, similarly, if you used some of the modules in Numeric, such as LinearAlgebra, it’s still almost as simple. You can do:

    from numpy.oldnumeric import linear_algebra as LinearAlgebra

Or, again, if you need backward compatibility:

    try:
        import Numeric
        import LinearAlgebra
    except ImportError:
        import numpy.oldnumeric as Numeric
        from numpy.oldnumeric import linear_algebra as LinearAlgebra

Tip 2 ) Fix minor type inconsistencies

There are some differences between Numeric and NumPy’s Old Numeric, and those are primarily in how Old Numeric doesn’t handle types in quite the same way. The biggest is character versus string arrays and floating point arrays. Now, say you used a string as data for your array. In Numeric this results in an array of a character type, AKA Numeric.Character or ‘c’ with a size of the number of characters in your string. But in NumPy this results in an array of a string type with a size of 1! That’s not very compatible. The solution? Just specify the type when constructing your array. So instead of Numeric.array(“datadatadata”) use Numeric.array(“datadatadata”, Numeric.Character). Yep, that one is really that simple.

Slightly less simple, though you may not even realize it is happening, is similarly related. Say you had a Numeric array of a float type using “f” to define it. Something like Numeric.array([1., 2., 3.], “f”). In Numeric this “f” type specifier results in a match to Float64. Something you may or may not have expected. This is because Numeric has an interesting string matching algorithm. In Numeric you can have a Float0, a Float8, a Float16, a Float32, a Float64, or even a Float128. Each could be specified in a string if you desired instead of using Numeric’s constants. Which means that if you specify a string of “float”, it leaves Numeric to try and decide which length float you want. And Numeric, thinking smart, matches the default float type used by Python, so it’s a nice match to your data. Which, by the way, is a Float64, otherwise known as a double. And so in the above example, if you specify type, “f” to numeric, yep, you guessed it, you end up with an array of Float64.

Where things get tricky is that NumPy doesn’t have the same float types like Numeric does. It just has float32 and float64, AKA, “f” and “d” respectively. So the same “f” that gave you a Float64 in Numeric will give you a Float32 in NumPy!

Now this might not be much of a problem to you if you’re sticking with Python-only code. Then again, with less accuracy it might. But where it can really kick your asterisk is if you did something silly like wrote a compiled module (Who would do a silly thing like that?) and pass it your array as one of the arguments. If you added type checking into your compiled code, if for no other reason than just good coding practices, this can end up throwing you for a loop when suddenly your arrays are no longer matching a Float64 type! The fix, of course, is simply to use the more descriptive Numeric.Float64 type instead of “f”. Or if you’re lazy and that’s too much typing, at least switch to “d” which both Numeric and numpy.oldnumeric will interpret as Float64.

Assuredly, if you’re really into variable typing, chances are there are other places where NumPy’s Old Numeric did not match Numeric as closely as perhaps it should have. For example Numpy.Int16 is “s”, where as numpy.oldnumeric.Int16 is “h”. I’m not sure what affect that has on anything, having not used it. But I noticed it. Goodness knows what else there may be too.  Your best bet is to first not use strings to define your types, but to go to the defined constants, and second to test test test.

Tip 3 ) Fix Numeric / NumPy inconsistencies

Besides just minor type inconsistencies when creating arrays are the bigger inconsistencies, namely in the method of types. In Numeric type constants are characters. Your Numeric.Int32 is literally the character ‘i’. Whereas in NumPy a dtype class was created for handling types. You can construct a numpy.dtype(‘i’), but it’s nothing as simple as just a character. But NumPy is even worse than that in practice, because there’s also a ‘type’ type used in NumPy, and that’s what the numpy.int32 constant is a type of, for example. It’s not the same as a dtype. And as you can see, it’s already getting messy when it comes to choosing your type constants.

Oh, wait, it gets worse.

Because in Numeric an array has a .typecode() that defines what type the array is of. And because Numeric just uses characters as type definitions, it simply returns a character of the type. It’s easy and straight forward.

NumPy ndarrays have no such method. Oh, they do have a type specifier built in. But it’s not a method named .typecode(). It’s a variable named .dtype. And it is a numpy.dtype class instance. But the NumPy type constants are ‘type’, not dtype. Yes, NumPy just by itself is messy. But now add in that all of your Numeric code is looking to compare “if array.typecode() == ‘i’:” and you just walked into a whole mess of incompatibility.

If you don’t care at all about backward compatibility, then you’re fairly well set. You can just replace array.typecode() with array.dtype.char. Yes, that’s right, the dtype class has a char member variable that (mostly, except for differences outlined in tip 2) you can compare against. If you’re more brave you can try even replacing that character with a constant so “if array.dtype == numpy.int32:” is a little more descriptive and cleaner. If you’re just moving forward.

However, for those who need to continue to support both environments with Numeric and environments where NumPy has replaced Numeric, using numpy.oldnumeric or not, you’ve walked into a world of hurt where you’re best making your own module of helper functions, because this gross incompatibility will make doing a simple type comparison in a backward-compatible way very messy, especially if your old environment doesn’t have any version of NumPy to help bridge the gap.

And this is especially true because NumPy is compiled code, so you can’t just sneakily add a .typecode() method to the ndarray class like you normally could in Python. I suppose you could try to do so to the base code and recompile all of NumPy for it. But the bigger question remains why in the world it wasn’t put there in the first place just to remain backward compatible? Welcome to a minor headache. But still, all considered, a pretty small problem compared to what you could be going through right now.

Tip 4 ) Want C++ NumPy? How about a Boost?

If you’ve got that nasty compiled code mixed in with your Python, you just may be using C++, and that means you might even be using Boost. (That’s what I’ve been using anyway.) If so, you might have been disappointed to find that Boost has no native NumPy support built in. And that the authors of NumPy seem to have no interest whatsoever in Boost, preferring Fortran to C++. I guess for a numerical package, I can’t really blame them for that, as that is Fortran’s forte. But it can be awfully inconvenient to the other 99.99% of the world who have forgotten that Fortran is even a programming language. (If they ever even knew it.) Well, cry not, for an unofficial Boost layer for NumPy exists. Enter a GitHub project for ndarray.

Mostly, it’s pretty straight forward and you’ll barely have to change your C++ code at all to use it. One important difference however is in your module export you’ll have to add a boost::numpy::initialize() call immediately at the top so that Boost knows how to template NumPy in order to match your Python to your C++.

And if you’re maintaining backward compatibility, well, it’s not the cleanest thing to do with C++ being not quite as flexible about that as Python. I think the best way to go about it is to turn a function into a template function on the cpp side, double up on an overloaded declaration on the h side, and then in the cpp side again have the overloaded implementations just call the template function with the array type. But wait, the pain isn’t over yet, because then you have to change your export definitions to specify which of the overloaded methods to use, which gets a wee bit messy. Or if you don’t want that mess, simply append a _NUMERIC and _NUMPY to the ends of your declarations to keep them separate instead of overloading. That makes things a lot easier in the export, but doesn’t look quite as smooth.

But then there’s one more monkey wrench in the works if you use Microsoft Visual C++, in that the Boost.NumPy layer needs some tweaking to work in Microsoft Land because M$ has yet to implement templates correctly. So you’ll have to add a cxxflags=/DBOOST_ALL_NO_LIB to your Boost build or else you’ll get multiply defined functions in your libraries because Microsoft still isn’t smart enough to weed out duplicate definitions when using templates, so Boost.NumPy ends up fighting with Boost.Python on MSVC. Doh!

Oh, wait, it actually gets worse because I almost forgot that you won’t even get that far in the first place. Because Microsoft also hasn’t implemented variable length arrays yet. So you’ll have to fix a couple of places in the Boost.NumPy code where they do Py_intptr_t dims[nd]; to become Py_intptr_t* dims = new Py_intptr_t[nd]; And, of course, not immediately return because if you don’t sneak that delete[] dims; line in there you’re going to have memory leaks. Yay!

The problem being that, well, sadly, not many Python users are on Windows, so the Boost.NumPy authors of that GitHub project just haven’t tested it on MSVC, apparently. But it all can be made to work. Honest.

You can even go the extra step and add the Boost.NumPy code straight into the Boost codebase locally before you build it. I mean you’re going to build it anyway, right? Might as well. To get it to work with the Boost build system I had to rename the src directory to build so that bjam could find the Jamfile in there. No biggie.

Of course if you do try to use the Boost.NumPy on Windows, don’t even bother trying to use SCons.  It’s not that SCons won’t work on Windows, because it will.  That’s the point.  It’s that Boost.NumPy’s SCons script won’t.  Why doesn’t it worn on Windows?  Well, you can kind of put that on Python, and kind of on the Boost.NumPy authors.  They chose to use the one part of distutils that doesn’t have full functionality on Windows: distutils.sysconfig.  Now, looking at the latest documentation on Python, you wouldn’t even know that there’s a problem with distutils.sysconfig on Windows.  But if you look at the base sysconfig Python module that distutils (strangely) gets distutils.sysconfig from (why the same Python module is basically duplicated is beyond me) you find this almost non-existent warning in the sysconfig documentation about configuration variables, “Notice that on Windows, it’s a much smaller set.“  What that almost impossible to find warning means is basically that on Windows pretty much None of the configuration variables exist, so sysutils.get_config_var and sysutils.get_config_vars are pretty much useless on Windows.  Thereby causing the Boost.NumPy SCons script to fail horribly.  So just don’t even try.  Use Boost’s bjam instead on Windows.

Conclusion

So, it’s “just” that easy. Uh-huh. In that there’s pretty much nada in the way of documentation, you can imagine it took me a while to sort some of this out. Hence why I’m putting it on Ye Olde Interwebs now, so that hopefully if you’re stuck doing the same thing you won’t waste nearly so much time coming up with answers. Numeric may be dead, but with NumPy’s Old Numeric your Python code doesn’t need to go through a massive rewrite. A bit more work though if you had C++ code too, but it can be done, and almost as cleanly.

QT Bugs Rant – Torpid Trolls Or Nokia Noworkniks?

I have to say, I am highly disappointed with the bug resolution team working on Qt. I’ve reported a number of bugs that I’ve found now, and each and every time I get back basically the same response (whenever I finally do get any response at all), which is to hold the bugfix hostage until I provide them with a sample program to demonstrate the bug.

Never mind that each and every time I have provided clear instructions on how to reproduce the bug. Instructions which should make recreating it simplicity.

But that might actually entail them doing work. So instead they claim that somehow they can’t reproduce the bug. (Perhaps because they never even bothered to try?)

Because sure, I have all of the time in the world to be doing their jobs for them. No, I clearly have not already gone way above and beyond the call of a typical user by first reporting the bug and second making sure that bug report includes all of my system information and details on how to reliably reproduce the bug on my system. No, a silver platter is clearly just not good enough. If that platter isn’t gold or platinum with diamond studs, then clearly I haven’t given them enough reason to bother fixing the bug.

Disingenuous much?

I’ve seen more dedicated efforts on open source projects with people volunteering their work for free!

I really never had these kinds of problems back when Trolltech owned Qt. Have the Trolls just gotten lazy ever since getting fat off the hog of selling out? Or did Nokia fire the Good Trolls and keep only the lazy ones when they bought out Trolltech?

Either way, I am not impressed.

What is the flirking point of me going through all of the trouble to report bugs if no one is going to even try to fix them? I might as well just stop wasting my time reporting the bugs in the first place!

It’s really sad when I have to honestly contemplate the idea of forking my own version of Qt and fixing things myself because their own quality control team can’t be bothered to do any real work. :( (Which I’ve already kind of started doing anyway because I’ve needed to create my own patches since they’re not on the ball.) Just to even drive me to consider forking Qt for my own use just to keep Qt usable for my professional needs means Really Bad Things.

When a customer takes the time to report a bug, that itself should be lauded. And when that bug report further contains so many important details on how exactly to recreate the bug, that’s freaking gold!

Not a reason to hold a bugfix hostage!

And the one time that I hand them source code containing a fix they can’t even be bothered to respond at all. Platinum platter with gold inlay … nada.

Why is anyone paying Nokia for this?  I sure don’t!

Nuts to the NokiTrolls* then! See if I ever bother reporting another dang bug of theirs. Bloody darn useless lazy sorry sacks.

Work on Qt really has gone downhill ever since the Trolls sold out. It’s really depressing. Back when Qt3 was it and Qt4 was under discussion, there were so many great talks on how the architecture would be cleaned up, every widget component would actually inherit from QWidget, things would be threadsafe, architecture would be consistent, etc.

Then they sold out.

And we got Qt4.

Which was missing some pretty major parts no less, like any kind of a replacement for Qt3’s QCanvas architecture. (That animal, the QGraphics architecture, wasn’t even introduced until version 4.2, and didn’t really work properly until later.) And there are still inconsistent parts. Still not all visual components are based on QWidget. It’s not threadsafe. (But it does have a wonderful mutex lock … if you feel like manually creating your own versions of everything to use it! Why didn’t they just make Qt being automatically threadsafe a configuration option for people who can’t afford to lose the processing time on it?) And basically, while Qt4 is worlds better than Qt3, it’s still not great. Qt4 has fallen very short of the hype, is struggling to stay modern, and if it basically weren’t for all of the great concepts introduced and work done before the sellout, would likely never have been able to even stand up on its own.

And now we have Qt5 looming, looking to have little or nothing to do at all with desktop Qt at all. It’s all just mobile-centric ideas. Which is no surprise since Qt was bought out by Nokia, but again is pretty darn useless for anyone using Qt as a Linux or multiplatform desktop GUI. (Umm … KDE much?)

But frankly Qt is just going to hell in a handbasket in my opinion. The new NokiTrolls are not fixing bugs. They’re producing a lot of new pieces to Qt that are more like proof-of-concept test code than polished release code. And Hades, they’re trying to switch it from C++ to Javascript! It’s a mess, and it’s not getting any better.

Mono isn’t looking as bad as it used to. Heck, neither is wxWidgets…

(I’m still going to avoid Java like the plague as often as I can though. And Javascript? Yeesh! No thanks! I’m a professional. I’ll take C++ (or a good derivative) any day. And if I want something scripty, or fluid and artistic, I’ll use Python, thanks! A language that Qt still doesn’t officially support even though clearly someone had enough time to wrap Qt for Python all on their own without their support!)

You know you’re disgruntled when you’re actually hoping that Nokia drops Qt entirely and forces Qt to become branched into only being open source so that a homogeneous community can form around Qt and get it back on track instead of this heterogeneous hodge-podge ruining what was once something great. Heck, if Qt were only open source, I might actually fix all of the bugs I find myself instead of just reporting them. But I’m surely not going to be taking the time to do that while it’s someone else’s paid job to, for a product being sold for a profit.

*= I chose NokiTrolls because Trokia is too much like Troika (both a game software company and generally meaning “a collection of 3” in Russian, which doesn’t work so well with 2, but might be applicable if Nokia sells Qt to someone else). And Trollkia sounds too much like a derogatory term for tuners ripping on Kia’s cars.

Nokia Drops MeeGo – Long Live Qt!

One of the big stories that was breaking while I was missing internet connectivity was the news that Nokia is ditching Symbian and MeeGo in favor of Microsoft (of all things) and Windows Phone 7. It almost makes you wonder why Nokia even bought Trolltech in the first place if they’re not going to leverage such an amazing UI platform on their cell phones. But honestly, I don’t care about the whys of that purchase. People do weird things, and as long as those things don’t impact me, who cares? Let them do whatever. It takes all kinds.

No, what I do care about is the future of Trolltech and Qt.

Now, there’s been a lot (a lot and a lot and then even more of a lot) of screaming that Qt is dead.

Rubbish!

I can’t even remotely begin to believe such nonsense. I mean to start with, Qt has such open-source roots on PCs and has been around for many many years. I can’t even imagine the impact to the Linux community should Qt die (I mean hello, KDE?) but I also just can’t see it happening anyway. You just can’t kill something that large and entrenched. Even if you discount the whole concept of open source which is another reason why the death of Qt just ain’t gonna happen.

But then there’s that Nokia’s bailiwick is cell phones. They may ditch MeeGo – a cellphone-specific Qt-based OS platform, but that’s an incredibly tiny niche for the whole of Qt itself which revolves around those big clunky boxes we have everywhere in our lives. It’s hardly a death knell for all of Qt to lose Nokia’s cell phone platform when they still have all of those PCs.

And then when you consider the other partners with Nokia on the MeeGo platform, like Intel, who are still interested in MeeGo, it hardly even seems likely that MeeGo itself will even die, let alone all of Qt. Long live Qt for tiny devices!

You also can’t discount the trolls at Trolltech for whom Qt is not just a product but a way of life. Just because Nokia bought them doesn’t mean they’re going to abandon a lifetime of work and four generations of developing the Qt platform just on the whim of their new Nokia overlords. That’d be absolutely absurd. I dare say the only people who could even believe (let alone suggest) such a thing would be Johnny-Come-Latelys who don’t even know who Trolltech is, let alone that Qt existed much longer without Nokia than with them. (That or maybe something like Microsoft toadies trying to spread FUD to kill off Qt, if they even exist. Which when you have a monopoly like Microsoft in the PC world, why would you even need them?)

So I think it’s a sure bet that Qt is still here to stay. I have to say I’m a bit disappointed (if not also greatly confused) by Nokia’s decision. (All I can figure is that it maybe had to involve lots and lots of money or something, but I’m not going to even dignify that with a coherent allegation.) But I still loves me them trolls and have no fears whatsoever about the future of what is in my professional opinion (as a developer who has used Visual Basic, Visual Fortran, VC++/MFC, C#/.NET, C++/Mono, Tcl/Tk, Java, C++/wxWidgets, Python/wxPython, and – of course – C++/Qt and Python/PyQt) the best GUI platform/library ever developed, let alone the best multiplatform solution, which most can’t claim.

(And no, by multiplatform I do not mean Win 9x, Win NT, Win 2K, Win XP, and Win 7. And yes, I intentionally left out Win ME and Win ME2 AKA Windows Vista because they both just really really need to vanish from this world before they can do any more harm.)

Not that other platforms aren’t just peachy in their own right. I mean I grew up on VB and VC++ for GUI development. I’d still happily use them, given a good reason. I’m not 100% sold on .NET (or moreover C#) just yet, but … meh, pay me enough and I’ll convert. Or make it fun to learn and I’ll teach myself. But nothing I’ve run across yet beats the sheer elegance of Qt. (Not to mention that whole multiplatform thing.) Especially on Python!

As for Symbian, well, we all knew that was dying, didn’t we? Dead horse. Beating. Stick. Poking. Yep. Still dead. Whack! ;)