Wednesday, November 14, 2007

Dalvik: Google's workaround for Sun's JVM

Mobile programming suddenly got a lot more interesting

Dalvik: how Google routed around Sun's IP-based licensing restrictions on Java ME

Dalvik is a virtual machine, just like Java's or .NET's.. but it's Google's own and they're making it open source without having to ask permission to anyone (well, for now, in the future expect a shit-load of IP-related lawsuits on this, especially since Sun and Microsoft signed a cross-IP licensing agreement on exactly such virtual machines technologies years ago... but don't forget IBM who has been writing emulation code for mainframes since the beginning of time).

The Android SDK does not compile your Java source code into Dalvik's bytecode directly, but it first uses a regular java compiler to generate regular java bytecode (say, javac or the built-in Eclipse compiler) and then converts that bytecode into Dalvik's bytecode (the "dx" tool does this: convers .class/.jar into .dex files).


there is no need to ship a java virtual machine on your Android-powered phone and you can use your regular Java standard edition to develop your phone application (means, you don't need to use Java ME anywhere at all).

Friday, November 09, 2007

Working / reading retreats

John Carmack says:
Once or twice a year I go on "working retreats", where I lock myself in a hotel room for two weeks with no internet connection for completely focused work.
Many-many years ago when I read about Bill Gates' twice-yearly "think weeks" I immediately realized how much I needed such a thing. Either reading or working would be fine. Just go completely offline, no phone, no emails, no feeds, no internet, no TV.

I'll try do do one full offline weekend for practice sometime. :)

Tuesday, September 04, 2007


For almost 5 years we've been using CruiseControl for continous builds. Last time I actually fiddled with the setup it was very buggy for such a widely used open source project but I learned to live with it because there were no convincing good alternatives.

Currently my company is evaluating the commercial Anthill for managing build reports for projects based on both Java and Microsoft technologies. Anthill seems nice, but it's commercial and closed source software and I would prefer something open source.

So I spent some time looking at Hudson. It seems really nice, with tons of features, GUI configuration and support for several issue tracking and build systems.
I especially liked the weather report like overview as seen on

Unfortunately I stumbled into a problem while trying to connect to our SVN server so I can't recommend it for our project just yet.

Thursday, August 16, 2007

Software Engineering Radio and public speaking

I recently discovered Software Engineering Radio and added it to the list of podcasts I regularly listen to during my commute.

There are 2 interviews in the archives with none other, than Prof. Doug Schmidt, who works at the same institute where I do (he is the creator of the ACE/TAO framework and the author of the Pattern Oriented Software Architecture books among others.). Unlike most of us nerds, he is a truly brilliant public speaker. I admit sitting in during some of his talks where the topic itself was not all that interesting for me, just to listen to him talk.

Well, today, I realized one of his secrets. Well, in addition to being smart of course. And having a good sense of humor. His other secret is: he talks fast. A high words per minute count doed make you sound more interesting and even smarter.

See also GeekBrief TV's episode 166, where Cali Lewis talks about why learning to talk fast is important. [Update: the archive doesn't go back enough to see that episode. Oh, well.]

Tuesday, August 14, 2007

Passed the SCJP exam

Many years ago I wanted to get an SCJP exam, but changed jobs before I could take it and later forgot about it somehow. Last month I decided that I'd take care of this unfinished business.

It took me a bit less than 2 weeks to prepare spending about 1 hour a day on average reading the book and solving example questions.

The exam is not hard if you have a lot of Java experience (I started developing in Java back in 1997 with JDK and 1.0 and Applets...) but it does have a few trick questions. The format is typically: what does this program print, but since "compilation fails" and "an Exception is thrown at Runtime" are almost always among the possible choices you have to look at the code carefully.

Some of the things I learned during the preparation are pretty much useless, like watching out for silly mistakes that modern IDEs, such as Eclipse catch for you immediately as you type the code.

Other questions test your knowledge of corner cases in the language that I've never encountered in my many years of coding, such as whether catch (Exception e) will catch AssertionError or whether this abomination compiles:

long[][] a[] = new long[3][][];

(It does.)

So now I'm certified (I got 90%) and know much more about the java.util.Scanner class than I'll ever need to. :)

Tuesday, July 17, 2007

Code reviews, automated testing, static analysis

There was an interview on The Java Posse podcast where Brian Goetz (whose book is great BTW) said something about software QA which made me nod heavily in agreement. (It probably looked strange standing alone in the bus stop). I liked it so much, that I'll transcribe it here:

"If you ask most developers on the street 'why do we test code' you'll get an answer something like: 'to find bugs!' Finding bugs is good, but I think finding bugs is a happy side effect of writing tests. [...]

Writing tests is one of those necessary things but it exhibits diminishing returns. The first 100 hours you spend writing tests are probably going to be more effective in terms of improving your confidence in the code than the 100 hours from 1000-1100. [...]

A better way to think about the kind of testing that we do is not that we're looking for bugs it's that we're looking to buy confidence. [...]

Writing tests is a form of buying confidence that exhibits diminishing returns.
Code review does the same thing. [...]

And if you try to apply portfolio theory to optimizing the QA budget [and] you observe that both of these buy you confidence with diminishing returns and they're uncorrelated then the optimal portfolio is to have some mix of testing and code review rather than putting all your eggs in one basket or the other.

Static analysis yet a third thing we can do [...] that tends to find different kinds of bugs than either testing or code review. So it's something that should be part of everybody's development lifecycle, ererybody's arsenal, everybody's toolkit."

I'm a huge fan of FindBugs and other static analysis tools I wish there were more ways (ie. a richer set of annotations) to make things in Python (and even in Java) more explicit, so that these tools could catch stupid mistakes for almost free and immediately rather than letting them slip through and manifest as hard to find bugs further down the road.

Sunday, July 08, 2007

A nice podcast on Python VMs

I don't have much time and patience for podcasts and I'm really happy when I find one that doesn't make me feel like I'm wasting my time. Here's probably the most informative podcast on Python that I've listened to in a long time:
It's a nice high level overview of CPython, Jython, IronPython, Psyco and PyPy. How they came about, what's the idea behind each of them and how they relate to each other.

The podcast is from April and in my opinion paints a somewhat rosy picture of these technologies but it's still highly recommended.

Tuesday, June 26, 2007

Yet another Python web development framework comparison

If you're still sitting on the fence about the right web framework here's another comparison post describing Django, Pylons and TurboGears.

Python web development and frameworks in 2007

I don't understand why the poster thinks that the TurboGears community is in 'decline'. All I see is that there are more subscribers on the mailing list than ever (3 times as many as for Pylons).

Monday, June 25, 2007

Google Developer Podcast

The only podcast that I regurarly listen to nowadays is The Java Posse. Recently Google launched a developer oriented podcast called the Google Developer Podcast. Since Dick Wall and Carl Quinn from the Posse are participating in the GDP it's almost like listening to a special edition of the Java Posse. For me this is a very good thing.

When I started going to the gym more I subscribed to tons of different podcasts: Python411, Drunk and Retired (Java), Digg Nation, Floss Weekly, etc. In the long run however I found that most of them are too unprofessional for my taste or don't have the right topics and especially signal to noise ratio. So I ended up listening to them only occasionally and stuck with the Posse in the last few months.

Are there any good developer podcasts you recommend?

Thursday, June 21, 2007

Django, TurboGears, Pylons comparison

It doesn't even try to be "scientific" but I still found this comparison of the currently popular Python web frameworks very informative:

Unscientific and biased comparison of Django, Pylons, and TurboGears

Tuesday, May 01, 2007

Guido's Python 3000 Talk

I somehow missed this talk Guido gave at Google about Python 3000. I found it through StumbleUpon now.

Saturday, February 24, 2007

Python web frameworks quote

Reading through the PyCon 2007 notes:

There are more Python frameworks than reserved Python keywords.

Funny because it's true. :)

Saturday, February 03, 2007

Open Office and regular expressions

For the first time in my life, I tried to use Open Office for more than 15 minutes. While converting data from an HTML page into a chart I noticed that regular expressions don't seem to work in the Replace part of the Find/Replace dialog.

This must be a joke. My problem is trivial: Convert "8.1 k" into "8100" and "9k into 9000". I can't do it the straightforward way with 2 regular expressions. On an ideological level I'm a big supporter of Open Office, but today all I see is that it's wasting my time. (The corresponding online help for the dialog is not very useful either.)

Monday, January 08, 2007

Periodic Table of Visualization Methods

An extremely cool collection of visualization methods. You can mouse over to get an example of each.

Friday, January 05, 2007

PEP 8 checker

PEP 8 checker for the anal retentive in all of us. :)

/Users/gergely/twp/twp/ W291 trailing whitespace
JCR: Trailing whitespace is superfluous.

/Users/gergely/twp/twp/ E302 expected 2 blank lines, found 1
def refresh_images(limit=dbbot._default_limit):
Separate top-level function and class definitions with two blank lines.

Method definitions inside a class are separated by a single blank line.

Extra blank lines may be used (sparingly) to separate groups of related
functions. Blank lines may be omitted between a bunch of related
one-liners (e.g. a set of dummy implementations).

Use blank lines in functions, sparingly, to indicate logical sections.

I'm the kind of masochistic guy who actually enjoys this kind of thing, but I need to figure out a better way to integrate it with PyDev. I wonder what's the easiest way to make Eclipse jump to the line where the error happened. Hmm...

TurboGears 1.0 and beyond

Kevin Dangoor, TurboGears project lead announced 1.0 this week on IRC.

I was there and I have this pretty screenshot to prove it! :)

Maybe even more importantly, TurboGears has a new leader: Alberto Valverde.

I was too busy to stay there for the followup discussions, but the gist of it seemed to be that a heavily WSGI based approach (sounded much like Pylons) will solve all problems including world hunger and the conflict in the Middle East.

Another equally important thing was the direction that is planned for TurboGears 2.0: decentralization and modularization. From what I understand people want to fork off chunks of TurboGears into fairly independent and externally reusable projects and keep TurboGears a small chunk of glue code that connects them together.

On the one hand this is not new, TurboGears started out by integrating a bunch of preexisting tools. ToscaWidgets was forked off recently from the TurboGears widget code. I agree that this approach can work to a certain extent. My guess is that in the case of TG the current change of direction (actually returning to its minimalistic roots) was more organizational than architectural. (Not that you can separate the two: see Conway's Law)

But there are pros and cons to decoupling. Unix command line tools are a good example. They were great, because there were standard interfaces between them which let them develop and be tested independently. But there is also a huge lack of conceptual integrity compared to monolithic frameworks. The naming conventions are inconsistent, different switches are used for the same functionality in different programs, etc.

The big advantage of monolithic frameworks is consistency in design. Modules use the same naming and coding style, have similar layouts. They reuse the low level utility code, the documentation tool, the testing framework, the bug reporting, the build and packaging system. There is one well known place to ask questions, to look for documentation, to download the latest stable release.

Linux distributions are a good example of both the strengths and weaknesses of heavily modularized systems. Probably the biggest advantage is that there is a huge amount of code reuse, and you can decentralize work to thousands of volunteers, maintaining the individual packages which can evolve independently.

On the other hand some combinations of packages are not tested properly, only certain combination of packages are well supported. If you report a bug that has been fixed in the upstream version, but not in your distro, you're on your own. Linux and Firefox is a good example.

People who want to support your software have a harder time when instead of a standard way, you have an infinite combination of modules. Just think of LSB and desktop Linux vs Mac OS or Windows.

We'll see how loose coupling works out for TurboGears. Interesting times ahead.

Tuesday, January 02, 2007

7zip is amazing

7zip just blew my pants off. Back in the day I though I was edgy when I used bzip2 instead of gzip, but this is just amazing.

I downloaded the full edit history of the Hungarian Wikipedia to run some analysis on it and 7z compressed it to 1/87th of its original size.

barcika:~/wp/huwiki$ du -k *
11502112 huwiki-20061205-pages-meta-history.xml
131808 huwiki-20061205-pages-meta-history.xml.7z
Of course this was superverbose XML, but the compression rate is still very impressive. The same original compressed with bz2 is almost 4 times as big. 7zip gonna be my first choice for archiving large log files.