Sneaky Snippets (2): A Spec’s Half-Life Period?

From a specification template of a company I worked for some years ago:

Purpose for states: Draft, Adopted, Deprecated

It kinda speaks for itself.

Java and/or Ruby

Currently I am working on a small prototypical application involving lots of webscraping, a database and some sophisticated term indexing and search query expansion. Since some parts of the app are quite independent from the others, sharing data only via the database, I decided to code as much as I could in Ruby. There is nothing better for getting into a new language than working on a real project.

For web scraping (which I always did with Python, my absolutely favourite scripting language to date, and the fabulous BeautifulSoup library) I decided to switch to Hpricot (silly name), which does the job of scraping untidy HTML pages good enough for me.

So, what do I use as a cost-effective, well-maintained search and indexing framework? Of course Lucene, because I have grown up with Java and the bold faith that there is a library for everything under the sun - written in Java. But lately I found that there is in fact a Lucene-lookalike implementation for Ruby named Ferret. And it is even faster than Lucene. So off to new pastures.

All that is left in Java will be the RDF handling, because there is nothing as sophisticated for RDF/OWL ontology handling as the Jena library in Ruby.

Tune in next time to hear me talk about the pitfalls of weakly-typed scripting languages …

Links:

Pavel Mayer on eXtreme Programming

This one is for our german-speaking audience: if you are interested in agile development methods (or eXtreme Programming in particular), but never had the opportunity to try it for real, you should listen to this issue of Chaosradio Express, a german radio show/podcast. It features Pavel Mayer, head of development at art+com.

From the show’s description:

Pavel berichtet aus seinen jahrelangen und mehrheitlich positiven Erfahrungen in der konkreten Anwendung von Extreme Programming im Unternehmen und erläutert, welche Schritte nötig waren, um diese Umstellung zu einem Erfolg zu führen, welche langfristigen Effekte das hatte.

Its a great and very insightful show, now I’d like to try XP myself…

The Departed

Back in July 2004, I was shocked to hear that Josh Bloch and Neal Gafter had left Sun. A few months ago, in October 2006, another shock came - Gilad Bracha has left Sun as well.

Not many big names are left at the home of Java. Bill Joy already went away in September 2003. How long will James Gosling and Guy Steele stick around? Makes me wonder what the future will bring for Sun and Java. Especially considering Bracha’s ominous farewell Good luck to you all - you’ll need it. Am I just being paranoid, or do these words and these events forecast a gloomy future for Sun, Java, and everyone involved, that is, me and you, my fellow programmers?

(In case you don’t know, Gilad, Josh, and Neal are Java Gods. Josh wrote the Java Collections Framework and Effective Java, one of the best books about Java. Neal Gafter was in charge of Sun’s Java compiler. Together they wrote Java Puzzlers, another great book. Gilad Bracha was heavily involved in the specification of Java itself, the JVM, Generics… you name it.)

Ruby & Me: No More Static Typing Zealotry

Note: This is part II of an ongoing series on the programming language Ruby.

In August 2005, I wrote in my personal blog (german):

Now, after hacking PHP for virtually twelve hours a day the last three weeks (with a few exceptions), I know that this language isn’t suited for people with a sensitive mind like mine. Its a particularly bad idea to begin by dumping out some quick & dirty code and then refactor this into a clean Model-View-Controller application (I think I have to read up on agile software development). PHP’s type-free variables together with my own web framework, which uses HTTP request and session as a shared hashtable for beans (like Model 2/Struts, naive, I know, …), lead to sheer debugging horror. After this I resolved to return to the world of static typing.

So I actually wanted to get back to Java — or any other statically typed language offering a sufficiently powerful and elegant (!) web framework. Because, yes, I was a static typing zealot, and I wanted to get home and snuggle up in the comfortable warmth of the static typing safety net. But this didn’t happen (for web development), for two reasons.

The first reason is that by developing MyVeryOwnWebFrameworkTM in PHP I learned an important lesson about scripting dynamically typed languages: under particular circumstances the flexibility of these languages can actually support elegance in a way statically typed languages can’t, e.g. for following the convention over configuration principle. And this is one of the areas where Ruby and Rails just excel, as we will see later in this series. What *I* did was developing PHP code in an idiomatic style borrowed from Java — what did I expect?

The other reason is that I discovered the concept of modal web frameworks which appealed to me as a very elegant approach and, again, couldn’t be done in Java. So after doing some research and taking sneak peeks into several languages and frameworks, such as Seaside for Smalltalk and some other framework I can’t remember for Haskell, I decided to learn Ruby because it has a rapidly growing base of supporters and there’s already a modal web framework for Ruby called Wee.

So actually Rails wasn’t even my main reason to making the switch to Ruby, yet it was the first thing I played with, perhaps due to the mass of training material available on the web. And I got stuck with it, because Rails immediate me taught me what makes Ruby cool. Thus, in the next episode of this series I will point out some of Ruby’s features that make me love this language.

Beans, Snakes, Gems

This will not be another programming language comparison frenzy. I had just some semi-serious thoughts about something I would call “language marketing”:

If you are starting to design a new programming language, start thinking of language identity. I do not know whether this term existed prior to this posting, but language identity is for programming languages what corporate identity is for enterprises. Of course, the scope of the language and all the nifty little features and whether it is compiled or interpreted and for which platforms the language is available is of some importance - but to make your language known, you need a lot more sexiness. This language sexiness is made by (but necessarily limited to):

  • A logo. Or better, an allegory. As we see in successful languages, this does not have to be an animal (although this helps a lot with O’Reilly). Java has the coffee, Python has the snake (although the name comes from the British comedy group), Ruby has the gem.
  • The name. C++ is a notable exception, taking its fame mostly from its predecessor C, which did not need a fancy name because it was the programming language sent to us from above in the Old and the New Testament. But most popular languages have names which are good to remember. Ask five developers how they are pronouncing “C#” and you will get six different answers.
  • A web site. This serves as a hub for everything about your language. The Python page, for example, is so resourceful that keeping a local documentation for the language should never be necessary. There is even a Firefox sidebar for easy access to all the information on the pages.
  • A figure head. At best, someone as strange and bearded as Larry Wall (PERL), at least some guru whose name is not even mentioned in full (like DHH instead of David Heinemeier Hansson, Rails/Ruby), or only by first name (”Bjarne said …”).

MS Project and Word

A question to all the project managers and other folks concerned with project planning out there: How do you embed gantt charts taken from MS Project into a MS Word document? Sounds easy? Well, there are a few constraints that must be satisfied:

  • It must be possible to rotate the chart — after embedding it or on the way.
  • The embedded chart must be resizable without quality loss, i.e. it must transformed into in some kind of vectorized form, or in very high resolution.
  • The Word file must be readable & editable on machines where MS Project isn’t installed (in order to collaboratively edit a document).
  • The resulting Word document must not exceed a reasonable size, i.e. its okay for a document to gain a few MB by adding a one-page chart.
  • The process of converting/embedding must be reasonably simple so that it can be repeated iteratively, i.e. it shouldn’t take more than a fistful of steps.

According to my personal experience, its an almost impossible job to do. After researching this problem and doing some tedious trial-and-error work for several hours, I found a way:

  • print the chart as a high-quality PDF with embedded fonts using Adobe’s PDF driver
  • open the PDF in Acrobat
  • rotate
  • save as EPS
  • insert the EPS into the Word doc

This works — if I perform these steps on my colleague’s machine. I tried this on my machine, but it won’t work. Exporting from Acrobat as Word doc works sometimes, but shows rather indeterministic behavior and sometimes leads to a mangled mess. Other approaches, such as going via WMF, EMF or directly embedding the MPP as an object, failed right away.
I assume there is an easier way to do this which works on any machine (with the necessary software) — any suggestions?

Humane Interface Design

Martin Fowler writes on his view on humane interface design, i.e. APIs that are designed for convenient use, in contrast to minimalist APIs. He states that

The essence of the humane interface is to find out what people want to do and design the interface so that it’s really easy to do the common case.

For example, Ruby’s arrays have convenience methods such as first, last, flatten, etc. which tend to be omitted in minimalist interfaces, because they can easily be implemented by clients. And Ruby also aliases method names, i.e. using multiple names for the same functionality, such as length and size of arrays. As an example for a minimalist interface, Fowler mentions the Java API for collections.

I wonder where Python falls in this spectrum. The Python folks state explicitly that

There should be one — and preferably only one — obvious way to do it.

Which sounds like a minimalist approach to me, in the meaning of “reduce stuff to the things that are canonically necessary”. But if you look at the APIs, you’ll find aliases there as well. I don’t know Python terribly well, so perhaps some Python guru can enlighten me on this?

Wrapping things up I don’t think that one approach – humane or minimalist – is necessarily better than the other. Both have their benefits and drawbacks, but like Martin Fowler I prefer humane interfaces.

Via One Man Hacking

Sneaky Snippets (1): Infinity

I found this beaut in a Delphi system I co-wrote while studying:

const
  UNENDLICH = 99999999;

Please note that (in this case) unendlich is the german counterpart for infinity.

This looks worse than it actually was. The system was an LL(1) parser generator and in order to increase performance we used fixed-length arrays for all kinds of collections, and we had to set a maximum upper bound for array indices. Maybe we shouldn’t have called it infinity, though.

Ruby & Me: The Beginning

Note: This is part I of an ongoing series on the programming language Ruby.

It all started when I was fed up with developing web applications with PHP or Java (i.e. Struts). I had a rather complex web project ahead and just called it quits with PHP and Java (for web development), because, well, PHP is the shortest path for a web developer to the sanitarium and Struts was simply to cumbersome for my taste. And of course there was Ruby on Rails this super productive new web framework everybody was drooling over, I just had to try it. So I got myself a printout of Rolling with Ruby on Rails to read. And it annoyed me almost from the beginning.

The main reason I didn’t like Rolling with Ruby on Rails is that it praises Rails to the skies while almost completely failing to mention any of the framework’s cool, elegant or productivity-boosting features. The whole article is mainly concerned with scaffolding, a feature which automatically provides a web-based interface for creating, editing, browsing and deleting objects. Its nice to get a first look into the thing, but in my opinion its almost useless for serious development. In my Rails projects, for example, scaffolded code amounts to about 1% (or less) of the overall code, because the application interfaces just don’t have much in common with the interfaces provided by scaffolding.

Yet there are a lot of cool things to tell about Rails: Its domain specific languages, the powerful ActiveRecord framework, its builtin support for AJAXification, to name a few. Rails owes the ease of using these features mainly to the Ruby programming language. So what makes Rails cool are basically the same things that make Ruby cool, applied masterly.

Thus, in the course of this series, I will present the distinctive language features of Ruby that are key for frameworks like Ruby on Rails and made me fall in love with Ruby almost instantly.

Next Page »