Java and/or Ruby

Currently I am working on a small prototypical application involving lots of webscraping, a database and some sophisticated term indexing and search query expansion. Since some parts of the app are quite independent from the others, sharing data only via the database, I decided to code as much as I could in Ruby. There is nothing better for getting into a new language than working on a real project.

For web scraping (which I always did with Python, my absolutely favourite scripting language to date, and the fabulous BeautifulSoup library) I decided to switch to Hpricot (silly name), which does the job of scraping untidy HTML pages good enough for me.

So, what do I use as a cost-effective, well-maintained search and indexing framework? Of course Lucene, because I have grown up with Java and the bold faith that there is a library for everything under the sun - written in Java. But lately I found that there is in fact a Lucene-lookalike implementation for Ruby named Ferret. And it is even faster than Lucene. So off to new pastures.

All that is left in Java will be the RDF handling, because there is nothing as sophisticated for RDF/OWL ontology handling as the Jena library in Ruby.

Tune in next time to hear me talk about the pitfalls of weakly-typed scripting languages …

Links:

Pavel Mayer on eXtreme Programming

This one is for our german-speaking audience: if you are interested in agile development methods (or eXtreme Programming in particular), but never had the opportunity to try it for real, you should listen to this issue of Chaosradio Express, a german radio show/podcast. It features Pavel Mayer, head of development at art+com.

From the show’s description:

Pavel berichtet aus seinen jahrelangen und mehrheitlich positiven Erfahrungen in der konkreten Anwendung von Extreme Programming im Unternehmen und erläutert, welche Schritte nötig waren, um diese Umstellung zu einem Erfolg zu führen, welche langfristigen Effekte das hatte.

Its a great and very insightful show, now I’d like to try XP myself…

I hate wizards

As I am frequently working with Eclipse, I often stumble upon wizards - small dialog windows, which promise to ease my daily developer’s grind (pun intended). Let me state one thing - I loathe wizards.

For example, I had a Java class which I wanted to create a web service from. No problem, I thought, the Web Standard Tools in Eclipse have a wizard for that. Next time, I will do it by hand, because of several problems:

  • That specific wizard crashed in about two of three times I ran it
  • When it ran through, the wizard left me with a feeling that I had not learned anything about the process of creating a web service from a Java class. The only things I learned is to use some voodoo and guessing for which values I had to fill in the wizard’s forms.
  • The wizard did not tell me in advance which generated source code it would overwrite. So at first run, some of my troublesome hand-coded classes were overwritten. Thank God for version control.

As a conclusion, I tend to use wizards only when I …

  1. … am sure that the wizard solves a problem faster than (and in the same way as) I could write the code by hand.
  2. … have a rough idea of what the wizard is doing under the hood.
« Previous Page