h1

Using Hadoop to Create SOLR Indexes

September 26, 2010

One of the most challenging projects I faced at work recently was to create a Apache SOLR index consisting of approx 15 million records. This index had been created once in the history of the company using a MySQL database and SOLR’s Data Import Handler (DIH). It had not been attempted since then because the original indexing process was time consuming (12-14 hours), required human supervision, and on failure had to be restarted from the very beginning.

For small data sets (say, less than 100,000 records) SOLR’s DIH and MySQL works fine. However, for these larger sets it’s just too much of a drain on resources.  Some members of our team and the architecture team had had success working with large data sets by leveraging the Apache Hadoop project. One of the most attractive aspect of Hadoop is that the processing is distributed which should reduce the total time to index. Also Hadoop has a robust fail-over system which would remove the need for human supervision. We architected a data pipeline by which data would be processed by modules. When one module completed its task it would alert the system and the next module would begin work on the the output of the previous module. The SOLR indexing is one module.

Read the rest of this entry »

h1

Android apps on my phone

August 15, 2010

I was the first one at work to get an Android phone. As word got around I have become the guy to go to when it comes to Android. I get asked questions about Android before they buy. I also get questions about Android after they’ve bought their new phone. One thing that people seem to appreciate is when I provide them with a list of apps to get them started.

Without further ado, here is a list of most of the apps I have on my phone. I’ve not included some of the apps as I think they come pre-installed.

Read the rest of this entry »

h1

How I managed to install the CyanogenMod 6.0.0-Droid-RC2 on my Motorola Droid from a Mac Book Pro (OSX v.10.6.4)

August 4, 2010

First I read the wiki entry http://wiki.cyanogenmod.com/index.php?title=Full_Update_Guide_-_Motorola_Droid As I was reading I noticed that there were no directions for OSX; only for Windows and Linux. It turns out that the flash recovery tools are not available for OSX. That means that I would have to do this from a virtual machine.

Because I am more comfortable with Linux I decided to use Ubuntu running in a virtual machine. I grabbed a copy of VMWare Fusion from http://downloads.vmware.com/ I then grabbed a copy of Ubuntu from http://www.ubuntu.com/desktop/get-ubuntu/download I had use VMWare with Windows before so I knew this couldn’t be that difficult but I skimmed http://svenand.blogdrive.com/archive/56.html just in case there might be any gotchas.

Read the rest of this entry »

h1

r cannot be resolved

January 2, 2010

In the Google group Android Beginners I frequently see messages that ask what the error “r cannot be resolved” means in Eclipse.

Read the rest of this entry »

h1

To Android or to iPhone?

November 15, 2009

I’ve wanted an iPhone for at least a year. Six to eight months ago I decided I was going to make the switch from Verizon to AT&T – just for the iPhone. Our Verizon contract expires next month. Now that December 8th is almost here I have begun counting the days. How exciting it will be to walk into the AT&T store and pick up my new iPhone.

I appreciate the quality of Apple products. From the beauty and elegance of the design to the simplicity of the user interface. It is obvious that they put great thought into the product as a whole. I wouldn’t call myself a “fanboy” but I’m pretty close. This is why it has come as a surprise that I have decided to go with the new Droid that Verizon just made available (http://phones.verizonwireless.com/motorola/droid/).

Read the rest of this entry »

h1

Mulitple cores in SOLR

July 3, 2009

I can’t believe how easily I set up multiple cores on SOLR (http://lucene.apache.org/solr/). It’s literally as simple as creating a solr.xml file in $SOLR_HOME that tells SOLR the path to the core directories and the names of the cores.

h1

OutOfMemoryException in Maven

August 19, 2008

I came across a situation today where Maven threw an OutOfMemoryException. I didn’t think the process would have taken that much memory but it clearly did. I was trying to deploy. I then tried it again but skipped the tests. No good. I found out that by setting the MAVEN_OPTS environment variable to something like -Xmx512m I was good to go.