Mac Tips, Tricks and Utilities

I’m always looking for a way to get more out of my setup. In the past year, since moving from Windows to OSX, I’ve accumulated some knowledge and utilities I’d like to share.

Multiple Monitors

These past few years, I’ve grown accustomed to working on a two-monitor setup. I wanted to move forward to three monitors (two external, one internal), but the only way to do this with the MacBook Pro I have (15-inch, Early 2011) was to buy two thunderbolt-enabled Apple monitors. Since I don’t feel like shelling out $1000 for each of those, I started looking for other solutions. The absolute best one I found was using an external USB graphics card. In my case, I use the Plugable UGA-2K-A ($62 at the moment of writing this post) and daisy-chain my two Dell 24″ monitors. It’s based on the DisplayLink DL-195 chipset and has great drivers for OSX, which lately also added the ability to rotate the external monitor 90°. The refresh rate is great even for a resolution of 1920×1200 and I can only imagine what the USB 3.0 based models can do. I’ve had one card die on me and got great customer service from Plugable, who sent me a replacement, no questions asked.

BetterTouchTool (free)

I work with an Apple Magic Mouse, which is based on the great idea of integrating a trackpad on a mouse. Of course, it’s lacking in many respects, but my bigger issue with it is that it doesn’t have a way to middle-click. BetterTouchTool is a utility that lets you define custom actions for your mouse, keyboard and trackpad. I installed it, set three-finger-click to be recognized as middle-click and haven’t looked back since. Later on I started adding more shortcuts, like having the top three points of my trackpad being, from left to right, close, minimize and maximize, in the same order as these actions appear on windows. I also like the author’s SecondBar utility, but personally I don’t find it as useful.

Little Snitch (not free)

Powerful firewall, though it gets kind of annoying if you don’t define broad rules. Already helped me catch one too many applications trying to call home. I don’t think there’s much left to say about it that hasn’t already been said.

Skitch (free)

The ⌘⇧4 shortcut and its friends in OSX let you capture parts of your screen, but this functionality is very basic. Enter Skitch, which sits in your tray and when you press ⌘⇧5, gives you some basic editing functionality and the ability to drag the screenshot as a file (for instance if you want to upload it). I use it heavily when moderating submissions to plaintextoffenders.com. Unfortunately it has the downside of being very pesky about its integration with Evernote (which bought the company last year).

.osx (free)

Always ask for more from your OS. The link is to a Hacker News discussion about tweaks you can perform in OSX. Here’s my list:

defaults write NSGlobalDomain AppleKeyboardUIMode -int 3
defaults write NSGlobalDomain AppleFontSmoothing -int 2
defaults write com.apple.dock showhidden -bool true
defaults write NSGlobalDomain AppleEnableMenuBarTransparency -bool false
defaults write NSGlobalDomain AppleShowAllExtensions -bool true
defaults write com.apple.finder ShowPathbar -bool true
defaults write com.apple.finder ShowStatusBar -bool true
defaults write NSGlobalDomain NSNavPanelExpandedStateForSaveMode -bool true
defaults write NSGlobalDomain PMPrintingExpandedStateForPrint -bool true
defaults write NSGlobalDomain ApplePressAndHoldEnabled -bool false
defaults write NSGlobalDomain KeyRepeat -int 0.02
defaults write NSGlobalDomain InitialKeyRepeat -int 12
defaults write com.apple.NetworkBrowser BrowseAllInterfaces -bool true
defaults write com.apple.finder _FXShowPosixPathInTitle -bool true
defaults write com.apple.desktopservices DSDontWriteNetworkStores -bool true
defaults write com.apple.finder FXEnableExtensionChangeWarning -bool false
/usr/libexec/PlistBuddy -c "Set :DesktopViewSettings:IconViewSettings:arrangeBy grid" ~/Library/Preferences/com.apple.finder.plist
defaults write com.apple.finder WarnOnEmptyTrash -bool false
defaults write com.apple.Safari DebugSnapshotsUpdatePolicy -int 2
defaults write com.apple.iTunes disablePingSidebar -bool true
defaults write com.apple.iTunes disablePing -bool true
chflags nohidden ~/Library
defaults write com.apple.LaunchServices LSQuarantine -bool false
defaults write com.apple.finder AppleShowAllFiles TRUE
defaults write com.apple.archiveutility dearchive-reveal-after -bool TRUE

for app in Safari Finder Dock Mail SystemUIServer; do killall "$app" >/dev/null 2>&1; done

f.lux (free)

Do you find yourself waking up at night sometimes, thinking about a cool idea, opening your laptop’s lid drowsily only to have your pupils catch fire by the glow of the monitor? f.lux automatically sets the temperature of your monitor to match the amount of light outside. Now your eyes can adjust more easily and your brain won’t send out an OMG WAKE UP IT’S MORNING alarm.

TotalFinder (not free)

This is the most recent addition to the list. Though it still lacks tested support for OSX Mountain Lion, it still works and works great. It adds tab support, side-by-side panes, etc. to the Finder. Also valuable is the global key combination to pop-up a finder window. Unfortunately it’s not free unless X.

Homebrew (free)

THE package manager for OSX. In stark opposition to MacPorts, it doesn’t require you to sudo it all the time. Unfortunately, the quality of the packages varies widely, so caveat emptor.

If you have any utilities or tips and tricks you’d like to share, please do. I’d love to get more out of my setup.

Parallelized Streams in Scala – Not What You Thought

I had a Stream of data coming in from an input source and had to do some heavily-CPU bound work. The code looked like this:

stream.map(item => cpuIntensiveWork(item))

Behind the scenes, the input stream is enumerated and items are acted upon one-by-one to create a lazily loaded list.
Wanting to parallelize this, I added par:

stream.par.map(item => cpuIntensiveWork(item))

What I imagined this would do was that it would load the lines into a buffer and have workers read from that buffer an execute the map while it was still writing. Turns out that’s not what happens.
What really happens is that the whole stream is read into memory and then that gets parallelized. Hardly what I was aiming for.

Unit Testing MySQL-Bound Code

I’m currently working on a product that is very tightly bound to MySQL. It’s so tightly coupled, in fact, that unit testing it while mocking the database access is meaningless. Unfortunately, accessing a MySQL database from the build server (or any machine) is not something that we can take for granted, not to mention concurrency (one person running the unit tests in parallel to another, messing each other’s sandbox) and the network I/O to an external server.

The solution I came up with would be to use an in-process, in-memory database to mock MySQL, so I looked to H2 (good) and HyperSQL (better), but neither had the same power as MySQL had.

Browsing through the Interwebs, I stumbled into mysql-je, a hack from back in 2006 which let you run MySQL as an in-process database, but it was so unstable and unmaintained that I kept looking and came across MySQL Connector/MJX. MJX is an in-process database that, unfortunately, needs access to the filesystem to store its files (you can use the temporary files’ directory), but comes in the form of handy (not so little) jars and works well when invoked from the JVM.

We need two things from this experiment:

  1. To be able to isolate and access our own MJX database.
  2. To intercept any and all calls to MySQL and handle them instead of the usual MySQL driver.

Let’s start by adding it to SBT:

val aspectj = "org.aspectj" % "aspectjrt" % "1.5.4" % "test"
val mxj = "mysql" % "mysql-connector-mxj" % "5.0.12" % "test"
val mxjdb = "mysql" % "mysql-connector-mxj-db-files" % "5.0.12" % "test"

And now we reload and wait, because the mysql-connector-mxj-db-files jar is almost 140MB in size (since it contains the binaries to run MySQL on multiple platforms) and takes while to download. While waiting, let’s take a look at the code, which lives in the ScalaTest unit testing environment:

package mysqlmock

import java.lang.String
import org.scalatest._
import java.sql.{Connection, DriverManager}
import java.io.File
import util.Random
import collection.JavaConversions._
import java.util.Properties
import com.mysql.management.{MysqldResourceI, MysqldResource}

trait MySqlIntercepting extends AbstractSuite {
    this: Suite =>

    // Note: This is almost an exact copy from the BeforeAndAfterAll trait, just so we don't force anyone to mix it in
    abstract override def run(testName: Option[String], reporter: Reporter,
                              stopper: Stopper, filter: Filter,
                              configMap: Map[String, Any],
                              distributor: Option[Distributor], tracker: Tracker) {
        var thrownException: Option[Throwable] = None

        // Force registering of mysql driver
        classOf[com.mysql.jdbc.Driver]

        // Kill mysql driver
        val drivers = enumerationAsScalaIterator(DriverManager.getDrivers)
        DriverManager.deregisterDriver(drivers.find(_.isInstanceOf[com.mysql.jdbc.Driver]).head)

        // Register our new faking driver
        val driver = new MockMySqlDriver()
        DriverManager.registerDriver(driver)

        try {
            super.run(testName, reporter, stopper, filter, configMap, distributor, tracker)
        }
        catch {
            case e: Exception => thrownException = Some(e)
        }
        finally {
            try {
                driver.shutdown()
                thrownException match {
                    case Some(e) => throw e
                    case None =>
                }
            }
            catch {
                case laterException: Exception =>
                    thrownException match {
                        // If both run and shutdown throw an exception, report the test exception
                        case Some(earlierException) => throw earlierException
                        case None => throw laterException
                    }
            }
        }
    }

    // This is a shortcut to getting a connection
    protected def createConnection(): Connection = {
        DriverManager.getConnection(MockMySqlDriver.internalConnectionStringPrefix)
    }
}

protected[mysqlmock] object MockMySqlDriver {
    val internalConnectionStringPrefix = "jdbc:test"
}

protected[mysqlmock] class MockMySqlDriver extends com.mysql.jdbc.Driver {
    private val databaseName: String = Random.alphanumeric.take(50).mkString
    private val databaseDir: File = new File(new File(System.getProperty("java.io.tmpdir")), databaseName)
    private val portNumber = (Random.nextInt(10049 - 10011) + 10011).toString // Unused port range
    private val databaseInstance = new MysqldResource(databaseDir)
    private var connectedAtLeastOnce: Boolean = false
    private val username = "test"
    private val password = "test"

    private def initializeDatabase() {
        if (!connectedAtLeastOnce) {
            // Only initialize the database once
            val options = Map(
                (MysqldResourceI.PORT -> portNumber),
                (MysqldResourceI.INITIALIZE_USER -> "true"),
                (MysqldResourceI.INITIALIZE_USER_NAME -> username),
                (MysqldResourceI.INITIALIZE_PASSWORD -> password)
            )

            databaseInstance.start("mysqld-" + databaseName, options)

            if (!databaseInstance.isRunning) {
                throw new RuntimeException("MySQL did not start.")
            }

            connectedAtLeastOnce = true
        }
    }

    def shutdown() {
        if (connectedAtLeastOnce)
            databaseInstance.shutdown()
    }

    override def acceptsURL(url: String) =
        // MySQL mocking
        url.startsWith("jdbc:mysql://") ||
        // Internal database
            url == MockMySqlDriver.internalConnectionStringPrefix ||
        // Or direct calls to mxj
            super.acceptsURL(url)

    override def connect(connectionString: String, props: Properties): Connection = {
        if (props.size() > 0) // If we came here with properties, then it's a call from ourselves
            return super.connect(connectionString, props)

        props.put("createDatabaseIfNotExist", "true")
        props.put("server.initialize-user", "true")
        props.put("--default-character-set", "utf8")
        props.put("--default-collation", "utf8_general_ci")
        props.put("user", username)
        props.put("password", password)

        // Make sure the database is running
        initializeDatabase()

        // Create a connection string to use with our database
        val string: String = "jdbc:mysql:mxj://localhost:" + portNumber + "/" + databaseName

        super.connect(string, props)
    }
}

As you can see from the code above, what we’re doing is creating a database for each test suite and throwing it away in the end. In addition to having the MySqlIntercepting trait intercept any attempt to connect to MySQL and redirect it to the temporary instance, let’s take a look at how we can mix it into our tests:

class MyTests extends Suite with BeforeAndAfter with MySqlIntercepting {
    before {
        val myConnection = super.createConnection()
        try {
            val statement = myConnection.createStatement()
            try {
                statement.execute("some sql here to initialize the database before each test")
            } finally {
                statement.close()
            }
        } finally {
            myConnection.close()
        }
    }

    after {
        val myConnection = super.createConnection()
        try {
            val statement = myConnection.createStatement()
            try {
                statement.execute("some sql here to clean up the database after each test")
            } finally {
                statement.close()
            }
        } finally {
            myConnection.close()
        }
    }

    // tests...
}

We ended up mocking MySQL without really mocking it. Access is extremely fast (with a little bit of time for bootstrapping per suite), and we’ve stripped away all of the unknown variables of using a shared, non-dedicated database server.

Addendum: Stopping in the middle of a test (without letting the test finish), for instance while debugging a test, will leave the mysqld process running, so to kill all dangling processes, use this:

ps -ef | grep MysqldResource.pid | grep -v grep | awk '{print "kill "$2}' | bash

When To Leave

I’ve been using Waze for a long time now and I love how it can predict very accurately how much time a drive somewhere would take. However, I didn’t like the fact that when I want to know when to leave one place to get to another, I had to keep re-opening the app and to check whether I would arrive on time if I left at that exact moment.

So I wrote a little something to find out when I need to leave.

I took the opportunity to learn some bash scripting, since I’m new to Unix. I welcome any critique of my work, if you have any.

The script queries the Google Geocoding API for both addresses and sends a routing request to Waze’s servers (warning: that’s somewhat of an unstable hack there). It then polls every now and then and when you have to leave, it sends a notification using Boxcar to your phone. It then nags you until you Ctrl-C the script (yeah, I didn’t care to make something prettier). If you open the notification in Boxcar and click the link, it opens Waze on the spot you want to go and all you need to do is set that as your destination.

Please note that this script Works On My Machine and I don’t guarantee it would on yours too.

What You Need:

  1. An iPhone with Boxcar (it’s free) installed.
  2. A Unix machine (I’m using Mac OSX Lion).

Setup:

  1. Go to Boxcar and create a new provider. Note down its API key and the email you registered with.
  2. Copy the script below to a wtl.sh file and make it executable. Change the two variables at the top of it to the API key and email you kept when you registered for Boxcar.
  3. Create a file called wtl.presets with presets you want to use, such as home or work. The file’s structure is lines of: [preset name] [address]
    The preset’s name must not include spaces.

Calling It:

./wtl.sh [address/preset of origin] [address/preset of destination] [when to arrive] [minutes head-start]
  • Address / Preset – Either a well-defined address (e.g. 1600 Pennsylvania Avenue, Washington, DC or even The White House – if Google can find it automatically on a map, then you’re good) or a preset from the file (see #3 above).
  • When To Arrive – The time you want to arrive at your destination, formatted as %Y-%m-%d %H:%M (e.g. 2011-12-13 15:00).
  • Minutes Head-Start – How many minutes you want the notifications to start coming before you need to leave.
Example: ./wtl.sh home "The Western Wall" "2011-12-05 10:00" 10

This will notify me 10 minutes before I need to leave home to get to The Western Wall at 10am, December 5th, 2011.

Future Work:

Writing a web application that would replace this bash script is a great direction, since a native iOS application would be somewhat useless here, since it can’t poll the server while closed.

The Source Code:

#!/bin/bash

BOXCAR_EMAIL=#Add your Boxcar email here
BOXCAR_API_KEY=#Add your Boxcar API key here

MINS_BEFORE=$4
EPOCH_TO_ARRIVE=`date -j -f "%Y-%m-%d %H:%M" "$3" "+%s"`

FROM_ADDRESS=`grep "$1" wtl.presets | awk '{ $1=""; gsub(/^ /,""); print }'`

if [ "$FROM_ADDRESS" = "" ]; then
    FROM_ADDRESS=$1
fi

FROM_ADDRESS=`echo $FROM_ADDRESS | sed 's# #+#g'`

TO_ADDRESS=`grep "$2" wtl.presets | awk '{ $1=""; gsub(/^ /,""); print }'`

if [ "$TO_ADDRESS" = "" ]; then
    TO_ADDRESS=$2
fi

TO_ADDRESS=`echo $TO_ADDRESS | sed 's# #+#g'`
FORMATTED_2=`echo $2 | sed 's# #+#g'`

echo Looking up origin address...

ORIGIN=`curl -s --get http://maps.googleapis.com/maps/api/geocode/json --data sensor=false --data address=$FROM_ADDRESS | ./jsawk 'if (this.status != "OK" || this.results.length != 1) return "ERROR"; return "x%3A" + this.results[0].geometry.location.lng + "+y%3A" + this.results[0].geometry.location.lat;'`

if [ "$ORIGIN" = "ERROR" ]; then
    echo Unable to parse origin address
    exit
fi

echo Looking up destination address...

DESTINATION=`curl -s --get http://maps.googleapis.com/maps/api/geocode/json --data sensor=false --data address=$TO_ADDRESS | ./jsawk 'if (this.status != "OK" || this.results.length != 1) return "ERROR"; return "x%3A" + this.results[0].geometry.location.lng + "+y%3A" + this.results[0].geometry.location.lat;'`

if [ "$DESTINATION" = "ERROR" ]; then
    echo Unable to parse destination address
    exit
fi

echo Starting time calculations.

while [ true ]
do
    TIME_TO_DRIVE=`curl -s --get http://www.waze.co.il/RoutingManager/routingRequest -d from=$ORIGIN -d to=$DESTINATION -d returnJSON=true -d returnGeometries=false -d returnInstructions=false -d timeout=60000 -d nPaths=1 | ./jsawk "var total = 0; for (var i = 0; i < this.response.results.length; i++) { total += this.response.results[i].crossTime; } return total;"`

    EPOCH_NOW=`date +%s`

    let "WHEN_TO_LEAVE = EPOCH_TO_ARRIVE - ((60 * $MINS_BEFORE) + $TIME_TO_DRIVE)"
    let "TIME_LEFT = ($WHEN_TO_LEAVE - $EPOCH_NOW) / 60"

    if [[ $TIME_LEFT -le 0 ]]; then
        let "TIME_LATE = 0 - TIME_LEFT"
        LL=`echo $DESTINATION | sed -E 's/x%3A(.+)\+y%3A(.+)/\2,\1/'`
        curl -d "email=$BOXCAR_EMAIL" -d "notification[message]=Leave!+You+are+$TIME_LATE+minute/s+late+to+leave+to+$FORMATTED_2!" -d "notification[source_url]=waze://?ll=$LL" http://boxcar.io/devices/providers/$BOXCAR_API_KEY/notifications
        echo Leave! You are $TIME_LATE minute/s late to leave to $2!
        sleep 60
    else
        echo Not yet. You still have $TIME_LEFT minute/s left.

        if [[ $TIME_LEFT > 10 ]]; then
            let "SLEEP_TIME = $TIME_LEFT / 2 * 60"
            sleep $SLEEP_TIME
        else
            sleep 60
        fi
    fi
done

Euler Project Problems 7-9 in Scala

This is my continuing series on using Project Euler to learn Scala.
You can see the other posts in the series by visiting the Project Euler category.

Problem #7

import collection.mutable.ListBuffer

object problem7 {
  private val primes: ListBuffer[Int] = ListBuffer[Int]()

  private def isPrime(number: Int) =
    !primes.toStream
      .takeWhile(_ <= math.sqrt(number).floor.toInt)
      .exists(number % _ == 0)

  private def findPrimesStream(): Stream[Int] = {
    // Start running from the last prime + 1 (or from 2 if no primes exist) until the first prime
    val startAt = if (primes.isEmpty) 2 else (primes.max + 1)
    val num: Int = Stream.from(startAt).find(isPrime(_)).head

    // Memoize
    primes.append(num)

    // Continue the stream
    num #:: findPrimesStream
  }

  def main(args: Array[String]) {
    // stream.apply at index 10001
    println(findPrimesStream()(10000))
  }
}

I tried to rely on Streams here, along with the obvious memoization that needed to take place.

Problem #8

object problem8 {
  private val number = "7316717653133062491922511967442657474235534919493496983520312774506326239578318016984801869478851843858615607891129494954595017379583319528532088055111254069874715852386305071569329096329522744304355766896648950445244523161731856403098711121722383113622298934233803081353362766142828064444866452387493035890729629049156044077239071381051585930796086670172427121883998797908792274921901699720888093776657273330010533678812202354218097512545405947522435258490771167055601360483958644670632441572215539753697817977846174064955149290862569321978468622482839722413756570560574902614079729686524145351004748216637048440319989000889524345065854122758866688116427171479924442928230863465674813919123162824586178664583591245665294765456828489128831426076900422421902267105562632111110937054421750694165896040807198403850962455444362981230987879927244284909188845801561660979191338754992005240636899125607176060588611646710940507754100225698315520005593572972571636269561882670428252483600823257530420752963450"

  private def numbersStream(implicit index: Int = 0): Stream[String] = {
    if (index > number.length() - 5)
      return Stream.empty

    number.substring(index, index + 5) #:: numbersStream(index + 1)
  }

  def main(args: Array[String]) {
    // Get all consecutive fives
    val max = numbersStream().map(
      // Convert each to a stream of bytes
      _.toStream
        // Convert each one to a number and find their product
        .map(c => Integer.parseInt(c.toString, 10))
        .foldLeft(1)(_ * _))
      // Find the max product
      .max

    println(max)
  }
}

Problem #9

object problem9 {
  def main(args: Array[String]) {
    println(
      Range(1, 999)
        .map(a => (a, Range(a + 1, 999 - a).find(b => a + b + math.sqrt(a*a + b*b) == 1000)))
        .collectFirst{case (a, b) if b != None => a * b.get * math.sqrt(a*a + b.get*b.get) }
        .get
        .toInt)
  }
}

This time I found out about the collect/collectFirst methods that are pretty much filter and map in one. Also, the fact that I worked with a tuple meant that it was easier to have a case handle it, rather than fall back to ._1, et al.

Additional Reading:

Euler Project Problems 4-6 (with 1 and 2 revisited) in Scala

This is my continuing series on using Project Euler to learn Scala.
You can see the other posts in the series by visiting the Project Euler category.

Problem #4

val result = Range(999, 900, -1)
        .map(a => (a, Range(999, 900, -1).find(b => {
            val number = a * b
            val digits = number.toString.toCharArray.map(digit => Integer.valueOf(digit - 0x30))
            var found = true

            for (i <- 0 to digits.length / 2)
              if (digits(i) != digits(digits.length - 1 - i))
                found = false

            found
          })))
        .filter(pair => pair._2 != None)
        .map(pair => (pair._1, pair._2.get, pair._1 * pair._2.get))
        .maxBy(tuple => tuple._3)

println(result._1 + " * " + result._2 + " = " + result._3)

Gotta love writing one statement solutions.
I went with the brute-force solution here, because it was easy to write and debug and ran at a decent time.

Problem #5

import collection.mutable.{ListBuffer, HashMap}

object problem5 {
  val primes = new HashMap[Int, Int]()

  def collectPrimeDivisors(n: Int) {
    var current: Int = n
    var divisor: Int = 0

    // Find out the prime divisors and list them in divisorList
    val divisorList = ListBuffer[Int]()

    while (divisor != current && current != 1) {
      divisor = current

      val found = primes.find((prime: (Int, Int)) => current % prime._1 == 0)

      if (found != None) {
        divisorList.append(found.get._1)
        current /= found.get._1
        divisor = found.get._1
      }
    }

    if (divisor == current) {
      divisorList.append(current)
    }

    // Zip up the list of divisors
    val map = HashMap[Int, Int]()
    divisorList.foreach((prime: Int) => { map.put(prime, map.get(prime).getOrElse(0) + 1); map })

    // Collect the prime divisors
    map.foreach((tuple: (Int, Int)) => primes.put(tuple._1, math.max(tuple._2, primes.get(tuple._1).getOrElse(0))))
  }

  def run(number: Int): Int = {
    // Collect prime divisors from 2 to 20
    Range(2, number).foreach((i: Int) => collectPrimeDivisors(i))

    // Multiply all prime divisors
    primes.foldLeft(1)((seed: Int, prime: (Int, Int)) => seed * math.pow(prime._1, prime._2).asInstanceOf[Int])
  }

  def main(args: Array[String]) {
    println(run(20))
  }
}

It took me less time to do this with pen and paper than it did to code the solution. Here’s to math :)

Problem #6

object problem6 {
  def apply(n: Int): Int = {
    val sumOfSquares = n * (n + 1) * (2 * n + 1) / 6
    val sum = n * (n + 1) / 2
    val squareOfSum = sum * sum

    squareOfSum - sumOfSquares
  }

  def main(args: Array[String]) {
    println(apply(100))
  }
}

Another win goes to math. O(1) trumps O(n).

Problem #1 (two more solutions)

object problem1_b {
  def problem1Runner(x: Int): Int = {
    val zeroCase: PartialFunction[Int, Int] = { case 0 => 0 }
    val otherNumbers: PartialFunction[Int, Int] = { case n => p1(n) }
    val divisible: PartialFunction[ Int, Int ] = { case n if (n % 3 == 0) || (n % 5 == 0) => n + p1(n) }

    val runner = zeroCase orElse divisible orElse otherNumbers

    runner(x)
  }

  def p1(x: Int) = problem1Runner(x - 1)

  def main(args: Array[String]) {
    println(p1(1000))
  }
}

This solution is thanks to Tomer who told me about the case if pattern, in which the domain of the function is defined in the if clause instead of having to implement PartialFunction all over again.

However, after reading some more, I came to a more succinct solution:

println(Range(1, 1000).filter(n => n % 3 == 0 || n % 5 == 0).sum)

Gotta love one-liners :)

Problem #2 (another solution)

This solution uses the Tail Recursion optimization in Scala that was pointed out to me by Tomer.

object problem2_b {
  @tailrec
  private def sum(a: Int, b: Int, runningSum: Long): Long = {
    if (b > 4000000)
      runningSum
    else
      sum(b, a + b, if (b % 2 == 0) b + runningSum else runningSum)
  }

  def main(args: Array[String]) {
    println(sum(1, 1, 0))
  }
}

FUSE for HDFS

Working with HDFS has many benefits, but there’s one disadvantage that for me, at least, was very painful and that is that it doesn’t behave like other POSIX file system. If you want to access the filesystem, you have to use it via the hadoop -fs commands.

To work around this limitation, there’s FUSE for HDFS, which exposes HDFS as a mount point on your machine. This means that you can now expose files from HDFS to any service that speaks POSIX and also to the command line. A nice side effect is that you can now use a visual client, like WinSCP, to explore your HDFS.

In my case I’ve gone with Cloudera’s implementation, which is still quite raw (as in “do not use in production” raw), and installed it on a 64-bit CentOS 5.5 server. Here’s a quick guide of how to install it.

First, install Cloudera’s repository:

yum --nogpgcheck localinstall cdh3-repository-1.0-1.noarch.rpm

Now, install Java, Hadoop Client and FUSE:

yum install java-1.6.0-openjdk-devel
yum install hadoop-0.20-native.x86_64
yum install hadoop-0.20-libhdfs.x86_64
yum install fuse-libs.x86_64

Next, install the FUSE implementation for HDFS:

yum install hadoop-0.20-fuse.x86_64

Since JAVA_HOME is hard-coded to be /usr/lib/j2sdk1.6-sun, you need to create a link to it (this gave me quite a bit of grief):

ln -s $JAVA_HOME /usr/lib/j2sdk1.6-sun

Now create the mount-point for HDFS:

mkdir [mount point]
hadoop-fuse-dfs dfs://[server]:[hdfs port] [mount point]

And you’re good to go.

Bonus Reading: What is FUSE?