Mac Tips, Tricks and Utilities

I’m always looking for a way to get more out of my setup. In the past year, since moving from Windows to OSX, I’ve accumulated some knowledge and utilities I’d like to share.

Multiple Monitors

These past few years, I’ve grown accustomed to working on a two-monitor setup. I wanted to move forward to three monitors (two external, one internal), but the only way to do this with the MacBook Pro I have (15-inch, Early 2011) was to buy two thunderbolt-enabled Apple monitors. Since I don’t feel like shelling out $1000 for each of those, I started looking for other solutions. The absolute best one I found was using an external USB graphics card. In my case, I use the Plugable UGA-2K-A ($62 at the moment of writing this post) and daisy-chain my two Dell 24″ monitors. It’s based on the DisplayLink DL-195 chipset and has great drivers for OSX, which lately also added the ability to rotate the external monitor 90°. The refresh rate is great even for a resolution of 1920×1200 and I can only imagine what the USB 3.0 based models can do. I’ve had one card die on me and got great customer service from Plugable, who sent me a replacement, no questions asked.

BetterTouchTool (free)

I work with an Apple Magic Mouse, which is based on the great idea of integrating a trackpad on a mouse. Of course, it’s lacking in many respects, but my bigger issue with it is that it doesn’t have a way to middle-click. BetterTouchTool is a utility that lets you define custom actions for your mouse, keyboard and trackpad. I installed it, set three-finger-click to be recognized as middle-click and haven’t looked back since. Later on I started adding more shortcuts, like having the top three points of my trackpad being, from left to right, close, minimize and maximize, in the same order as these actions appear on windows. I also like the author’s SecondBar utility, but personally I don’t find it as useful.

Little Snitch (not free)

Powerful firewall, though it gets kind of annoying if you don’t define broad rules. Already helped me catch one too many applications trying to call home. I don’t think there’s much left to say about it that hasn’t already been said.

Skitch (free)

The ⌘⇧4 shortcut and its friends in OSX let you capture parts of your screen, but this functionality is very basic. Enter Skitch, which sits in your tray and when you press ⌘⇧5, gives you some basic editing functionality and the ability to drag the screenshot as a file (for instance if you want to upload it). I use it heavily when moderating submissions to plaintextoffenders.com. Unfortunately it has the downside of being very pesky about its integration with Evernote (which bought the company last year).

.osx (free)

Always ask for more from your OS. The link is to a Hacker News discussion about tweaks you can perform in OSX. Here’s my list:

defaults write NSGlobalDomain AppleKeyboardUIMode -int 3
defaults write NSGlobalDomain AppleFontSmoothing -int 2
defaults write com.apple.dock showhidden -bool true
defaults write NSGlobalDomain AppleEnableMenuBarTransparency -bool false
defaults write NSGlobalDomain AppleShowAllExtensions -bool true
defaults write com.apple.finder ShowPathbar -bool true
defaults write com.apple.finder ShowStatusBar -bool true
defaults write NSGlobalDomain NSNavPanelExpandedStateForSaveMode -bool true
defaults write NSGlobalDomain PMPrintingExpandedStateForPrint -bool true
defaults write NSGlobalDomain ApplePressAndHoldEnabled -bool false
defaults write NSGlobalDomain KeyRepeat -int 0.02
defaults write NSGlobalDomain InitialKeyRepeat -int 12
defaults write com.apple.NetworkBrowser BrowseAllInterfaces -bool true
defaults write com.apple.finder _FXShowPosixPathInTitle -bool true
defaults write com.apple.desktopservices DSDontWriteNetworkStores -bool true
defaults write com.apple.finder FXEnableExtensionChangeWarning -bool false
/usr/libexec/PlistBuddy -c "Set :DesktopViewSettings:IconViewSettings:arrangeBy grid" ~/Library/Preferences/com.apple.finder.plist
defaults write com.apple.finder WarnOnEmptyTrash -bool false
defaults write com.apple.Safari DebugSnapshotsUpdatePolicy -int 2
defaults write com.apple.iTunes disablePingSidebar -bool true
defaults write com.apple.iTunes disablePing -bool true
chflags nohidden ~/Library
defaults write com.apple.LaunchServices LSQuarantine -bool false
defaults write com.apple.finder AppleShowAllFiles TRUE
defaults write com.apple.archiveutility dearchive-reveal-after -bool TRUE

for app in Safari Finder Dock Mail SystemUIServer; do killall "$app" >/dev/null 2>&1; done

f.lux (free)

Do you find yourself waking up at night sometimes, thinking about a cool idea, opening your laptop’s lid drowsily only to have your pupils catch fire by the glow of the monitor? f.lux automatically sets the temperature of your monitor to match the amount of light outside. Now your eyes can adjust more easily and your brain won’t send out an OMG WAKE UP IT’S MORNING alarm.

TotalFinder (not free)

This is the most recent addition to the list. Though it still lacks tested support for OSX Mountain Lion, it still works and works great. It adds tab support, side-by-side panes, etc. to the Finder. Also valuable is the global key combination to pop-up a finder window. Unfortunately it’s not free unless X.

Homebrew (free)

THE package manager for OSX. In stark opposition to MacPorts, it doesn’t require you to sudo it all the time. Unfortunately, the quality of the packages varies widely, so caveat emptor.

If you have any utilities or tips and tricks you’d like to share, please do. I’d love to get more out of my setup.

Parallelized Streams in Scala – Not What You Thought

I had a Stream of data coming in from an input source and had to do some heavily-CPU bound work. The code looked like this:

stream.map(item => cpuIntensiveWork(item))

Behind the scenes, the input stream is enumerated and items are acted upon one-by-one to create a lazily loaded list.
Wanting to parallelize this, I added par:

stream.par.map(item => cpuIntensiveWork(item))

What I imagined this would do was that it would load the lines into a buffer and have workers read from that buffer an execute the map while it was still writing. Turns out that’s not what happens.
What really happens is that the whole stream is read into memory and then that gets parallelized. Hardly what I was aiming for.

Unit Testing MySQL-Bound Code

I’m currently working on a product that is very tightly bound to MySQL. It’s so tightly coupled, in fact, that unit testing it while mocking the database access is meaningless. Unfortunately, accessing a MySQL database from the build server (or any machine) is not something that we can take for granted, not to mention concurrency (one person running the unit tests in parallel to another, messing each other’s sandbox) and the network I/O to an external server.

The solution I came up with would be to use an in-process, in-memory database to mock MySQL, so I looked to H2 (good) and HyperSQL (better), but neither had the same power as MySQL had.

Browsing through the Interwebs, I stumbled into mysql-je, a hack from back in 2006 which let you run MySQL as an in-process database, but it was so unstable and unmaintained that I kept looking and came across MySQL Connector/MJX. MJX is an in-process database that, unfortunately, needs access to the filesystem to store its files (you can use the temporary files’ directory), but comes in the form of handy (not so little) jars and works well when invoked from the JVM.

We need two things from this experiment:

  1. To be able to isolate and access our own MJX database.
  2. To intercept any and all calls to MySQL and handle them instead of the usual MySQL driver.

Let’s start by adding it to SBT:

val aspectj = "org.aspectj" % "aspectjrt" % "1.5.4" % "test"
val mxj = "mysql" % "mysql-connector-mxj" % "5.0.12" % "test"
val mxjdb = "mysql" % "mysql-connector-mxj-db-files" % "5.0.12" % "test"

And now we reload and wait, because the mysql-connector-mxj-db-files jar is almost 140MB in size (since it contains the binaries to run MySQL on multiple platforms) and takes while to download. While waiting, let’s take a look at the code, which lives in the ScalaTest unit testing environment:

package mysqlmock

import java.lang.String
import org.scalatest._
import java.sql.{Connection, DriverManager}
import java.io.File
import util.Random
import collection.JavaConversions._
import java.util.Properties
import com.mysql.management.{MysqldResourceI, MysqldResource}

trait MySqlIntercepting extends AbstractSuite {
    this: Suite =>

    // Note: This is almost an exact copy from the BeforeAndAfterAll trait, just so we don't force anyone to mix it in
    abstract override def run(testName: Option[String], reporter: Reporter,
                              stopper: Stopper, filter: Filter,
                              configMap: Map[String, Any],
                              distributor: Option[Distributor], tracker: Tracker) {
        var thrownException: Option[Throwable] = None

        // Force registering of mysql driver
        classOf[com.mysql.jdbc.Driver]

        // Kill mysql driver
        val drivers = enumerationAsScalaIterator(DriverManager.getDrivers)
        DriverManager.deregisterDriver(drivers.find(_.isInstanceOf[com.mysql.jdbc.Driver]).head)

        // Register our new faking driver
        val driver = new MockMySqlDriver()
        DriverManager.registerDriver(driver)

        try {
            super.run(testName, reporter, stopper, filter, configMap, distributor, tracker)
        }
        catch {
            case e: Exception => thrownException = Some(e)
        }
        finally {
            try {
                driver.shutdown()
                thrownException match {
                    case Some(e) => throw e
                    case None =>
                }
            }
            catch {
                case laterException: Exception =>
                    thrownException match {
                        // If both run and shutdown throw an exception, report the test exception
                        case Some(earlierException) => throw earlierException
                        case None => throw laterException
                    }
            }
        }
    }

    // This is a shortcut to getting a connection
    protected def createConnection(): Connection = {
        DriverManager.getConnection(MockMySqlDriver.internalConnectionStringPrefix)
    }
}

protected[mysqlmock] object MockMySqlDriver {
    val internalConnectionStringPrefix = "jdbc:test"
}

protected[mysqlmock] class MockMySqlDriver extends com.mysql.jdbc.Driver {
    private val databaseName: String = Random.alphanumeric.take(50).mkString
    private val databaseDir: File = new File(new File(System.getProperty("java.io.tmpdir")), databaseName)
    private val portNumber = (Random.nextInt(10049 - 10011) + 10011).toString // Unused port range
    private val databaseInstance = new MysqldResource(databaseDir)
    private var connectedAtLeastOnce: Boolean = false
    private val username = "test"
    private val password = "test"

    private def initializeDatabase() {
        if (!connectedAtLeastOnce) {
            // Only initialize the database once
            val options = Map(
                (MysqldResourceI.PORT -> portNumber),
                (MysqldResourceI.INITIALIZE_USER -> "true"),
                (MysqldResourceI.INITIALIZE_USER_NAME -> username),
                (MysqldResourceI.INITIALIZE_PASSWORD -> password)
            )

            databaseInstance.start("mysqld-" + databaseName, options)

            if (!databaseInstance.isRunning) {
                throw new RuntimeException("MySQL did not start.")
            }

            connectedAtLeastOnce = true
        }
    }

    def shutdown() {
        if (connectedAtLeastOnce)
            databaseInstance.shutdown()
    }

    override def acceptsURL(url: String) =
        // MySQL mocking
        url.startsWith("jdbc:mysql://") ||
        // Internal database
            url == MockMySqlDriver.internalConnectionStringPrefix ||
        // Or direct calls to mxj
            super.acceptsURL(url)

    override def connect(connectionString: String, props: Properties): Connection = {
        if (props.size() > 0) // If we came here with properties, then it's a call from ourselves
            return super.connect(connectionString, props)

        props.put("createDatabaseIfNotExist", "true")
        props.put("server.initialize-user", "true")
        props.put("--default-character-set", "utf8")
        props.put("--default-collation", "utf8_general_ci")
        props.put("user", username)
        props.put("password", password)

        // Make sure the database is running
        initializeDatabase()

        // Create a connection string to use with our database
        val string: String = "jdbc:mysql:mxj://localhost:" + portNumber + "/" + databaseName

        super.connect(string, props)
    }
}

As you can see from the code above, what we’re doing is creating a database for each test suite and throwing it away in the end. In addition to having the MySqlIntercepting trait intercept any attempt to connect to MySQL and redirect it to the temporary instance, let’s take a look at how we can mix it into our tests:

class MyTests extends Suite with BeforeAndAfter with MySqlIntercepting {
    before {
        val myConnection = super.createConnection()
        try {
            val statement = myConnection.createStatement()
            try {
                statement.execute("some sql here to initialize the database before each test")
            } finally {
                statement.close()
            }
        } finally {
            myConnection.close()
        }
    }

    after {
        val myConnection = super.createConnection()
        try {
            val statement = myConnection.createStatement()
            try {
                statement.execute("some sql here to clean up the database after each test")
            } finally {
                statement.close()
            }
        } finally {
            myConnection.close()
        }
    }

    // tests...
}

We ended up mocking MySQL without really mocking it. Access is extremely fast (with a little bit of time for bootstrapping per suite), and we’ve stripped away all of the unknown variables of using a shared, non-dedicated database server.

Addendum: Stopping in the middle of a test (without letting the test finish), for instance while debugging a test, will leave the mysqld process running, so to kill all dangling processes, use this:

ps -ef | grep MysqldResource.pid | grep -v grep | awk '{print "kill "$2}' | bash

When To Leave

I’ve been using Waze for a long time now and I love how it can predict very accurately how much time a drive somewhere would take. However, I didn’t like the fact that when I want to know when to leave one place to get to another, I had to keep re-opening the app and to check whether I would arrive on time if I left at that exact moment.

So I wrote a little something to find out when I need to leave.

I took the opportunity to learn some bash scripting, since I’m new to Unix. I welcome any critique of my work, if you have any.

The script queries the Google Geocoding API for both addresses and sends a routing request to Waze’s servers (warning: that’s somewhat of an unstable hack there). It then polls every now and then and when you have to leave, it sends a notification using Boxcar to your phone. It then nags you until you Ctrl-C the script (yeah, I didn’t care to make something prettier). If you open the notification in Boxcar and click the link, it opens Waze on the spot you want to go and all you need to do is set that as your destination.

Please note that this script Works On My Machine and I don’t guarantee it would on yours too.

What You Need:

  1. An iPhone with Boxcar (it’s free) installed.
  2. A Unix machine (I’m using Mac OSX Lion).

Setup:

  1. Go to Boxcar and create a new provider. Note down its API key and the email you registered with.
  2. Copy the script below to a wtl.sh file and make it executable. Change the two variables at the top of it to the API key and email you kept when you registered for Boxcar.
  3. Create a file called wtl.presets with presets you want to use, such as home or work. The file’s structure is lines of: [preset name] [address]
    The preset’s name must not include spaces.

Calling It:

./wtl.sh [address/preset of origin] [address/preset of destination] [when to arrive] [minutes head-start]
  • Address / Preset – Either a well-defined address (e.g. 1600 Pennsylvania Avenue, Washington, DC or even The White House – if Google can find it automatically on a map, then you’re good) or a preset from the file (see #3 above).
  • When To Arrive – The time you want to arrive at your destination, formatted as %Y-%m-%d %H:%M (e.g. 2011-12-13 15:00).
  • Minutes Head-Start – How many minutes you want the notifications to start coming before you need to leave.
Example: ./wtl.sh home "The Western Wall" "2011-12-05 10:00" 10

This will notify me 10 minutes before I need to leave home to get to The Western Wall at 10am, December 5th, 2011.

Future Work:

Writing a web application that would replace this bash script is a great direction, since a native iOS application would be somewhat useless here, since it can’t poll the server while closed.

The Source Code:

#!/bin/bash

BOXCAR_EMAIL=#Add your Boxcar email here
BOXCAR_API_KEY=#Add your Boxcar API key here

MINS_BEFORE=$4
EPOCH_TO_ARRIVE=`date -j -f "%Y-%m-%d %H:%M" "$3" "+%s"`

FROM_ADDRESS=`grep "$1" wtl.presets | awk '{ $1=""; gsub(/^ /,""); print }'`

if [ "$FROM_ADDRESS" = "" ]; then
    FROM_ADDRESS=$1
fi

FROM_ADDRESS=`echo $FROM_ADDRESS | sed 's# #+#g'`

TO_ADDRESS=`grep "$2" wtl.presets | awk '{ $1=""; gsub(/^ /,""); print }'`

if [ "$TO_ADDRESS" = "" ]; then
    TO_ADDRESS=$2
fi

TO_ADDRESS=`echo $TO_ADDRESS | sed 's# #+#g'`
FORMATTED_2=`echo $2 | sed 's# #+#g'`

echo Looking up origin address...

ORIGIN=`curl -s --get http://maps.googleapis.com/maps/api/geocode/json --data sensor=false --data address=$FROM_ADDRESS | ./jsawk 'if (this.status != "OK" || this.results.length != 1) return "ERROR"; return "x%3A" + this.results[0].geometry.location.lng + "+y%3A" + this.results[0].geometry.location.lat;'`

if [ "$ORIGIN" = "ERROR" ]; then
    echo Unable to parse origin address
    exit
fi

echo Looking up destination address...

DESTINATION=`curl -s --get http://maps.googleapis.com/maps/api/geocode/json --data sensor=false --data address=$TO_ADDRESS | ./jsawk 'if (this.status != "OK" || this.results.length != 1) return "ERROR"; return "x%3A" + this.results[0].geometry.location.lng + "+y%3A" + this.results[0].geometry.location.lat;'`

if [ "$DESTINATION" = "ERROR" ]; then
    echo Unable to parse destination address
    exit
fi

echo Starting time calculations.

while [ true ]
do
    TIME_TO_DRIVE=`curl -s --get http://www.waze.co.il/RoutingManager/routingRequest -d from=$ORIGIN -d to=$DESTINATION -d returnJSON=true -d returnGeometries=false -d returnInstructions=false -d timeout=60000 -d nPaths=1 | ./jsawk "var total = 0; for (var i = 0; i < this.response.results.length; i++) { total += this.response.results[i].crossTime; } return total;"`

    EPOCH_NOW=`date +%s`

    let "WHEN_TO_LEAVE = EPOCH_TO_ARRIVE - ((60 * $MINS_BEFORE) + $TIME_TO_DRIVE)"
    let "TIME_LEFT = ($WHEN_TO_LEAVE - $EPOCH_NOW) / 60"

    if [[ $TIME_LEFT -le 0 ]]; then
        let "TIME_LATE = 0 - TIME_LEFT"
        LL=`echo $DESTINATION | sed -E 's/x%3A(.+)\+y%3A(.+)/\2,\1/'`
        curl -d "email=$BOXCAR_EMAIL" -d "notification[message]=Leave!+You+are+$TIME_LATE+minute/s+late+to+leave+to+$FORMATTED_2!" -d "notification[source_url]=waze://?ll=$LL" http://boxcar.io/devices/providers/$BOXCAR_API_KEY/notifications
        echo Leave! You are $TIME_LATE minute/s late to leave to $2!
        sleep 60
    else
        echo Not yet. You still have $TIME_LEFT minute/s left.

        if [[ $TIME_LEFT > 10 ]]; then
            let "SLEEP_TIME = $TIME_LEFT / 2 * 60"
            sleep $SLEEP_TIME
        else
            sleep 60
        fi
    fi
done

Euler Project Problems 7-9 in Scala

This is my continuing series on using Project Euler to learn Scala.
You can see the other posts in the series by visiting the Project Euler category.

Problem #7

import collection.mutable.ListBuffer

object problem7 {
  private val primes: ListBuffer[Int] = ListBuffer[Int]()

  private def isPrime(number: Int) =
    !primes.toStream
      .takeWhile(_ <= math.sqrt(number).floor.toInt)
      .exists(number % _ == 0)

  private def findPrimesStream(): Stream[Int] = {
    // Start running from the last prime + 1 (or from 2 if no primes exist) until the first prime
    val startAt = if (primes.isEmpty) 2 else (primes.max + 1)
    val num: Int = Stream.from(startAt).find(isPrime(_)).head

    // Memoize
    primes.append(num)

    // Continue the stream
    num #:: findPrimesStream
  }

  def main(args: Array[String]) {
    // stream.apply at index 10001
    println(findPrimesStream()(10000))
  }
}

I tried to rely on Streams here, along with the obvious memoization that needed to take place.

Problem #8

object problem8 {
  private val number = "7316717653133062491922511967442657474235534919493496983520312774506326239578318016984801869478851843858615607891129494954595017379583319528532088055111254069874715852386305071569329096329522744304355766896648950445244523161731856403098711121722383113622298934233803081353362766142828064444866452387493035890729629049156044077239071381051585930796086670172427121883998797908792274921901699720888093776657273330010533678812202354218097512545405947522435258490771167055601360483958644670632441572215539753697817977846174064955149290862569321978468622482839722413756570560574902614079729686524145351004748216637048440319989000889524345065854122758866688116427171479924442928230863465674813919123162824586178664583591245665294765456828489128831426076900422421902267105562632111110937054421750694165896040807198403850962455444362981230987879927244284909188845801561660979191338754992005240636899125607176060588611646710940507754100225698315520005593572972571636269561882670428252483600823257530420752963450"

  private def numbersStream(implicit index: Int = 0): Stream[String] = {
    if (index > number.length() - 5)
      return Stream.empty

    number.substring(index, index + 5) #:: numbersStream(index + 1)
  }

  def main(args: Array[String]) {
    // Get all consecutive fives
    val max = numbersStream().map(
      // Convert each to a stream of bytes
      _.toStream
        // Convert each one to a number and find their product
        .map(c => Integer.parseInt(c.toString, 10))
        .foldLeft(1)(_ * _))
      // Find the max product
      .max

    println(max)
  }
}

Problem #9

object problem9 {
  def main(args: Array[String]) {
    println(
      Range(1, 999)
        .map(a => (a, Range(a + 1, 999 - a).find(b => a + b + math.sqrt(a*a + b*b) == 1000)))
        .collectFirst{case (a, b) if b != None => a * b.get * math.sqrt(a*a + b.get*b.get) }
        .get
        .toInt)
  }
}

This time I found out about the collect/collectFirst methods that are pretty much filter and map in one. Also, the fact that I worked with a tuple meant that it was easier to have a case handle it, rather than fall back to ._1, et al.

Additional Reading:

Euler Project Problems 4-6 (with 1 and 2 revisited) in Scala

This is my continuing series on using Project Euler to learn Scala.
You can see the other posts in the series by visiting the Project Euler category.

Problem #4

val result = Range(999, 900, -1)
        .map(a => (a, Range(999, 900, -1).find(b => {
            val number = a * b
            val digits = number.toString.toCharArray.map(digit => Integer.valueOf(digit - 0x30))
            var found = true

            for (i <- 0 to digits.length / 2)
              if (digits(i) != digits(digits.length - 1 - i))
                found = false

            found
          })))
        .filter(pair => pair._2 != None)
        .map(pair => (pair._1, pair._2.get, pair._1 * pair._2.get))
        .maxBy(tuple => tuple._3)

println(result._1 + " * " + result._2 + " = " + result._3)

Gotta love writing one statement solutions.
I went with the brute-force solution here, because it was easy to write and debug and ran at a decent time.

Problem #5

import collection.mutable.{ListBuffer, HashMap}

object problem5 {
  val primes = new HashMap[Int, Int]()

  def collectPrimeDivisors(n: Int) {
    var current: Int = n
    var divisor: Int = 0

    // Find out the prime divisors and list them in divisorList
    val divisorList = ListBuffer[Int]()

    while (divisor != current && current != 1) {
      divisor = current

      val found = primes.find((prime: (Int, Int)) => current % prime._1 == 0)

      if (found != None) {
        divisorList.append(found.get._1)
        current /= found.get._1
        divisor = found.get._1
      }
    }

    if (divisor == current) {
      divisorList.append(current)
    }

    // Zip up the list of divisors
    val map = HashMap[Int, Int]()
    divisorList.foreach((prime: Int) => { map.put(prime, map.get(prime).getOrElse(0) + 1); map })

    // Collect the prime divisors
    map.foreach((tuple: (Int, Int)) => primes.put(tuple._1, math.max(tuple._2, primes.get(tuple._1).getOrElse(0))))
  }

  def run(number: Int): Int = {
    // Collect prime divisors from 2 to 20
    Range(2, number).foreach((i: Int) => collectPrimeDivisors(i))

    // Multiply all prime divisors
    primes.foldLeft(1)((seed: Int, prime: (Int, Int)) => seed * math.pow(prime._1, prime._2).asInstanceOf[Int])
  }

  def main(args: Array[String]) {
    println(run(20))
  }
}

It took me less time to do this with pen and paper than it did to code the solution. Here’s to math :)

Problem #6

object problem6 {
  def apply(n: Int): Int = {
    val sumOfSquares = n * (n + 1) * (2 * n + 1) / 6
    val sum = n * (n + 1) / 2
    val squareOfSum = sum * sum

    squareOfSum - sumOfSquares
  }

  def main(args: Array[String]) {
    println(apply(100))
  }
}

Another win goes to math. O(1) trumps O(n).

Problem #1 (two more solutions)

object problem1_b {
  def problem1Runner(x: Int): Int = {
    val zeroCase: PartialFunction[Int, Int] = { case 0 => 0 }
    val otherNumbers: PartialFunction[Int, Int] = { case n => p1(n) }
    val divisible: PartialFunction[ Int, Int ] = { case n if (n % 3 == 0) || (n % 5 == 0) => n + p1(n) }

    val runner = zeroCase orElse divisible orElse otherNumbers

    runner(x)
  }

  def p1(x: Int) = problem1Runner(x - 1)

  def main(args: Array[String]) {
    println(p1(1000))
  }
}

This solution is thanks to Tomer who told me about the case if pattern, in which the domain of the function is defined in the if clause instead of having to implement PartialFunction all over again.

However, after reading some more, I came to a more succinct solution:

println(Range(1, 1000).filter(n => n % 3 == 0 || n % 5 == 0).sum)

Gotta love one-liners :)

Problem #2 (another solution)

This solution uses the Tail Recursion optimization in Scala that was pointed out to me by Tomer.

object problem2_b {
  @tailrec
  private def sum(a: Int, b: Int, runningSum: Long): Long = {
    if (b > 4000000)
      runningSum
    else
      sum(b, a + b, if (b % 2 == 0) b + runningSum else runningSum)
  }

  def main(args: Array[String]) {
    println(sum(1, 1, 0))
  }
}

FUSE for HDFS

Working with HDFS has many benefits, but there’s one disadvantage that for me, at least, was very painful and that is that it doesn’t behave like other POSIX file system. If you want to access the filesystem, you have to use it via the hadoop -fs commands.

To work around this limitation, there’s FUSE for HDFS, which exposes HDFS as a mount point on your machine. This means that you can now expose files from HDFS to any service that speaks POSIX and also to the command line. A nice side effect is that you can now use a visual client, like WinSCP, to explore your HDFS.

In my case I’ve gone with Cloudera’s implementation, which is still quite raw (as in “do not use in production” raw), and installed it on a 64-bit CentOS 5.5 server. Here’s a quick guide of how to install it.

First, install Cloudera’s repository:

yum --nogpgcheck localinstall cdh3-repository-1.0-1.noarch.rpm

Now, install Java, Hadoop Client and FUSE:

yum install java-1.6.0-openjdk-devel
yum install hadoop-0.20-native.x86_64
yum install hadoop-0.20-libhdfs.x86_64
yum install fuse-libs.x86_64

Next, install the FUSE implementation for HDFS:

yum install hadoop-0.20-fuse.x86_64

Since JAVA_HOME is hard-coded to be /usr/lib/j2sdk1.6-sun, you need to create a link to it (this gave me quite a bit of grief):

ln -s $JAVA_HOME /usr/lib/j2sdk1.6-sun

Now create the mount-point for HDFS:

mkdir [mount point]
hadoop-fuse-dfs dfs://[server]:[hdfs port] [mount point]

And you’re good to go.

Bonus Reading: What is FUSE?

Euler Project Problems 1-3 in Scala

As with when I learned Ruby, I find Project Euler to be an extremely useful way both to learn a language and its idioms and also flex my math muscles every once in a while. Since I’m starting to learn Scala, I decided to start answering the problems there again (the last time I got to 33) and thought I’d share what I came up with here.

Problem #1

object problem1 {
  def run(number: Int): Int = {
    sum(number - 1)
  }

  def sum(number: Int): Int = {
    number match {
      case 0 => 0
      case n => (if(n % 3 == 0 || n % 5 == 0) n else 0) + problem1.sum(n - 1)
    }
  }
}

I had wanted to use PartialFunction‘s functionality for the sake of some mental gymnastics (and the fact that filtering doesn’t occur until the inside of the case in the previous solution), so here’s another solution:

def problem1Runner(x: Int): Int = {
  val zeroCase: PartialFunction[Int, Int] = { case 0 => 0 }
  val otherNumbers: PartialFunction[Int, Int] = { case n => p1(n) }

  val runner = zeroCase orElse new Problem1PF orElse otherNumbers

  runner(x)
}

def p1(x: Int) = problem1Runner(x - 1)

class Problem1PF extends PartialFunction[Int, Int] {
  def isDefinedAt(n: Int): Boolean = {
    (n % 3 == 0 || n % 5 == 0)
  }

  def apply(n: Int): Int = {
    n + p1(n)
  }
}

println(p1(1000))

Obviously, this is the less readable solution, but it taught me more to try and code it.

Problem #2

def fibSum(): Int = fibSum(1, 2)

def fibSum(a: Int, b: Int): Int = {
  var sum = 0
  var x = a
  var y = b

  while (x <= 4000000) {
    if (x % 2 == 0) sum += x

    val z = x
    x = y
    y += z
  }

  sum
}

println(fibSum())

I tried to create an obvious recursive solution, but I quickly understood how it would overflow.

Problem #3

Here I tried the obvious solution first, but it had horrible complexity, so I went and implemented the (unreadable) Shanks’ square forms factorization algorithm (and used Euclid’s greatest common divisor algorithm as part of it).

import collection.mutable.ListBuffer

object problem3 {
  def apply(n: Long): Long = {
    var num = n
    val factors = ListBuffer[Long](problem3.findFactor(num))

    while (factors.last != 1) {
      num /= factors.last
      factors.append(problem3.findFactor(num))
    }

    factors.max
  }

  private def findFactor(n: Long): Long = {
    // initialize
    val k = 1
    var p: ListBuffer[Long] = ListBuffer(math.floor(math.sqrt(k * n)).asInstanceOf[Long])
    var q: ListBuffer[Long] = ListBuffer(1, (k * n - math.pow(p.head, 2.0)).asInstanceOf[Long])
    var b: Long = 0L

    // repeat
    var i = 1

    do {
      b = math.floor((math.floor(math.sqrt(k * n)) + p(i-1)) / q(i)).asInstanceOf[Long]
      p.append(b * q(i) - p(i-1))
      q.append(q(i-1) + b * (p(i-1) - p(i)))
      i += 1
    } while (!isPerfectSquare(q.last))

    // initialize
    b = math.floor((math.floor(math.sqrt(k * n)) - p(i-1)) / math.sqrt(q(i))).asInstanceOf[Long]
    p = ListBuffer((b * math.sqrt(q(i)) + p(i - 1)).asInstanceOf[Long])
    q = ListBuffer(math.sqrt(q(i)).asInstanceOf[Long], ((k * n - math.pow(p(0), 2.0)) / math.sqrt(q(i))).asInstanceOf[Long])

    // repeat
    i = 0

    do {
      i += 1
      b = math.floor((math.floor(math.sqrt(k * n)) + p(i-1)) / q(i)).asInstanceOf[Long]
      p.append(b * q(i) - p(i-1))
      q.append(q(i-1) + b * (p(i-1) - p(i)))
    } while (p(i-1) != p(i))

    gcd(n, p(i))
  }

  private def isPerfectSquare(n: Long): Boolean = {
    val root: Long = math.floor(math.sqrt(n)).asInstanceOf[Long]

    root * root == n
  }

  private def gcd(a: Long, b: Long): Long = {
    // Euclid's algorithm
    b match {
      case 0 => a
      case _ => gcd(b, a - b * math.floor(1.0 * a / b).asInstanceOf[Long])
    }
  }
}

println(problem3(600851475143L))

Notes:

  1. ListBuffers are better than Lists for appending items.
  2. Turns out you can’t do for on a range longer than Int.MAX_VALUE.
  3. The number of non-textual operators in Scala is insane!

A Fresh Start

Since January 2010, I’ve been working at LivePerson, as a Senior Java Developer and ScrumMaster. It’s now time to move on – I will be joining newBrandAnalytics starting November 1st, 2011 and have decided it’s time to bring my professional blog back to life.

As I have not focused on .NET in the past two years and the fact that I’ll be working in Scala at nBA, I’ve decided to move the blog from the old ASP.NET Community Server 2007 installation that has been my home for years to WordPress. The blog will no longer focus on any one thing, so if you’ve come here expecting .NET content, I apologize, but hope you’ll stay and keep reading.

Here’s to new beginnings and fresh starts! :)

The Anti-Social Network

(I’ve returned from my trip a few months ago, but didn’t have enough to blog about; hopefully, this is my returning to a semi-regular posting schedule; also, I’ve decided to veer off .NET in some of my posts, focusing on more issues that I like to talk about and offer my opinions on)

(Update: This article is now featured in Hebrew in Gal Mor’s blog “Holes in the Net”)

CC-BY-NC, Min Jung Kim Social networks have been flourishing in the past few years, and have let us reach more people than we could before them. I, personally, use both Facebook and Twitter on a daily basis. The trend of ‘socialized’ applications is steadily marching on. A simple example of that is e-commerce sites looking to increase their conversion rate with you using recommendations from your friends.

However, there is yet another trend coming up that we should worry about, which is users trying to force anti-social behavior. Here are a couple of examples:

  1. Facebook introduced the ‘Like’ feature in February 2009, making it easier for users to say that they like a certain something: a post, a picture, a link, etc. This feature was immediately put to good use, with people ‘liking’ things, instead of creating long comment chains. ‘Liking’ something was much easier, too, being a single-click action, and increased the amount of participation. The Like feature was a good thing.

    Lately, voices have come up in favor of creating a ‘Dislike’ button. Users feel that since they have an easy way to notify their friend that they like his or her post, that they should equally be able to notify them of not liking it, discouraging the person from continuing this kind of activity.
    To use the ‘Dislike’ feature would be like saying to someone’s face that their joke was not funny or that they are ugly. Face to face, you might only do that to your closest friends and family and even then in jest, but online it’s much easier to be anti-social. The socially acceptable alternative to the ‘Dislike’ button would be to ‘Hide’ that friend’s updates, which unlike ignoring someone in the real world, is something the friend will never know about and will therefore never get offended by.
  2. The way to use Twitter is not something users can agree on. One thing that is widely agreed upon, though, is that following someone means you care what they say. That leads most people to the mistaken belief that the more followers you acquire, the more people will believe you are important and influential.One of the ways in which users are attempting to inflate their follower count is the contemptible habit of expecting auto-follow-back. This means that the user follows you, waits for you to follow them back and then un-follows you if you haven’t. This widely embraced practice causes unsuspecting users to be barraged with a very large number of follow notifications from people, brands and companies they have no interest in.This is downright adamant to spam, which we can all agree is one of the least socially acceptable activities you can partake in online.

These examples just go to show the trend of users wanting to add anti-social behavior to a social platform.

Social networks, on their part, should never allow themselves to be catalysts for anti-social behavior. Anti-social activities, such as hiding users or reporting them or their actions should remain hidden from all other users and must remain in confidence between the user and the platform. The real world parallel to this would be ‘telling on’ someone, which is frowned upon.
Facebook tries to avoid that by making all positive social activities more open, while keeping the anti-social activities on a one-on-one basis between the user and the platform. Unfavorable comments are still prevalent, but are unavoidable.
Twitter is, on the other hand, lagging far behind by keeping their network very much bare of personal filters (no way to hide unwanted tweets) and allowing users to easily find out who un-followed them, to name just a few issues.

The rule-of-thumb I present here does not stop with social networks, but continues to other socially-influenced platforms. Socially-enabled games follow this rule too: FarmVille, which at the time of writing of this post is the most popular and fastest growing game of all time on Facebook, does the same.
Players have a multitude of socially-positive activities to choose from, such as sending each other gifts, taking in animals their friends put up for adoption, helping out on friends’ farms and even racing each other to the top of their local hall of fame in the long run. On the other hand, there are no socially-negative activities in the game: players can not steal crops or animals from friends, mess up their farms, etc.

Social platforms should make it easy to broadcast socially-positive activities and hard to broadcast the negative ones. Once they follow this simple frame of mind, they will reap the psychological and sociological benefits.