Creating Scalding project with Maven

In this post I am going to show how to start a Scalding project with Maven.

Setup scala in your IDE

First step is to make sure your IDE (IntelliJ or Eclipse) supports Scala.

Create the scalding project

  • Create a new maven Scala project.
  • Use this pom.xml:
  • <?xml version="1.0" encoding="UTF-8"?>
    <project xmlns="" xmlns:xsi="" xsi:schemaLocation="">
     <transformer implementation="org.apache.maven.plugins.shade.resource.AppendingTransformer">
     <transformer implementation="org.apache.maven.plugins.shade.resource.AppendingTransformer">
     <transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">

Use this scala class:

  • package scalding.examples
    import com.twitter.scalding._
    import org.apache.hadoop
    object JobRunner {
     def main(args : Array[String]) { hadoop.conf.Configuration, new Tool, args);
    class WordCountJob(args : Args) extends Job(args) {
     .flatMap { line => line.split("""s+""") }
     .groupBy { word => word }

Run the Scalding project

  • Create a running profile in your IDE
  • Set the Main class to: scalding.examples.JobRunner
  • Add the following program arguments: scalding.examples.WordCountJob –hdfs  –input hdfs://<your-hadoop-server>:<hdfs-port(e.x., 8020)>/user/myuser/input_path –output hdfs://<your-hadoop-server>:<hdfs-port(e.x., 8020)>/user/myuser/output_path/someOutputFile.tsv

Run the program and check the file in HDFS: /user/myuser/output_path/someOutputFile.tsv


In this post I showed you how you can build your first scalding project using simple maven project with no need to use sbt tool.