SBT and Specs for BDD in Scala

BDD Programming Scala

I had used RSpec earlier for Behaviour Development Development for a Ruby on Rails project. Today I learnt how to do BDD in Scala. Chapter 4 of Programming Scala introduces Traits and Specs for testing the code. Here is how I set it up: I already had Scala installed. So the first step was to setup sbt ( Simple Build Tool for Scala ). Setup was easy ( described in detail here ): $ sudo yum localinstall Then, I created a sample project which is hosted on github ( ): $ git clone git:// $ cd trait-with-sbt-spec $ sbt test And here is the output ...

Read more →

Apache Mesos - incubator project; and Spark Project

frameworks mesos spark

Apache Mesos, written primarily in C++ is a cluster resource allocation framework. It is used by the Spark Project ( majority of Spark project is written in Scala ). Apache Mesos - Spark Project - Tech Talk: Matei Zaharia (UC Berkeley) – «Spark: A Framework for Iterative and Interactive Cluster Computing» -

Read more →

Scala script tool seems to be broken Fedora 17

Fedora programming Scala

I am running Scala on Fedora 17 and I seem to face a problem. I created a /tmp/hello.scala file, as follows: println("Hello world") This works just fine if I run it on Ubuntu 12.04 i.e. it prints Hello world. However, when I try to run it on Fedora, I don’t get any output and the script keeps waiting forever. So I tried to inspect what’s happening here: $ bash -x scala /tmp/hello.scala + scala_exit_status=127 + saved_stty= + trap onExit INT ++ stty -g + saved_stty=6d02:5:4bf:8a3b:3:1c:7f:15:4:0:1:ff:11:13:1a:ff:12:f:17:16:ff:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0 + [[ !

Read more →

Set up for learning Scala with Emacs

emacs programming scala

Obviously the first step is to install Scala language. $ sudo aptitude install scala OR $ yum install scala Then I ran my first Scala «Hello world!» program from CLI. Setting up Scala mode for Emacs was a bit of a pain so I merged the old scala-mode and the latest into my repo. Here are very simple steps to setup scala-mode for Emacs. $ cd ~/.emacs.d/ $ git clone git:// $ cd scala-mode $ make Now add following startup code to ~/.emacs file ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;; START: Scala mode (add-to-list 'load-path "/home/tuxdna/.emacs.d/scala-mode") (require 'scala-mode-auto) ;; END: Scala mode ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; And you are done with the setup!

Read more →

On building large scale data processing system

distributed realtime

I was reading a few blog posts about distributed, large-scale processing of data, be it in batch or real-time. And definitely the move is towards real-time now. ( here and here ) . Well, in this blog post I am only going to mention about the things that I have come across so far. I would like to learn more. All the buzz around large scale data processing, in some way or the other, seems to be inspired by papers published by Google or the systems they built.

Read more →

Indexing the documents stored in a database using Apache Solr and Apache Tika

Lucene Solr Tika

Indexing the documents stored in a database Outline: Setup a MySQL database [1] containing documents( PDF/DOC/HTML etc ). Setup Apache Solr / Tika Import the documents just by hitting an import url. NOTE: Also check the update note at the end of this post. These steps were done on my machine running Fedora 17. The commands be easliy converted for other distributions. Setup MySQL database with documents Install MySQL Server: # yum install mysql-server # service mysqld start Also install Java library for connecting to MySQL ( Solr would need it later ) # yum install -y mysql-connector-java Setup a MySQL database [1] for storing binary files CREATE DATABASE binary_files; CREATE TABLE tbl_files ( id_files tinyint(3) unsigned NOT NULL auto_increment, bin_data longblob NOT NULL, description tinytext NOT NULL, filename varchar(255) NOT NULL, filesize integer NOT NULL, filetype varchar(255) NOT NULL, PRIMARY KEY (id_files) ); GRANT SELECT, INSERT, UPDATE, DELETE, CREATE, ALTER ON binary_files.* TO binary_user@localhost IDENTIFIED BY 'binary_password'; Now lets create a ruby script to populate the database with documents.

Read more →

JCallTracer: Tool to generate Sequence Diagrams for Java programs

Java JCallTracer

For sometime now I have been working on a project called JCallTracer. I had a simple problem at hand: generate Squence Diagrams for a program written in Java. I did try to google such a tool but I couldn’t find anything that was Open Source and worked on Linux. The closest I could find was Java Call Tracer. This tool was designed for Windows users and didn’t compile on Linux. I fixed that, but then it was apparently designed for Java programs with small memory foot-print.

Read more →

C 2011 Standard and current FOSS implementations


Today I got to know that C has a new standard released in 2011. You can find a detailed Dr. Dobbs’s article on the subject. So far I havent come across any Open Source compiler that fully implements C11 features. Clang and GCC are yet to fully support this standard. Clang has added support for anonymous structs and anonymous unions: Clang 3.1 adds support for anonymous structs and anonymous unions, added in the latest ISO C standard.

Read more →

Apache Incubator projects

apache incubator

I was going through a list of Apache Incubator projects and I found a few really interesting projects, primarily because I could immediately relate them to some functionality I could readily use. However, I have to say that the layout on the Apache Incubator projects makes it a daunting task to visit each and every project link to know the technology or domain a project name could be relate to. If instead of a project name matrix, there was a simple project list with Project Name, Technologies, Domain etc., it would have been far easier to identify the relevant projects.

Read more →

Memory consumption by a .so file for a running process

linux memory process

I wanted to know how much memory is consumed by C++ standard library for a process running on Linux. There is no straightforward way I could find so I have written a small script to do exactly that. Script Location: How to use? $ wget $ ruby ./mem-usage.rb /proc/5952/maps | grep libstd /usr/lib/x86_64-linux-gnu/ 2988.0 KBs In the above example, 5952 is the PID of Thunderbird mail client and C++ standard library consumes 2988 KB of memory for this process.

Read more →