Friday, February 20, 2009

Java Broke Computer Science

So... yeah, Java broke computer science. But first some background, I'm doing agent based modeling these days, and a common thing you have 2+ nested loops. For example I'm working on a project right now where I'll be looping an outer loop somewhere in the range of 100,000+ times and an inner loop maybe 10,000 - 20,000. Suffice to say, I need to make it as fast as I can. I wrote some little tests so I could get a sense of how long it would take, just to run through the loop and do something small.

A class to do stuff with:

public class Foo {
 private int count;
 private boolean inc;

 public Foo(boolean inc) {
  this.inc = inc;
 }

 public void tick() {
  if (inc) {
   count++;
  }
 }
}

My Test:

private int outerLoop = 100000;
private int innerLoop = 10000;

@Test
public void arrays() {
 Foo[] foos = new Foo[innerLoop];
 for (int i = 0; i < innerLoop; i++) {
  boolean inc = i % 2 == 0 ? true : false;
  foos[i] = new Foo(inc);
 }

 long start = System.currentTimeMillis();
 for (int i = 0; i < outerLoop; i++) {
  for (int j = 0; j < innerLoop; j++) {
   foos[j].tick();
  }
 }
 long fin = System.currentTimeMillis();
 System.out.println("arrays took: " + (fin - start));
}

Takes on my machine in java 1.6 ~2,000 ms I'd like to point out here, that Foo[] foos will not change, so I don't care about speed of modifying a data structure, just the fastest way to iterate.

Right, iterate, what's an arrayList looklike?

@Test
public void arrayList() {
 ArrayList arrayList = new ArrayList();
 for (int i = 0; i < innerLoop; i++) {
  boolean inc = i % 2 == 0 ? true : false;
  arrayList.add(new Foo(inc));
 }
 long start = System.currentTimeMillis();
 for (int i = 0; i < outerLoop; i++) {
  for (Foo node: arrayList) {
   node.tick();
  }
 }
 long fin = System.currentTimeMillis();
 System.out.println("ArrayList took: " + (fin - start));
}

This runs in ~3000 ms

Okay the Java Iterators aren't cutting it, but what about a real linked list? not java.util.LinkedList but, Linked_list the data structure. This as far as I can figure should be ideal. All there is to iteration is pointer de-referencing. There are no lookups to find the a given index in an array, just following the pointers.

Here's my implementation:

public class LinkedListNode {
 public T data;
 public LinkedListNode next;

 public LinkedListNode(T data) {
  this.data = data;
 }
}

public class LinkedList {
 private LinkedListNode first;
 private LinkedListNode last;

 public LinkedList() {}

 public LinkedListNode first() {
  return first;
 }

 public void addToEnd(LinkedListNode linkedListNode) {
  if (first == null) {
   first = linkedListNode;
   last = linkedListNode;
  } else {
   last.next = linkedListNode;
   last = linkedListNode;
  }
 }
}

And the test:

@Test
public void linkedList() {
 LinkedList linkedList = new LinkedList();
 for (int i = 0; i < innerLoop; i++) {
  boolean inc = i % 2 == 0 ? true : false;
  linkedList.addToEnd(new LinkedListNode(new Foo(inc)));
 }

 long start = System.currentTimeMillis();
 LinkedListNode startNode = linkedList.first();
 for (int i = 0; i < outerLoop; i++) {
  LinkedListNode node = startNode;
  while (node.next != null) {
   node.data.tick();
   node = node.next;
  }
 }
 long fin = System.currentTimeMillis();
 System.out.println("linkedList took: " + (fin - start));
}

Any guesses to how this performed? try ~5500 ms Like I said Java broke computer science, they must be doing some amazing compiler optimizations, how is it that node.next is slower than foo[499]. It boggles the mind. What's the take away here? When you can use a primitive array do so?

oh, just a note, yes I know ArrayList is backed by a array, and yes I tested a java.util.LinkedList as well, and it was the worst of the bunch.

Monday, February 9, 2009

Test Data Builders

I've been meaning to post something about this for a little bit, not exactly sure what I had to add to the subject besides, "hey that's a good idea", and after about a month thinking about it I think it's safe to say I don't have much to add except for evangelism.

So without further ado, I'd suggest reading a few articles on test data builders:
http://blog.jayfields.com/2009/01/most-java-unit-tests-consist-of-class.html
Test Data Builders

In short Test Data Builders allow you to minimize maintenance cost in your tests and allow you to easily inject objects into your test objects, it's great, great stuff.

I guess I do have an addition, a story:
On my last project a co-worker made a perfectly reasonable api change, and added a parameter to a constructor, after making the change he realized this would effect ~100 tests (whoa), so he used Eclipses magic refactoring tools and initialized the value to null. The problem was his changes really required that new parameter to be non-null (in fact if it was null a NPE would be thrown), and even though the code compiled all the tests broke (with NPEs). He didn't want to fix the tests. Why should he fix all 100 unrelated tests, maybe 5-10 tests but 100, that was just too much. And really I understood that. And I was lucky, I had just read Jay Fields post about Test Data Builders. 1/2 a day later a co-worker and I added all the Builders we needed for our project and started updating our tests to use those builders. With the builders in place, fixing the NPEs was in one place. And now, if there is a similar situation and one change breaks hundreds of tests, there is one place to go and fix them all.

Anyway, I would not start a new project without using Test Data Builders they really pay for themselves when maintaining your tests and your code.

Friday, February 6, 2009

Stack Overflow #38

There's a big old brouhaha regarding the Stack Overflow podcast #38 Joel summarizes the pod cast: http://www.joelonsoftware.com/items/2009/01/31.html He comes across as very anti-unit test, which is fairly surprising to me. I was surprised, because I feel that unit testing has only made me a better developer and allowed me to deliver higher quality software faster. Anyway I just came across Uncle Bob's response: http://blog.objectmentor.com/articles/2009/01/31/quality-doesnt-matter-that-much-jeff-and-joel. From the comments on Object Mentor, looks Uncle Bob will be participating in a stackoverflow podcast, that should be interesting to to hear.

Another blogger I follow also posted a good response: http://blog.jayfields.com/2009/02/thoughts-on-developer-testing.html

Anyone come across anything else interesting?

 
Web Statistics