So... yeah, Java broke computer science. But first some background, I'm doing agent based modeling these days, and a common thing you have 2+ nested loops. For example I'm working on a project right now where I'll be looping an outer loop somewhere in the range of 100,000+ times and an inner loop maybe 10,000 - 20,000. Suffice to say, I need to make it as fast as I can. I wrote some little tests so I could get a sense of how long it would take, just to run through the loop and do something small.
A class to do stuff with:
public class Foo { private int count; private boolean inc; public Foo(boolean inc) { this.inc = inc; } public void tick() { if (inc) { count++; } } }
My Test:
private int outerLoop = 100000; private int innerLoop = 10000; @Test public void arrays() { Foo[] foos = new Foo[innerLoop]; for (int i = 0; i < innerLoop; i++) { boolean inc = i % 2 == 0 ? true : false; foos[i] = new Foo(inc); } long start = System.currentTimeMillis(); for (int i = 0; i < outerLoop; i++) { for (int j = 0; j < innerLoop; j++) { foos[j].tick(); } } long fin = System.currentTimeMillis(); System.out.println("arrays took: " + (fin - start)); }
Takes on my machine in java 1.6 ~2,000 ms I'd like to point out here, that Foo[] foos will not change, so I don't care about speed of modifying a data structure, just the fastest way to iterate.
Right, iterate, what's an arrayList looklike?
@Test public void arrayList() { ArrayListarrayList = new ArrayList (); for (int i = 0; i < innerLoop; i++) { boolean inc = i % 2 == 0 ? true : false; arrayList.add(new Foo(inc)); } long start = System.currentTimeMillis(); for (int i = 0; i < outerLoop; i++) { for (Foo node: arrayList) { node.tick(); } } long fin = System.currentTimeMillis(); System.out.println("ArrayList took: " + (fin - start)); }
This runs in ~3000 ms
Okay the Java Iterators aren't cutting it, but what about a real linked list? not java.util.LinkedList but, Linked_list the data structure. This as far as I can figure should be ideal. All there is to iteration is pointer de-referencing. There are no lookups to find the a given index in an array, just following the pointers.
Here's my implementation:
public class LinkedListNode{ public T data; public LinkedListNode next; public LinkedListNode(T data) { this.data = data; } } public class LinkedList { private LinkedListNode first; private LinkedListNode last; public LinkedList() {} public LinkedListNode first() { return first; } public void addToEnd(LinkedListNode linkedListNode) { if (first == null) { first = linkedListNode; last = linkedListNode; } else { last.next = linkedListNode; last = linkedListNode; } } }
And the test:
@Test public void linkedList() { LinkedListlinkedList = new LinkedList (); for (int i = 0; i < innerLoop; i++) { boolean inc = i % 2 == 0 ? true : false; linkedList.addToEnd(new LinkedListNode (new Foo(inc))); } long start = System.currentTimeMillis(); LinkedListNode startNode = linkedList.first(); for (int i = 0; i < outerLoop; i++) { LinkedListNode node = startNode; while (node.next != null) { node.data.tick(); node = node.next; } } long fin = System.currentTimeMillis(); System.out.println("linkedList took: " + (fin - start)); }
Any guesses to how this performed? try ~5500 ms Like I said Java broke computer science, they must be doing some amazing compiler optimizations, how is it that node.next is slower than foo[499]. It boggles the mind. What's the take away here? When you can use a primitive array do so?
oh, just a note, yes I know ArrayList is backed by a array, and yes I tested a java.util.LinkedList as well, and it was the worst of the bunch.