Concurrency in Java: Hanging By A Thread Part 1

A major difficulty for new developers is the concept of concurrency. Initially, the concept of breaking tasks down to multiple processes seems trivial. After all, many single process tasks can be made faster by leveraging the power of our modern processors, right? Absolutely. Almost all production level software uses concurrency at some level to make things run smoothly. The difficulty comes when a developer requires multi-threaded access to an object or member variable, or perhaps when determining if two threads will become deadlocked. There are many decisions to be made when writing a multi-threaded task not to mention a multi-threaded application. Let’s look at a few examples.

A basic exercise multi-tasking may be to process large amounts of data such as arrays more quickly. Say I have a large array of integers, and I’d like to add one to each of array’s values. Simple enough:

public static void plusOne(int[] someArray){ for(int i = 0; i < someArray.length; i++){ someArray[i]++; } }

Of course this has acceptable performance, but maybe we can speed it up a bit.

public void plusOneConcurrent(int[] someArray){ ArrayList threads = new ArrayList<>(); int indexValue = (int) Math.ceil((double) someArray.length / (double) NUM_OF_WORKERS); for(int i = 0; i < NUM_OF_WORKERS; i++){ int starting = i * indexValue; int finishing = Math.min(someArray.length,starting + indexValue); IncrementWorker aWorker = new IncrementWorker(starting, finishing, someArray); threads.add(aWorker); aWorker.start(); }
for(Thread thread : threads){ thread.join(); } } private class IncrementWorker extends Thread { private int starting; private int finishing; private int[] someArray; public IncrementWorker (int starting, int finishing, int[] someArray) { this.starting = starting; this.finishing = finishing; this.someArray = someArray; } public void run() { if(starting >= someArray.length) return; for(int i = starting; i < finishing; i++){ someArray[i]++; } }

We get a small performance increase on my machine at certain sample sizes, but this won’t be true for all cases. Arrays smaller than a certain amount will actually take a performance large hit due to the higher over head of starting those threads. We can get around this by just calling the original, non-concurrent method for arrays smaller than this size. On my machine the concurrent performance begins at around an array size of 150,000 but slows back down. This is because the task we are doing is fairly trivial, so the overhead of creating these threads is slowing the performance down. Regardless, it’s a worthwhile exercise.

Well this is just fine and dandy, but what if I have to access previously calculated data to calculate my next result? Well this puts you in a bind. If you’re not able to find a closed-form solution for your recurrence relation, then you’ll have to program it out the hard way. Say you’re given an integer array with the first element initialized to a random number. The subsequent index’s value should be one plus the previous value. Let’s say you don’t know the closed form solution for this problem and instead have to code it. This works great for single threading, right?

public static void recurrence(int[] someArrayWhereFirstValueIsRandom){ for(int i = 1; i < someArray.length; i++){ someArray[i] = someArray[i-1] + 1; } }

You’re a wizard! This makes great sense. But what if you got a bug that compelled you to multi-thread this operation. Well you’ll have to do something tricky. You’ve probably heard of race conditions before, but it’s where you can’t determine if some task another task is dependent on will complete before the dependent task starts. If it’s not complete then you’ll either get incorrect output or throw a bunch of run time errors. We can’t guarantee index 3821332 will finish before 3821333 so we can’t start threads at random and let them run wild. Now you’ve probably realized that you could start threads that access the first value and initialized and copied it to the starting index of each thread’s portion. For the sake of learning, let’s pretend that you must practice with locks!

What are locks? Locks simply close access to a resource until the working thread is done, then the lock is released. This is similar to the concept of a mutex or semaphore. This can be helpful when you’re able to control when resources are locked. In java, you normally do not have to use explicit locks, but there is a Lock class provided. Instead probably will used Java’s concurrency keywords synchronized and volatile. Synchronized can be used in a method declaration, which will block access to the method until the current thread is done using said method.

public synchronized void incrementSomething(){something++;}

Volatile can also be useful. It can only be used with variables unlike synchronized. It makes a variable’s value visible to all threads when its value is changed, this may be useful when there are no additional concurrency requirements. This isn’t to be confused with Atomic classes in Java. Volatile guarantees that subsequent reads after a write will show that written value. This doesn’t work with operations such as ++ as ++ requires a read, modify, and write cycle. Atomic classes provide operations to guarantee thread safety, but they won’t be covered in this post.

We’ll cover the solution to this problem in the next post!

Concurrency in Java: Hanging By A Thread Part 1

Leave a Reply Cancel reply