Java Concurrency Notes
Process for JVM to Run a Program
- JVM(Java Virtual Machine) start the main thread
- JVM verifies the existence of the main() method and initializes the class.
- JVM uses preemptive scheduling to determine which thread must run first.
- JVM terminates itself when all user threads finish their execution
- If JVM finds running daemon thread, it terminates the thread and after that shutdown itself.
Main Thread
Main thread is the thread begins running immediately when a Java program starts up.
main() method vs main thread
For each program, a Main thread is created by JVM. The “Main” thread first verifies the existence of the main() method, and then it initializes the class.
Daemon Thread
Daemon thread is an utmost priority thread that runs in background to perform tasks such as garbage collection.
Daemon vs User Threads
Priority: Daemon thread can not prevent the JVM from exiting when all the user threads finish their execution. When the only remaining threads in a process are daemon threads, the interpreter exits.
Usage: Daemon thread is to provide services to user thread for background supporting task.
Preemptive scheduling
The scheduler will choose a thread that has the highest priority and in the Runnable state to execute.
Lifecycle of a Thread in Java
Five States
- New
- Runnable
- Blocked
- Waiting
- Timed Waiting
- Terminated
Methods in Thread
run()
We override run() in Thread class or Runnable interface to make these functions be run concurrently.
run() cannot return any result, must be void
If you want a function return results, use Callable interface instead of Runnable interface.
Why we need start()?
start():Cause the JVM to execute the run() method of this thread. ""start() will create a separate call stack for this thread."" So it can maintain its own local variables and processes.
yield() vs sleep()
yield():The current check whether there is any other threads with higher or the same priority. If yes, run one of them. Otherwise, the current thread continues running.
sleep():The current thread will definitely stop executing for a given amount of time; If no other thread or process needs to be run, the CPU will be idle.
join() vs wait()
join():Block the current thread from running until the thread calls join() has been terminated.
wait():This method is inherited from class java.lang.Object. Used in synchronized methods. Block the current method from running until notify() or notifyAll() has been called in some other synchronized methods.
Callable vs Runnable
Synchronized Method and Block
Critical section: A code segment that can be accessed by two or more threads, and contains shared variables whose consistency need to be maintain.
Race condition: Two or more processes accessing, and may try to modify shared data at the same time.
Use the synchronized keyword with method and code block to ensure the critical section is only accessed by one thread at a time.
Fail-fast
迭代器在遍历时直接访问集合中的内容,并且在遍历过程中使用一个 modCount 变量。集合在被遍历期间如果内容发生变化,就会改变modCount的值。每当迭代器使用hashNext()/next()遍历下一个元素之前,都会检测modCount变量是否为expectedmodCount值,是的话就返回遍历;否则抛出ConcurrentModificationException异常,终止遍历。
wait(), notify(), and notifyAll()
wait(), notify(), and notifyAll() are used to implement synchronization and the interaction between threads in Java. They only work in a synchronized method or synchronized block.
When a thread has to be pending until the values of some resources have been changed by some other threads, it calls wait(). By taking this action, it is blocked from running until another thread calls notify() or notifyAll().
Reentrant
Synchronized blocks in Java are reentrant.
Reentrance means if a thread already holds the lock on a monitor object, it has access to all blocks synchronized on the same monitor object.
Reentrance is implemented by using an int counter instead of boolean flag to represent the status of the lock. Every time when a thread re-enter into lock again, counter is incremented by one. For every unlock request, counter is decremented by one. When counter is 0, the resource is unlocked.
Advantages:
- Multithreading: Since java is multithreaded language, synchronization is a good way to achieve mutual exclusion on shared resource(s).
- Instance and Static Methods: Both synchronized instance methods and synchronized static methods can be executed concurrently because they are used to lock different Objects.
Limitations:
- Concurrency Limitations: Java synchronization does not allow concurrent reads.
- Decreases Efficiency: Java synchronized method run very slowly and can degrade the performance.
Thread pool
For highly concurrent programs, if we create a new thread to handle each request, we would spend more time and consume more system resources in creating and destroying threads than processing actual requests. This may cause the system to run out of memory.
A thread pool solve this problem by reusing previously created threads to execute current tasks.
Structure of the thread pool
Keywords
- corePoolSize: 线程池核心线程数量
- maximumPoolSize: 线程池最大线程数量
- keepAliverTime: 当活跃线程数大于核心线程数时,空闲的多余线程最大存活时间
- unit: 存活时间的单位
- workQueue: 存放任务的队列
- handler: 超出线程范围和队列容量的任务的处理程序
- RejectedExecutionHandler (饱和策略): 当线程池处于饱和状态而新任务到来时应采取的措施,分为以下几种:
- AbortPolicy:直接抛出异常,默认。
- CallerRunsPolicy:只用调用所在的线程运行任务。
- DiscardOldestPolicy:丢弃队列里最近的一个任务,并执行当前任务。
- DiscardPolicy:不处理,丢弃掉。
1、如果线程池的当前活跃线程数<核心线程数(核心线程空闲或者还有核心线程没有被创建)则创建一个新的工作线程来执行任务,否则进入下一步。
2、如果工作队列没有满,则将新提交的任务存储在这个工作队列里,否则进入下一步。
3、如果线程池的当前活跃线程数<最大线程数,则创建一个新的工作线程来执行任务,否则交给饱和策略来处理这个任务。
To use thread pools, we first create an object of ExecutorService and pass a set of tasks implementing the Runnable interface to it. If the number of tasks is lower than the maximum pool size, a thread is generated to run the task. Otherwise, the pending task is putted into a queue.
Risks in using Thread Pools
- Deadlock: It occurs when all the executing threads are waiting for the results from the blocked threads waiting in the queue due to the unavailability of threads for execution. Indeed, do not put tasks that concurrently wait for results from each other in the same thread pool.
-
Thread Leakage: It occurs if a thread is removed from the pool to execute a task but not returned to it when the task completed. If this happens, the pool loses available threads that can be reused.
Occurs in cases: the thread throws an exception but pool class does not catch this exception, the thread has a long lived operation - Resource Thrashing: If the thread pool size is very large then time is wasted in context switching between threads. Having more threads than the optimal number may cause starvation problem leading to resource thrashing.
The optimum size of the thread pool
For determining the size of the thread pool, we need to be clear about which of CPU and IO spends more time. For IO-heavy tasks, threads are waiting for IO results at most of the time, so we can properly increase the size thread pool to increase the efficiency of CPU. For CPU-heavy tasks, all threads are busily computing on CPU, so no more extra task is needed.
CPU-heavy: Size of the thread pool = Number of CPU core + 1 (+1 is for use the waiting period efficiently)
IO-heavy: Size of the thread pool = Number of CPU core * 2 + 1
Best thread pool size = (Thread waiting time / Thread running time + 1) * Number of CPU core
Problems to solve in concurrency
Starvation
A thread is not granted CPU time because other threads grab it all.
Causes
- Threads with high priority swallow all CPU time from threads with lower priority.
- Threads are blocked indefinitely waiting to enter a synchronized block, because other threads are constantly allowed access before it.
- Threads waiting on an object (called wait() on it) remain waiting indefinitely because other threads are constantly awakened instead of it.
Solution
Give all threads fair chances to be executed. Use locks instead of synchronized blocks can improve fairness.
Deadlock -- not a lock, but a problem
Threads are waiting for each other to release shared sources that they are holding.
To avoid deadlock condition, we should :
Avoid Nested Locks: Avoid giving lock to multiple threads if we already have given to one.
Avoid Unnecessary Locks: We should lock only those members which are required.
Using thread join: Use Thread.join() with maximum time you think the execution will take to avoid waiting for deadlocks.
Locks
A lock can be fair or unfair.
A fair lock will check the blocking queue when it acquires the lock. Only if it is the first one in the queue, it can be wake up by the unlock() from the current running thread.
A unfair lock directly tries to grab the lock every time when the lock is released.
Reentrant Lock
Similar to synchronized block, Reentrant Lock allows threads to enter into lock on a resource more than once. Every time when a thread re-enter into lock again, hold count is incremented by one. For every unlock request, hold count is decremented by one. When hold count is 0, the resource is unlocked.
Reentrant Lock vs synchronized block
- The keyword synchronized don’t offer any mechanism of a waiting queue and after the exit of one thread, any thread can take the lock. This could lead to starvation of resources for some other thread for a very long period of time. Reentrant Lock solves this problem by offering a fairness parameter. Such parameter ensures the thread that waits the longest time in queue gets the resources after they are unlocked. However, the fairness parameter decreases the throughput of the program. So we need to choose synchronized and reentrant lock carefully due to our use cases.
- Reentrant Lock can wake up a specific thread accurately with the help of Reentrant LockCondition classReentrant Lock. Synchronized block can only notify() randomly or notifyAll().
- Functions in a synchronized block cannot be interrupted. Functions acquire a reentrant Lock by
lock.lockInterruptibly()
can be interrupted bythread.interrupt()
while holding the lock. - Reentrant Lock need to be manually unlocked.
Spin Lock
Thread tries to acquire a Spin Lock will not be blocked. It will attempt to acquire the lock continuously in a loop.
Advantages: Decrease the cost of context switching between threads.
Limitations: Running a loop increases CPU cost.
ReadWriteLock / ReentrantReadWriteLock
In Java, ReadWriteLock is an Interface. ReentrantReadWriteLock is a Class implement the ReentrantReadWriteLock Interface.
The thread acquires a Read Lock can successfully get it If no threads are writing, and no threads have requested write access. If we do not up-prioritize writes, starvation of the writing thread will occur.
The thread acquires a Write Lock can successfully get it If no threads are reading or writing.
Similar to the Reentrant Lock, it has a fair mode that is not activated by default, and the thread locked can also be interrupted.
Reentrance: This lock allows both readers and writers to reacquire read or write locks in the style of a ReentrantLock. A writer can reacquire the read lock, but not vice-versa.
Lock Downgrading: Reentrancy also allows downgrading from the write lock to a read lock, by acquiring the write lock, then the read lock and then releasing the write lock. However, upgrading from a read lock to the write lock is not possible.
It is useful in the case that there are large amount of write/read operations of data, and there are much more reads that writes.
Other Useful Components
volatile
In Java Memory Model(JMM), we have both main memory and individual workspace memory for each thread. Thread uses the variables values in its workspace memory* for operations, and workspace memory* synchronizes with the main memory periodically. Unfortunately, in highly concurrent situations, the frequency of updating a variable is higher than the frequency of synchronization. Indeed, threads cannot get the latest values for their operation.
Visibility
Add keyword volatile to variables can solve this problem. All writes to a volatile variable are refreshed to the main memory immediately, which ensure the visibility of the variable.
Ordering
Volatile also forbids the reorder of this variable in compile. When the thread want to read this value, it discards the value of this variable in its workspace memory, and reads the variable from main memory. This is one of the famous happens-before rules: the write to a volatile variable happens before a follow-up read of it.
Automicity
Unfortunately, volatile only maintains the atomicity of single-step processes. For example, it protects the atomicity of i=10
and System.out.println(i)
, but cannot keep the correctness of i++
or i=i+2
. If a thread A is blocked after it reads i
, but before it writes i++
, there is a chance for thread B to update the value of i
in main memory at this time, and the outcome written by A to the main memory later will be wrong.
public class AutomicityTest {
public int count = 0;
public volatile int countVolatile = 0;
public AtomicInteger atomicInteger = new AtomicInteger(0);
public void increase() {
count++;
countVolatile++;
atomicInteger.incrementAndGet();
}
public static void main(String[] args) {
AutomicityTest test = new AutomicityTest();
Thread[] tlist = new Thread[10];
for(int i=0; i<tlist.length; i++) {
Thread t = new Thread(() -> {
for(int j=0; j<1000; j++) {
test.increase();
};
});
tlist[i] = t;
tlist[i].start();
}
try {
for(int i = 0; i<tlist.length; i++) {
tlist[i].join();
}
} catch (InterruptedException e) {
e.printStackTrace();
}
System.out.println(test.count);
System.out.println(test.countVolatile);
System.out.println(test.atomicInteger.get());
}
}
Result
9227
9363
10000
If you definitely want to ensure atomicity in the case of reading from other variables, you should use AtomicInteger.
AutomicBoolean / AtomicInteger
Similar to volatile, but also ensure atomicity by encapsulating all common functions of such type (e.g. ```i++``) in synchronized blocks.
Volatile and AtomicInteger have exact the same functions as a synchronized block and lock, but they are more light-weighted with lower cost.
Mutex
Mutex allows only 1 thread can access the resources at a time.
Semaphore
SemaphoreSemaphore can be released by a thread other than the owner (as semaphores have no notion of ownership). This can be useful in some specialized contexts, such as deadlock recovery.
Mutex vs Semaphore
- Mutex allows only 1 thread can access the resources at a time. Semaphore controls the number of threads that can access the resources at a time by an integer counter.
- Mutex has no fairness or ordering guarantees. The fairness of Semaphore can be controlled by a boolean variable.
CountDownLatch
CyclicBarrier
CyclicBarrierawait() of CountDownLatch/CyclicBarrier vs join()
For the current thread with join(), it can only restart running again until the thread calls join() terminates. But a thread can call countDown() at any time, so the current thread with await() is not necessary to wait for the end of the thread calls countDown().
CountDownLatch vs CyclicBarrier
- A CountDownLatch can be used only once in a program(until it’s count reaches 0). A CyclicBarrier can be used again and again once all the threads in a barriers is released and the barriers is reset.
- In CountDownLatch, only one thread is waiting for other threads to done some work. In CyclicBarrier, all threads are waiting for each other to arrive at some stage (on the barrier). After the barrier is broken, they can restart together again from this shared stage to do further works. No thread is ensured to be the last one arrives at the barrier by default.
Blocking Queue
A blocking queue is a queue that blocks when you try to dequeue from it and the queue is empty, or if you try to enqueue items to it and the queue is already full. It is commonly used in a producer-consumer model.
Blocking Queue
Reference
GeeksforGeeks - Multithreading in Java
GeeksforGeeks - Synchronized in Java
GeeksforGeeks - Lifecycle and States of a Thread in Java
GeeksforGeeks - Inter-thread Communication in Java
GeeksforGeeks - Method and Block Synchronization in Java
GeeksforGeeks - Deadlock in Java Multithreading
Jenkov - Starvation and Fairness
Jenkov - Lock in Java
GeeksforGeeks - Reentrant Lock in Java
ReentrantLock 精确唤醒线程
JavaSE12 Docs - ReentrantWriteReadLock
GeeksforGeeks - Thread Pools in Java
JAVA线程池原理详解一
线程池数目的确定
GeeksforGeeks - Semaphore in Java
JavaSE12 Docs - Semaphore
Mutex
GeeksforGeeks - CountDownLatch in Java
GeeksforGeeks - Java.util.concurrent.CyclicBarrier in Java
GeeksforGeeks - Callable and Future in Java
CountDownLatch和CyclicBarrier的使用和区别
Jenkov - Blocking Queues
面试官最爱的volatile关键字
GeeksforGeeks - What does start() function do in multithreading in Java?
GeeksforGeeks - Joining Threads in Java
GeeksforGeeks - Java Concurrency – yield(), sleep() and join() methods
GeeksforGeeks - Main Thread in Java
GeeksforGeeks - Daemon thread in Java