Princeton Alogorithm COS226 Week
now in before lecture, we already have a concept about stack, queue, randomized queue. we also learn some sorting algorithm, selection sort, insertion sort, shell sort, merge sort, quick sort.
is there a data structure could combine them all.
i mean i want to get a element by order one by one.
this is today's topic. priority queue, below is the API

there are a lot of application used this data structure. in the later lecture, a famous shortest path called Dijkstra use it. and in operating system, load balancing use it. data compression will use it. event driven simulation will use it.
the find first kth problem is a traditional problem using priority queue. we have a log with a lot of transaction, please find the most biggest 3 transaction. we have not enough memory to store all of them. how to solve it?

we can maintain a priority queue with 3 items. if there is a item larger than the min of the queue, then replace the min.

how the priority queue implemented?
before introduce it, we want to introduce a concept of complete tree.

and binary heap could use this structure to work. we only keep a array. the left children is 2 * i. the right children is 2 * i + 1. when the root index is 1.

in heap there are two important operation.
promotion and demotion.
promotion
when child's key becomes larger key than its parent's key


demotion
parent's key becomes smaller than one of its children's


below is some important binary heap consideration
the key should be immutable because change key while they are on the PQ, will break the PQ characteristics.
in java we can use final to make the class immutable.

there are a lot of advantages of immutable class.
simplifies debugging, safer in presence of hostile code, simplifies concurrent programming, safe to use as key in priority queue or symbol table.
Heap sort
so now we get a data structure which could pop element by order in log n time.
so we get another sort algorithm with n lg n in place.
the basic plan for it is
create max-heap with all N keys, then repeatedly remove the max one.
a important thing is heap construction, we should build the heap from bottom to top, then we can get O (n) time to build the heap.


the most difference of heap sort it a inplace sorting algorithm with N log N worst-case
but the disadvantages is that
inner loop longer than quicksort, make poor use of cache memory, not stable.

Molecular dynamics simulation


the solution is event-driven simulation.
change state only when something happen.
we now only focus on times when collisions occur. maintain PQ of collision events, prioritized by time.
remove the min = get next collision

QUIZ
Q1

this question could be solved by two priority queue, one is maxheap, another is minheap.
because we need to find the median in constant time, we need to balance the elements evenly into two pq. the most important is that maxHeap.max <= minHeap.min
then the maxHeap max is the median.
the problem now is remove and insert, when insert we can insert to the max heap first, then check it still meet the invariant, the size diff is <= 1, and maxHeap max <= minHeap min, if not, do some balance action.
remove we can just pop the maxHeap max, then check invariant and do balance.
Q2

because we know the priority queue is the array strucutre, we can know the size of it, and random a number in size, then just the arr[idx]
delRandom, will think more, we need to delete a value in the middle. we random a position and then swap with the array last one, the important thing here is we don't know the key is should promotion or demotion, so we need try both.
Q3

version 1 could be prepare all the a^3 + b^3 pair result, then sort them.
we can know the result are same is neighbor, if there are more than 1 number in the list, this is taxicab number
version 2 , we could use priority queue. the record 1 ~ n-1's pair, pair contains two number, one is a and another is b, pair 1's first number always is 1, second number can be combined with any number large than 1. same as others pair k.
so now we pop the min, check it is same as previous, if it is, we find a tax number. if not, we increase second number in pair.
Project 8 Puzzle

this project have no bonus, but the 100% score code when run the PuzzleChecker, it will cause
Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded
at Board.neighbor(Board.java:126)
at Board.neighbors(Board.java:139)
at Solver.<init>(Solver.java:57)
at PuzzleChecker.main(PuzzleChecker.java:52)
the project with Puzzle check download link
https://coursera.cs.princeton.edu/algs4/assignments/8puzzle/8puzzle.zip
so i do the improvement below
Use a 1d array instead of a 2d array (as suggested above).
Cache either the Manhattan distance of a board (or Manhattan priority of a search node). It is waste to recompute the same quantity over and over again.
Exploit the fact that the difference in Manhattan distance between a board and a neighbor is either −1 or +1.
Use only one PQ to run the A* algorithm on the initial board and its twin.
When two search nodes have the same Manhattan priority, you can break ties however you want, e.g., by comparing either the Hamming or Manhattan distances of the two boards.
Use a parity argument to determine whether a puzzle is unsolvable (instead of two synchronous A* searches). However, this will either break the API or will require a fragile dependence on the toString() method, so don't do it.
after improved above suggestion, the speed is very faster, we can solve all the example except 4x4-50, 4x4-78, 4x4-80 with in 75 seconds.

the good blog for it is
https://medium.com/breathe-publication/solving-the-15-puzzle-e7e60a3d9782
how to check solvable in n^2
http://www.cs.bham.ac.uk/~mdr/teaching/modules04/java2/TilesSolvability.html
if u want to improve more please reference this github.
https://blog.csdn.net/Wood_Du/article/details/89666736
https://github.com/jDramaix/SlidingPuzzle
course quiz and project : https://github.com/yixuaz/algorithm4-princeton-cos226