Princeton Alogorithm COS226 Week

2019-09-28 本文已影响0人西部小笼包

now in before lecture, we already have a concept about stack, queue, randomized queue. we also learn some sorting algorithm, selection sort, insertion sort, shell sort, merge sort, quick sort.
is there a data structure could combine them all.
i mean i want to get a element by order one by one.
this is today's topic. priority queue, below is the API

image.png

there are a lot of application used this data structure. in the later lecture, a famous shortest path called Dijkstra use it. and in operating system, load balancing use it. data compression will use it. event driven simulation will use it.

the find first kth problem is a traditional problem using priority queue. we have a log with a lot of transaction, please find the most biggest 3 transaction. we have not enough memory to store all of them. how to solve it?

image.png

we can maintain a priority queue with 3 items. if there is a item larger than the min of the queue, then replace the min.

image.png

how the priority queue implemented?
before introduce it, we want to introduce a concept of complete tree.

image.png

and binary heap could use this structure to work. we only keep a array. the left children is 2 * i. the right children is 2 * i + 1. when the root index is 1.

image.png

in heap there are two important operation.
promotion and demotion.

promotion

when child's key becomes larger key than its parent's key

image.png

demotion

parent's key becomes smaller than one of its children's

image.png

below is some important binary heap consideration

the key should be immutable because change key while they are on the PQ, will break the PQ characteristics.

in java we can use final to make the class immutable.

image.png

there are a lot of advantages of immutable class.
simplifies debugging, safer in presence of hostile code, simplifies concurrent programming, safe to use as key in priority queue or symbol table.

Heap sort

so now we get a data structure which could pop element by order in log n time.
so we get another sort algorithm with n lg n in place.

the basic plan for it is
create max-heap with all N keys, then repeatedly remove the max one.

a important thing is heap construction, we should build the heap from bottom to top, then we can get O (n) time to build the heap.

image.png

the most difference of heap sort it a inplace sorting algorithm with N log N worst-case
but the disadvantages is that
inner loop longer than quicksort, make poor use of cache memory, not stable.

image.png

Molecular dynamics simulation

image.png

the solution is event-driven simulation.
change state only when something happen.
we now only focus on times when collisions occur. maintain PQ of collision events, prioritized by time.
remove the min = get next collision

image.png

QUIZ

Q1

image.png

this question could be solved by two priority queue, one is maxheap, another is minheap.
because we need to find the median in constant time, we need to balance the elements evenly into two pq. the most important is that maxHeap.max <= minHeap.min
then the maxHeap max is the median.
the problem now is remove and insert, when insert we can insert to the max heap first, then check it still meet the invariant, the size diff is <= 1, and maxHeap max <= minHeap min, if not, do some balance action.
remove we can just pop the maxHeap max, then check invariant and do balance.

Q2

image.png

because we know the priority queue is the array strucutre, we can know the size of it, and random a number in size, then just the arr[idx]
delRandom, will think more, we need to delete a value in the middle. we random a position and then swap with the array last one, the important thing here is we don't know the key is should promotion or demotion, so we need try both.

Q3

image.png

version 1 could be prepare all the a^3 + b^3 pair result, then sort them.
we can know the result are same is neighbor, if there are more than 1 number in the list, this is taxicab number

version 2 , we could use priority queue. the record 1 ~ n-1's pair, pair contains two number, one is a and another is b, pair 1's first number always is 1, second number can be combined with any number large than 1. same as others pair k.
so now we pop the min, check it is same as previous, if it is, we find a tax number. if not, we increase second number in pair.

Project 8 Puzzle

image.png

this project have no bonus, but the 100% score code when run the PuzzleChecker, it will cause

Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded
        at Board.neighbor(Board.java:126)
        at Board.neighbors(Board.java:139)
        at Solver.<init>(Solver.java:57)
        at PuzzleChecker.main(PuzzleChecker.java:52)

the project with Puzzle check download link
https://coursera.cs.princeton.edu/algs4/assignments/8puzzle/8puzzle.zip

so i do the improvement below
Use a 1d array instead of a 2d array (as suggested above).

Cache either the Manhattan distance of a board (or Manhattan priority of a search node). It is waste to recompute the same quantity over and over again.

Exploit the fact that the difference in Manhattan distance between a board and a neighbor is either −1 or +1.

Use only one PQ to run the A* algorithm on the initial board and its twin.
When two search nodes have the same Manhattan priority, you can break ties however you want, e.g., by comparing either the Hamming or Manhattan distances of the two boards.

Use a parity argument to determine whether a puzzle is unsolvable (instead of two synchronous A* searches). However, this will either break the API or will require a fragile dependence on the toString() method, so don't do it.

after improved above suggestion, the speed is very faster, we can solve all the example except 4x4-50, 4x4-78, 4x4-80 with in 75 seconds.

image.png

Princeton Alogorithm COS226 Week

promotion

demotion

Heap sort

Molecular dynamics simulation

QUIZ

Q1

Q2

Q3

Project 8 Puzzle

the good blog for it is

how to check solvable in n^2

if u want to improve more please reference this github.

猜你喜欢

热点阅读