Saturday, August 22, 2015

String tokenizer

Given a function which will tell if a string is word or not, and given a string, break it into multiple words. For example: "usaway" can be broken into "us away" or "usa way".

If we think each index of characters in the string is a node in a direct graph, then there is edge from node i to j if substring from index i to j is a word. Start from node at index 0, if we can find a path to node at index of last character, then we find a way to tokenize the string.  To do this we can use depth first search. 

The following is the code. Note if there are multiple ways to break a string into words, then the method will only return one way. 

public static String tokenize(String s){
String[] result = new String[1];
result[0] = "";
dfs(s, 0, result);
return result[0];
}

private static boolean dfs(String s, int index, String[] result){
if(index == s.length())
return true;
for(int i=index+1; i<=s.length(); i++){
String substring = s.substring(index, i);
if(isWord(substring) && dfs(s, i, result)){
result[0] =  substring + " " + result[0];
return true;
}
}
return false;
}


The second tokenize method will return a list. Each item in the return list is a list of indices which form a path from index at 0 to the last index of the string. 

public List<List<Integer>> tokenize2(String input){
/*
* map character index position to the next character index position if they are part of the solution
*/
Map<Integer, List<Integer>> store = new TreeMap<Integer, List<Integer>>();
dfs(input, store, 0);
List<List<Integer>> result = new ArrayList<List<Integer>>();
List<Integer> list = new ArrayList<Integer>();
list.add(0);
result.add(list);
boolean flag = true;
while(flag){
flag = false;
List<List<Integer>> copy = new ArrayList<List<Integer>>();
for(List<Integer> li : result){
int k = li.get(li.size()-1);
if(store.containsKey(k)){
flag = true;
for(int n : store.get(k)){
List<Integer> temp = new ArrayList<Integer>(li);
temp.add(n);
copy.add(temp);
}
}else
copy.add(li);
}
result = copy;
}
return result;
}
private Integer dfs(String s, Map<Integer, List<Integer>> store, int from){
int len = s.length();
if(from == len-1)//we find the leaf node
return from;
for(int i=from+1; i<len-1; i++){
if(isWord(s, from, i)){
Integer ret = dfs(s, store, i);
if(!store.containsKey(from)){
List<Integer> list = new ArrayList<Integer>();
store.put(from, list);
}
//the edge from-ret form the path which is the solution
store.get(from).add(ret);
return ret==null?null:from;
}
}
return null;
}

N-queens puzzle

The n-queens puzzle is a classic dynamic programming puzzle. The problem of placing n queens on an n×n chessboard such that no two queens attack each other.
Given an integer n, return all distinct solutions to the n-queens puzzle.
Each solution contains a distinct board configuration of the n-queens' placement, where 'Q' and '.' both indicate a queen and an empty space respectively.
For example,
There exist two distinct solutions to the 4-queens puzzle:
[
 [".Q..",  // Solution 1
  "...Q",
  "Q...",
  "..Q."],
 ["..Q.",  // Solution 2
  "Q...",
  "...Q",
  ".Q.."]
]


public static void findAllValidQueenPositions(int k){
/*
* an array indicate the queens' column index position on each row
* e.g. if k = 2, array of 0, 1 means queens are placed at 0,0 and 1,1 cells
*/
int[] pos = new int[k];
for(int r = 0; r<pos.length; r++)
pos[r] = -1;
move(pos, 0);
}

private static void move(int[] pos, int row){
int length = pos.length;
if(row == length){//we finish the whole chess board
System.out.println(Arrays.toString(pos));
return;
}

/*
* go through each col on current row
*/
for(int col = 0; col<length; col++){
if(isValidMove(pos, col, row)){
int[] copy = Arrays.copyOf(pos, length);
copy[row] = col;
move(copy, row+1);
}
}

}


private static boolean isValidMove(int[] pos, int col, int row) {
if(row==0)
return true;
else{
for(int r = 0; r<row; r++){
if(Math.abs(pos[r] - col) == Math.abs(r - row) || pos[r] == col)
return false;
}
}

return true;

}

Sunday, August 16, 2015

Longest Valid Parentheses

There is the longest valid parentheses problem definition.

We are given a string containing just parentheses of character ')' and '(' only, find the length of the longest valid (well-formed) parentheses substring. For example, for "(()", the longest valid parentheses substring is "()", output is 2. Anther example is ")()())", the longest valid parentheses substring is "()()", which has length of 4.

Approach I: dynamic programming

As usual many of the string related problems can be solved by dynamic problem. Given a substring from index at i to j, how can we find out the longest valid parentheses substring, which is not easy to construct the dynamic function. But we can convert this problem to another problem, that is, given substring S from index i to j, find out if all of the parentheses in the substring S form valid pair.

Here is how we construct isValid dynamic function based on index i and j, which is to break the substring(i,j) to two part: i,k and k+1, j. in which k = i+1, i+3, i+5, ..., j-2.

if(isValid(i,k) && isValid(k+1,j)) then isValid(i,j) = true.

Here is the java code:

public static int longestValidParentheses(String str){
int len = str.length();
boolean[][] isValid = new boolean[len][len];
int maxLen = 0;
for(int i =0; i< len-1; i++){
if(str.charAt(i)=='(' && str.charAt(i+1) == ')'){
isValid[i][i+1] = true;
maxLen = 2;
}
}
for(int step=3; step<len; step=step+2){ for(int i=0; i+step<len; i++){ int j = i+step;
if(isValid[i+1][j-1]&&isValid(str,i,j)) isValid[i][j] = true; for(int k=i+1; k<=j-2; k=k+2){ if(isValid[i][k] && isValid[k+1][j]) isValid[i][j] = true; if(isValid[i][j]){ maxLen = Math.max(maxLen, j - i + 1); break; } } } }
return maxLen;
}
private static boolean isValid(String str, int i, int j){
return str.charAt(i) == '(' && str.charAt(j) == ')';
}

Approach II: using stack

Stack is great data structure, the difference between stack and queue is that you update and query from the same place: the back, while in queue you have two places to update: insert at back, remove at front. That is the reason we can use stack is constructed data structure like max/min stack and max/min queue, in this case, the queue is built on top of stack, as well.

static class Node {
        int pos;
        char c;
        Node(int p, char c){
            this.pos = p;
            this.c = c;
        }
    }
    public static int maxValidParenthesis(String input){
        int max = 0;
        Stack<Node> s = new Stack<Node>();
        for(int i=0, j= input.length(); i<j; i++){
            char c = input.charAt(i);
            if(s.isEmpty() || c == '(' || s.peek().c == ')')
                s.push(new Node(i, c));
            else{
                s.pop();
                max = Math.max(i - (s.isEmpty()?-1:s.peek().pos), max);
            }
        }
        return max;
    }
 

Trie implementation and related string search problems

Trie is special purpose tree to store set of strings to solve a specific set of problems. In stead of storing the characters in the nodes, it is stored on the edges. The string can be constructed by concatenating all the characters found in the path.

For example it can store a dictionary contains a set of words, then it can answer the questions such as:

1. Does word S exist in the dictionary?
2. Given a string S, find all of the words which has the prefix equals to S. Think about design a data structure to support a auto-complete search box.

The followings are Java implementation of Trie to store the set of words from dictionary and support word search in the dictionary.


/*
* trie node definition
* it assumes the character are ascii chacaters
* each word in the dictionary is stored at leaf level
*/


public class TrieNode {
        private final static ALPHABET_SIZE = 256;
        TrieNode[] children = new SuffixNode[ALPHABET_SIZE];
boolean isLeaf;
String word;
}

/*
* suffix tree class
* it defines couple operations to support string search in a dictionary 
*/
public class Trie {
TrieNode root;
public Trie(){
root = new TrieNode();
}
/*
* insert a string into suffix tree
*/
public void insert(String s){
insertHelper(s, 0, root);
}
private void insertHelper(String s, int index, TrieNode node){
if(index == s.length()){
node.isLeaf = true;
node.word = s;
return;
}
TrieNode trieNode = node.children[s.charAt(index)];
if(trieNode==null)
trieNode = new TrieNode();
insertHelper(s, index+1, trieNode);
}
/*
* check if string is prefix of one of the words contained in the tree
*/
public boolean containsPrefix(String s){
return containsPrefix(s, 0, root);
}
private boolean containsPrefix(String s, int index, TrieNode node){
if(index == s.length())
return true;
TrieNode trieNode = node.children[s.charAt(index)];
if(trieNode != null)
return containsPrefix(s, index+1, trieNode);
else
return false;
}

/*
* check if the string is a word existing in the suffix tree
*/
public boolean isWord(String s){
return isWordHelper(s, 0, root);
}
private boolean isWordHelper(String s, int index, TrieNode node){
if(index == s.length())
return node.isLeaf; //the current node must be leaf node
TrieNode trieNode = node.children[s.charAt(index)];
if(trieNode != null)
return isWordHelper(s, index+1, trieNode);
else
return false;
}
/*
* find if the word is prefix of any words contained in the trie tree
* and return the last node which contains the word
*/
public TrieNode searchFor(String s){
return searchFor(s, 0, root);
}
private TrieNode searchFor(String s, int index, TrieNode node){
if(index == s.length())
return node;
TrieNode trieNode = node.children[s.charAt(index)];
if(trieNode != null)
return searchFor(s, index+1, trieNode);
return null;
}
}

Sunday, August 9, 2015

Generics Design Patterns: part II

Generics were introduced Java 5. First of all, recap of the benefit of Generics. 1) compile-time type safety; 2) eliminate type casting; 3) enable generics algorithm/design pattern implementation.

Continuing from previous blog: generic related design patterns. In this blog I list couple more generics-based design patterns. Please note many of the examples are from the book "Java Generics and Collections" with some modifications.

1. Function design pattern.

Function design pattern allows functions become object and passed around. In below code snippet, a generics interface was used to define a function which takes T type input, return R type and throws X which is subclass of Throwable.

public class FunctionQ {

    interface Function<T, R, X extends Throwable> {
        R apply(T t) throws X;
    }
   
    static <T, R, X extends Throwable>  List<R> applyAll(Function<T, R, X> f, List<T> list) throws X {
        List<R> result = new ArrayList<R>(list.size());
        for(T t : list){
            result.add(f.apply(t));
        }
        return result;
    }
}


2. Strategy design pattern

One of the common patterns in software development is there are a set of domain classes form parent-child class hierarchy, then there are a set of business logic classes which are parallel to the domain classes, also for parent-child class hierarchy and operate on those domain classes.

In the following snippet, we have abstract base class TaxPayer and Individual/Organization extends from this base class. Then we have different tax calculators implement different tax compute strategies for different classes.

public class StrategyQ {

    abstract class TaxPayer {
        float income;
        float getIncome(){
            return income;
        }
    }
   
    class Individual extends TaxPayer {
       
    }
   
    class Organization extends Individual {
        boolean isProfit;
       
    }
   
    abstract class TaxCalculator<P extends TaxPayer> {
        abstract double computeTax(P p);
    }
   
    class DefaultTaxCalculator<P extends TaxPayer> extends TaxCalculator<P>{
        private final double RATE = 0.4;
        @Override
        double computeTax(P p) {
            return p.getIncome() * RATE;
        }       
    }
   
    class CheatingTaxCalculator<P extends TaxPayer> extends TaxCalculator<P>{
        @Override
        double computeTax(P p) {
            return 0;
        }       
    }
   
    class OrganizationTaxCalculator extends DefaultTaxCalculator<Organization>{
        @Override
        double computeTax(Organization p) {
            return p.isProfit?super.computeTax(p):0;
        }   
    }
}

3. Visitor pattern

The visitor pattern allows the decoupling of domain class and its operations. One of the common usage of visitor pattern is to define the set of operations on tree node. Below is the code: 

abstract static class Tree<E> {
        /*
         * list of Tree operations
         */
        abstract public String toString();
        abstract public Double sum();
       
        /*
         * static creator methods
         */
        public static <E> Tree<E>  leaf(final E e){
            return new Tree<E>(){

                @Override
                public String toString() {
                    return e.toString();
                }

                @Override
                public Double sum() {
                    return ((Number)e).doubleValue();
                }
               
            };
        }
       
        public static <E> Tree<E> branch(final Tree<E> left, final Tree<E> right){
            return new Tree<E>(){
            @Override
            public String toString() {
                return "(" + left.toString() + "^" + right.toString() + ")";
            }

            @Override
            public Double sum() {
                return left.sum() + right.sum();
            }
            };
        }
       
    }

Tuesday, August 4, 2015

check if one string has anagram substring of the other string

Most of the anagram related question can be solved by using int array, each element in the array representing the number of appearances of each char in the string.

    /*
     * Given 2 strings, check if any one of them
     * has any anagram of the other string, as a substring of it.
     *
     * here without losing generality s1.length() > s2.length()
     *
     * time complexity: O(L2) + O(L1)*O(1) = O(L1)
     */
    public static boolean hasAnagramSubstring(String s1, String s2){
        assert s1.length() > s2.length();
        final int ALPHA_SIZE = 256;
        int[] store2 = new int[ALPHA_SIZE];
        for(char c : s2.toCharArray())
            store2[c] ++;

        char[] chars = s1.toCharArray();
        int[] store1 = new int[ALPHA_SIZE];
        for(int i=0; i<s2.length(); i++){
            store1[chars[i]]++;
        }

        //count the number of different characters in each string
        int numOfNonZero = 0;
        for(int i=0; i<ALPHA_SIZE; i++){
           //store the difference between two int array in store1
            store1[i] = store1[i] - store2[i];
            if(store1[i]!=0)
                numOfNonZero++;
        }

        if(numOfNonZero==0)
            return true;

        for(int i = s2.length(); i<s1.length(); i++){
            store1[chars[i]]++;
            if(store1[chars[i]]==0)
                numOfNonZero--;
            store1[chars[i-1]]--;
            if(store1[chars[i-1]]==0)
                numOfNonZero--;
            if(numOfNonZero==0)
                return true;
        }
        return false;

    }