Labels

Wednesday, March 4, 2015

Word Break II

Given a string s and a dictionary of words dict, add spaces in s to construct a sentence where each word is a valid dictionary word.
Return all such possible sentences.
For example, given
s = "catsanddog",
dict = ["cat", "cats", "and", "sand", "dog"].
A solution is ["cats and dog", "cat sand dog"].

Naive Way:  Given word-break has both DP and DFS solutions. I think this question can also be solved by both methods.

First thinking about DP.
Once I got the 1D array denoting whether S[0...i] can be reconstructed by words in dict. After knowing that s[0...i] is re-constructable, need to search for next opt[j] = true (j>i) and dict.contains(s.substring(i,j+1)) = true. A recursive method will be helpful.

This backtrace method didn't get accepted by OJ for TLE. The time consuming for the largest case is really huge. But we still has another backtrace method-> starting from behind rather than starting from beginning.

 public class Solution {  
   public List<String> wordBreak(String s, Set<String> dict) {  
     List<String> rslt = new ArrayList<String>();  
     // edge case   
     if(s==null || s.length()==0) return rslt;   
     // DP   
     boolean opt[] = new boolean[s.length()+1];   
     // base case   
     opt[0] = true;   
     // iteration   
     for(int i = 1;i <= s.length();i++)   
       for(int t = 0;t < i;t++)   
         opt[i] = opt[i] || opt[t] && dict.contains(s.substring(t,i));   
     backtrace(s, opt, 0, dict, new String(), rslt);  
     return rslt;  
   }  
   private void backtrace(String s, boolean[] opt, int index, Set<String> dict, String path, List<String> rslt){  
     // base case  
     if(index == s.length()){  
       rslt.add(path);  
       return;  
     }  
     // recursion  
     StringBuilder str = new StringBuilder();  
     for(int i = index+1; i < opt.length; i++){  
       str.append(s.charAt(i-1));  
       if(opt[i] && dict.contains(str.toString())){  
         String cur = new String(path);  
         cur += path.length()==0?"":" ";  
         cur += str.toString();  
         backtrace(s, opt, i, dict, cur, rslt);  
       }  
     }  
     return;  
   }  
 }  

This starting from end backtrace method get accepted. Why starting from begin have such a difference from starting from end? What I recall from my Algorithm Course is that Michael taught a starting from end backtrace method when backtracing DP. It is because when starting from begin, some middle result, say opt[4] may be true, while starting from 4, no valid result can be generated. We'll waste a lot of time on invalid final result but valid middle result paths. However, starting from end ensures that each path you travel will be a valid path.

 public class Solution {  
   public List<String> wordBreak(String s, Set<String> dict) {  
     List<String> rslt = new ArrayList<String>();  
     // edge case   
     if(s==null || s.length()==0) return rslt;   
     // DP   
     boolean opt[] = new boolean[s.length()+1];   
     // base case   
     opt[0] = true;   
     // iteration   
     for(int i = 1;i <= s.length();i++)   
       for(int t = 0;t < i;t++)   
         opt[i] = opt[i] || opt[t] && dict.contains(s.substring(t,i));   
     backtrace(s, opt, s.length(), dict, new Stack<String>(), rslt);  
     return rslt;  
   }  
   private void backtrace(String s, boolean[] opt, int index, Set<String> dict, Stack<String> path, List<String> rslt){  
     // base case  
     if(index == 0){  
       List<String> list = new ArrayList<String>();  
       StringBuilder str = new StringBuilder();  
       list.addAll(path);  
       for(int i = list.size()-1;i>=0;i--){  
         str.append(list.get(i));  
         if(i!=0) str.append(" ");  
       }  
       rslt.add(str.toString());  
       return;  
     }  
     // recursion  
     StringBuilder str = new StringBuilder();  
     int i = index-1;  
     while(i >= 0){  
       String t = s.substring(i,index);  
       if(opt[i] && dict.contains(t)){  
         path.push(t);  
         backtrace(s, opt, i, dict, path, rslt);  
         path.pop();  
       }  
       i--;  
     }  
     return;  
   }  
 }  


For DFS method, I think the path can be constructed on the fly. Also, doing it from end to start.

 public class Solution {  
   public List<String> wordBreak(String s, Set<String> dict) {  
     List<String> rslt= new ArrayList<String>();  
     // DFS  
     dfs(s, s.length(), dict, new Stack<String>(), rslt);  
     return rslt;  
   }  
   private void dfs(String s, int index, Set<String> dict, Stack<String> path, List<String> rslt){  
     // base case  
     if(index == 0){  
       List<String> list = new ArrayList<String>();  
       StringBuilder str= new StringBuilder();  
       list.addAll(path);  
       for(int i = list.size()-1;i>=0;i--){  
         str.append(list.get(i));  
         if(i!=0) str.append(" ");  
       }  
       rslt.add(str.toString());  
     }  
     // recursion  
     int i = index-1;  
     while(i >= 0){  
       String t = s.substring(i,index);  
       if(dict.contains(t)){  
         path.push(t);  
         dfs(s, i, dict, path, rslt);  
         path.pop();  
       }  
       i--;  
     }  
     return;  
   }  
 }  

It turns out to be the same backtrace method without using DP to generate true or false array. That's nothing weird, it means knowing whether S[0....i] is re-constructable helps little in these test cases. However, that should not be the case, the DP array should be able to provide useful information and deny each path that leads to invalid result. My explanation is the test case for OJ is too weak in Word Break II.

No comments:

Post a Comment