Labels

Tuesday, March 3, 2015

Word Break

Given a string s and a dictionary of words dict, determine if s can be segmented into a space-separated sequence of one or more dictionary words.
For example, given
s = "leetcode",
dict = ["leet", "code"].
Return true because "leetcode" can be segmented as "leet code".

Naive Way: Another problem with a dictionary. Since the dictionary is usually large, it's better to do something on the string and left the dictionary to perform dict.contains() function. I came up with a DFS way to cut s and check for the existence of each cut.

DFS first fails for TLE, and then I add a Set to memorize failed index to reduce from O(2^n) to  O(n^2) in run time. This idea is learned from mahdy on his post in Decode Ways

 public class Solution {  
   public boolean wordBreak(String s, Set<String> dict) {  
     // DFS  
     Set<Integer> set = new HashSet<Integer>();  
     return dfs(s, 0, dict, set);  
   }  
   private boolean dfs(String s, int index, Set<String> dict, Set<Integer> set){  
     // base case  
     if(index == s.length()) return true;  
     // check memory  
     if(set.contains(index)) return false;  
     // recursion  
     for(int i = index+1;i <= s.length();i++){  
       String t = s.substring(index, i);  
       if(dict.contains(t))  
         if(dfs(s, i, dict, set))  
           return true;  
         else  
           set.add(i);  
     }  
     set.add(index);  
     return false;  
   }  
 }  

And after a fail test, I found out that each string in the dictionary can be used unlimited times. Which leads a DFS without back trace. If each string in the dictionary can be used only once, a simple modification will solve it.

 public class Solution {  
   public boolean wordBreak(String s, Set<String> dict) {  
     // DFS  
     Set<Integer> set = new HashSet<Integer>();  
     return dfs(s, 0, dict, set);  
   }  
   private boolean dfs(String s, int index, Set<String> dict, Set<Integer> set){  
     // base case  
     if(index == s.length()) return true;  
     // check memory  
     if(set.contains(index)) return false;  
     // recursion  
     for(int i = index+1;i <= s.length();i++){  
       String t = s.substring(index, i);  
       if(dict.contains(t)){  
         dict.remove(t);  
         if(dfs(s, i, dict, set))  
           return true;  
         else  
           set.add(i);  
         dict.add(t);  
       }  
     }  
     set.add(index);  
     return false;  
   }  
 }  

But what's the general approach? If it is not for path memorizing, DFS will fail in run time.There should exists a totally different solution that aims to solve this problem.

I am thinking about DP. Because in Decode Ways, the formal method was DP while a DFS with path memorize can deal with it, too.
the basic logic will be
opt[i] // whether s[0....i] can be formed by words in dict
// base case
opt[0] = true
// iteration
opt[i] = opt[t] && dict.contains(s.substring(t+1,i)) for 0<t<i

 public class Solution {  
   public boolean wordBreak(String s, Set<String> dict) {  
     // edge case  
     if(s==null || s.length()==0) return dict.contains("");  
     // DP  
     boolean opt[] = new boolean[s.length()+1];  
     // base case  
     opt[0] = true;  
     // iteration  
     for(int i = 1;i <= s.length();i++)  
       for(int t = 0;t < i;t++)  
         opt[i] = opt[i] || opt[t] && dict.contains(s.substring(t,i));  
     return opt[s.length()];  
   }  
 }  

No comments:

Post a Comment