Anagram grouping
All credits to the question go to Neetcode.io. I am merely borrowing from him.
Link - Neetcode 150
Given an array of strings strs, group all anagrams together into sublists. You may return the output in any order.
An anagram is a string that contains the exact same characters as another string, but the order of the characters can be different.
Example 1:
Input: strs = ["act","pots","tops","cat","stop","hat"]
Output: [["hat"],["act", "cat"],["stop", "pots", "tops"]]
Approach 1 (Kinda stupid but it worked in theory)
Pseudocode
Have a global hashmap of (k,v) pairs
for every word:
generate a hashmap
check if said hashmap is in the global one
If yes, append the word to the existing one
If no, create a new key and add the word as the value
Simple, right? Just get together a couple of hashmaps an you are done.
WRONG
It is wrong on so many levels, starting with -
- WTF was I thinking to make a hashmap as a key? (Its not even hashable!!!!)
- Seriously, wtf is that line of thought
- WHO RANDOMNLY SETS A HASHMAP AS A KEY!!?
Okay, rant over (and a reminder to never set a data structure as a key).
Logically, it makes sense but this does not translate very well into code. So I learnt another way to do the same
Approach 2
Pseudocode
create a dict
for every word:
Generate an array of size 26 with 0's
for each character in the word:
find the ascii diff of the character and 'a' i.e
find the position of the character in the alphabet
increment the value at the index by 1
Once done for the word, store the array as a tuple in the dict
return the values of the dict
Notice something different? - We did the same thing as before but in a structured manner
- Created a numeric alphabet array and use it as the basis of our "hashmap". This value is the same for a string and its anagrams -> Makes the most sense to capture duplicates instead of "using a hashmap as a key" (Duh!)
- Generating the key as a tuple of the alphabet array (Honestly this is genius in my opinion) -> tuples are more or less unique when
result = defaultdict(list)
for word in strs:
count = [0]*26
for ch in word:
count[ord(ch)-ord('a')] +=1
result[tuple(count)].append(word)
return list(result.values())
Another thing to notice here is the use of defaultdict instead of dict. This makes things simpler, essentially it allows us to define a dict with values as list instead of having to define a condition where we check if the key exists or not.
But yeah, this is what I think makes sense and implemented to solve this problem.