Top K Frequent Elements: Three Approaches
Top K Frequent Elements: Three Approaches
Bucket Sort
Time complexity: O(n)
Build frequency map O(n)
Place numbers into buckets by frequency O(n)
Traverse buckets backwards and collect results. Worst case: O(n)
vector<int> topKFrequent(vector<int>& nums, int k) {
map<int, int> m;
for(int num : nums)
++m[num];
// When frequency becomes bucket index, length is array size + 1
vector<vector<int> > bucket(nums.size()+1);
vector<int> res;
for(const auto &pair : m)
bucket[pair.second].push_back(pair.first);
for(int i=bucket.size()-1;i>=0&&res.size()<k;--i) {
if (bucket[i].size()>0)
res.insert(res.end(), bucket[i].begin(), bucket[i].end());
}
return res;
}The edge case: if the whole array has one number, bucket length should be array length + 1. This lets us use all possible frequencies as direct indices.
Hash Map + Max Heap
Time complexity: n log(n-k)
vector<int> topKFrequent(vector<int>& nums, int k) {
unordered_map<int, int> m;
for(int num : nums)
++m[num];
priority_queue<pair<int,int>> pq;
vector<int> res;
for(const auto& pair : m) {
pq.push(make_pair(pair.second, pair.first));
if (pq.size() > m.size()-k) {
res.push_back(pq.top().second);
pq.pop();
}
}
return res;
}Same idea but better when K is small. Works well for large datasets. The heap maintains high-frequency elements. We start pulling from heap top when remaining input data is less than K.
Takeaways
- STL vector merge operations
- Buckets for quick classification tasks
#heap #interview #bucket