Top K Frequent Elements: Three Approaches

Top K Frequent Elements: Three Approaches

Bucket Sort

Time complexity: O(n)

Build frequency map O(n) Place numbers into buckets by frequency O(n)
Traverse buckets backwards and collect results. Worst case: O(n)

vector<int> topKFrequent(vector<int>& nums, int k) {
    map<int, int> m;
    for(int num : nums)
        ++m[num];

    // When frequency becomes bucket index, length is array size + 1
    vector<vector<int> > bucket(nums.size()+1);
    vector<int> res;
    for(const auto &pair : m)
        bucket[pair.second].push_back(pair.first);

    for(int i=bucket.size()-1;i>=0&&res.size()<k;--i) {
        if (bucket[i].size()>0)
            res.insert(res.end(), bucket[i].begin(), bucket[i].end());
    }
    return res;
}

The edge case: if the whole array has one number, bucket length should be array length + 1. This lets us use all possible frequencies as direct indices.

Hash Map + Max Heap

Time complexity: n log(n-k)

vector<int> topKFrequent(vector<int>& nums, int k) {
    unordered_map<int, int> m;
    for(int num : nums)
        ++m[num];

    priority_queue<pair<int,int>> pq;
    vector<int> res;
    for(const auto& pair : m) {
        pq.push(make_pair(pair.second, pair.first));
        if (pq.size() > m.size()-k) {
            res.push_back(pq.top().second);
            pq.pop();
        }
    }
    return res;
}

Same idea but better when K is small. Works well for large datasets. The heap maintains high-frequency elements. We start pulling from heap top when remaining input data is less than K.

Takeaways

  • STL vector merge operations
  • Buckets for quick classification tasks

#heap #interview #bucket