Given an array of citations (each citation is a non-negative integer) of a researcher, write a function to compute the researcher’s h-index.

According to the definition of h-index on Wikipedia: “A scientist has index h if h of his/her N papers have at least h citations each, and the other N − h papers have no more than h citations each.”

For example, given citations = [3, 0, 6, 1, 5], which means the researcher has 5 papers in total and each of them had received 3, 0, 6, 1, 5 citations respectively. Since the researcher has 3 papers with at least 3 citations each and the remaining two with no more than 3 citations each, his h-index is 3.

Note: If there are several possible values for h, the maximum one is taken as the h-index.

思路

我们要深入理解桶排序。它是复杂度比较低、效率比较高的一种排序方式;当然空间复杂度也比较高。在这道题当中,非常巧妙地用上了。

如果我们有这五个数:2,5,8,5,3,如何对它们进行排序呢?

我们可以构造一个有9个元素的数组a[类似于一个简化的哈希表],元素的序列编号从0开始一直到8。

遍历这五个数。访问到2时,a[2] += 1; 访问到5时,a[5] += 1; 访问到8时,a[8] += 1; 访问到5时,a[5] += 1, 此时a[5] = 2; 访问到3时,a[3] += 1, 此时a[3] = 2.

我们再遍历一次数组a,此时可以输出已排序的数组,此数出现几次,就输出几次即可。

在这一题当中,我们为什么可以用桶排序呢?因为H index是有上界的。H index再大,也不会超过数组的元素个数。所以,桶排序在这一题用得恰如其分。若citation数组元素个数为n,桶排序的元素个数即为n+1。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
class Solution {
public:
int hIndex(vector<int>& citations) {
vector<int> hindex(citations.size()+1);
for(int i = 0; i < citations.size(); i++){
if(citations[i] > citations.size()) hindex[citations.size()]++;
else{
hindex[citations[i]]++;
}
}
int sum = 0;
for(int i = hindex.size() - 1; i >= 0; i--){
sum += hindex[i];
if(sum >= i) return i;
}
}
};

参考资料:http://book.51cto.com/art/201405/441260.htm