34. 如果用HiveQL
SELECT sourceIP, totalRevenue, avgPageRank
FROM
SELECT sourceIP, sum(adRevenue) as totalRevenue,
avg(pageRank)as avgPageRank
FROM Rankings as R, Uservisits as UV
WHERE R.pageURL = UV.destURL and UV.visitDate
between Date (‘2000-01-15’) and Date (‘2000-01-22’)
GROUP BY UV.sourceIP
ORDER BY totalRevenue DESC limit 1;
34
35. Hive中的表
• 就像在关系型数据库里, 数据存储在表里
• 比SQL更丰富的字段类型
– 基本类型: ints, floats, strings, date
– 复杂类型: associative arrays, lists, structs
例子:
CREATE Table Employees
(
Name string,
Salary integer,
Children List <Struct <firstName: string, DOB:date>>
)
35