B = filter A by date == '20100819' and age < 30;
-- both date and country are partition columns
C = filter A by date == '20100819' and country == 'US';
...
不过如果假定这里存在无数个分区,且我们打算利用Hcatalog通过单一请求对其进行全部查询,那么Pig也将遭遇与Hive类似的问题。在这种情况下,使用glob与通配符来表达可能更为方便。
例如:
Partition-1, Partition-2, Partition-3,....Partition-n exist within the location /user/inputLocation/
Using globs we can provide the input to Pig as:
/user/inputLocation/{Partition-1, Partition-2, Partition-3,....Partition-n}