
Large Data and Performance


Performance Notes:

From Jeff Borror's Q for mortals.
Using commas between where clauses in sql is usually
faster than using & because the commas do sequential
reduction from left to right whereas clauses with &
apply to the whole table.
In the nesting case, use the most restrictive clause first.

Use grouping on non-unique columns in order to get a hash table
for those columns.
Use partitioning to separate the key field. 

Exercise 1: Try the following and see what to conclude about the performance.
	q perfexamp.q

Note that partitioning undoes the effect of grouping.


Large data:
https://code.kx.com/trac/wiki/Startingkdbplus/hdb
https://code.kx.com/trac/wiki/KdbplusForMortals/partitioned_tables

Partitioning is needed when even columns are too big to fit into
memory.
In a partition, we will have the full table schema but only for some
of the records.
q genandpartition.q

Then 
q
\l partdb
And now you can access rantrade more or less normally,
e.g. using select.
Put date first if used as it prevents searches across different partitions.

Associative aggregates -- avg, sum, count etc. work well on partitioned data
Lots of limits on how to update stuff.

select from rantrade where date = ...

Try
q loadpartition.q

If there are many accesses to a partitioned table, we might want
to spread out the IO. For that, you use segmentation.
https://code.kx.com/trac/wiki/KdbplusForMortals/segments#a1.4.0Overview

