
Challenge 1:

Please could you advise on the following problem: I have grouped a table
by its id and want to filter out a specific event,`A, with a
corresponding qty of 0 (shares the same index position); my grouped
table is similar to this:

mytbl: ([id:til 3]event:(`A`B`C;`B`A`A`C;`A`B`C);qty:(til 3;2-til 4;2 0 3))


From

id| event    qty     
--| -----------------
0 | `A`B`C   0 1 2   
1 | `B`A`A`C 2 1 0 -1
2 | `A`B`C   2 0 3 


We want

id| event qty  
--| -----------
2 | A B C 2 0 3


Here are some possibly helpful hints.
Note that

show select bitres: ((`A = ) event) and ((0 = ) qty) from mytbl

yields

bitres
-----
100b 
0010b
000b

This works because event corresponds to an entire multi-array
but the operation scalar = multi-array works element by element.
The operation "=" is atomic.

Next, note that 
show select bitres2: (1b in/: (`A =  event) and (0 =  qty)) from mytbl

yields

bitres2
-------
1      
1      
0 

1. Please try to solve the original problem.
You want to eliminate an entire row (an id and its associated event
and qty entries) if there is an event `A and a qty 0.

q case1.q

Note that we could have solved the above problem in a more SQL-like
fashion using any and each:

show select bitres2: any each ((`A = ) event) and ((0 = ) qty) from mytbl
	/ This takes the multiarray and asks for each array element
	/ whether there is a 1 there or not.

and then

show select from mytbl where not any each ((`A = ) event) and ((0 = ) qty)






Challenge 2:
Hi, we've been trying to figure out if there is an efficient way to
calculate intraday NAV for ETFs, Indices, etc.

(NAV = net asset value. It is the sum of the price
times the qty of each asset underlying a portfolio.)

We have the tick data for all the underlying securities and need to
recompute the NAV with each update of any constituent in the basket
historically in order to simulate a trading strategy.

The different methods we've come up with are:

1) With each quote update, update a dictionary containing the bid/ask of
each constituent and recompute NAV
2) Multiply the bid/ask time series of each constituent by the
constituent weighting and somehow add the time series together (not sure
how)
3) Since our tick data only has millisecond granularity, lj each
constituent's bid/ask time series into a table with all possible millisecond
times and do *fills and mmu*
Historical data only.  We're hoping to run this over a large list of ETFs,
standard indices, customized indices, etc.  There are several hundred ETFs
that we'd like to run this on.  Underlying syms will vary by ETF/Index,
anywhere from 10 to 3,000.  Number of quotes will depend on the trading
day.  For 6/17/11 I see about 72mm quotes for the S&P 500 names.

Just tested lj'ing each ticker in S&P 500 keyed on millisecond times, took
~1 sec per name.  Could probably get each day's calculationsdone overnight
if we use peach, but wanted to see if there was an even more efficient
method.

First let's do a small scale example.
Here is a table of stocks and their prices over time.


mytbl: ([]time:09:30:00.0 + til 10;sym:`a`a`a`b`b`a`a`b`a`b;price:1 2 3 100 101 2 3 99 2 100)

Now look at it:

show mytbl

time         sym price
----------------------
09:30:00.000 a   1    
09:30:00.001 a   2    
09:30:00.002 a   3    (nav = 3* 10)
09:30:00.003 b   100   (nav = (3*10) + (100*20))
09:30:00.004 b   101   (nav = (3*10) + (101*20))
09:30:00.005 a   2    
09:30:00.006 a   3    
09:30:00.007 b   99   
09:30:00.008 a   2    
09:30:00.009 b   100  

You can see that b is much more valuable than a.
Now, let's say that some portfolio has 10 of a and 20 of b.

myqty:([sym:`a`b]weight:10 20) / so we have more of the expensive one

show myqty

sym| weight
---| ------
a  | 10    
b  | 20  

Now let's take a look at the join between the two tables.

show select sym,time,price,weight from mytbl ij myqty


sym time         price weight
-----------------------------
a   09:30:00.000 1     10    
a   09:30:00.001 2     10    
a   09:30:00.002 3     10    
b   09:30:00.003 100   20    
b   09:30:00.004 101   20    
a   09:30:00.005 2     10    
a   09:30:00.006 3     10    
b   09:30:00.007 99    20    
a   09:30:00.008 2     10    
b   09:30:00.009 100   20 

At any given time, we can compute the Net Asset Value by multiplying
the last price of each security by its weight.
But this value changes over time and finding the last is a linear time
operation if begun from scratch.
Ideally, with each price change on a security, we would do constant
work to get the running NAV.

Here is the result we want:

time         nav
------------------
09:30:00.000 10   
09:30:00.001 20   
09:30:00.002 30   
09:30:00.003 2030 
09:30:00.004 2050 
09:30:00.005 2040 
09:30:00.006 2050 
09:30:00.007 2010 
09:30:00.008 2000 
09:30:00.009 2020 


2. That is your challenge. Compute the running NAV.
Think about "deltas".

q case2.q
http://kx.com/q/nav.q

2a. Compute the online running NAV (so you can't do a grouping
because the data is coming in continuously).

q case2run.q

2b. Suppose the quantities change over time. So, instead of
myqty:([sym:`a`b]weight:10 20) 
you might have a changing set of quantities that depend on time in
which sym would not be a key.
For example, you might have:

myqtyintimeordered:([]time:(2 # 09:30:00.0), (09:30:00.001 + 2*(til 3));sym:`a`b`a`b`b;weight:10 25 35 40 13 )


We have to make sure there is a price at the time of every purchase.
So modify the price table appropriately to fill in prices.

mytbl: ([]time:(09:30:00.0 + til 10), (09:30:00.0 + til 10);sym:(10 # `a),(10 # `b);price:1 2 3 2 3 2 2 2 2 2 100 101 102 99 98 97 103 104 110 108)
and sort it
mytbl: `time xasc mytbl

What would you do then?

q case2brun.q

Challenge 3: We have a trading strategy that says
"Go in when the price is p1 but exit when the price go down to p2"
How do we implement that efficiently?



Example: Want to enter when stock goes up to 12 and exit if it ever
goes to 6 or less.
Here is a price vector in time order.

p:10 11 12 12 13 18 17 14 13 11 9 5 5 5 4 3 5 6 7 8 9 10


We will represent the time we are holding
stock  as 1s in a Boolean vector.

0 0 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0

Because this is an operation that depends on history we have to
run through the vector in time order.

Let's look at the following:

0{x+(2*y)}\p

or

foo: {[x;y] x+(2*y)}
0 foo\p

What is happening is that initially 0 is assigned to x
and then we add twice the first element.
After that, x becomes the running result and y is the new element in the 
scan.
{x+2*y} is just a function that has no name.
We could give it a name.
Try this.


3a. Ok, now what can you do to generate the Boolean vector we want
based on entry at 12 or more and exit at 6 or less. 

q case3.q

3b. Can you extend it to entry at 12 or more but not greater than 15
and exit at 6 or less or greater than 15.


q case3.q

Challenge 4:
You are designing the coinage for a certain new country.
You want only three coin denominations.
You want to optimize them so that if a customer is paying without
receiving change, the customer uses as few coins as possible (though
perhaps using several coins of the same denomination).
        q bestcoins.q

Challenge 5:
Given a price history from the public markets and given
a set of trades that you have done, how far off were you
than the public market.
 	bestprice.q


Challenge 6: 
Given a table having orderid and timestamps (one or two of them
per orderid), want to calculate the difference in the timestamps
for each order id and sort in descending order by the difference.

ravi.q

Challenge 7:
With respect to nileshdata,
For each column there will be a preference
sequence.

eventclientpref: `a`b`c
eventcomppref: `b`c`a


For each orderid, want the non-null client corresponding
to the hightest eventclientpref.
Separately want the non-null compid correspondeing
to the hightest eventcomppref.

nilesh.q

Challenge 8:
Given a table like this:
orderid parentorderid replacedbyorderid events keyfield
-------------------------------------------------------
x       y                               an     0       
y                                       an     1       
z                     y                 rn     2       
a       b                               rn     3       
c       b                               rn     4    

Note that x and y are linked because of the row having keyfield 0,
z and y are linked because of eh row having keyfield 2,
so all the rows having x, y, and z should be linked.
Similarly for a and b.
Thus the output should be:
linkedgroupid| rowids 
-------------| -------
1            | 0, 1, 2
2            | 3, 4 

	vlad.q
