
From chris@kx.com  Tue Mar 12 08:00:42 2013

Hi Dennis

You should be able to use the example hdb and rdb in the wiki tutorial
http://code.kx.com/wiki/Startingkdbplus/contents.

Attached are the latest versions, as both linux and windows zips. These
should be unzipped into a subfolder ws of your q installation, i.e. ~/q/ws,
otherwise just change the scripts as required.

After loading q, then

\l ws/buildtaq.q

will build an hdb for 20 days. You can change the parameters in the script
header for a longer run (4 years works fine), or to create a segmented
database.

The scripts in the tick subdirectory run a tickerplant and several rdb
processes. They should work as is, but I set things up so that all
processes run in a single window so as to not clutter the screen, i.e. on
linux using tmux, and on windows using console2.

Regards

Chris

P.S. gmail complained that windows.zip contained an executable file, so it
has been renamed to windows.jpg. Just rename it back.

From jshmain@gmail.com  Tue Mar 12 09:50:39 2013
Received: from mx.cims.nyu.edu (mx.cims.nyu.edu [128.122.49.99])
	by mail.cims.nyu.edu (8.14.5+Sun/8.14.5) with ESMTP id r2CDodK9014845
	for <shasha@mail.cims.nyu.edu>; Tue, 12 Mar 2013 09:50:39 -0400 (EDT)
Received: from mail-ve0-f173.google.com (mail-ve0-f173.google.com [209.85.128.173])
	by mx.cims.nyu.edu (8.14.4+Sun/8.14.4) with ESMTP id r2CDoXTg020698
	for <shasha@courant.nyu.edu>; Tue, 12 Mar 2013 09:50:35 -0400 (EDT)
Received: by mail-ve0-f173.google.com with SMTP id oz10so3494189veb.18
        for <shasha@courant.nyu.edu>; Tue, 12 Mar 2013 06:50:28 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20120113;
        h=mime-version:x-received:in-reply-to:references:date:message-id
         :subject:from:to:content-type;
        bh=Y2HJVnE19kpGILXTw4u1vayfguhX8x1Ar4CDhAXK1qo=;
        b=i78s4XHzzzb6HFaBRla/VewBlSeHkIWU/LhWWRjcUlX3h+tmvqXMLUpG4BOkVENfS+
         8d/U9dfm6A5w7eefOKFxXUmiZ2BbUkZVjn/yrL5PonAthRDJe4z3e8Qpn9V15Ozom3gn
         lWF+hMXCmEkoJrIKFlNefZ62m45lDLzcDvm9MTHbNRv8NAi1FAKew+5DNxZQLtegTO7W
         qJcXJXEnqaLWJb8OaqEDJBQ5N0CS+txGo2sTpF2ItheT9rufo7C/31JFqCud/LxLCZ7G
         SMlTLeZp6QjlgKAxKVlUuSEdA7SX0yj9jh8U0A/hjW4UB863+Y74xp2F2JOHhbyQwRsU
         xC1w==
MIME-Version: 1.0
X-Received: by 10.58.186.241 with SMTP id fn17mr6647683vec.8.1363096228692;
 Tue, 12 Mar 2013 06:50:28 -0700 (PDT)
Received: by 10.58.255.194 with HTTP; Tue, 12 Mar 2013 06:50:28 -0700 (PDT)
In-Reply-To: <201303111518.r2BFI0Om002680@crunchy12.cims.nyu.edu>
References: <201303111518.r2BFI0Om002680@crunchy12.cims.nyu.edu>
Date: Tue, 12 Mar 2013 09:50:28 -0400
Message-ID: <CAG5ZFjh1o0svbF9_OGY0c9PQHZtmXQ_yLBkkk8gNE5W-RARszw@mail.gmail.com>
Subject: Re: tickerplant/rdb/hdb
From: Jeffrey Shmain <jshmain@gmail.com>
To: Dennis Shasha <shasha@courant.nyu.edu>
Content-Type: multipart/alternative; boundary=047d7b6dcf4681c86504d7ba92e7
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.4.1 (mx.cims.nyu.edu [128.122.49.99]); Tue, 12 Mar 2013 09:50:35 -0400 (EDT)
X-Virus-Scanned: clamav-milter 0.97.4 at mx
X-Virus-Status: Clean
X-Scanned-By: MIMEDefang 2.73 on 128.122.49.99
Status: R
Content-Length: 7388

--047d7b6dcf4681c86504d7ba92e7
Content-Type: text/plain; charset=ISO-8859-1

Hi Dennis,

I would be happy to work with you on creating the full course outline and
providing some of the content.  Below are few points that I thought would
be useful:

1.  I think the first thing to introduce students to Ticker Plant/RDB would
be to start with some theoretic definition of publish/subscribe model.
Here is a short description on that (
http://en.wikipedia.org/wiki/Publish%E2%80%93subscribe_pattern)  Almost all
financial firms employ some kind of pub/sub model and this is exactly the
reason why KX has adapted this approach in their Ticker Plant/RDB
implementation.

2.  Here is a tutorial from First Derivatives that has some useful
information about Ticker Plants/RDBs/HDBs (*
http://www.foraysintolife.com/wp-content/uploads/2011/08/FDKdbTutorial.pdf)*

3.  Another thing is getting u.q file from KX (Simon?).  I think Ticker
Plant is their proprietary software, but I don't see why they would have
any problems sharing it for educational purposes.

Now as far as the outline, you are correct:  For market data systems, I
have seen the systems scaled(partitioned) by security IDs.  For order
management/risk management, usually, by TraderId.  Ticker Plant uses sym
column to subscribe by a specific field, but the semantic of the sym column
depends on the business.

As far as replication, this can be achieved via multiple methods.  The
easiest is just to subscribe another RDB (on a different site) to the same
Ticker Plant.  However, in that case the failure of the Ticker Plant on one
site would mean the interruption of service.  We may wish to consider
replicating the full thread (Ticker Plant/RDB) and give the publishers an
option to publish to both.

I like the idea of segregating queries.  Usually the Technology departments
just round robin between the sites, as they do not have a good framework
for query segregation.

I have seen the systems combing RDBs with HDBs.  Usually this is achieved
by having a smart gateway that connects to both.  Clients usually do not
have any idea if the data is in HDB or RDB, the gateway itself breaks the
queries up, hits the right place, and then combines the results before
returning to clients.

Repartition will be tough, but achievable.  In the places I worked
repartition required to bring the system offline.  The complexity is in
creating and API for clients and publishers to receive the new partition
information and to automatically adjust.

When do you want to start teaching the course and what do you think are the
next steps?

Thanks,

Jeff






On Mon, Mar 11, 2013 at 11:18 AM, Dennis Shasha <shasha@courant.nyu.edu>wrote:

>
> Dear Jeff,
> So, the setup I'm imagining works in stages:
>
> 1. feeder client --> tickerplant --> partitioned based on securityids to
> rdbs
>
> 2. feeder client --> tickerplant --> partitioned based on securityids to
> rdbs
> + replication to a single site
>
> 3. queries to partitioned site or replicated site depending on query type.
>
> 4. system combining rdb and hdb.
>
> 5. Ability to repartition if demand is high.
>
> Best,
> Dennis
>

