Computer Science

Description

You are given a bunch of "investment vehicles" which we will unceremoniously call gambles. Before discussing an exact formulation of the problem, let's suppose just for the moment that you have a bunch of possible gambles and each of them pays two to one with 0.7 odds. Should you put all your money in one or spread it out assuming your goal is to maximize expected return and minimize the variance. Obviously spread them out. Expected value doesn't change but variance goes way down. A minimum variance solution that gives a certain expected value is called an efficient frontier .

Now suppose that you have correlations among your gambles. That is, suppose you have class A gambles that have a correlation of 1 with one another, class B gambles that have a correlation of 1 with one another and so on for other classes. You would want to place your gambles in the different classes. Dividing money among two gambles in the same class gives the same variance as putting all the money on one gamble of that class.

We are now ready to state the problem. Each gamble gi has a high return (gihi), a medium return (gimed), and a low return (gilow). If you place d dollars on gi and you get the high return, then you gain d*gihi dollars from that gamble. Each return has an associated initial probability: gihiprob, gimedprob, gilowprob. Initially, the average expected return per gamble is 2 to 1, but that is just an average. Different gambles may give different returns.

After an initial assignment of probabilities, they are modified based on four binary attributes A1, A2, A3, and A4. That is certain combinations of those attribute-values (called favorable combinations) bias gihiprob higher at the expense of gilowprob and certain ones (called unfavorable combinations) make gilowprob higher at the expense of gihiprob. The numbers in each category are equal. The same combinations will have the same effect across all runs. Here is how the bias works: if a gamble's attribute-values are a favorable combination then then halve gilowprob and add that probability to gihiprob. If a gamble's attribute-values are unfavorable then halve gihiprob and add that probability to gilowprob.

Moreover, certain gambles are linked. Gambles are played in an order g1, g2, ... gn, where the order is a random permutation of the gambler ids (different for each run). If it is time to play gi then count all gj where j < i that are linked to i and that got a high return. Call that number Hi. Then medium Mi. Then Li. If Hi > Mi + Li, then halve gilowprob and add that probability to gihiprob. If Li > Hi + Mi, then halve gihiprob and add that probability to gilowprob. Then play gamble gi.

You will be given an initial capital of 1 (which you can divide however you wish), a bunch of possible gambles with various returns and probabilities and some links among the gambles. Your task is to allocate your 1 unit among several gambles. Your goal is to increase your capital to 2 in as many rounds as possible.

The architect will then do the following in five rounds: Assign outcomes to each gamble according to the probability above. Calculate each person's wealth based on their allocation and the outcomes. If a person has 2 units or more (having started each round at 1), then that person gets a point for that round. Also the player will be told what the final return outcome of each gamble was to help the player infer which attribute combinations are favorable and perhaps change allocations for the next round. The player is not told which order the gambles were played in however.

The Learning Problem

Note that you will greatly increase your odds of winning if you can bet more on favorable attribute-values and less on unfavorable attribute-values. So you must determine which are which. For this you will need a learning algorithm. A simple learning algorithm would be bet more on attribute-values associated with the most wins (high payoff). So for each combination of attribute-values, you write down which fraction of results gave a high return. Then you put more money into the attribute-values that gave you high returns and less into the one that gave low returns. Heuristically, this works if the probabilities gihiprob, gimedprob, and gilowprob were initially the same for all gambles, there was bias, and there were no links. But our world is more complicated than that.

Call a gamble "clean" if it has no links. So you might try bet more on clean attribute-values associated with the most wins. But then there might be too few of these. How would you weight the different links?

Other indications of goodness might have to do with what the initial probability is of getting the high payoff. If it's relatively high to begin with then it is less surprising that we get to a high payoff. You want to find the combinations that are favorable. (There will be a prize for this alone.)

Here is a hint. Suppose that you knew that a certain pair of combinations were favorable. Then you could evaluate the expected value of high payoffs for all the gambles ignoring links. You could then look at the combination pair that maximizes the closeness of the actual outcome to the predicted one. That is,
argmin_combinationpair (diff(actual outcomes, predicted outcomes given that some pair of combinations is favorable)) * prob(a given pair).

You might find it useful to look at maximum likelihood estimators.

Architecture Team

Generate the data (roughly 200 gambles). There are three tables that are all visible.

gamble(gambleid, high return, high prob, medium return, med prob, low return, low prob)
(The average expected initial return of any gamble will be 2 to 1, but this can change based on attributes and links.)

gambleatts(gambleid, A1, A2, A3, A4)
(The gambleid and its attributes; these are binary 0/1 values. and is visible to you. Hidden from you is a set of two combinations that are favorable, two that are unfavorable and twelve that are neutral.)