# Talk:Martingale (probability theory)

## tau

I don't think saying that [τ=t] is independent of X_{t+1} etc. is quite what we want, but the issue is a little messy since I've been trying to avoid talking about filtrations. The definition of stopping time I'm used to is with respect to a filtration; the natural filtration to use (given that we aren't talking about filtrations) would seem to be the one generated by the X_{i}. In this case [τ=t] is a function of X_{1} through X_{t}, which is what I was aiming for in the definition given here.

I'm not sure how to make the "independent of the future" style definition work---for example, we could have a martingale where X_{t+1} encodes in some subtle way (e.g. by minute perturbations of the lower-order bits) all of the preceding values. So for the moment I am pulling this paragraph out, since unless I am missing something it looks like the last claim is just plain false:

- To say that "the occurrence or non-occurrence of the event τ=
*t*depends only on the values of*X*_{1},*X*_{2}, ...,*X*_{t}" need not mean that the occurrence or non-occurrence of that event is completely determined by those values. It can mean rather that, although the occurrence or non-occurrence of the event τ=*t*may be correllated with the values of*X*_{1},*X*_{2}, ...,*X*_{t}, that occurrence or non-occurrence is probabilistically independent of the values of*X*_{t+1},*X*_{t+2},*X*_{t+3}, ... .

Populus 03:06, 13 Nov 2003 (UTC)

Just added an article on filtrations, putting the link here in case anybody wants to link to it ... Bryan Barnard 22:40, 14 Jun 2004 (UTC)

"1 = E[Y1] = E[Yτ] = m² - E[τ]. We immediately get E[τ] = m²+1" Seems to me we get . Typo? - Gauge 00:49, 21 Aug 2004 (UTC)

Why on earth is a Martingale called a Martingale? One sentence on the historical reason would be interesting Torfason 18:52, 22 October 2005 (UTC)

## Recent addition

I added a general mathematical definition for martingales that take value in general topological vector spaces. They have quite a lot a applications in mathematical finance and in stochastic partial differential equations, so I think it is an important definition.

But I am not sure that the remark is placed in the right position (quite at the beginning). On one hand, it is a definition, and should be written in the beginning. On the other hand, it uses some abstract mathematical tools, and should be put at the end of the article in order to avoid to make it too hostile for non-mathematical readers. gala.martin

- Good points. For now I moved it down. See the article martingale and its history for my explanation. Oleg Alexandrov (talk) 01:51, 4 December 2005 (UTC)

I'm a wikipedia neophyte, so I won't try to sign this or anything like that. Nor am I an expert on sailing vessels, so I won't try to edit this page, either. However-- I do believe there is a type of sail called a martingale, which could benefit from its own page disambiguated from this one.

- I added it to the disambig at the top --14:18, 8 March 2006 (UTC)

## Gambler's Fallacy

"Of course in reality the exponential growth of the bets would quickly bankrupt those foolish enough to use the martingale after even a moderately long run of bad luck."

I think this sentence qualifies as gambler's fallacy as it implies a long run of bad luck would affect future bets badly. --Kurulananfok 21:00, 20 May 2006 (UTC)

- Maybe yhe sentence should be changed (do we need to say that somebody is foolish?). Anyway, since nobody has infinite worth, after a long run of bad luck there are not
*future bets*: the gambler bankrupts and cannot afford further betting. --gala.martin (what?) 13:38, 21 May 2006 (UTC)

## Compared to others

I think that we had better to put supmartingale, submartingale, semimartingale together in the article. Jackzhp 16:29, 9 September 2006 (UTC)

## suggestions

In optional sampling theorem condition (a) is redundant since you require (b). Second, the link of the word constant to mathematical_constant is irrelevant - it's not the same meaning/intention

In Optional Sampling Theorem there is a reference to undefined condition (c): "a gambler with a finite lifetime (which gives conditions (a) and (b)) and a house limit on bets (condition (c))"

## Martingale systems in investment, and the value of stating the obvious

It seems that more snake oil is being sold, and more ignorance spread, in the form of martingale systems promoted as a sound investment strategy, for stocks or foreign exchange. Shouldn't it perhaps be mentioned that martingale systems have been used, not just by casino gamblers, but by investors who perceive some kind of scientifically proven advantage to the use of such systems?

On the subject of stating the obvious, perhaps a plain language explanation of the fallacy of martingale systems would be suitable. Simply put, if you you're betting on a flip of a fair coin, then your expectation is zero (that is, your current status +/- 0), and by the very definition of random outcomes there is no need to refer to previous results to draw this conclusion. When betting in a "supermartingale" situation, such as a bet on black on a roulette wheel with two green spots, your expectation is determined by the odds of one trial, again without reference to previous trials, and this is largely self-evident without lengthy mathematical proof (I'm not saying we shouldn't show the proof! Only that it's fair to make a plain language statement of the obvious fact as well).

A similar plain language explanation might be to point out that, "if it makes sense, after six consecutive losses, to bet 2^6 times your original bet, and if results of each bet are independent, then it makes just as much sense to bet that amount *right now*. If it ever makes sense to bet $64.00 (or $64,000.00), then it *always* makes sense to do so. If it ever *doesn't* make sense to bet that amount, then it *never* makes sense to, unless somehow the conditions of the game itself are changed... or unless you take the rational approach and constrain your bets, not by tying them to a chain of random past events according to an irrational system, but rather by the limits of one's own tolerance for risk.

Maybe this doesn't sound scientific enough, but sometimes common sense should be expressed in a common way.zadignose 04:56, 22 August 2007 (UTC)

- Zadi, the idea of doubling your bet until you win can work, but you'd have to wait until the right point in time. If there was already 10 heads in a row and THEN you started the double bets, your are ARE INDEED much better off. The reason for that is that 10 in a row is very rare and 15 in a row is even more rare. If you withheld the start of your betting until a major statistical anomoly was already present, then what are are actually doing is beting AGAINST the continuation of the MAJOR anomoly. The key would be to know how often 10, 15 and 20, etc. in a row actually occur. Just by the seat of my pants, I'd venture to say you almost never see 20 in a row - even with millions of flips. 216.153.214.89 (talk) 00:09, 13 September 2008 (UTC)

- Eek! Gambler's fallacy! CRETOG8(t/c) 00:20, 13 September 2008 (UTC)

No - you are wrong. The odds are added differently than you think. If you flip a coin, it's 50/50. If you've flipped and gotten 10 heads in a row, is it still 50/50 on the 11th? No. Why? Because the more you get of the same in a row, the less likely the streak is to continue. If not, then you'd see 5,000 in a row all the time - but you never do. The way to count the odds is the difference between the frequency of each count in a row. For example, 5 in a row would take X number of flips to see one occurance of 5 in a row. Well it takes a LOT more flips to see a 10 in row than a 5 in a row. The longer the streak, the less often it occurs. Let's just say for example 10 in a row happens once every 100,000 times and and 11 in a row happens once every 110,000 times. The odds are .909 lower (100/110) that 10 in a row will continue to 11 in a row. However, if you took the results of the 11th flip after 1,000 ten-in-a-row runs, it would be 50/50. But we ARE NOT talking about doing 10 in a row many times and then going for the 11 in a row afterwards. We are only talking about 1 series and betting against that 1 series until it reverses. By waiting until there is a higher number of flips in a row (starting at 10 in a row, say), before starting the opposite betting, what you are doing is betting ONLY during a small slice of time, where a larger slice of time is required to get to 11 in a row. You are in effect exploiting the known temporary anomaly in the odds. I've discussed this at length in the past with a oddsmaking math genius - it is correct and could be proven with a qbasic program. The odds of 50/50 flips runs in small peaks and valleys. Imagine a timeline fluctuating up and down as it goes - these are the flips recorded as a fluctuating line. Up is heads, down is tails, left to right are the flips. Up one notch for heads, back to one below the line for tails and vice versa. As long as you get the same in row, the peak or valley gets larger. As soon as the series breaks, you go to position #1 on the other side of the line. The line would look something like seismograph printout. But on our paper, no matter how long our sheet was, you'd almost NEVER find a 10 valley or a 10 peak (10 tails or 10 heads in a row). And it's that much more difficult to get to 11, 12. It's not that you are beating the odds, it's that you are only playing when the odds are in your favor. The longer you let the steak get before you bet, the better the odds it will reverse while you still have money. If you don't believe me, challenge yourself to prove it. Let's write the qbasic code for this proof and post it here. BTW, the reason why no one benefits from this odds anomaly is that you'd have to stand by the roulette wheel all day just to see a goodly enough number in a row that your starting point would be affordable. You might have to wait all week to see one occurance of 10 in a row (must count 0 and 00 as streak continue - so your odd are less than 50/50 to start). So, if you went against it at 10, it would likely reverse by 13. So you bet 100(11-lose), bet 200(12-lose), bet 400(13-win). You lost $300, won $400, net $100 and you worked all week to get it. The anomaly is there, but the risk and time required to exploit it is too great. No matter how many times you double bet before you finally win, in the end, your net winnings equals the amount of your 1st bet. 216.153.214.89 (talk) 05:50, 18 September 2008 (UTC)

- Yep, that's exactly the gambler's fallacy. You wrote, "
*If you flip a coin, it's 50/50. If you've flipped and gotten 10 heads in a row, is it still 50/50 on the 11th?*" To which my answer is, yes, it is. CRETOG8(t/c) 06:04, 18 September 2008 (UTC) - P.S. The talk page of an article isn't really the place to go into this in depth, but if you want to discuss it more, feel free to hit me at my talk page. CRETOG8(t/c) 06:05, 18 September 2008 (UTC)

No, you are wrong. In any lot of odds-smoothing series, there will be outlier events. The longer a continous run is from the mean, the greater the odds that it must revert to the mean. The only fallacy about the "gamblers fallacy" is that some idiots think that all people who look at this and think "not so", would actually try to make $$ betting on it. Let me ask you a question: If you had just witenessed 100 heads in a row and had $900 billion dollars reserve, are you going to tell me you couldn't win the even money bet that the next would be tails - even for say $10>20>40>80>160>320>640>1,280>2,560>5,120>10,240>$20,480? Are you going to go from seeing 100 in a row (which NEVER happens) to 110 in a row (which NEVER NEVER happens). It's not a "fallacy" it's an odds-smoothing anomaly and it could be exploited, but the amount you could bet is too tightly controlled by the demands for a HUGE capital reserve, so even if you waited for a string of many in a row before you start, it wouldn't be worth the effort. To understand this, you need to step back from the absolute of the total odds being spread over all flipping occurrences and see that localized odds-smoothing anomalies do rarely occur and when they do, they are detectable. 216.153.214.89 (talk) 11:28, 19 September 2008 (UTC)

- What are you talking about? To illustrate the point: I just flipped my quarter and it landed heads three times in a row. Is it going to magically be more likely to land on tails when I toss it again? Of course not. If you suggest that it is so, you are contradicting practically all of probability theory and the theory of martingales. When you say "given that we tossed a coin 10 times, what's the probability that it will land on heads?" You calculate the conditional probability of an 11th heads given 10 heads, which by Bayes Rule would amount to the equation: P[11 heads]/P[10 heads] = (1/2)^11/(1/2)^10 = 1/2. There you have it, 1/2 probability of getting a heads on the next toss. I'm not trying to discredit you, but I find it highly unlikely that you spoke at length with a "math genius" on this topic. Furthermore, you cannot "prove" such a thing with a computer simulation (lol @ qbasic). Also, given that you are discussing in the mathematics section, please try to use mathematically precise language. —Preceding unsigned comment added by 207.237.81.59 (talk) 18:07, 11 February 2009 (UTC)

207: Thanks for the feedback. What I am referring to is reversion to the mean. Please read this and after you have done so, please re-consider my comments in light of that information. 216.153.214.89 (talk) 01:09, 23 February 2009 (UTC)

- 216.153.214.89, you're still wrong, and committing the gambler's fallacy by misunderstanding reversion to the mean. Reversion to the mean states that, e.g., a run of 9 heads out of 10 is likely to followed by the next 10 flips having fewer than 9 heads, closer to 5. Reversion to the mean only comes into play when you're comparing two of the same observations-- a run of 100 heads is unlikely to be followed by another identical run of 100 heads, but reversion to the mean says nothing about observing the next 1 or 10 flips, as that's a different variate. In practice, here is what reversion to the mean means:
- 1) Getting eight heads out of ten flips can mean that you're flipping a very biased coin. But it's much more likely that you're flipping a fair coin but got an unusual number of heads. Therefore, we expect that the next ten flips will have a number of heads closer to average, five.
- 2) This applies in say, IQ tests. There are many more people who have an IQ score of 130 than 140. So if you take an IQ test and get a score of 135, it's more likely that you were a person with an IQ score of 130 scoring 5 points high out of luck than a person with a score of 140 scoring 5 points low out of bad luck. So your next IQ test is more likely to be closer to the mean of 100. It's the same thing for height; a 6'10" person is more likely to have 6'6" genes and a great environment than 7'2" genes and a poor environment, simply because many more people have 6'6" genes. —Preceding unsigned comment added by 132.228.195.207 (talk) 21:31, 31 March 2009 (UTC)

## Origin of the name

There is no citation of the etymology of Martingale. In fact, I am reasonably confident that the stochastic process called a Martingale had its origin from the horse collar, not the gambling system. It is to give the imagery that there is a constraint on where the horse's head can move in the next time instant. It was a coincidence that this also happened to be a name of a gambling system. —Preceding unsigned comment added by 124.171.59.106 (talk) 11:37, 24 April 2008 (UTC)

- A coincidence? The gambling system has an obvious connection to the mathematical concept. I find that hard to believe that it would be coincidence. The entire point of the gambling system is a claim that the optional sampling theorem can be violated. —Preceding unsigned comment added by 132.228.195.207 (talk) 21:35, 31 March 2009 (UTC)

## Demonstration of R software

I think that the R-programmed simulation of a Brownian motion is just a sequence of independent variables, whereas it should be the sum of independent random variables. I haven't changed it, because I'd like someone to provide a second opinion. Thanks. —Preceding unsigned comment added by Philip Maton (talk • contribs) 16:46, 27 May 2009 (UTC)

- I agree. As it stands, the Excel formula and the R command/graphic are inconsistent. -- Chadhoward (talk) 22:18, 14 March 2010 (UTC)

## Polya's urn redirect

Polya's urn redirects here, Wouldn't it be better if redirected to Urn problem? --Squidonius (talk) 23:20, 23 June 2010 (UTC)

{xx{Hidden begin}xx}

### A more general definition

One can define a martingale which is an uncountable family of random variables. Also, those random variables may take values in a more general space than just the real numbers.

Let be a directed set, be a real topological vector space, and its topological dual (denote by this duality). Moreover, let be a filtered probability space, that is a probability space equipped with a family of sigma-algebras with the following property: for each with , one has .

A family of random variables :

is called a martingale if for each and with , the three following properties are satisfied:

- is -measurable.

If the directed set is a real interval (or the whole real axis, or a semiaxis) then a martingale is called a continuous time martingale. If is the set of natural numbers it is called a discrete time martingale. {xx{Hidden end}xx}

## Coined by Ville

I think the term martingale was coined by Ville, somewhat before Lévy used them. About the Girsanov theorem, it only permits to construct a measure that makes SOME Ito processes a martingale (not any Ito process!). And Balsara and Kleinert seem out of place in the references. Why not link to Bachelier or some other financial applicant instead of referencing Balsara's book "for futures traders". Likewise, treatises of Doob, Meyer or Neveu are actually about martingales, whereas Kleinert's book seems to be about "a lot of other stuff".90.27.21.180 (talk) 13:50, 28 April 2010 (UTC)

## Relation to Markov chains ???

Surely, there must be some theorems relating martingales to Markov chains? The definitions are quite similar; one speaks of probabilities, the other of expectations. Surely there must be overlap! Yet the current article breaths narry a word of this. linas (talk) 03:40, 20 November 2010 (UTC)

- They are only related in that they both describe stochastic processes. The Markov property (roughly) states that after observing the prior state of the system, there is no additional information that can be gained by observing earlier states. In a Markov chain, once we enter one link of the chain, the next transitions out of the chain only depend on that link; they do not depend on how we got to that link. However, in a Martingale (roughly), the conditional expectation of the next observation given this and
*all*prior observations is equal to this observation. So Martingales need not have a Markov property as the next transition can depend on more than the present state of the system. —**TedPavlic**(talk/contrib/@) 02:15, 15 August 2011 (UTC)

## Lead is gibberish

Please explain in English, someone! -- cheers, Michael C. Price ^{talk} 12:56, 14 August 2011 (UTC)

- The lede gives the standard rough definition of a martingale. In a prototypical martingale stochastic process, a realization (i.e., a "draw") of the current value of the stochastic process is exactly the mean of the next value of the process. For example, imagine a game where you win a dollar every time a fair coin comes up heads and you lose a dollar every time that fair coin turns up tails. Your winnings are a martingale. That is, your winnings now are the expected value (the "mean" in probability) of your winnings after the next coin flip. In other words, the next coin flip will increase your winnings by a dollar with 50% probability and will decrease your winnings by a dollar with a 50% probability, and so your expended
*gain*on the next flip is $0, and your expected*total winnings*on the next flip is identical to your current winnings. Because the expected winnings on the next flip are equal to your known winnings on this flip, the process is a**martingale**. Does that help? Do you think there is something that could be added to the lede that could make things more clear? At the moment, the wikilinks should help clarify some of the muddiness. Where that doesn't help, some of the rest of the page should. However, the lede is a little short, and so maybe something (but what?) could be added. —**TedPavlic**(talk/contrib/@) 02:07, 15 August 2011 (UTC) - To clarify, I'm describing a "prototypical" Martingale by focusing on only the immediately prior observation. The broader definition of Martingale states that the conditional expectation of the next observation given
*all*prior observations (including this one) is equal to this observation. So it's easier to talk about Martingales when they are Markov chains, but the generic Martingale need not have this property. —**TedPavlic**(talk/contrib/@) 02:33, 15 August 2011 (UTC)- Thanks, that helps a great deal. I was struggling with stuff like "value of an observation" which I think means "value of an observa
*ble*" or "value of the observables". The term "equal to the observation" needs clarification as well, IMO. -- cheers, Michael C. Price^{talk}06:05, 15 August 2011 (UTC)- Hm. I think "observation" is probably a better word than observable. This topic doesn't really relate to observability. That is, we aren't measuring an output of a system and trying to estimate the internal states of that system. Here, "observation" is a "realization" of a random variable. A stochastic process is a sequence (i.e., an ordered list) of random variables that are typically indexed by something related to time.
- For example, a continuous-time stochastic process maps each continuous time to a random variable . So, at time 0, there is a random variable , and at time 3.4, there is a random variable . To "observe" these random variables means to "draw" from them according to their probability distributions just as you would draw from a card deck or flip a coin or pull the arm of a one-armed bandit. Making an observation of may tell you nothing about what you would draw from . On the other hand, if is strongly correlated with , then drawing from might allow you to predict a future draw from with absolute certainty.
- A martingale (at least the conventional definition of one) is a discrete-time stochastic process (because it requires a notion of "immediately before"). A discrete-time stochastic process maps each discrete time (i.e., a natural number (or an integer, if you like)) to a random variable . In a continuous-time stochastic process, there is no way to describe the "next" random variable at any given time. However, in a discrete-time stochastic process, at time the "next" random variable is at time . It is not necessary for a discrete-time stochastic process to be a martingale. However, there is a discrete-time stochastic process with the property that observing ("drawing") tells you the (conditional) expected value of the probability distribution at , then that process is a martingale.

- So martingales are stochastic processes (i.e., ordered lists of random variables) where the conditional expected value of the probability
**distribution**at time given all draws of variables before time is equal to the value drawn at time . —**TedPavlic**(talk/contrib/@) 14:28, 15 August 2011 (UTC)- These comments still hold for observables in physics too. "Observation" really is a better term ("realization" probably is even better). —
**TedPavlic**(talk/contrib/@) 17:49, 17 August 2011 (UTC)- I've edited the lede to try to make a distinction between observed values and random variables. —
**TedPavlic**(talk/contrib/@) 17:53, 17 August 2011 (UTC)- Thanks for your efforts. It all helps! -- cheers, Michael C. Price
^{talk}20:35, 17 August 2011 (UTC)

- Thanks for your efforts. It all helps! -- cheers, Michael C. Price

- I've edited the lede to try to make a distinction between observed values and random variables. —

- These comments still hold for observables in physics too. "Observation" really is a better term ("realization" probably is even better). —

- Hm. I think "observation" is probably a better word than observable. This topic doesn't really relate to observability. That is, we aren't measuring an output of a system and trying to estimate the internal states of that system. Here, "observation" is a "realization" of a random variable. A stochastic process is a sequence (i.e., an ordered list) of random variables that are typically indexed by something related to time.

- Thanks, that helps a great deal. I was struggling with stuff like "value of an observation" which I think means "value of an observa

Template:OdI've taken a stab at it myself, philosophy being that anybody should be able to at least understand what the article is about (and those who want details will read further for the rigorous definition). Hence I've put the opening sentence into plain English, avoided defining any symbols, and incorporated mention of a prototypical well-known example, but fundamentally I've tried to preserve all that was being said. Cesiumfrog (talk) 01:12, 30 August 2011 (UTC)

- Hm. I think those edits might go too far. Terms like "likely" and "rise" and "fall" have too much loaded meaning and may communicate the wrong idea to the reader. For example, the conditional probability distribution of the next sample may have significant skew asymmetry. In that case, saying that the mean of that distribution is equivalent to the previous sample does not imply that the process is "equally likely" to "rise" and "fall" unless you put strict qualifications on what it means to rise and fall. You can imagine a probability distribution where 99% of the time it rises but the 1% of the time it falls it falls tremendously. In that case, you still have a Martingale, but the process would spend most of its time rising. So I prefer the old definition perhaps with a little rounding off of edges. This "lay definition" goes too far (IMHO). —
**TedPavlic**(talk/contrib/@) 19:05, 30 August 2011 (UTC)

Template:Outdent Well, for future reference of anybody wanting simpler, here was my proposal for the lead:

*In probability theory, a***martingale**is a stochastic process (i.e., a sequence of random variables) which is likely to rise just as much as it is likely to fall. That is, given all of the previous observed values (i.e., the realizations), the expectation for the next value is always equal to the current value.*An unbiased random walk is an example of a martingale. Martingales are models of fair games.*

and I disagree with the current version. Much of this pertains directly to the MOS:

- Since it is the norm for all mathematical articles to have a separate section for the precise definition, I don't think this article has an exceptional necessity for its lead to be cluttered with a self-referential explanation of that fact. Besides, there's already the TOC. (If there are still concerns, perhaps adding "rigorous" to the section title would alleviate them?)
- The lead is always supposed to be informal and "without rigor". (At the very least it should be less rigorous than those "mathematically rigorous definitions" it says are "given below".) Now, the lower section contains the mathematical definition (expressed
*E(X_{n+1}|X_1,...,X_n)=X_n*) and also restates the definition into words (rigorously), saying it is a "*stochastic process (i.e., a sequence of random variables) [..for which..] the conditional expected value of the next observation, given all the past observations, is equal to the last observation.*" Alas, the current version of the lead not only preserves just as much rigor but also defines more terms and introduces more jargon:*"..is a stochastic process (i.e., a sequence of random variables) [for which,] given all of the previous observed values (i.e., the realizations), the conditional expected value (i.e., the expectation or mean of all cases that share the same previous observed values) for the next value is equal to the current value."* - An article should never open with the words
*"Very roughly speaking,"*. Furthermore, in cases (such as this) where the lead is really not speaking that innacurately (ideally just isn't using strict jargon), a disclaimer isn't warranted at all. (Those seeking a completely rigorous definition will read down to that section, still.) - Yes, it is true that the chance of a martingale rising is not always equal to the chance of it falling. But the expecation for how much it will rise does equal the amount it is expected to fall. Yes, this isn't perfectly unambiguous, which is why my proposal immediately follows up with a clarification of exactly this point. (The approach is to start simple and move toward more technical as the article proceeds. And not to proceed too long before getting to an example and motivation. The concept of martingales really shouldn't be too difficult for someone unfamiliar with expectation values to understand.)
- The use of the (common speech) clause "given .." makes redundant the term
*conditional*expectation. Better to stick with what terminology is more likely to be familiar, and save more specialised jargon for outside of the lead. - But I'm happy we're no longer introducing symbols in the lead. (I used the word "always" to replace the use of "t".)

Lastly, somewhere in the article I think an example should explain more clearly why the definition is not simply E(X_{n+1}|X_n)=X_n since the obvious examples are actually not conditional on X_1,..X_{n-1}. Similarly, a more tangible example (or at least less abstractly explained) of a martingale with respect to another sequence could be beneficial. Cesiumfrog (talk) 02:46, 31 August 2011 (UTC)

- Your qualification of what "always" means suggests a purpose for symbols. In any case, technical jargon can be resolved with Wikilinks as it is done on many other technical pages. I think it's important not to relegate the lead to be an auxiliary example (that is better suited for an "Examples" section). Regarding the conditional expectation and the other realizations, you need more than just the previous realization if the process does not have the Markov property (also see Markov chain). Imagine you had a deck of cards and every time you drew a card off the top of the deck, you did
*not*replace the card back into the deck. In that case, the conditional probability distribution for the next card draw depends not only on the previous card drawn but on*all*of the previous cards drawn. Moreover, you could implement a "coin" based on such a card deck and thus the conditional probability of "heads" and "tails" would differ if you knew only the previous realization versus knowing all prior realizations. —**TedPavlic**(talk/contrib/@) 18:29, 31 August 2011 (UTC)

- I would suggest the following:

- In probability theory, a
**martingale**is a sequence of random variables (i.e. a stochastic process) where the expectation for the next value is always equal to the most recent value, notwithstanding all of the earlier observed values (i.e., the realizations). - An unbiased random walk is an example of a martingale. Martingales are models of fair games because the expectation of winning on the next round always equals the expectation of losing.

- In probability theory, a
- --agr (talk) 12:13, 31 August 2011 (UTC)
- "Notwithstanding" is not the right word. The conditional expected value is really the expectation conditioned on having knowledge of all prior realizations. If one of those realizations change, the expectation may change. Moreover, the phrase about martingales being models of fair games is so strong that it seems to suggest that martingales are only applied in the analysis of games. A wikilink from games to a generic definition may solve this problem. —
**TedPavlic**(talk/contrib/@) 18:29, 31 August 2011 (UTC)

- "Notwithstanding" is not the right word. The conditional expected value is really the expectation conditioned on having knowledge of all prior realizations. If one of those realizations change, the expectation may change. Moreover, the phrase about martingales being models of fair games is so strong that it seems to suggest that martingales are only applied in the analysis of games. A wikilink from games to a generic definition may solve this problem. —

- The martingale definition restricts the conditional probability distribution of the next sample. Each possible prefix (i.e., each sequence of prior realizations) maps to a conditional probability distribution of the next sample. The "notwithstanding" sounds like the conditional probability distribution is somehow invariant of the history of the system. In fact it's the opposite. The history (i.e., all of the prior realizations) tell you which conditional probability distribution (or at least conditional expected value) to use to describe the next sample. If it is a martingale, then the conditional expected value (which is parameterized by the entire history of the process at that point) will always be equal to the most recent realization. In other words, if I only knew the current observation, my expected value for the next observation might be different if I knew the entire history. Imagine drawing cards out of a deck without replacement. If all you know is the previous card drawn, you have a lot of uncertainty about the next card drawn. However, if you know the entire history (i.e., every card drawn so far), then your expected value of the next card changes substantially. It is that latter expected value that is used in the definition of the martingale. This is why the
**conditional**modifier is so important (see conditional expected value). —**TedPavlic**(talk/contrib/@) 03:08, 1 September 2011 (UTC)- agr: If you'd like to use the "notwithstanding" language, it may be a good idea to take a hint from the definition in Probability and statistics: the science of uncertainty by Evans and Rosenthal (note that they explicitly assume that the Martingale is a Markov chain "for simplicity", but they also say that "this is not really necessary"). In particular, you can rephrase the definition in terms of the conditional expectation of (given all prior observations). In that case, the conditional expectation is 0, notwithstanding the current observation. As they summarize, "on average", the chain's value "does not change, regardless of" what the current value actually is. Again, this definition assumes a Markov chain (explicitly, for simplicity), but a similar form could be used here in the general definition. If a Markov chain is needed to help explain, then it should be stated
**explicitly**as a device for simplicity. —**TedPavlic**(talk/contrib/@) 20:07, 1 September 2011 (UTC)

- agr: If you'd like to use the "notwithstanding" language, it may be a good idea to take a hint from the definition in Probability and statistics: the science of uncertainty by Evans and Rosenthal (note that they explicitly assume that the Martingale is a Markov chain "for simplicity", but they also say that "this is not really necessary"). In particular, you can rephrase the definition in terms of the conditional expectation of (given all prior observations). In that case, the conditional expectation is 0, notwithstanding the current observation. As they summarize, "on average", the chain's value "does not change, regardless of" what the current value actually is. Again, this definition assumes a Markov chain (explicitly, for simplicity), but a similar form could be used here in the general definition. If a Markov chain is needed to help explain, then it should be stated
- No, you're misimagining it or else it isn't a martingale. If all you know is the current observation, then that is your expectation value for the next observation, and if it's a martingale then you're correct. If knowing more would have changed your expectation to anything other than that, then it isn't a martingale. Yes, the probability distribution shape may change depending on earlier realisations, but not its center of gravity. For example, a drunkard may lurch with greater amplitudes if it has stumbled into the bar more recently than the coffee shop, but provided the walk stays unbiased it stays a martingale. Likewise, the random idling of a unicyclist can be a martingale despite being likely to go back and forth more than side to side (hence the probability distribution depends on the last two observations, but the expectation value depends only on the last one).
- agr, this suggests a great simplification of the lead. For example, is the definition "a martingale is a sequence of random observations for which the latest realised value is always the expectation value for the following observation" anything short of completely rigorous? Can you fully elucidate a counterexample TedPavlic? Cesiumfrog (talk) 03:44, 1 September 2011 (UTC)
- Cesiumfrog: You need the whole history because that's the definition of a Martingale. For confirmation, see Randomized algorithms by Motwani and Raghavan. Now scroll down to page 87. It says that the knowledge of the past bets does not help to predict the future. That's the essence of a martingale, and perhaps this is where agr's "notwithstanding" came from. In a Martingale, if you know all of your past history, you still are clueless about the future outcome. The point of a martingale is not simply that knowing what you have now fixes your expectation of what you have later, it's that
**even if you did know**the history of how you got here, it wouldn't help you to predict where you would go next. Thus, that's conditioned on the entire history and does not require a Markov property. As mentioned in my previous edit, some authors add the Markov property for simplicity, but it is not the general definition of a martingale. A martingale is a fair game because there's no way to predict the future based on any knowledge of past events. —**TedPavlic**(talk/contrib/@) 20:21, 1 September 2011 (UTC)

- Cesiumfrog: You need the whole history because that's the definition of a Martingale. For confirmation, see Randomized algorithms by Motwani and Raghavan. Now scroll down to page 87. It says that the knowledge of the past bets does not help to predict the future. That's the essence of a martingale, and perhaps this is where agr's "notwithstanding" came from. In a Martingale, if you know all of your past history, you still are clueless about the future outcome. The point of a martingale is not simply that knowing what you have now fixes your expectation of what you have later, it's that

- The martingale definition restricts the conditional probability distribution of the next sample. Each possible prefix (i.e., each sequence of prior realizations) maps to a conditional probability distribution of the next sample. The "notwithstanding" sounds like the conditional probability distribution is somehow invariant of the history of the system. In fact it's the opposite. The history (i.e., all of the prior realizations) tell you which conditional probability distribution (or at least conditional expected value) to use to describe the next sample. If it is a martingale, then the conditional expected value (which is parameterized by the entire history of the process at that point) will always be equal to the most recent realization. In other words, if I only knew the current observation, my expected value for the next observation might be different if I knew the entire history. Imagine drawing cards out of a deck without replacement. If all you know is the previous card drawn, you have a lot of uncertainty about the next card drawn. However, if you know the entire history (i.e., every card drawn so far), then your expected value of the next card changes substantially. It is that latter expected value that is used in the definition of the martingale. This is why the

Template:Outdent
Please review new lede. —**TedPavlic** (talk/contrib/@) 21:01, 1 September 2011 (UTC)

### Biased Coin?

Something about the biased coin as an example of a sub- and supermartingale seems misleading. The expectation value of a biased coin doesn't depend on its history – it's always p. Even in the case where you have a coin and are told that it's biased (but not by how much), you wouldn't know whether it was biased up or down, so still it's not a sub- or supermartingale. And if you were told that it was biased in a particular direction, the initial expectation value probably isn't 0.5.

206.124.141.187 (talk) 19:56, 29 June 2013 (UTC)

- Never mind – I realized that the bankroll is the random variable, not the outcome of the coin flips. 206.124.141.187 (talk) 03:44, 2 July 2013 (UTC)

## Applications

I would like to suggest adding a section on applications for martingales. The abstract of *The least variable phase type distribution is Erlang* concludes stating that their "proof is simple and elegant and is a nice example of the power of martingales; it seems intractible without them." Could we collect together some examples of other insights given by the theory of martingales? Gareth Jones (talk) 16:50, 12 November 2013 (UTC)