PLoSWiki
http://topicpages.ploscompbiol.org/wiki/Main_Page
MediaWiki 1.17.0
first-letter
Media
Special
Talk
User
User talk
PLoSWiki
PLoSWiki talk
File
File talk
MediaWiki
MediaWiki talk
Template
Template talk
Help
Help talk
Category
Category talk
Talk:Approximate Bayesian computation
105
833
2012-05-15T21:56:28Z
Cdessimoz
12
Created page with " == Comments of Darren Logan on things to do before moving the article to Wikipedia == * The title should probably be "Approximate Bayesian computation" (small "c"). The "...in ..."
== Comments of Darren Logan on things to do before moving the article to Wikipedia ==
* The title should probably be "Approximate Bayesian computation" (small
"c"). The "...in computational biology" is probably redundant for WP
article. You can always make
a redirection from the long title to the short one.
* In the summary and elsewhere, you use terms like "over the last years"
and "recently". These should be avoided, as WP articles are not dated and
thus non-specific time-frames are not meaningful. If you need to refer to
time, be specific (e.g. "Since 1999..." or "In 2010...")
* Example. In general, Wikipedia articles should not contain worked
examples. That type of content is better suited to Wikiversity or
Wikibooks. There are exceptions, however. The guidance on this can be seen
at WP:NOT, specifically: "An article should not read like a "how-to"
style... the purpose of Wikipedia is to present facts, not to teach subject
matter. It is not appropriate to create or edit articles that read as
textbooks, with leading questions and systematic problem solutions as
examples... Some kinds of examples, specifically those intended to inform
rather than to instruct, may be appropriate for inclusion in a Wikipedia
article." I think you example might be ok, but you should be careful of the
tone to ensure it doesn't seem like a "how-to" guide.
* Beginnings. It might be a good idea to start with the history section at
the very top.
* Wikipedia articles do not have conclusion sections.
* Throughout the article you should try and avoid using a narrative voice and remove all self-references. For example:
** "As the previous section suggests..."
** "This section attempts to review important recent developments..."
** "...should be considered with sober caution, as discussed below."
** "Interestingly..."
** "This section discusses these potential risks and reviews possible ways to address them.."
** "As the above makes clear..."
** "This section reviews risks..."
** "This section attempts to review important recent developments."
834
2012-05-15T21:57:58Z
Cdessimoz
12
/* Comments of Darren Logan on things to do before moving the article to Wikipedia */
== Comments of Darren Logan on things to do before moving the article to Wikipedia ==
* The title should probably be "Approximate Bayesian computation" (small "c"). The "...in computational biology" is probably redundant for WP article. You can always make a redirection from the long title to the short one.
* In the summary and elsewhere, you use terms like "over the last years" and "recently". These should be avoided, as WP articles are not dated and thus non-specific time-frames are not meaningful. If you need to refer to time, be specific (e.g. "Since 1999..." or "In 2010...")
* Example. In general, Wikipedia articles should not contain worked examples. That type of content is better suited to Wikiversity or Wikibooks. There are exceptions, however. The guidance on this can be seen at WP:NOT, specifically: "An article should not read like a "how-to" style... the purpose of Wikipedia is to present facts, not to teach subject matter. It is not appropriate to create or edit articles that read as textbooks, with leading questions and systematic problem solutions as examples... Some kinds of examples, specifically those intended to inform rather than to instruct, may be appropriate for inclusion in a Wikipedia article." I think you example might be ok, but you should be careful of the tone to ensure it doesn't seem like a "how-to" guide.
* Wikipedia articles do not have conclusion sections.
* Throughout the article you should try and avoid using a narrative voice and remove all self-references. For example:
** "As the previous section suggests..."
** "This section attempts to review important recent developments..."
** "...should be considered with sober caution, as discussed below."
** "Interestingly..."
** "This section discusses these potential risks and reviews possible ways to address them.."
** "As the above makes clear..."
** "This section reviews risks..."
** "This section attempts to review important recent developments."
843
2012-05-17T06:48:26Z
Spencer Bliven
1
moved [[Talk:Approximate Bayesian Computation in Computational Biology]] to [[Talk:Approximate Bayesian computation]]: author request
== Comments of Darren Logan on things to do before moving the article to Wikipedia ==
* The title should probably be "Approximate Bayesian computation" (small "c"). The "...in computational biology" is probably redundant for WP article. You can always make a redirection from the long title to the short one.
* In the summary and elsewhere, you use terms like "over the last years" and "recently". These should be avoided, as WP articles are not dated and thus non-specific time-frames are not meaningful. If you need to refer to time, be specific (e.g. "Since 1999..." or "In 2010...")
* Example. In general, Wikipedia articles should not contain worked examples. That type of content is better suited to Wikiversity or Wikibooks. There are exceptions, however. The guidance on this can be seen at WP:NOT, specifically: "An article should not read like a "how-to" style... the purpose of Wikipedia is to present facts, not to teach subject matter. It is not appropriate to create or edit articles that read as textbooks, with leading questions and systematic problem solutions as examples... Some kinds of examples, specifically those intended to inform rather than to instruct, may be appropriate for inclusion in a Wikipedia article." I think you example might be ok, but you should be careful of the tone to ensure it doesn't seem like a "how-to" guide.
* Wikipedia articles do not have conclusion sections.
* Throughout the article you should try and avoid using a narrative voice and remove all self-references. For example:
** "As the previous section suggests..."
** "This section attempts to review important recent developments..."
** "...should be considered with sober caution, as discussed below."
** "Interestingly..."
** "This section discusses these potential risks and reviews possible ways to address them.."
** "As the above makes clear..."
** "This section reviews risks..."
** "This section attempts to review important recent developments."
846
2012-05-18T22:35:36Z
Spencer Bliven
1
/* Comments of Darren Logan on things to do before moving the article to Wikipedia */
== Comments of Darren Logan on things to do before moving the article to Wikipedia ==
* The title should probably be "Approximate Bayesian computation" (small "c"). The "...in computational biology" is probably redundant for WP article. You can always make a redirection from the long title to the short one.
*: Moved. --<span style="font-family:Palatino Linotype, Book Antiqua, Palatino, serif;">[[User:Spencer Bliven|Spencer]] [[User talk:Spencer Bliven|Bliven]]</span> 15:35, 18 May 2012 (PDT)
* In the summary and elsewhere, you use terms like "over the last years" and "recently". These should be avoided, as WP articles are not dated and thus non-specific time-frames are not meaningful. If you need to refer to time, be specific (e.g. "Since 1999..." or "In 2010...")
* Example. In general, Wikipedia articles should not contain worked examples. That type of content is better suited to Wikiversity or Wikibooks. There are exceptions, however. The guidance on this can be seen at WP:NOT, specifically: "An article should not read like a "how-to" style... the purpose of Wikipedia is to present facts, not to teach subject matter. It is not appropriate to create or edit articles that read as textbooks, with leading questions and systematic problem solutions as examples... Some kinds of examples, specifically those intended to inform rather than to instruct, may be appropriate for inclusion in a Wikipedia article." I think you example might be ok, but you should be careful of the tone to ensure it doesn't seem like a "how-to" guide.
* Wikipedia articles do not have conclusion sections.
* Throughout the article you should try and avoid using a narrative voice and remove all self-references. For example:
** "As the previous section suggests..."
** "This section attempts to review important recent developments..."
** "...should be considered with sober caution, as discussed below."
** "Interestingly..."
** "This section discusses these potential risks and reviews possible ways to address them.."
** "As the above makes clear..."
** "This section reviews risks..."
** "This section attempts to review important recent developments."
868
2012-05-22T10:41:21Z
Cdessimoz
12
/* Comments of Darren Logan on things to do before moving the article to Wikipedia */
== Comments of Darren Logan on things to do before moving the article to Wikipedia ==
* The title should probably be "Approximate Bayesian computation" (small "c"). The "...in computational biology" is probably redundant for WP article. You can always make a redirection from the long title to the short one.
*: Moved. --<span style="font-family:Palatino Linotype, Book Antiqua, Palatino, serif;">[[User:Spencer Bliven|Spencer]] [[User talk:Spencer Bliven|Bliven]]</span> 15:35, 18 May 2012 (PDT)
'''The following comments apply specifically to the wikipedia-version of this article''' --[[User:Cdessimoz|Cdessimoz]] 03:41, 22 May 2012 (PDT)
* In the summary and elsewhere, you use terms like "over the last years" and "recently". These should be avoided, as WP articles are not dated and thus non-specific time-frames are not meaningful. If you need to refer to time, be specific (e.g. "Since 1999..." or "In 2010...")
* Example. In general, Wikipedia articles should not contain worked examples. That type of content is better suited to Wikiversity or Wikibooks. There are exceptions, however. The guidance on this can be seen at WP:NOT, specifically: "An article should not read like a "how-to" style... the purpose of Wikipedia is to present facts, not to teach subject matter. It is not appropriate to create or edit articles that read as textbooks, with leading questions and systematic problem solutions as examples... Some kinds of examples, specifically those intended to inform rather than to instruct, may be appropriate for inclusion in a Wikipedia article." I think you example might be ok, but you should be careful of the tone to ensure it doesn't seem like a "how-to" guide.
* Wikipedia articles do not have conclusion sections.
* Throughout the article you should try and avoid using a narrative voice and remove all self-references. For example:
** "As the previous section suggests..."
** "This section attempts to review important recent developments..."
** "...should be considered with sober caution, as discussed below."
** "Interestingly..."
** "This section discusses these potential risks and reviews possible ways to address them.."
** "As the above makes clear..."
** "This section reviews risks..."
** "This section attempts to review important recent developments."
881
2012-06-04T19:08:02Z
Xian
20
== Comments of Darren Logan on things to do before moving the article to Wikipedia ==
* The title should probably be "Approximate Bayesian computation" (small "c"). The "...in computational biology" is probably redundant for WP article. You can always make a redirection from the long title to the short one.
*: Moved. --<span style="font-family:Palatino Linotype, Book Antiqua, Palatino, serif;">[[User:Spencer Bliven|Spencer]] [[User talk:Spencer Bliven|Bliven]]</span> 15:35, 18 May 2012 (PDT)
'''The following comments apply specifically to the wikipedia-version of this article''' --[[User:Cdessimoz|Cdessimoz]] 03:41, 22 May 2012 (PDT)
* In the summary and elsewhere, you use terms like "over the last years" and "recently". These should be avoided, as WP articles are not dated and thus non-specific time-frames are not meaningful. If you need to refer to time, be specific (e.g. "Since 1999..." or "In 2010...")
* Example. In general, Wikipedia articles should not contain worked examples. That type of content is better suited to Wikiversity or Wikibooks. There are exceptions, however. The guidance on this can be seen at WP:NOT, specifically: "An article should not read like a "how-to" style... the purpose of Wikipedia is to present facts, not to teach subject matter. It is not appropriate to create or edit articles that read as textbooks, with leading questions and systematic problem solutions as examples... Some kinds of examples, specifically those intended to inform rather than to instruct, may be appropriate for inclusion in a Wikipedia article." I think you example might be ok, but you should be careful of the tone to ensure it doesn't seem like a "how-to" guide.
* Wikipedia articles do not have conclusion sections.
* Throughout the article you should try and avoid using a narrative voice and remove all self-references. For example:
** "As the previous section suggests..."
** "This section attempts to review important recent developments..."
** "...should be considered with sober caution, as discussed below."
** "Interestingly..."
** "This section discusses these potential risks and reviews possible ways to address them.."
** "As the above makes clear..."
** "This section reviews risks..."
** "This section attempts to review important recent developments."
== Comments of Christian P. Robert on the entry ==
A few comments on the specific entry on ABC written by Mikael Sunnåker et al.... The entry starts with the representation of the posterior probability of an hypothesis, rather than with the posterior density of a model parameter, which seems to lead the novice reader astray. After all, (a) ABC was not introduced for conducting model choice and (b) interchanging hypothesis and model means that the probability of an hypothesis H as used in the entry is actually the evidence in favour of the corresponding model. (There are a few typos and grammar mistakes, but I assume either PLoS or later contributors will correct those.) When the authors state that the "outcome of the ABC rejection algorithm is a set of parameter estimates distributed according to the desired posterior distribution", I think they are leading the readers astray as they forget the "approximative" aspect of this distribution. Further below, I think I would have used the title "Insufficient summary statistics" rather than "Sufficient summary statistics", as it spells out more clearly the fundamental issue with the potential difficulty in using ABC. (And I am not sure the subsequent paragraph on "Choice and sufficiency of summary statistics" should bother with the sufficiency aspects... It eems to me much more relevant to assess the impact on predictive performances.
Although this is most minor, I would not have made mention of the (rather artificial) "table for interpretation of the strength in values of the Bayes factor (...) originally published by Harold Jeffreys[6] ". I obviously appreciate very much that the authors advertise our warning about the potential lack of validity of an ABC based Bayes factor! I also like the notion of "quality control", even though it should only appear once. And the pseudo-example is quite fine as an introduction, while it could be supplemented with the outcome resulting from a large n, to be compared with the true posterior distribution. The section "Pitfalls and remedies" is remarkable in that it details the necessary steps for validating a ABC implementation: the only entry I would remove is the one about "Prior distribution and parameter ranges", in that this is not a problem inherent to ABC... (Granted, the authors present this as a "general risks in statistical inference exacerbated in ABC", which makes more sense!) It may be that the section on the non-zero tolerance should emphasize more clearly the fact that ε should not be zero. As discussed in the recent Read Paper by Fearnhead and Prangle when envisioning ABC as a non-parametric method of inference.
At last, it is always possible to criticise the coverage of the historical part, since this is such a recent field that it is constantly evolving. But the authors correctly point out to (Don) Rubin on the one hand and to Diggle and Graton on the other. I would suggest adding in this section links to the relevant softwares like our own DIY-ABC...
882
2012-06-04T19:29:53Z
Xian
20
== Comments of Darren Logan on things to do before moving the article to Wikipedia ==
* The title should probably be "Approximate Bayesian computation" (small "c"). The "...in computational biology" is probably redundant for WP article. You can always make a redirection from the long title to the short one.
*: Moved. --<span style="font-family:Palatino Linotype, Book Antiqua, Palatino, serif;">[[User:Spencer Bliven|Spencer]] [[User talk:Spencer Bliven|Bliven]]</span> 15:35, 18 May 2012 (PDT)
'''The following comments apply specifically to the wikipedia-version of this article''' --[[User:Cdessimoz|Cdessimoz]] 03:41, 22 May 2012 (PDT)
* In the summary and elsewhere, you use terms like "over the last years" and "recently". These should be avoided, as WP articles are not dated and thus non-specific time-frames are not meaningful. If you need to refer to time, be specific (e.g. "Since 1999..." or "In 2010...")
* Example. In general, Wikipedia articles should not contain worked examples. That type of content is better suited to Wikiversity or Wikibooks. There are exceptions, however. The guidance on this can be seen at WP:NOT, specifically: "An article should not read like a "how-to" style... the purpose of Wikipedia is to present facts, not to teach subject matter. It is not appropriate to create or edit articles that read as textbooks, with leading questions and systematic problem solutions as examples... Some kinds of examples, specifically those intended to inform rather than to instruct, may be appropriate for inclusion in a Wikipedia article." I think your example might be ok, but you should be careful of the tone to ensure it doesn't seem like a "how-to" guide.
* Wikipedia articles do not have conclusion sections.
* Throughout the article you should try and avoid using a narrative voice and remove all self-references. For example:
** "As the previous section suggests..."
** "This section attempts to review important recent developments..."
** "...should be considered with sober caution, as discussed below."
** "Interestingly..."
** "This section discusses these potential risks and reviews possible ways to address them.."
** "As the above makes clear..."
** "This section reviews risks..."
** "This section attempts to review important recent developments."
== Comments of Christian P. Robert on the entry ==
A few comments on the specific entry on ABC written by Mikael Sunnåker et al.... The entry starts with the representation of the posterior probability of an hypothesis, rather than with the posterior density of a model parameter, which seems to lead the novice reader astray. After all, (a) ABC was not introduced for conducting model choice and (b) interchanging hypothesis and model means that the probability of an hypothesis H as used in the entry is actually the evidence in favour of the corresponding model. (There are a few typos and grammar mistakes, but I assume either PLoS or later contributors will correct those.) When the authors state that the "outcome of the ABC rejection algorithm is a set of parameter estimates distributed according to the desired posterior distribution", I think they are leading the readers astray as they forget the "approximative" aspect of this distribution. Further below, I think I would have used the title "Insufficient summary statistics" rather than "Sufficient summary statistics", as it spells out more clearly the fundamental issue with the potential difficulty in using ABC. (And I am not sure the subsequent paragraph on "Choice and sufficiency of summary statistics" should bother with the sufficiency aspects... It seems to me much more relevant to assess the impact on predictive performances.
Although this is most minor, I would not have made mention of the (rather artificial) "table for interpretation of the strength in values of [[wp:Bayes factor|the Bayes factor]] (...) originally published by [[wp:Harold Jeffreys|Harold Jeffreys]]". I obviously appreciate very much that the authors advertise our warning <ref> Robert, C.P., Cornuet, J.-M., Marin, J.-M. and Pillai, N. (2011) Lack of confidence in approximate Bayesian computation model choice. PNAS vol. 108 no. 37 15112-15117 </ref> about the potential lack of validity of an ABC based Bayes factor! I also like the notion of "quality control", even though it should only appear once. And the pseudo-example is quite fine as an introduction, while it could be supplemented with the outcome resulting from a large n, to be compared with the true posterior distribution. The section "Pitfalls and remedies" is remarkable in that it details the necessary steps for validating a ABC implementation: the only entry I would remove is the one about "Prior distribution and parameter ranges", in that this is not a problem inherent to ABC... (Granted, the authors present this as a "general risks in statistical inference exacerbated in ABC", which makes more sense!) It may be that the section on the non-zero tolerance should emphasize more clearly the fact that ε should not be zero. As discussed in the recent Read Paper by Fearnhead and Prangle <ref>Fearnhead, P. and Prangle, D. (2012) Constructing summary statistics for approximate Bayesian computation: semi-automatic approximate Bayesian computation. Journal of the Royal Statistical Society Series B. Volume 74, Issue 3, pages 419–474</refs> when envisioning ABC as a non-parametric method of inference.
At last, it is always possible to criticise the coverage of the historical part, since this is such a recent field that it is constantly evolving. But the authors correctly point out to (Don) Rubin on the one hand and to Diggle and Graton on the other. I would suggest adding in this section links to the relevant softwares like our own DIY-ABC...
883
2012-06-04T19:31:39Z
Xian
20
== Comments of Darren Logan on things to do before moving the article to Wikipedia ==
* The title should probably be "Approximate Bayesian computation" (small "c"). The "...in computational biology" is probably redundant for WP article. You can always make a redirection from the long title to the short one.
*: Moved. --<span style="font-family:Palatino Linotype, Book Antiqua, Palatino, serif;">[[User:Spencer Bliven|Spencer]] [[User talk:Spencer Bliven|Bliven]]</span> 15:35, 18 May 2012 (PDT)
'''The following comments apply specifically to the wikipedia-version of this article''' --[[User:Cdessimoz|Cdessimoz]] 03:41, 22 May 2012 (PDT)
* In the summary and elsewhere, you use terms like "over the last years" and "recently". These should be avoided, as WP articles are not dated and thus non-specific time-frames are not meaningful. If you need to refer to time, be specific (e.g. "Since 1999..." or "In 2010...")
* Example. In general, Wikipedia articles should not contain worked examples. That type of content is better suited to Wikiversity or Wikibooks. There are exceptions, however. The guidance on this can be seen at WP:NOT, specifically: "An article should not read like a "how-to" style... the purpose of Wikipedia is to present facts, not to teach subject matter. It is not appropriate to create or edit articles that read as textbooks, with leading questions and systematic problem solutions as examples... Some kinds of examples, specifically those intended to inform rather than to instruct, may be appropriate for inclusion in a Wikipedia article." I think your example might be ok, but you should be careful of the tone to ensure it doesn't seem like a "how-to" guide.
* Wikipedia articles do not have conclusion sections.
* Throughout the article you should try and avoid using a narrative voice and remove all self-references. For example:
** "As the previous section suggests..."
** "This section attempts to review important recent developments..."
** "...should be considered with sober caution, as discussed below."
** "Interestingly..."
** "This section discusses these potential risks and reviews possible ways to address them.."
** "As the above makes clear..."
** "This section reviews risks..."
** "This section attempts to review important recent developments."
== Comments of Christian P. Robert on the entry ==
A few comments on the specific entry on ABC written by Mikael Sunnåker et al.... The entry starts with the representation of the posterior probability of an hypothesis, rather than with the posterior density of a model parameter, which seems to lead the novice reader astray. After all, (a) ABC was not introduced for conducting model choice and (b) interchanging hypothesis and model means that the probability of an hypothesis H as used in the entry is actually the evidence in favour of the corresponding model. (There are a few typos and grammar mistakes, but I assume either PLoS or later contributors will correct those.) When the authors state that the "outcome of the ABC rejection algorithm is a set of parameter estimates distributed according to the desired posterior distribution", I think they are leading the readers astray as they forget the "approximative" aspect of this distribution. Further below, I think I would have used the title "Insufficient summary statistics" rather than "Sufficient summary statistics", as it spells out more clearly the fundamental issue with the potential difficulty in using ABC. (And I am not sure the subsequent paragraph on "Choice and sufficiency of summary statistics" should bother with the sufficiency aspects... It seems to me much more relevant to assess the impact on predictive performances.
Although this is most minor, I would not have made mention of the (rather artificial) "table for interpretation of the strength in values of [[wp:Bayes factor|the Bayes factor]] (...) originally published by [[wp:Harold Jeffreys|Harold Jeffreys]]". I obviously appreciate very much that the authors advertise our warning <ref> Robert, C.P., Cornuet, J.-M., Marin, J.-M. and Pillai, N. (2011) Lack of confidence in approximate Bayesian computation model choice. PNAS vol. 108 no. 37 15112-15117 </ref> about the potential lack of validity of an ABC based Bayes factor! I also like the notion of "quality control", even though it should only appear once. And the pseudo-example is quite fine as an introduction, while it could be supplemented with the outcome resulting from a large n, to be compared with the true posterior distribution. The section "Pitfalls and remedies" is remarkable in that it details the necessary steps for validating a ABC implementation: the only entry I would remove is the one about "Prior distribution and parameter ranges", in that this is not a problem inherent to ABC... (Granted, the authors present this as a "general risks in statistical inference exacerbated in ABC", which makes more sense!) It may be that the section on the non-zero tolerance should emphasize more clearly the fact that ε should not be zero. As discussed in the recent Read Paper by Fearnhead and Prangle <ref>Fearnhead, P. and Prangle, D. (2012) Constructing summary statistics for approximate Bayesian computation: semi-automatic approximate Bayesian computation. Journal of the Royal Statistical Society Series B. Volume 74, Issue 3, pages 419–474</refs> when envisioning ABC as a non-parametric method of inference.
At last, it is always possible to criticise the coverage of the historical part, since this is such a recent field that it is constantly evolving. But the authors correctly point out to (Don) Rubin on the one hand and to Diggle and Graton on the other. I would suggest adding in this section links to the relevant softwares like our own DIY-ABC...
==Notes==
<references />
884
2012-06-04T19:32:30Z
Xian
20
== Comments of Darren Logan on things to do before moving the article to Wikipedia ==
* The title should probably be "Approximate Bayesian computation" (small "c"). The "...in computational biology" is probably redundant for WP article. You can always make a redirection from the long title to the short one.
*: Moved. --<span style="font-family:Palatino Linotype, Book Antiqua, Palatino, serif;">[[User:Spencer Bliven|Spencer]] [[User talk:Spencer Bliven|Bliven]]</span> 15:35, 18 May 2012 (PDT)
'''The following comments apply specifically to the wikipedia-version of this article''' --[[User:Cdessimoz|Cdessimoz]] 03:41, 22 May 2012 (PDT)
* In the summary and elsewhere, you use terms like "over the last years" and "recently". These should be avoided, as WP articles are not dated and thus non-specific time-frames are not meaningful. If you need to refer to time, be specific (e.g. "Since 1999..." or "In 2010...")
* Example. In general, Wikipedia articles should not contain worked examples. That type of content is better suited to Wikiversity or Wikibooks. There are exceptions, however. The guidance on this can be seen at WP:NOT, specifically: "An article should not read like a "how-to" style... the purpose of Wikipedia is to present facts, not to teach subject matter. It is not appropriate to create or edit articles that read as textbooks, with leading questions and systematic problem solutions as examples... Some kinds of examples, specifically those intended to inform rather than to instruct, may be appropriate for inclusion in a Wikipedia article." I think your example might be ok, but you should be careful of the tone to ensure it doesn't seem like a "how-to" guide.
* Wikipedia articles do not have conclusion sections.
* Throughout the article you should try and avoid using a narrative voice and remove all self-references. For example:
** "As the previous section suggests..."
** "This section attempts to review important recent developments..."
** "...should be considered with sober caution, as discussed below."
** "Interestingly..."
** "This section discusses these potential risks and reviews possible ways to address them.."
** "As the above makes clear..."
** "This section reviews risks..."
** "This section attempts to review important recent developments."
== Comments of Christian P. Robert on the entry ==
A few comments on the specific entry on ABC written by Mikael Sunnåker et al.... The entry starts with the representation of the posterior probability of an hypothesis, rather than with the posterior density of a model parameter, which seems to lead the novice reader astray. After all, (a) ABC was not introduced for conducting model choice and (b) interchanging hypothesis and model means that the probability of an hypothesis H as used in the entry is actually the evidence in favour of the corresponding model. (There are a few typos and grammar mistakes, but I assume either PLoS or later contributors will correct those.) When the authors state that the "outcome of the ABC rejection algorithm is a set of parameter estimates distributed according to the desired posterior distribution", I think they are leading the readers astray as they forget the "approximative" aspect of this distribution. Further below, I think I would have used the title "Insufficient summary statistics" rather than "Sufficient summary statistics", as it spells out more clearly the fundamental issue with the potential difficulty in using ABC. (And I am not sure the subsequent paragraph on "Choice and sufficiency of summary statistics" should bother with the sufficiency aspects... It seems to me much more relevant to assess the impact on predictive performances.
Although this is most minor, I would not have made mention of the (rather artificial) "table for interpretation of the strength in values of [[wp:Bayes factor|the Bayes factor]] (...) originally published by [[wp:Harold Jeffreys|Harold Jeffreys]]". I obviously appreciate very much that the authors advertise our warning <ref> Robert, C.P., Cornuet, J.-M., Marin, J.-M. and Pillai, N. (2011) Lack of confidence in approximate Bayesian computation model choice. PNAS vol. 108 no. 37 15112-15117 </ref> about the potential lack of validity of an ABC based Bayes factor! I also like the notion of "quality control", even though it should only appear once. And the pseudo-example is quite fine as an introduction, while it could be supplemented with the outcome resulting from a large n, to be compared with the true posterior distribution. The section "Pitfalls and remedies" is remarkable in that it details the necessary steps for validating a ABC implementation: the only entry I would remove is the one about "Prior distribution and parameter ranges", in that this is not a problem inherent to ABC... (Granted, the authors present this as a "general risks in statistical inference exacerbated in ABC", which makes more sense!) It may be that the section on the non-zero tolerance should emphasize more clearly the fact that ε should not be zero. As discussed in the recent Read Paper by Fearnhead and Prangle <ref>Fearnhead, P. and Prangle, D. (2012) Constructing summary statistics for approximate Bayesian computation: semi-automatic approximate Bayesian computation. Journal of the Royal Statistical Society Series B. Volume 74, Issue 3, pages 419–474</refs> when envisioning ABC as a non-parametric method of inference.
At last, it is always possible to criticise the coverage of the historical part, since this is such a recent field that it is constantly evolving. But the authors correctly point out to (Don) Rubin on the one hand and to Diggle and Graton on the other. I would suggest adding in this section links to the relevant softwares like our own DIY-ABC...
==Notes==
<references />
885
2012-06-04T19:34:31Z
Xian
20
== Comments of Darren Logan on things to do before moving the article to Wikipedia ==
* The title should probably be "Approximate Bayesian computation" (small "c"). The "...in computational biology" is probably redundant for WP article. You can always make a redirection from the long title to the short one.
*: Moved. --<span style="font-family:Palatino Linotype, Book Antiqua, Palatino, serif;">[[User:Spencer Bliven|Spencer]] [[User talk:Spencer Bliven|Bliven]]</span> 15:35, 18 May 2012 (PDT)
'''The following comments apply specifically to the wikipedia-version of this article''' --[[User:Cdessimoz|Cdessimoz]] 03:41, 22 May 2012 (PDT)
* In the summary and elsewhere, you use terms like "over the last years" and "recently". These should be avoided, as WP articles are not dated and thus non-specific time-frames are not meaningful. If you need to refer to time, be specific (e.g. "Since 1999..." or "In 2010...")
* Example. In general, Wikipedia articles should not contain worked examples. That type of content is better suited to Wikiversity or Wikibooks. There are exceptions, however. The guidance on this can be seen at WP:NOT, specifically: "An article should not read like a "how-to" style... the purpose of Wikipedia is to present facts, not to teach subject matter. It is not appropriate to create or edit articles that read as textbooks, with leading questions and systematic problem solutions as examples... Some kinds of examples, specifically those intended to inform rather than to instruct, may be appropriate for inclusion in a Wikipedia article." I think your example might be ok, but you should be careful of the tone to ensure it doesn't seem like a "how-to" guide.
* Wikipedia articles do not have conclusion sections.
* Throughout the article you should try and avoid using a narrative voice and remove all self-references. For example:
** "As the previous section suggests..."
** "This section attempts to review important recent developments..."
** "...should be considered with sober caution, as discussed below."
** "Interestingly..."
** "This section discusses these potential risks and reviews possible ways to address them.."
** "As the above makes clear..."
** "This section reviews risks..."
** "This section attempts to review important recent developments."
== Comments of Christian P. Robert on the entry ==
A few comments on the specific entry on ABC written by Mikael Sunnåker et al.... The entry starts with the representation of the posterior probability of an hypothesis, rather than with the posterior density of a model parameter, which seems to lead the novice reader astray. After all, (a) ABC was not introduced for conducting model choice and (b) interchanging hypothesis and model means that the probability of an hypothesis H as used in the entry is actually the evidence in favour of the corresponding model. (There are a few typos and grammar mistakes, but I assume either PLoS or later contributors will correct those.) When the authors state that the "outcome of the ABC rejection algorithm is a set of parameter estimates distributed according to the desired posterior distribution", I think they are leading the readers astray as they forget the "approximative" aspect of this distribution. Further below, I think I would have used the title "Insufficient summary statistics" rather than "Sufficient summary statistics", as it spells out more clearly the fundamental issue with the potential difficulty in using ABC. (And I am not sure the subsequent paragraph on "Choice and sufficiency of summary statistics" should bother with the sufficiency aspects... It seems to me much more relevant to assess the impact on predictive performances.
Although this is most minor, I would not have made mention of the (rather artificial) "table for interpretation of the strength in values of [[wp:Bayes factor|the Bayes factor]] (...) originally published by [[wp:Harold Jeffreys|Harold Jeffreys]]". I obviously appreciate very much that the authors advertise our warning <ref> Robert, C.P., Cornuet, J.-M., Marin, J.-M. and Pillai, N. (2011) Lack of confidence in approximate Bayesian computation model choice. PNAS vol. 108 no. 37 15112-15117 </ref> about the potential lack of validity of an ABC based Bayes factor! I also like the notion of "quality control", even though it should only appear once. And the pseudo-example is quite fine as an introduction, while it could be supplemented with the outcome resulting from a large n, to be compared with the true posterior distribution. The section "Pitfalls and remedies" is remarkable in that it details the necessary steps for validating a ABC implementation: the only entry I would remove is the one about "Prior distribution and parameter ranges", in that this is not a problem inherent to ABC... (Granted, the authors present this as a "general risks in statistical inference exacerbated in ABC", which makes more sense!) It may be that the section on the non-zero tolerance should emphasize more clearly the fact that ε should not be zero. As discussed in the recent Read Paper by Fearnhead and Prangle <ref>Fearnhead, P. and Prangle, D. (2012) Constructing summary statistics for approximate Bayesian computation: semi-automatic approximate Bayesian computation. Journal of the Royal Statistical Society Series B. Volume 74, Issue 3, pages 419–474</refs> when envisioning ABC as a non-parametric method of inference.
At last, it is always possible to criticise the coverage of the historical part, since this is such a recent field that it is constantly evolving. But the authors correctly point out to (Don) Rubin on the one hand and to Diggle and Graton on the other. I would suggest adding in this section links to the relevant softwares like our own DIY-ABC...
==References==
<references />
886
2012-06-04T19:34:52Z
Xian
20
== Comments of Darren Logan on things to do before moving the article to Wikipedia ==
* The title should probably be "Approximate Bayesian computation" (small "c"). The "...in computational biology" is probably redundant for WP article. You can always make a redirection from the long title to the short one.
*: Moved. --<span style="font-family:Palatino Linotype, Book Antiqua, Palatino, serif;">[[User:Spencer Bliven|Spencer]] [[User talk:Spencer Bliven|Bliven]]</span> 15:35, 18 May 2012 (PDT)
'''The following comments apply specifically to the wikipedia-version of this article''' --[[User:Cdessimoz|Cdessimoz]] 03:41, 22 May 2012 (PDT)
* In the summary and elsewhere, you use terms like "over the last years" and "recently". These should be avoided, as WP articles are not dated and thus non-specific time-frames are not meaningful. If you need to refer to time, be specific (e.g. "Since 1999..." or "In 2010...")
* Example. In general, Wikipedia articles should not contain worked examples. That type of content is better suited to Wikiversity or Wikibooks. There are exceptions, however. The guidance on this can be seen at WP:NOT, specifically: "An article should not read like a "how-to" style... the purpose of Wikipedia is to present facts, not to teach subject matter. It is not appropriate to create or edit articles that read as textbooks, with leading questions and systematic problem solutions as examples... Some kinds of examples, specifically those intended to inform rather than to instruct, may be appropriate for inclusion in a Wikipedia article." I think your example might be ok, but you should be careful of the tone to ensure it doesn't seem like a "how-to" guide.
* Wikipedia articles do not have conclusion sections.
* Throughout the article you should try and avoid using a narrative voice and remove all self-references. For example:
** "As the previous section suggests..."
** "This section attempts to review important recent developments..."
** "...should be considered with sober caution, as discussed below."
** "Interestingly..."
** "This section discusses these potential risks and reviews possible ways to address them.."
** "As the above makes clear..."
** "This section reviews risks..."
** "This section attempts to review important recent developments."
== Comments of Christian P. Robert on the entry ==
A few comments on the specific entry on ABC written by Mikael Sunnåker et al.... The entry starts with the representation of the posterior probability of an hypothesis, rather than with the posterior density of a model parameter, which seems to lead the novice reader astray. After all, (a) ABC was not introduced for conducting model choice and (b) interchanging hypothesis and model means that the probability of an hypothesis H as used in the entry is actually the evidence in favour of the corresponding model. (There are a few typos and grammar mistakes, but I assume either PLoS or later contributors will correct those.) When the authors state that the "outcome of the ABC rejection algorithm is a set of parameter estimates distributed according to the desired posterior distribution", I think they are leading the readers astray as they forget the "approximative" aspect of this distribution. Further below, I think I would have used the title "Insufficient summary statistics" rather than "Sufficient summary statistics", as it spells out more clearly the fundamental issue with the potential difficulty in using ABC. (And I am not sure the subsequent paragraph on "Choice and sufficiency of summary statistics" should bother with the sufficiency aspects... It seems to me much more relevant to assess the impact on predictive performances.
Although this is most minor, I would not have made mention of the (rather artificial) "table for interpretation of the strength in values of [[wp:Bayes factor|the Bayes factor]] (...) originally published by [[wp:Harold Jeffreys|Harold Jeffreys]]". I obviously appreciate very much that the authors advertise our warning <ref> Robert, C.P., Cornuet, J.-M., Marin, J.-M. and Pillai, N. (2011) Lack of confidence in approximate Bayesian computation model choice. PNAS vol. 108 no. 37 15112-15117 </ref> about the potential lack of validity of an ABC based Bayes factor! I also like the notion of "quality control", even though it should only appear once. And the pseudo-example is quite fine as an introduction, while it could be supplemented with the outcome resulting from a large n, to be compared with the true posterior distribution. The section "Pitfalls and remedies" is remarkable in that it details the necessary steps for validating a ABC implementation: the only entry I would remove is the one about "Prior distribution and parameter ranges", in that this is not a problem inherent to ABC... (Granted, the authors present this as a "general risks in statistical inference exacerbated in ABC", which makes more sense!) It may be that the section on the non-zero tolerance should emphasize more clearly the fact that ε should not be zero. As discussed in the recent Read Paper by Fearnhead and Prangle <ref>Fearnhead, P. and Prangle, D. (2012) Constructing summary statistics for approximate Bayesian computation: semi-automatic approximate Bayesian computation. Journal of the Royal Statistical Society Series B. Volume 74, Issue 3, pages 419–474</refs> when envisioning ABC as a non-parametric method of inference.
At last, it is always possible to criticise the coverage of the historical part, since this is such a recent field that it is constantly evolving. But the authors correctly point out to (Don) Rubin on the one hand and to Diggle and Graton on the other. I would suggest adding in this section links to the relevant softwares like our own DIY-ABC...
==References==
<references/>
887
2012-06-04T19:35:36Z
Xian
20
== Comments of Darren Logan on things to do before moving the article to Wikipedia ==
* The title should probably be "Approximate Bayesian computation" (small "c"). The "...in computational biology" is probably redundant for WP article. You can always make a redirection from the long title to the short one.
*: Moved. --<span style="font-family:Palatino Linotype, Book Antiqua, Palatino, serif;">[[User:Spencer Bliven|Spencer]] [[User talk:Spencer Bliven|Bliven]]</span> 15:35, 18 May 2012 (PDT)
'''The following comments apply specifically to the wikipedia-version of this article''' --[[User:Cdessimoz|Cdessimoz]] 03:41, 22 May 2012 (PDT)
* In the summary and elsewhere, you use terms like "over the last years" and "recently". These should be avoided, as WP articles are not dated and thus non-specific time-frames are not meaningful. If you need to refer to time, be specific (e.g. "Since 1999..." or "In 2010...")
* Example. In general, Wikipedia articles should not contain worked examples. That type of content is better suited to Wikiversity or Wikibooks. There are exceptions, however. The guidance on this can be seen at WP:NOT, specifically: "An article should not read like a "how-to" style... the purpose of Wikipedia is to present facts, not to teach subject matter. It is not appropriate to create or edit articles that read as textbooks, with leading questions and systematic problem solutions as examples... Some kinds of examples, specifically those intended to inform rather than to instruct, may be appropriate for inclusion in a Wikipedia article." I think your example might be ok, but you should be careful of the tone to ensure it doesn't seem like a "how-to" guide.
* Wikipedia articles do not have conclusion sections.
* Throughout the article you should try and avoid using a narrative voice and remove all self-references. For example:
** "As the previous section suggests..."
** "This section attempts to review important recent developments..."
** "...should be considered with sober caution, as discussed below."
** "Interestingly..."
** "This section discusses these potential risks and reviews possible ways to address them.."
** "As the above makes clear..."
** "This section reviews risks..."
** "This section attempts to review important recent developments."
== Comments of Christian P. Robert on the entry ==
A few comments on the specific entry on ABC written by Mikael Sunnåker et al.... The entry starts with the representation of the posterior probability of an hypothesis, rather than with the posterior density of a model parameter, which seems to lead the novice reader astray. After all, (a) ABC was not introduced for conducting model choice and (b) interchanging hypothesis and model means that the probability of an hypothesis H as used in the entry is actually the evidence in favour of the corresponding model. (There are a few typos and grammar mistakes, but I assume either PLoS or later contributors will correct those.) When the authors state that the "outcome of the ABC rejection algorithm is a set of parameter estimates distributed according to the desired posterior distribution", I think they are leading the readers astray as they forget the "approximative" aspect of this distribution. Further below, I think I would have used the title "Insufficient summary statistics" rather than "Sufficient summary statistics", as it spells out more clearly the fundamental issue with the potential difficulty in using ABC. (And I am not sure the subsequent paragraph on "Choice and sufficiency of summary statistics" should bother with the sufficiency aspects... It seems to me much more relevant to assess the impact on predictive performances.
Although this is most minor, I would not have made mention of the (rather artificial) "table for interpretation of the strength in values of [[wp:Bayes factor|the Bayes factor]] (...) originally published by [[wp:Harold Jeffreys|Harold Jeffreys]]". I obviously appreciate very much that the authors advertise our warning <ref> Robert, C.P., Cornuet, J.-M., Marin, J.-M. and Pillai, N. (2011) Lack of confidence in approximate Bayesian computation model choice. PNAS vol. 108 no. 37 15112-15117 </ref> about the potential lack of validity of an ABC based Bayes factor! I also like the notion of "quality control", even though it should only appear once. And the pseudo-example is quite fine as an introduction, while it could be supplemented with the outcome resulting from a large n, to be compared with the true posterior distribution. The section "Pitfalls and remedies" is remarkable in that it details the necessary steps for validating a ABC implementation: the only entry I would remove is the one about "Prior distribution and parameter ranges", in that this is not a problem inherent to ABC... (Granted, the authors present this as a "general risks in statistical inference exacerbated in ABC", which makes more sense!) It may be that the section on the non-zero tolerance should emphasize more clearly the fact that ε should not be zero. As discussed in the recent Read Paper by Fearnhead and Prangle <ref>Fearnhead, P. and Prangle, D. (2012) Constructing summary statistics for approximate Bayesian computation: semi-automatic approximate Bayesian computation. Journal of the Royal Statistical Society Series B. Volume 74, Issue 3, pages 419–474</refs> when envisioning ABC as a non-parametric method of inference.
At last, it is always possible to criticise the coverage of the historical part, since this is such a recent field that it is constantly evolving. But the authors correctly point out to (Don) Rubin on the one hand and to Diggle and Graton on the other. I would suggest adding in this section links to the relevant softwares like our own DIY-ABC...
==References==
<references />
889
2012-06-04T19:37:37Z
Xian
20
== Comments of Darren Logan on things to do before moving the article to Wikipedia ==
* The title should probably be "Approximate Bayesian computation" (small "c"). The "...in computational biology" is probably redundant for WP article. You can always make a redirection from the long title to the short one.
*: Moved. --<span style="font-family:Palatino Linotype, Book Antiqua, Palatino, serif;">[[User:Spencer Bliven|Spencer]] [[User talk:Spencer Bliven|Bliven]]</span> 15:35, 18 May 2012 (PDT)
'''The following comments apply specifically to the wikipedia-version of this article''' --[[User:Cdessimoz|Cdessimoz]] 03:41, 22 May 2012 (PDT)
* In the summary and elsewhere, you use terms like "over the last years" and "recently". These should be avoided, as WP articles are not dated and thus non-specific time-frames are not meaningful. If you need to refer to time, be specific (e.g. "Since 1999..." or "In 2010...")
* Example. In general, Wikipedia articles should not contain worked examples. That type of content is better suited to Wikiversity or Wikibooks. There are exceptions, however. The guidance on this can be seen at WP:NOT, specifically: "An article should not read like a "how-to" style... the purpose of Wikipedia is to present facts, not to teach subject matter. It is not appropriate to create or edit articles that read as textbooks, with leading questions and systematic problem solutions as examples... Some kinds of examples, specifically those intended to inform rather than to instruct, may be appropriate for inclusion in a Wikipedia article." I think your example might be ok, but you should be careful of the tone to ensure it doesn't seem like a "how-to" guide.
* Wikipedia articles do not have conclusion sections.
* Throughout the article you should try and avoid using a narrative voice and remove all self-references. For example:
** "As the previous section suggests..."
** "This section attempts to review important recent developments..."
** "...should be considered with sober caution, as discussed below."
** "Interestingly..."
** "This section discusses these potential risks and reviews possible ways to address them.."
** "As the above makes clear..."
** "This section reviews risks..."
** "This section attempts to review important recent developments."
== Comments of Christian P. Robert on the entry ==
A few comments on the specific entry on ABC written by Mikael Sunnåker et al.... The entry starts with the representation of the posterior probability of an hypothesis, rather than with the posterior density of a model parameter, which seems to lead the novice reader astray. After all, (a) ABC was not introduced for conducting model choice and (b) interchanging hypothesis and model means that the probability of an hypothesis H as used in the entry is actually the evidence in favour of the corresponding model. (There are a few typos and grammar mistakes, but I assume either PLoS or later contributors will correct those.) When the authors state that the "outcome of the ABC rejection algorithm is a set of parameter estimates distributed according to the desired posterior distribution", I think they are leading the readers astray as they forget the "approximative" aspect of this distribution. Further below, I think I would have used the title "Insufficient summary statistics" rather than "Sufficient summary statistics", as it spells out more clearly the fundamental issue with the potential difficulty in using ABC. (And I am not sure the subsequent paragraph on "Choice and sufficiency of summary statistics" should bother with the sufficiency aspects... It seems to me much more relevant to assess the impact on predictive performances.
Although this is most minor, I would not have made mention of the (rather artificial) "table for interpretation of the strength in values of [[wp:Bayes factor|the Bayes factor]] (...) originally published by [[wp:Harold Jeffreys|Harold Jeffreys]]". I obviously appreciate very much that the authors advertise our warning <ref> Robert, C.P., Cornuet, J.-M., Marin, J.-M. and Pillai, N. (2011) Lack of confidence in approximate Bayesian computation model choice. PNAS vol. 108 no. 37 15112-15117.</ref> about the potential lack of validity of an ABC based Bayes factor! I also like the notion of "quality control", even though it should only appear once. And the pseudo-example is quite fine as an introduction, while it could be supplemented with the outcome resulting from a large n, to be compared with the true posterior distribution. The section "Pitfalls and remedies" is remarkable in that it details the necessary steps for validating a ABC implementation: the only entry I would remove is the one about "Prior distribution and parameter ranges", in that this is not a problem inherent to ABC... (Granted, the authors present this as a "general risks in statistical inference exacerbated in ABC", which makes more sense!) It may be that the section on the non-zero tolerance should emphasize more clearly the fact that ε should not be zero. As discussed in the recent Read Paper by Fearnhead and Prangle <ref>Fearnhead, P. and Prangle, D. (2012) Constructing summary statistics for approximate Bayesian computation: semi-automatic approximate Bayesian computation. Journal of the Royal Statistical Society Series B. Volume 74, Issue 3, pages 419–474.</ref> when envisioning ABC as a non-parametric method of inference.
At last, it is always possible to criticise the coverage of the historical part, since this is such a recent field that it is constantly evolving. But the authors correctly point out to (Don) Rubin on the one hand and to Diggle and Graton on the other. I would suggest adding in this section links to the relevant softwares like our own DIY-ABC...
==References==
<references />
890
2012-06-04T19:42:44Z
Xian
20
== Comments of Darren Logan on things to do before moving the article to Wikipedia ==
* The title should probably be "Approximate Bayesian computation" (small "c"). The "...in computational biology" is probably redundant for WP article. You can always make a redirection from the long title to the short one.
*: Moved. --<span style="font-family:Palatino Linotype, Book Antiqua, Palatino, serif;">[[User:Spencer Bliven|Spencer]] [[User talk:Spencer Bliven|Bliven]]</span> 15:35, 18 May 2012 (PDT)
'''The following comments apply specifically to the wikipedia-version of this article''' --[[User:Cdessimoz|Cdessimoz]] 03:41, 22 May 2012 (PDT)
* In the summary and elsewhere, you use terms like "over the last years" and "recently". These should be avoided, as WP articles are not dated and thus non-specific time-frames are not meaningful. If you need to refer to time, be specific (e.g. "Since 1999..." or "In 2010...")
* Example. In general, Wikipedia articles should not contain worked examples. That type of content is better suited to Wikiversity or Wikibooks. There are exceptions, however. The guidance on this can be seen at WP:NOT, specifically: "An article should not read like a "how-to" style... the purpose of Wikipedia is to present facts, not to teach subject matter. It is not appropriate to create or edit articles that read as textbooks, with leading questions and systematic problem solutions as examples... Some kinds of examples, specifically those intended to inform rather than to instruct, may be appropriate for inclusion in a Wikipedia article." I think your example might be ok, but you should be careful of the tone to ensure it doesn't seem like a "how-to" guide.
* Wikipedia articles do not have conclusion sections.
* Throughout the article you should try and avoid using a narrative voice and remove all self-references. For example:
** "As the previous section suggests..."
** "This section attempts to review important recent developments..."
** "...should be considered with sober caution, as discussed below."
** "Interestingly..."
** "This section discusses these potential risks and reviews possible ways to address them.."
** "As the above makes clear..."
** "This section reviews risks..."
** "This section attempts to review important recent developments."
== Comments of Christian P. Robert on the entry ==
A few comments on the specific entry on ABC written by Mikael Sunnåker et al.... The entry starts with the representation of the posterior probability of an hypothesis, rather than with the posterior density of a model parameter, which makes it seems likely it could lead the novice reader astray. After all, (a) ABC was not introduced for conducting model choice and (b) interchanging hypothesis and model means that the probability of an hypothesis '''''H''''' as used in the entry is actually the evidence in favour of the corresponding model. (There are a few typos and grammar mistakes, but I assume either PLoS or later contributors will correct those.) When the authors state that the "outcome of the ABC rejection algorithm is a set of parameter estimates distributed according to the desired posterior distribution", I think they are leading some of the readers astray as they forget the "approximative" aspect of this distribution. Further below, I would have used the title "Insufficient summary statistics" rather than "Sufficient summary statistics", as it spells out more clearly the fundamental issue with the potential difficulty in using ABC. (And I am not sure the subsequent paragraph on "Choice and sufficiency of summary statistics" should bother with the sufficiency aspects... It seems to me much more relevant to assess the impact on predictive performances.
Although this is most minor, I would not have made mention of the (rather artificial) "table for interpretation of the strength in values of [[wp:Bayes factor|the Bayes factor]] (...) originally published by [[wp:Harold Jeffreys|Harold Jeffreys]]". I obviously appreciate very much that the authors advertise our warning <ref> Robert, C.P., Cornuet, J.-M., Marin, J.-M. and Pillai, N. (2011) Lack of confidence in approximate Bayesian computation model choice. PNAS vol. 108 no. 37 15112-15117.</ref> about the potential lack of validity of an ABC based Bayes factor! I also like the notion of "quality control", even though it should only appear once. And the pseudo-example is quite fine as an introduction, while it could be supplemented with the outcome resulting from a large n, to be compared with the true posterior distribution. The section "Pitfalls and remedies" is remarkable in that it details the necessary steps for validating a ABC implementation: the only entry I would remove is the one about "Prior distribution and parameter ranges", in that this is not a problem inherent to ABC... (Granted, the authors present this as a "general risks in statistical inference exacerbated in ABC", which makes more sense!) It may be that the section on the non-zero tolerance should emphasize more clearly the fact that ε should not be zero. As discussed in the recent Read Paper by Fearnhead and Prangle <ref>Fearnhead, P. and Prangle, D. (2012) Constructing summary statistics for approximate Bayesian computation: semi-automatic approximate Bayesian computation. Journal of the Royal Statistical Society Series B. Volume 74, Issue 3, pages 419–474.</ref> when envisioning ABC as a non-parametric method of inference.
At last, it is always possible to criticise the coverage of the historical part, since this is such a recent field that it is constantly evolving. But the authors correctly point out to (Don) Rubin on the one hand and to Diggle and Graton on the other. I would suggest adding in this section links to the relevant softwares like our own DIY-ABC<ref>Cornuet, J.-M., Santos, F., Beaumont, M. et al. (2008) Inferring population history with DIY ABC: a user-friendly approach to approximate Bayesian computation . Bioinformatics 24 (23): 2713-2719.</ref>...
==References==
<references />
964
2012-06-12T09:16:31Z
Dennisprangle
21
== Comments of Darren Logan on things to do before moving the article to Wikipedia ==
* The title should probably be "Approximate Bayesian computation" (small "c"). The "...in computational biology" is probably redundant for WP article. You can always make a redirection from the long title to the short one.
*: Moved. --<span style="font-family:Palatino Linotype, Book Antiqua, Palatino, serif;">[[User:Spencer Bliven|Spencer]] [[User talk:Spencer Bliven|Bliven]]</span> 15:35, 18 May 2012 (PDT)
'''The following comments apply specifically to the wikipedia-version of this article''' --[[User:Cdessimoz|Cdessimoz]] 03:41, 22 May 2012 (PDT)
* In the summary and elsewhere, you use terms like "over the last years" and "recently". These should be avoided, as WP articles are not dated and thus non-specific time-frames are not meaningful. If you need to refer to time, be specific (e.g. "Since 1999..." or "In 2010...")
* Example. In general, Wikipedia articles should not contain worked examples. That type of content is better suited to Wikiversity or Wikibooks. There are exceptions, however. The guidance on this can be seen at WP:NOT, specifically: "An article should not read like a "how-to" style... the purpose of Wikipedia is to present facts, not to teach subject matter. It is not appropriate to create or edit articles that read as textbooks, with leading questions and systematic problem solutions as examples... Some kinds of examples, specifically those intended to inform rather than to instruct, may be appropriate for inclusion in a Wikipedia article." I think your example might be ok, but you should be careful of the tone to ensure it doesn't seem like a "how-to" guide.
* Wikipedia articles do not have conclusion sections.
* Throughout the article you should try and avoid using a narrative voice and remove all self-references. For example:
** "As the previous section suggests..."
** "This section attempts to review important recent developments..."
** "...should be considered with sober caution, as discussed below."
** "Interestingly..."
** "This section discusses these potential risks and reviews possible ways to address them.."
** "As the above makes clear..."
** "This section reviews risks..."
** "This section attempts to review important recent developments."
== Comments of Christian P. Robert on the entry ==
A few comments on the specific entry on ABC written by Mikael Sunnåker et al.... The entry starts with the representation of the posterior probability of an hypothesis, rather than with the posterior density of a model parameter, which makes it seems likely it could lead the novice reader astray. After all, (a) ABC was not introduced for conducting model choice and (b) interchanging hypothesis and model means that the probability of an hypothesis '''''H''''' as used in the entry is actually the evidence in favour of the corresponding model. (There are a few typos and grammar mistakes, but I assume either PLoS or later contributors will correct those.) When the authors state that the "outcome of the ABC rejection algorithm is a set of parameter estimates distributed according to the desired posterior distribution", I think they are leading some of the readers astray as they forget the "approximative" aspect of this distribution. Further below, I would have used the title "Insufficient summary statistics" rather than "Sufficient summary statistics", as it spells out more clearly the fundamental issue with the potential difficulty in using ABC. (And I am not sure the subsequent paragraph on "Choice and sufficiency of summary statistics" should bother with the sufficiency aspects... It seems to me much more relevant to assess the impact on predictive performances.
Although this is most minor, I would not have made mention of the (rather artificial) "table for interpretation of the strength in values of [[wp:Bayes factor|the Bayes factor]] (...) originally published by [[wp:Harold Jeffreys|Harold Jeffreys]]". I obviously appreciate very much that the authors advertise our warning <ref> Robert, C.P., Cornuet, J.-M., Marin, J.-M. and Pillai, N. (2011) Lack of confidence in approximate Bayesian computation model choice. PNAS vol. 108 no. 37 15112-15117.</ref> about the potential lack of validity of an ABC based Bayes factor! I also like the notion of "quality control", even though it should only appear once. And the pseudo-example is quite fine as an introduction, while it could be supplemented with the outcome resulting from a large n, to be compared with the true posterior distribution. The section "Pitfalls and remedies" is remarkable in that it details the necessary steps for validating a ABC implementation: the only entry I would remove is the one about "Prior distribution and parameter ranges", in that this is not a problem inherent to ABC... (Granted, the authors present this as a "general risks in statistical inference exacerbated in ABC", which makes more sense!) It may be that the section on the non-zero tolerance should emphasize more clearly the fact that ε should not be zero. As discussed in the recent Read Paper by Fearnhead and Prangle <ref name="FP2012">Fearnhead, P. and Prangle, D. (2012) Constructing summary statistics for approximate Bayesian computation: semi-automatic approximate Bayesian computation. Journal of the Royal Statistical Society Series B. Volume 74, Issue 3, pages 419–474.</ref> when envisioning ABC as a non-parametric method of inference.
At last, it is always possible to criticise the coverage of the historical part, since this is such a recent field that it is constantly evolving. But the authors correctly point out to (Don) Rubin on the one hand and to Diggle and Graton on the other. I would suggest adding in this section links to the relevant softwares like our own DIY-ABC<ref>Cornuet, J.-M., Santos, F., Beaumont, M. et al. (2008) Inferring population history with DIY ABC: a user-friendly approach to approximate Bayesian computation . Bioinformatics 24 (23): 2713-2719.</ref>...
== Review by Dennis Prangle ==
This is a well written and accessible introductory article. I particularly like the balance struck between describing the simplicity of implementing ABC and the potential drawbacks.
=== Major comments ===
Much of the material in the "recent methodological developments" section is well established and no longer recent relative to the age of the field (e.g. the Marjoram et al paper was published in 2003). I'd suggest at least renaming this section. Alternatively, much of this material could be incorporated into the "approximation of the posterior" section, as regression correction ideas and MCMC / SMC algorithms are tools commonly used to improve the approximation.
A little more coverage of applications would be nice. One way to do this without increasing the length of the article would be to explicitly reference recent review papers (Beaumont 2010, Bertorelle et al 2010, Csillery et al 2010, Marin et al 2011<ref>Jean-Michel Marin, Pierre Pudlo, Christian P. Robert and Robin J. Ryder (2011) Approximate Bayesian computational methods. Statistics and Computing (published online)</ref>) for further details.
The model comparison section should explain how the ABC rejection sampling algorithm can be adapted to perform inference between models (or give a reference). A reference to more advanced algorithms (e.g. Didelot et al, Toni and Stumpf 2009<ref>Tina Toni and Michael P. H. Stumpf (2009) Simulation-based model selection for dynamical systems in systems and population biology. Bioinformatics (26) 104-110</ref>) would also be helpful.
I agree with Christian Robert's comments that the discussion of a hypothesis H in the motivation section is somewhat confusing, and that links to code could be helpful. Some additional suggestions are the "abc" R package and ABC-SysBio.
=== Minor comments ===
The acceptance criterion should be <math>\rho (\hat{D},D) \le \epsilon</math> not <math>\rho (\hat{D},D)<\epsilon</math> if <math>\epsilon=0</math> is to correspond to acceptance of exact matches only.
"Sufficient summary statistics": As Christian writes, it would seem more natural to discuss general summary statistics first, then the special and less practically useful case of sufficient statistics.
"Example": I'd point out that this is an example application only, and more accurate inference is possible here by particle filtering methods. If there were some missing data this would be a more natural ABC application e.g. if only the summary statistic was observed.
"Approximation of the posterior": "...has been justified theoretically under some limiting conditions". The word "limiting" doesn't seem (to me) to describe the measurement error case.
"Choice and sufficiency of summary statistics": "Sufficient statistics are optimal..." I'd change to "Low dimensional sufficient statistics". For some models (e.g. iid Cauchy) the only sufficient statistics are the full data set, which would be a poor choice.
"Choice and sufficiency of summary statistics": "...which is approximated with a pilot run of simulations". Something like "...which is approximated by linear regression based on simulated data" would be more accurate.
"Choice and sufficiency of summary statistics": It might be useful to reference a recent comparison<ref>M. G. B. Blum, M. A. Nunes, D. Prangle, S. A. Sisson (2012) A comparative review of dimension reduction methods in approximate Bayesian computation. arxiv.org/abs/1202.3819 </ref> (disclaimer: which I contributed to) between methods of choosing summary statistics.
"Bayes factor with ABC and summary statistics": "...can also be used to..." it might be more accurate to say "...is sufficient to..."
"Bayes factor with ABC and summary statistics": "meaningless" seems too strong as the next sentence suggests a potentially useful alternative way of doing inference.
"Prior distribution and parameter ranges": "...based on the principle of maximum entropy". A link to the general topic of [[wp:Prior_distribution#Uninformative_priors|objective priors]] might be helpful here.
"Large data sets": "which may be a tractable approach for ABC based methods". Note it is already easy to parallelise many of the steps in ABC algorithms based on rejection sampling and SMC.
"Curse of dimensionality": Some theoretical results have been proved here<ref>M. G. B. Blum (2010) Approximate Bayesian Computation: a nonparametric perspective. Journal of the American Statistical Association (105) 1178-1187</ref><ref name="FP2012" />.
"Conclusion": "With faster evaluation of the likelihood function..." I'm not sure what this is getting at; in ABC applications the likelihood function typically cannot be evaluated!
==References==
<references />
965
2012-06-12T09:20:34Z
Dennisprangle
21
== Comments of Darren Logan on things to do before moving the article to Wikipedia ==
* The title should probably be "Approximate Bayesian computation" (small "c"). The "...in computational biology" is probably redundant for WP article. You can always make a redirection from the long title to the short one.
*: Moved. --<span style="font-family:Palatino Linotype, Book Antiqua, Palatino, serif;">[[User:Spencer Bliven|Spencer]] [[User talk:Spencer Bliven|Bliven]]</span> 15:35, 18 May 2012 (PDT)
'''The following comments apply specifically to the wikipedia-version of this article''' --[[User:Cdessimoz|Cdessimoz]] 03:41, 22 May 2012 (PDT)
* In the summary and elsewhere, you use terms like "over the last years" and "recently". These should be avoided, as WP articles are not dated and thus non-specific time-frames are not meaningful. If you need to refer to time, be specific (e.g. "Since 1999..." or "In 2010...")
* Example. In general, Wikipedia articles should not contain worked examples. That type of content is better suited to Wikiversity or Wikibooks. There are exceptions, however. The guidance on this can be seen at WP:NOT, specifically: "An article should not read like a "how-to" style... the purpose of Wikipedia is to present facts, not to teach subject matter. It is not appropriate to create or edit articles that read as textbooks, with leading questions and systematic problem solutions as examples... Some kinds of examples, specifically those intended to inform rather than to instruct, may be appropriate for inclusion in a Wikipedia article." I think your example might be ok, but you should be careful of the tone to ensure it doesn't seem like a "how-to" guide.
* Wikipedia articles do not have conclusion sections.
* Throughout the article you should try and avoid using a narrative voice and remove all self-references. For example:
** "As the previous section suggests..."
** "This section attempts to review important recent developments..."
** "...should be considered with sober caution, as discussed below."
** "Interestingly..."
** "This section discusses these potential risks and reviews possible ways to address them.."
** "As the above makes clear..."
** "This section reviews risks..."
** "This section attempts to review important recent developments."
== Comments of Christian P. Robert on the entry ==
A few comments on the specific entry on ABC written by Mikael Sunnåker et al.... The entry starts with the representation of the posterior probability of an hypothesis, rather than with the posterior density of a model parameter, which makes it seems likely it could lead the novice reader astray. After all, (a) ABC was not introduced for conducting model choice and (b) interchanging hypothesis and model means that the probability of an hypothesis '''''H''''' as used in the entry is actually the evidence in favour of the corresponding model. (There are a few typos and grammar mistakes, but I assume either PLoS or later contributors will correct those.) When the authors state that the "outcome of the ABC rejection algorithm is a set of parameter estimates distributed according to the desired posterior distribution", I think they are leading some of the readers astray as they forget the "approximative" aspect of this distribution. Further below, I would have used the title "Insufficient summary statistics" rather than "Sufficient summary statistics", as it spells out more clearly the fundamental issue with the potential difficulty in using ABC. (And I am not sure the subsequent paragraph on "Choice and sufficiency of summary statistics" should bother with the sufficiency aspects... It seems to me much more relevant to assess the impact on predictive performances.
Although this is most minor, I would not have made mention of the (rather artificial) "table for interpretation of the strength in values of [[wp:Bayes factor|the Bayes factor]] (...) originally published by [[wp:Harold Jeffreys|Harold Jeffreys]]". I obviously appreciate very much that the authors advertise our warning <ref> Robert, C.P., Cornuet, J.-M., Marin, J.-M. and Pillai, N. (2011) Lack of confidence in approximate Bayesian computation model choice. PNAS vol. 108 no. 37 15112-15117.</ref> about the potential lack of validity of an ABC based Bayes factor! I also like the notion of "quality control", even though it should only appear once. And the pseudo-example is quite fine as an introduction, while it could be supplemented with the outcome resulting from a large n, to be compared with the true posterior distribution. The section "Pitfalls and remedies" is remarkable in that it details the necessary steps for validating a ABC implementation: the only entry I would remove is the one about "Prior distribution and parameter ranges", in that this is not a problem inherent to ABC... (Granted, the authors present this as a "general risks in statistical inference exacerbated in ABC", which makes more sense!) It may be that the section on the non-zero tolerance should emphasize more clearly the fact that ε should not be zero. As discussed in the recent Read Paper by Fearnhead and Prangle <ref name="FP2012">Fearnhead, P. and Prangle, D. (2012) Constructing summary statistics for approximate Bayesian computation: semi-automatic approximate Bayesian computation. Journal of the Royal Statistical Society Series B. Volume 74, Issue 3, pages 419–474.</ref> when envisioning ABC as a non-parametric method of inference.
At last, it is always possible to criticise the coverage of the historical part, since this is such a recent field that it is constantly evolving. But the authors correctly point out to (Don) Rubin on the one hand and to Diggle and Graton on the other. I would suggest adding in this section links to the relevant softwares like our own DIY-ABC<ref>Cornuet, J.-M., Santos, F., Beaumont, M. et al. (2008) Inferring population history with DIY ABC: a user-friendly approach to approximate Bayesian computation . Bioinformatics 24 (23): 2713-2719.</ref>...
== Review by Dennis Prangle ==
This is a well written and accessible introductory article. I particularly like the balance struck between describing the simplicity of implementing ABC and the potential drawbacks.
=== Major comments ===
(nb I've included full references only for papers not in the original article.)
Much of the material in the "recent methodological developments" section is well established and no longer recent relative to the age of the field (e.g. the Marjoram et al paper was published in 2003). I'd suggest at least renaming this section. Alternatively, much of this material could be incorporated into the "approximation of the posterior" section, as regression correction ideas and MCMC / SMC algorithms are tools commonly used to improve the approximation.
A little more coverage of applications would be nice. One way to do this without increasing the length of the article would be to explicitly reference recent review papers (Beaumont 2010, Bertorelle et al 2010, Csillery et al 2010, Marin et al 2011<ref>Jean-Michel Marin, Pierre Pudlo, Christian P. Robert and Robin J. Ryder (2011) Approximate Bayesian computational methods. Statistics and Computing (published online)</ref>) for further details.
The model comparison section should explain how the ABC rejection sampling algorithm can be adapted to perform inference between models (or give a reference). A reference to more advanced algorithms (e.g. Didelot et al, Toni and Stumpf 2009<ref>Tina Toni and Michael P. H. Stumpf (2009) Simulation-based model selection for dynamical systems in systems and population biology. Bioinformatics (26) 104-110</ref>) would also be helpful.
I agree with Christian Robert's comments that the discussion of a hypothesis H in the motivation section is somewhat confusing, and that links to code could be helpful. Some additional suggestions are the "abc" R package and ABC-SysBio.
=== Minor comments ===
The acceptance criterion should be <math>\rho (\hat{D},D) \le \epsilon</math> not <math>\rho (\hat{D},D)<\epsilon</math> if <math>\epsilon=0</math> is to correspond to acceptance of exact matches only.
"Sufficient summary statistics": As Christian writes, it would seem more natural to discuss general summary statistics first, then the special and less practically useful case of sufficient statistics.
"Example": I'd point out that this is an example application only, and more accurate inference is possible here by particle filtering methods. If there were some missing data this would be a more natural ABC application e.g. if only the summary statistic was observed.
"Approximation of the posterior": "...has been justified theoretically under some limiting conditions". The word "limiting" doesn't seem (to me) to describe the measurement error case.
"Choice and sufficiency of summary statistics": "Sufficient statistics are optimal..." I'd change to "Low dimensional sufficient statistics". For some models (e.g. iid Cauchy) the only sufficient statistics are the full data set, which would be a poor choice.
"Choice and sufficiency of summary statistics": "...which is approximated with a pilot run of simulations". Something like "...which is approximated by linear regression based on simulated data" would be more accurate.
"Choice and sufficiency of summary statistics": It might be useful to reference a recent comparison<ref>M. G. B. Blum, M. A. Nunes, D. Prangle, S. A. Sisson (2012) A comparative review of dimension reduction methods in approximate Bayesian computation. arxiv.org/abs/1202.3819 </ref> (disclaimer: which I contributed to) between methods of choosing summary statistics.
"Bayes factor with ABC and summary statistics": "...can also be used to..." it might be more accurate to say "...is sufficient to..."
"Bayes factor with ABC and summary statistics": "meaningless" seems too strong as the next sentence suggests a potentially useful alternative way of doing inference.
"Prior distribution and parameter ranges": "...based on the principle of maximum entropy". A link to the general topic of [[wp:Prior_distribution#Uninformative_priors|objective priors]] might be helpful here.
"Large data sets": "which may be a tractable approach for ABC based methods". Note it is already easy to parallelise many of the steps in ABC algorithms based on rejection sampling and SMC.
"Curse of dimensionality": Some theoretical results have been proved here<ref>M. G. B. Blum (2010) Approximate Bayesian Computation: a nonparametric perspective. Journal of the American Statistical Association (105) 1178-1187</ref><ref name="FP2012" />.
"Conclusion": "With faster evaluation of the likelihood function..." I'm not sure what this is getting at; in ABC applications the likelihood function typically cannot be evaluated!
==References==
<references />
1006
2012-06-28T02:05:01Z
Daniel Mietchen
5
/* Comments of Darren Logan on things to do before moving the article to Wikipedia */
== Comments of Darren Logan on things to do before moving the article to Wikipedia ==
* The title should probably be "Approximate Bayesian computation" (small "c"). The "...in computational biology" is probably redundant for WP article. You can always make a redirection from the long title to the short one.
*: Moved. --<span style="font-family:Palatino Linotype, Book Antiqua, Palatino, serif;">[[User:Spencer Bliven|Spencer]] [[User talk:Spencer Bliven|Bliven]]</span> 15:35, 18 May 2012 (PDT)
'''The following comments apply specifically to the wikipedia-version of this article''' --[[User:Cdessimoz|Cdessimoz]] 03:41, 22 May 2012 (PDT)
* In the summary and elsewhere, you use terms like "over the last years" and "recently". These should be avoided, as WP articles are not dated and thus non-specific time-frames are not meaningful. If you need to refer to time, be specific (e.g. "Since 1999..." or "In 2010...")
* Example. In general, Wikipedia articles should not contain worked examples. That type of content is better suited to Wikiversity or Wikibooks. There are exceptions, however. The guidance on this can be seen at WP:NOT, specifically: "An article should not read like a "how-to" style... the purpose of Wikipedia is to present facts, not to teach subject matter. It is not appropriate to create or edit articles that read as textbooks, with leading questions and systematic problem solutions as examples... Some kinds of examples, specifically those intended to inform rather than to instruct, may be appropriate for inclusion in a Wikipedia article." I think your example might be ok, but you should be careful of the tone to ensure it doesn't seem like a "how-to" guide.
* Wikipedia articles do not have conclusion sections.
* Throughout the article you should try and avoid using a narrative voice and remove all self-references. For example:
** "As the previous section suggests..."
** "This section attempts to review important recent developments..."
** "...should be considered with sober caution, as discussed below."
** "Interestingly..."
** "This section discusses these potential risks and reviews possible ways to address them.."
** "As the above makes clear..."
** "This section reviews risks..."
** "This section attempts to review important recent developments."
To follow up on these remarks, the history section in such Wikipedia articles is typically the first after the lead section, as it puts the topic into its historic context. I have thus moved it there. However, this possibly breaks some of the narrative flow and should thus be checked again during the revision. --[[User:Daniel Mietchen|Daniel Mietchen]] 19:05, 27 June 2012 (PDT)
== Comments of Christian P. Robert on the entry ==
A few comments on the specific entry on ABC written by Mikael Sunnåker et al.... The entry starts with the representation of the posterior probability of an hypothesis, rather than with the posterior density of a model parameter, which makes it seems likely it could lead the novice reader astray. After all, (a) ABC was not introduced for conducting model choice and (b) interchanging hypothesis and model means that the probability of an hypothesis '''''H''''' as used in the entry is actually the evidence in favour of the corresponding model. (There are a few typos and grammar mistakes, but I assume either PLoS or later contributors will correct those.) When the authors state that the "outcome of the ABC rejection algorithm is a set of parameter estimates distributed according to the desired posterior distribution", I think they are leading some of the readers astray as they forget the "approximative" aspect of this distribution. Further below, I would have used the title "Insufficient summary statistics" rather than "Sufficient summary statistics", as it spells out more clearly the fundamental issue with the potential difficulty in using ABC. (And I am not sure the subsequent paragraph on "Choice and sufficiency of summary statistics" should bother with the sufficiency aspects... It seems to me much more relevant to assess the impact on predictive performances.
Although this is most minor, I would not have made mention of the (rather artificial) "table for interpretation of the strength in values of [[wp:Bayes factor|the Bayes factor]] (...) originally published by [[wp:Harold Jeffreys|Harold Jeffreys]]". I obviously appreciate very much that the authors advertise our warning <ref> Robert, C.P., Cornuet, J.-M., Marin, J.-M. and Pillai, N. (2011) Lack of confidence in approximate Bayesian computation model choice. PNAS vol. 108 no. 37 15112-15117.</ref> about the potential lack of validity of an ABC based Bayes factor! I also like the notion of "quality control", even though it should only appear once. And the pseudo-example is quite fine as an introduction, while it could be supplemented with the outcome resulting from a large n, to be compared with the true posterior distribution. The section "Pitfalls and remedies" is remarkable in that it details the necessary steps for validating a ABC implementation: the only entry I would remove is the one about "Prior distribution and parameter ranges", in that this is not a problem inherent to ABC... (Granted, the authors present this as a "general risks in statistical inference exacerbated in ABC", which makes more sense!) It may be that the section on the non-zero tolerance should emphasize more clearly the fact that ε should not be zero. As discussed in the recent Read Paper by Fearnhead and Prangle <ref name="FP2012">Fearnhead, P. and Prangle, D. (2012) Constructing summary statistics for approximate Bayesian computation: semi-automatic approximate Bayesian computation. Journal of the Royal Statistical Society Series B. Volume 74, Issue 3, pages 419–474.</ref> when envisioning ABC as a non-parametric method of inference.
At last, it is always possible to criticise the coverage of the historical part, since this is such a recent field that it is constantly evolving. But the authors correctly point out to (Don) Rubin on the one hand and to Diggle and Graton on the other. I would suggest adding in this section links to the relevant softwares like our own DIY-ABC<ref>Cornuet, J.-M., Santos, F., Beaumont, M. et al. (2008) Inferring population history with DIY ABC: a user-friendly approach to approximate Bayesian computation . Bioinformatics 24 (23): 2713-2719.</ref>...
== Review by Dennis Prangle ==
This is a well written and accessible introductory article. I particularly like the balance struck between describing the simplicity of implementing ABC and the potential drawbacks.
=== Major comments ===
(nb I've included full references only for papers not in the original article.)
Much of the material in the "recent methodological developments" section is well established and no longer recent relative to the age of the field (e.g. the Marjoram et al paper was published in 2003). I'd suggest at least renaming this section. Alternatively, much of this material could be incorporated into the "approximation of the posterior" section, as regression correction ideas and MCMC / SMC algorithms are tools commonly used to improve the approximation.
A little more coverage of applications would be nice. One way to do this without increasing the length of the article would be to explicitly reference recent review papers (Beaumont 2010, Bertorelle et al 2010, Csillery et al 2010, Marin et al 2011<ref>Jean-Michel Marin, Pierre Pudlo, Christian P. Robert and Robin J. Ryder (2011) Approximate Bayesian computational methods. Statistics and Computing (published online)</ref>) for further details.
The model comparison section should explain how the ABC rejection sampling algorithm can be adapted to perform inference between models (or give a reference). A reference to more advanced algorithms (e.g. Didelot et al, Toni and Stumpf 2009<ref>Tina Toni and Michael P. H. Stumpf (2009) Simulation-based model selection for dynamical systems in systems and population biology. Bioinformatics (26) 104-110</ref>) would also be helpful.
I agree with Christian Robert's comments that the discussion of a hypothesis H in the motivation section is somewhat confusing, and that links to code could be helpful. Some additional suggestions are the "abc" R package and ABC-SysBio.
=== Minor comments ===
The acceptance criterion should be <math>\rho (\hat{D},D) \le \epsilon</math> not <math>\rho (\hat{D},D)<\epsilon</math> if <math>\epsilon=0</math> is to correspond to acceptance of exact matches only.
"Sufficient summary statistics": As Christian writes, it would seem more natural to discuss general summary statistics first, then the special and less practically useful case of sufficient statistics.
"Example": I'd point out that this is an example application only, and more accurate inference is possible here by particle filtering methods. If there were some missing data this would be a more natural ABC application e.g. if only the summary statistic was observed.
"Approximation of the posterior": "...has been justified theoretically under some limiting conditions". The word "limiting" doesn't seem (to me) to describe the measurement error case.
"Choice and sufficiency of summary statistics": "Sufficient statistics are optimal..." I'd change to "Low dimensional sufficient statistics". For some models (e.g. iid Cauchy) the only sufficient statistics are the full data set, which would be a poor choice.
"Choice and sufficiency of summary statistics": "...which is approximated with a pilot run of simulations". Something like "...which is approximated by linear regression based on simulated data" would be more accurate.
"Choice and sufficiency of summary statistics": It might be useful to reference a recent comparison<ref>M. G. B. Blum, M. A. Nunes, D. Prangle, S. A. Sisson (2012) A comparative review of dimension reduction methods in approximate Bayesian computation. arxiv.org/abs/1202.3819 </ref> (disclaimer: which I contributed to) between methods of choosing summary statistics.
"Bayes factor with ABC and summary statistics": "...can also be used to..." it might be more accurate to say "...is sufficient to..."
"Bayes factor with ABC and summary statistics": "meaningless" seems too strong as the next sentence suggests a potentially useful alternative way of doing inference.
"Prior distribution and parameter ranges": "...based on the principle of maximum entropy". A link to the general topic of [[wp:Prior_distribution#Uninformative_priors|objective priors]] might be helpful here.
"Large data sets": "which may be a tractable approach for ABC based methods". Note it is already easy to parallelise many of the steps in ABC algorithms based on rejection sampling and SMC.
"Curse of dimensionality": Some theoretical results have been proved here<ref>M. G. B. Blum (2010) Approximate Bayesian Computation: a nonparametric perspective. Journal of the American Statistical Association (105) 1178-1187</ref><ref name="FP2012" />.
"Conclusion": "With faster evaluation of the likelihood function..." I'm not sure what this is getting at; in ABC applications the likelihood function typically cannot be evaluated!
==References==
<references />
1137
2012-07-05T13:45:41Z
Mikaelsunnaker
13
== Comments of Darren Logan on things to do before moving the article to Wikipedia ==
* The title should probably be "Approximate Bayesian computation" (small "c"). The "...in computational biology" is probably redundant for WP article. You can always make a redirection from the long title to the short one.
*: Moved. --<span style="font-family:Palatino Linotype, Book Antiqua, Palatino, serif;">[[User:Spencer Bliven|Spencer]] [[User talk:Spencer Bliven|Bliven]]</span> 15:35, 18 May 2012 (PDT)
'''The following comments apply specifically to the wikipedia-version of this article''' --[[User:Cdessimoz|Cdessimoz]] 03:41, 22 May 2012 (PDT)
* In the summary and elsewhere, you use terms like "over the last years" and "recently". These should be avoided, as WP articles are not dated and thus non-specific time-frames are not meaningful. If you need to refer to time, be specific (e.g. "Since 1999..." or "In 2010...")
* Example. In general, Wikipedia articles should not contain worked examples. That type of content is better suited to Wikiversity or Wikibooks. There are exceptions, however. The guidance on this can be seen at WP:NOT, specifically: "An article should not read like a "how-to" style... the purpose of Wikipedia is to present facts, not to teach subject matter. It is not appropriate to create or edit articles that read as textbooks, with leading questions and systematic problem solutions as examples... Some kinds of examples, specifically those intended to inform rather than to instruct, may be appropriate for inclusion in a Wikipedia article." I think your example might be ok, but you should be careful of the tone to ensure it doesn't seem like a "how-to" guide.
* Wikipedia articles do not have conclusion sections.
* Throughout the article you should try and avoid using a narrative voice and remove all self-references. For example:
** "As the previous section suggests..."
** "This section attempts to review important recent developments..."
** "...should be considered with sober caution, as discussed below."
** "Interestingly..."
** "This section discusses these potential risks and reviews possible ways to address them.."
** "As the above makes clear..."
** "This section reviews risks..."
** "This section attempts to review important recent developments."
To follow up on these remarks, the history section in such Wikipedia articles is typically the first after the lead section, as it puts the topic into its historic context. I have thus moved it there. However, this possibly breaks some of the narrative flow and should thus be checked again during the revision. --[[User:Daniel Mietchen|Daniel Mietchen]] 19:05, 27 June 2012 (PDT)
== Comments of Christian P. Robert on the entry ==
A few comments on the specific entry on ABC written by Mikael Sunnåker et al.... The entry starts with the representation of the posterior probability of an hypothesis, rather than with the posterior density of a model parameter, which makes it seems likely it could lead the novice reader astray. After all, (a) ABC was not introduced for conducting model choice and (b) interchanging hypothesis and model means that the probability of an hypothesis '''''H''''' as used in the entry is actually the evidence in favour of the corresponding model. (There are a few typos and grammar mistakes, but I assume either PLoS or later contributors will correct those.) When the authors state that the "outcome of the ABC rejection algorithm is a set of parameter estimates distributed according to the desired posterior distribution", I think they are leading some of the readers astray as they forget the "approximative" aspect of this distribution. Further below, I would have used the title "Insufficient summary statistics" rather than "Sufficient summary statistics", as it spells out more clearly the fundamental issue with the potential difficulty in using ABC. (And I am not sure the subsequent paragraph on "Choice and sufficiency of summary statistics" should bother with the sufficiency aspects... It seems to me much more relevant to assess the impact on predictive performances.
Although this is most minor, I would not have made mention of the (rather artificial) "table for interpretation of the strength in values of [[wp:Bayes factor|the Bayes factor]] (...) originally published by [[wp:Harold Jeffreys|Harold Jeffreys]]". I obviously appreciate very much that the authors advertise our warning <ref> Robert, C.P., Cornuet, J.-M., Marin, J.-M. and Pillai, N. (2011) Lack of confidence in approximate Bayesian computation model choice. PNAS vol. 108 no. 37 15112-15117.</ref> about the potential lack of validity of an ABC based Bayes factor! I also like the notion of "quality control", even though it should only appear once. And the pseudo-example is quite fine as an introduction, while it could be supplemented with the outcome resulting from a large n, to be compared with the true posterior distribution. The section "Pitfalls and remedies" is remarkable in that it details the necessary steps for validating a ABC implementation: the only entry I would remove is the one about "Prior distribution and parameter ranges", in that this is not a problem inherent to ABC... (Granted, the authors present this as a "general risks in statistical inference exacerbated in ABC", which makes more sense!) It may be that the section on the non-zero tolerance should emphasize more clearly the fact that ε should not be zero. As discussed in the recent Read Paper by Fearnhead and Prangle <ref name="FP2012">Fearnhead, P. and Prangle, D. (2012) Constructing summary statistics for approximate Bayesian computation: semi-automatic approximate Bayesian computation. Journal of the Royal Statistical Society Series B. Volume 74, Issue 3, pages 419–474.</ref> when envisioning ABC as a non-parametric method of inference.
At last, it is always possible to criticise the coverage of the historical part, since this is such a recent field that it is constantly evolving. But the authors correctly point out to (Don) Rubin on the one hand and to Diggle and Graton on the other. I would suggest adding in this section links to the relevant softwares like our own DIY-ABC<ref>Cornuet, J.-M., Santos, F., Beaumont, M. et al. (2008) Inferring population history with DIY ABC: a user-friendly approach to approximate Bayesian computation . Bioinformatics 24 (23): 2713-2719.</ref>...
== Review by Dennis Prangle ==
This is a well written and accessible introductory article. I particularly like the balance struck between describing the simplicity of implementing ABC and the potential drawbacks.
=== Major comments ===
(nb I've included full references only for papers not in the original article.)
Much of the material in the "recent methodological developments" section is well established and no longer recent relative to the age of the field (e.g. the Marjoram et al paper was published in 2003). I'd suggest at least renaming this section. Alternatively, much of this material could be incorporated into the "approximation of the posterior" section, as regression correction ideas and MCMC / SMC algorithms are tools commonly used to improve the approximation.
A little more coverage of applications would be nice. One way to do this without increasing the length of the article would be to explicitly reference recent review papers (Beaumont 2010, Bertorelle et al 2010, Csillery et al 2010, Marin et al 2011<ref>Jean-Michel Marin, Pierre Pudlo, Christian P. Robert and Robin J. Ryder (2011) Approximate Bayesian computational methods. Statistics and Computing (published online)</ref>) for further details.
The model comparison section should explain how the ABC rejection sampling algorithm can be adapted to perform inference between models (or give a reference). A reference to more advanced algorithms (e.g. Didelot et al, Toni and Stumpf 2009<ref>Tina Toni and Michael P. H. Stumpf (2009) Simulation-based model selection for dynamical systems in systems and population biology. Bioinformatics (26) 104-110</ref>) would also be helpful.
I agree with Christian Robert's comments that the discussion of a hypothesis H in the motivation section is somewhat confusing, and that links to code could be helpful. Some additional suggestions are the "abc" R package and ABC-SysBio.
=== Minor comments ===
The acceptance criterion should be <math>\rho (\hat{D},D) \le \epsilon</math> not <math>\rho (\hat{D},D)<\epsilon</math> if <math>\epsilon=0</math> is to correspond to acceptance of exact matches only.
"Sufficient summary statistics": As Christian writes, it would seem more natural to discuss general summary statistics first, then the special and less practically useful case of sufficient statistics.
"Example": I'd point out that this is an example application only, and more accurate inference is possible here by particle filtering methods. If there were some missing data this would be a more natural ABC application e.g. if only the summary statistic was observed.
"Approximation of the posterior": "...has been justified theoretically under some limiting conditions". The word "limiting" doesn't seem (to me) to describe the measurement error case.
"Choice and sufficiency of summary statistics": "Sufficient statistics are optimal..." I'd change to "Low dimensional sufficient statistics". For some models (e.g. iid Cauchy) the only sufficient statistics are the full data set, which would be a poor choice.
"Choice and sufficiency of summary statistics": "...which is approximated with a pilot run of simulations". Something like "...which is approximated by linear regression based on simulated data" would be more accurate.
"Choice and sufficiency of summary statistics": It might be useful to reference a recent comparison<ref>M. G. B. Blum, M. A. Nunes, D. Prangle, S. A. Sisson (2012) A comparative review of dimension reduction methods in approximate Bayesian computation. arxiv.org/abs/1202.3819 </ref> (disclaimer: which I contributed to) between methods of choosing summary statistics.
"Bayes factor with ABC and summary statistics": "...can also be used to..." it might be more accurate to say "...is sufficient to..."
"Bayes factor with ABC and summary statistics": "meaningless" seems too strong as the next sentence suggests a potentially useful alternative way of doing inference.
"Prior distribution and parameter ranges": "...based on the principle of maximum entropy". A link to the general topic of [[wp:Prior_distribution#Uninformative_priors|objective priors]] might be helpful here.
"Large data sets": "which may be a tractable approach for ABC based methods". Note it is already easy to parallelise many of the steps in ABC algorithms based on rejection sampling and SMC.
"Curse of dimensionality": Some theoretical results have been proved here<ref>M. G. B. Blum (2010) Approximate Bayesian Computation: a nonparametric perspective. Journal of the American Statistical Association (105) 1178-1187</ref><ref name="FP2012" />.
"Conclusion": "With faster evaluation of the likelihood function..." I'm not sure what this is getting at; in ABC applications the likelihood function typically cannot be evaluated!
==References==
<references />
== Response to the reviewers ==
===Response to the comments by Christian P. Robert===
1: The entry starts with the representation of the posterior probability of an hypothesis, rather than with the posterior density of a model parameter, which makes it seems likely it could lead the novice reader astray. After all, (a) ABC was not introduced for conducting model choice and (b) interchanging hypothesis and model means that the probability of an hypothesis H as used in the entry is actually the evidence in favour of the corresponding model.
Comment: We now first talk only about parameter estimation. We have also rewritten the section about model selection for better coherence of the text.
2: There are a few typos and grammar mistakes, but I assume either PLoS or later contributors will correct those.
Comment: We have corrected the typos and grammatical mistakes found during the revision.
3: When the authors state that the "outcome of the ABC rejection algorithm is a set of parameter estimates distributed according to the desired posterior distribution", I think they are leading some of the readers astray as they forget the "approximative" aspect of this distribution.
Comment: This has been changed.
4: Further below, I would have used the title "Insufficient summary statistics" rather than "Sufficient summary statistics", as it spells out more clearly the fundamental issue with the potential difficulty in using ABC.
Comment: The title has been changed to “Summary statistics”.
5: And I am not sure the subsequent paragraph on "Choice and sufficiency of summary statistics" should bother with the sufficiency aspects... It seems to me much more relevant to assess the impact on predictive performances.
Comment:
6: Although this is most minor, I would not have made mention of the (rather artificial) "table for interpretation of the strength in values of the Bayes factor (...) originally published by Harold Jeffreys".
Comment: The reference has been removed.
7: I also like the notion of "quality control", even though it should only appear once.
Comment: We have now merged the two sections about quality control.
8: And the pseudo-example is quite fine as an introduction, while it could be supplemented with the outcome resulting from a large n, to be compared with the true posterior distribution.
Comment:
9: The section "Pitfalls and remedies" is remarkable in that it details the necessary steps for validating a ABC implementation: the only entry I would remove is the one about "Prior distribution and parameter ranges", in that this is not a problem inherent to ABC... (Granted, the authors present this as a "general risks in statistical inference exacerbated in ABC", which makes more sense!)
Comment: We have decided to keep this entry. However, a sentence was added under “Pitfalls and remedies” to emphasize that the problem related to “Prior distribution and parameter ranges” is not specific to ABC.
10: It may be that the section on the non-zero tolerance should emphasize more clearly the fact that ε should not be zero. As discussed in the recent Read Paper by Fearnhead and Prangle [2] when envisioning ABC as a non-parametric method of inference.
Comment:
11: I would suggest adding in this section links to the relevant softwares like our own DIY-ABC[3]...
Comment: A section about software has been added, as well as a Table (Table 3) with references to relevant papers.
===Response to the comments by Dennis Prangle===
====Major comments====
1: Much of the material in the "recent methodological developments" section is well established and no longer recent relative to the age of the field (e.g. the Marjoram et al paper was published in 2003). I'd suggest at least renaming this section. Alternatively, much of this material could be incorporated into the "approximation of the posterior" section, as regression correction ideas and MCMC / SMC algorithms are tools commonly used to improve the approximation.
Comment: The section has been removed and most of the material has been incorporated into the “approximation of the posterior” section.
2: A little more coverage of applications would be nice. One way to do this without increasing the length of the article would be to explicitly reference recent review papers (Beaumont 2010, Bertorelle et al 2010, Csillery et al 2010, Marin et al 2011[4]) for further details.
Comment: We have added a sentence about applications of ABC, with references to these review papers, at the end of the “Example” section.
3: The model comparison section should explain how the ABC rejection sampling algorithm can be adapted to perform inference between models (or give a reference). A reference to more advanced algorithms (e.g. Didelot et al, Toni and Stumpf 2009[5]) would also be helpful.
Comment: We have added a reference to the Toni & Stumpf SMC-ABC method for model selection.
4: I agree with Christian Robert's comments that the discussion of a hypothesis H in the motivation section is somewhat confusing,
Comment: See our response to Christian Robert’s comment above (1).
5: and that links to code could be helpful. Some additional suggestions are the "abc" R package and ABC-SysBio.
Comment: A section about software has been added.
====Minor comments====
6: The acceptance criterion should be not if ε = 0 is to correspond to acceptance of exact matches only.
Comment: This has been changed.
7: "Sufficient summary statistics": As Christian writes, it would seem more natural to discuss general summary statistics first, then the special and less practically useful case of sufficient statistics.
Comment:
8: "Example": I'd point out that this is an example application only, and more accurate inference is possible here by particle filtering methods. If there were some missing data this would be a more natural ABC application e.g. if only the summary statistic was observed.
Comment: We have modified the example so that only the frequency of switches is known (summary statistics). We have also added a sentence to point out that it is only an example application, and that the inference may be improved with more advanced methods.
9: "Approximation of the posterior": "...has been justified theoretically under some limiting conditions". The word "limiting" doesn't seem (to me) to describe the measurement error case.
Comment:
10: "Choice and sufficiency of summary statistics": "Sufficient statistics are optimal..." I'd change to "Low dimensional sufficient statistics". For some models (e.g. iid Cauchy) the only sufficient statistics are the full data set, which would be a poor choice.
Comment: This has been changed.
11: "Choice and sufficiency of summary statistics": "...which is approximated with a pilot run of simulations". Something like "...which is approximated by linear regression based on simulated data" would be more accurate.
Comment: This has been changed.
12: "Choice and sufficiency of summary statistics": It might be useful to reference a recent comparison[6] (disclaimer: which I contributed to) between methods of choosing summary statistics.
Comment: A sentence was added with a reference to the paper.
13: "Bayes factor with ABC and summary statistics": "...can also be used to..." it might be more accurate to say "...is sufficient to..."
Comment: This has been changed.
14: "Bayes factor with ABC and summary statistics": "meaningless" seems too strong as the next sentence suggests a potentially useful alternative way of doing inference.
Comment: The formulation was changed to “may therefore be misinformative”.
15: "Prior distribution and parameter ranges": "...based on the principle of maximum entropy". A link to the general topic of objective priors might be helpful here.
Comment: This formulation has been changed.
16: "Large data sets": "which may be a tractable approach for ABC based methods". Note it is already easy to parallelise many of the steps in ABC algorithms based on rejection sampling and SMC.
Comment: This has been changed.
17: "Curse of dimensionality": Some theoretical results have been proved here[7][2].
Comment:
18: "Conclusion": "With faster evaluation of the likelihood function..." I'm not sure what this is getting at; in ABC applications the likelihood function typically cannot be evaluated!
Comment: This formulation has been changed.
1139
2012-07-05T13:51:51Z
Mikaelsunnaker
13
/* Response to the reviewers */
== Comments of Darren Logan on things to do before moving the article to Wikipedia ==
* The title should probably be "Approximate Bayesian computation" (small "c"). The "...in computational biology" is probably redundant for WP article. You can always make a redirection from the long title to the short one.
*: Moved. --<span style="font-family:Palatino Linotype, Book Antiqua, Palatino, serif;">[[User:Spencer Bliven|Spencer]] [[User talk:Spencer Bliven|Bliven]]</span> 15:35, 18 May 2012 (PDT)
'''The following comments apply specifically to the wikipedia-version of this article''' --[[User:Cdessimoz|Cdessimoz]] 03:41, 22 May 2012 (PDT)
* In the summary and elsewhere, you use terms like "over the last years" and "recently". These should be avoided, as WP articles are not dated and thus non-specific time-frames are not meaningful. If you need to refer to time, be specific (e.g. "Since 1999..." or "In 2010...")
* Example. In general, Wikipedia articles should not contain worked examples. That type of content is better suited to Wikiversity or Wikibooks. There are exceptions, however. The guidance on this can be seen at WP:NOT, specifically: "An article should not read like a "how-to" style... the purpose of Wikipedia is to present facts, not to teach subject matter. It is not appropriate to create or edit articles that read as textbooks, with leading questions and systematic problem solutions as examples... Some kinds of examples, specifically those intended to inform rather than to instruct, may be appropriate for inclusion in a Wikipedia article." I think your example might be ok, but you should be careful of the tone to ensure it doesn't seem like a "how-to" guide.
* Wikipedia articles do not have conclusion sections.
* Throughout the article you should try and avoid using a narrative voice and remove all self-references. For example:
** "As the previous section suggests..."
** "This section attempts to review important recent developments..."
** "...should be considered with sober caution, as discussed below."
** "Interestingly..."
** "This section discusses these potential risks and reviews possible ways to address them.."
** "As the above makes clear..."
** "This section reviews risks..."
** "This section attempts to review important recent developments."
To follow up on these remarks, the history section in such Wikipedia articles is typically the first after the lead section, as it puts the topic into its historic context. I have thus moved it there. However, this possibly breaks some of the narrative flow and should thus be checked again during the revision. --[[User:Daniel Mietchen|Daniel Mietchen]] 19:05, 27 June 2012 (PDT)
== Comments of Christian P. Robert on the entry ==
A few comments on the specific entry on ABC written by Mikael Sunnåker et al.... The entry starts with the representation of the posterior probability of an hypothesis, rather than with the posterior density of a model parameter, which makes it seems likely it could lead the novice reader astray. After all, (a) ABC was not introduced for conducting model choice and (b) interchanging hypothesis and model means that the probability of an hypothesis '''''H''''' as used in the entry is actually the evidence in favour of the corresponding model. (There are a few typos and grammar mistakes, but I assume either PLoS or later contributors will correct those.) When the authors state that the "outcome of the ABC rejection algorithm is a set of parameter estimates distributed according to the desired posterior distribution", I think they are leading some of the readers astray as they forget the "approximative" aspect of this distribution. Further below, I would have used the title "Insufficient summary statistics" rather than "Sufficient summary statistics", as it spells out more clearly the fundamental issue with the potential difficulty in using ABC. (And I am not sure the subsequent paragraph on "Choice and sufficiency of summary statistics" should bother with the sufficiency aspects... It seems to me much more relevant to assess the impact on predictive performances.
Although this is most minor, I would not have made mention of the (rather artificial) "table for interpretation of the strength in values of [[wp:Bayes factor|the Bayes factor]] (...) originally published by [[wp:Harold Jeffreys|Harold Jeffreys]]". I obviously appreciate very much that the authors advertise our warning <ref> Robert, C.P., Cornuet, J.-M., Marin, J.-M. and Pillai, N. (2011) Lack of confidence in approximate Bayesian computation model choice. PNAS vol. 108 no. 37 15112-15117.</ref> about the potential lack of validity of an ABC based Bayes factor! I also like the notion of "quality control", even though it should only appear once. And the pseudo-example is quite fine as an introduction, while it could be supplemented with the outcome resulting from a large n, to be compared with the true posterior distribution. The section "Pitfalls and remedies" is remarkable in that it details the necessary steps for validating a ABC implementation: the only entry I would remove is the one about "Prior distribution and parameter ranges", in that this is not a problem inherent to ABC... (Granted, the authors present this as a "general risks in statistical inference exacerbated in ABC", which makes more sense!) It may be that the section on the non-zero tolerance should emphasize more clearly the fact that ε should not be zero. As discussed in the recent Read Paper by Fearnhead and Prangle <ref name="FP2012">Fearnhead, P. and Prangle, D. (2012) Constructing summary statistics for approximate Bayesian computation: semi-automatic approximate Bayesian computation. Journal of the Royal Statistical Society Series B. Volume 74, Issue 3, pages 419–474.</ref> when envisioning ABC as a non-parametric method of inference.
At last, it is always possible to criticise the coverage of the historical part, since this is such a recent field that it is constantly evolving. But the authors correctly point out to (Don) Rubin on the one hand and to Diggle and Graton on the other. I would suggest adding in this section links to the relevant softwares like our own DIY-ABC<ref>Cornuet, J.-M., Santos, F., Beaumont, M. et al. (2008) Inferring population history with DIY ABC: a user-friendly approach to approximate Bayesian computation . Bioinformatics 24 (23): 2713-2719.</ref>...
== Review by Dennis Prangle ==
This is a well written and accessible introductory article. I particularly like the balance struck between describing the simplicity of implementing ABC and the potential drawbacks.
=== Major comments ===
(nb I've included full references only for papers not in the original article.)
Much of the material in the "recent methodological developments" section is well established and no longer recent relative to the age of the field (e.g. the Marjoram et al paper was published in 2003). I'd suggest at least renaming this section. Alternatively, much of this material could be incorporated into the "approximation of the posterior" section, as regression correction ideas and MCMC / SMC algorithms are tools commonly used to improve the approximation.
A little more coverage of applications would be nice. One way to do this without increasing the length of the article would be to explicitly reference recent review papers (Beaumont 2010, Bertorelle et al 2010, Csillery et al 2010, Marin et al 2011<ref>Jean-Michel Marin, Pierre Pudlo, Christian P. Robert and Robin J. Ryder (2011) Approximate Bayesian computational methods. Statistics and Computing (published online)</ref>) for further details.
The model comparison section should explain how the ABC rejection sampling algorithm can be adapted to perform inference between models (or give a reference). A reference to more advanced algorithms (e.g. Didelot et al, Toni and Stumpf 2009<ref>Tina Toni and Michael P. H. Stumpf (2009) Simulation-based model selection for dynamical systems in systems and population biology. Bioinformatics (26) 104-110</ref>) would also be helpful.
I agree with Christian Robert's comments that the discussion of a hypothesis H in the motivation section is somewhat confusing, and that links to code could be helpful. Some additional suggestions are the "abc" R package and ABC-SysBio.
=== Minor comments ===
The acceptance criterion should be <math>\rho (\hat{D},D) \le \epsilon</math> not <math>\rho (\hat{D},D)<\epsilon</math> if <math>\epsilon=0</math> is to correspond to acceptance of exact matches only.
"Sufficient summary statistics": As Christian writes, it would seem more natural to discuss general summary statistics first, then the special and less practically useful case of sufficient statistics.
"Example": I'd point out that this is an example application only, and more accurate inference is possible here by particle filtering methods. If there were some missing data this would be a more natural ABC application e.g. if only the summary statistic was observed.
"Approximation of the posterior": "...has been justified theoretically under some limiting conditions". The word "limiting" doesn't seem (to me) to describe the measurement error case.
"Choice and sufficiency of summary statistics": "Sufficient statistics are optimal..." I'd change to "Low dimensional sufficient statistics". For some models (e.g. iid Cauchy) the only sufficient statistics are the full data set, which would be a poor choice.
"Choice and sufficiency of summary statistics": "...which is approximated with a pilot run of simulations". Something like "...which is approximated by linear regression based on simulated data" would be more accurate.
"Choice and sufficiency of summary statistics": It might be useful to reference a recent comparison<ref>M. G. B. Blum, M. A. Nunes, D. Prangle, S. A. Sisson (2012) A comparative review of dimension reduction methods in approximate Bayesian computation. arxiv.org/abs/1202.3819 </ref> (disclaimer: which I contributed to) between methods of choosing summary statistics.
"Bayes factor with ABC and summary statistics": "...can also be used to..." it might be more accurate to say "...is sufficient to..."
"Bayes factor with ABC and summary statistics": "meaningless" seems too strong as the next sentence suggests a potentially useful alternative way of doing inference.
"Prior distribution and parameter ranges": "...based on the principle of maximum entropy". A link to the general topic of [[wp:Prior_distribution#Uninformative_priors|objective priors]] might be helpful here.
"Large data sets": "which may be a tractable approach for ABC based methods". Note it is already easy to parallelise many of the steps in ABC algorithms based on rejection sampling and SMC.
"Curse of dimensionality": Some theoretical results have been proved here<ref>M. G. B. Blum (2010) Approximate Bayesian Computation: a nonparametric perspective. Journal of the American Statistical Association (105) 1178-1187</ref><ref name="FP2012" />.
"Conclusion": "With faster evaluation of the likelihood function..." I'm not sure what this is getting at; in ABC applications the likelihood function typically cannot be evaluated!
==References==
<references />
== Response to the reviewers ==
===Response to the comments by Christian P. Robert===
1: The entry starts with the representation of the posterior probability of an hypothesis, rather than with the posterior density of a model parameter, which makes it seems likely it could lead the novice reader astray. After all, (a) ABC was not introduced for conducting model choice and (b) interchanging hypothesis and model means that the probability of an hypothesis H as used in the entry is actually the evidence in favour of the corresponding model.
'''Comment:''' We now first talk only about parameter estimation. We have also rewritten the section about model selection for better coherence of the text.
2: There are a few typos and grammar mistakes, but I assume either PLoS or later contributors will correct those.
'''Comment:''' We have corrected the typos and grammatical mistakes found during the revision.
3: When the authors state that the "outcome of the ABC rejection algorithm is a set of parameter estimates distributed according to the desired posterior distribution", I think they are leading some of the readers astray as they forget the "approximative" aspect of this distribution.
'''Comment:''' This has been changed.
4: Further below, I would have used the title "Insufficient summary statistics" rather than "Sufficient summary statistics", as it spells out more clearly the fundamental issue with the potential difficulty in using ABC.
'''Comment:''' The title has been changed to “Summary statistics”.
5: And I am not sure the subsequent paragraph on "Choice and sufficiency of summary statistics" should bother with the sufficiency aspects... It seems to me much more relevant to assess the impact on predictive performances.
'''Comment:'''
6: Although this is most minor, I would not have made mention of the (rather artificial) "table for interpretation of the strength in values of the Bayes factor (...) originally published by Harold Jeffreys".
'''Comment:''' The reference has been removed.
7: I also like the notion of "quality control", even though it should only appear once.
'''Comment:''' We have now merged the two sections about quality control.
8: And the pseudo-example is quite fine as an introduction, while it could be supplemented with the outcome resulting from a large n, to be compared with the true posterior distribution.
'''Comment:'''
9: The section "Pitfalls and remedies" is remarkable in that it details the necessary steps for validating a ABC implementation: the only entry I would remove is the one about "Prior distribution and parameter ranges", in that this is not a problem inherent to ABC... (Granted, the authors present this as a "general risks in statistical inference exacerbated in ABC", which makes more sense!)
'''Comment:''' We have decided to keep this entry. However, a sentence was added under “Pitfalls and remedies” to emphasize that the problem related to “Prior distribution and parameter ranges” is not specific to ABC.
10: It may be that the section on the non-zero tolerance should emphasize more clearly the fact that ε should not be zero. As discussed in the recent Read Paper by Fearnhead and Prangle [2] when envisioning ABC as a non-parametric method of inference.
'''Comment:'''
11: I would suggest adding in this section links to the relevant softwares like our own DIY-ABC[3]...
'''Comment:''' A section about software has been added, as well as a Table (Table 3) with references to relevant papers.
===Response to the comments by Dennis Prangle===
====Major comments====
1: Much of the material in the "recent methodological developments" section is well established and no longer recent relative to the age of the field (e.g. the Marjoram et al paper was published in 2003). I'd suggest at least renaming this section. Alternatively, much of this material could be incorporated into the "approximation of the posterior" section, as regression correction ideas and MCMC / SMC algorithms are tools commonly used to improve the approximation.
'''Comment:''' The section has been removed and most of the material has been incorporated into the “approximation of the posterior” section.
2: A little more coverage of applications would be nice. One way to do this without increasing the length of the article would be to explicitly reference recent review papers (Beaumont 2010, Bertorelle et al 2010, Csillery et al 2010, Marin et al 2011[4]) for further details.
'''Comment:''' We have added a sentence about applications of ABC, with references to these review papers, at the end of the “Example” section.
3: The model comparison section should explain how the ABC rejection sampling algorithm can be adapted to perform inference between models (or give a reference). A reference to more advanced algorithms (e.g. Didelot et al, Toni and Stumpf 2009[5]) would also be helpful.
'''Comment:''' We have added a reference to the Toni & Stumpf SMC-ABC method for model selection.
4: I agree with Christian Robert's comments that the discussion of a hypothesis H in the motivation section is somewhat confusing,
'''Comment:''' See our response to Christian Robert’s comment above (1).
5: and that links to code could be helpful. Some additional suggestions are the "abc" R package and ABC-SysBio.
'''Comment:''' A section about software has been added.
====Minor comments====
6: The acceptance criterion should be not if ε = 0 is to correspond to acceptance of exact matches only.
'''Comment:''' This has been changed.
7: "Sufficient summary statistics": As Christian writes, it would seem more natural to discuss general summary statistics first, then the special and less practically useful case of sufficient statistics.
'''Comment:'''
8: "Example": I'd point out that this is an example application only, and more accurate inference is possible here by particle filtering methods. If there were some missing data this would be a more natural ABC application e.g. if only the summary statistic was observed.
'''Comment:''' We have modified the example so that only the frequency of switches is known (summary statistics). We have also added a sentence to point out that it is only an example application, and that the inference may be improved with more advanced methods.
9: "Approximation of the posterior": "...has been justified theoretically under some limiting conditions". The word "limiting" doesn't seem (to me) to describe the measurement error case.
'''Comment:'''
10: "Choice and sufficiency of summary statistics": "Sufficient statistics are optimal..." I'd change to "Low dimensional sufficient statistics". For some models (e.g. iid Cauchy) the only sufficient statistics are the full data set, which would be a poor choice.
'''Comment:''' This has been changed.
11: "Choice and sufficiency of summary statistics": "...which is approximated with a pilot run of simulations". Something like "...which is approximated by linear regression based on simulated data" would be more accurate.
'''Comment:''' This has been changed.
12: "Choice and sufficiency of summary statistics": It might be useful to reference a recent comparison[6] (disclaimer: which I contributed to) between methods of choosing summary statistics.
'''Comment:''' A sentence was added with a reference to the paper.
13: "Bayes factor with ABC and summary statistics": "...can also be used to..." it might be more accurate to say "...is sufficient to..."
'''Comment:''' This has been changed.
14: "Bayes factor with ABC and summary statistics": "meaningless" seems too strong as the next sentence suggests a potentially useful alternative way of doing inference.
'''Comment:''' The formulation was changed to “may therefore be misinformative”.
15: "Prior distribution and parameter ranges": "...based on the principle of maximum entropy". A link to the general topic of objective priors might be helpful here.
'''Comment:''' This formulation has been changed.
16: "Large data sets": "which may be a tractable approach for ABC based methods". Note it is already easy to parallelise many of the steps in ABC algorithms based on rejection sampling and SMC.
'''Comment:''' This has been changed.
17: "Curse of dimensionality": Some theoretical results have been proved here[7][2].
'''Comment:'''
18: "Conclusion": "With faster evaluation of the likelihood function..." I'm not sure what this is getting at; in ABC applications the likelihood function typically cannot be evaluated!
'''Comment:''' This formulation has been changed.
1140
2012-07-05T13:54:47Z
Mikaelsunnaker
13
/* Response to the reviewers */
== Comments of Darren Logan on things to do before moving the article to Wikipedia ==
* The title should probably be "Approximate Bayesian computation" (small "c"). The "...in computational biology" is probably redundant for WP article. You can always make a redirection from the long title to the short one.
*: Moved. --<span style="font-family:Palatino Linotype, Book Antiqua, Palatino, serif;">[[User:Spencer Bliven|Spencer]] [[User talk:Spencer Bliven|Bliven]]</span> 15:35, 18 May 2012 (PDT)
'''The following comments apply specifically to the wikipedia-version of this article''' --[[User:Cdessimoz|Cdessimoz]] 03:41, 22 May 2012 (PDT)
* In the summary and elsewhere, you use terms like "over the last years" and "recently". These should be avoided, as WP articles are not dated and thus non-specific time-frames are not meaningful. If you need to refer to time, be specific (e.g. "Since 1999..." or "In 2010...")
* Example. In general, Wikipedia articles should not contain worked examples. That type of content is better suited to Wikiversity or Wikibooks. There are exceptions, however. The guidance on this can be seen at WP:NOT, specifically: "An article should not read like a "how-to" style... the purpose of Wikipedia is to present facts, not to teach subject matter. It is not appropriate to create or edit articles that read as textbooks, with leading questions and systematic problem solutions as examples... Some kinds of examples, specifically those intended to inform rather than to instruct, may be appropriate for inclusion in a Wikipedia article." I think your example might be ok, but you should be careful of the tone to ensure it doesn't seem like a "how-to" guide.
* Wikipedia articles do not have conclusion sections.
* Throughout the article you should try and avoid using a narrative voice and remove all self-references. For example:
** "As the previous section suggests..."
** "This section attempts to review important recent developments..."
** "...should be considered with sober caution, as discussed below."
** "Interestingly..."
** "This section discusses these potential risks and reviews possible ways to address them.."
** "As the above makes clear..."
** "This section reviews risks..."
** "This section attempts to review important recent developments."
To follow up on these remarks, the history section in such Wikipedia articles is typically the first after the lead section, as it puts the topic into its historic context. I have thus moved it there. However, this possibly breaks some of the narrative flow and should thus be checked again during the revision. --[[User:Daniel Mietchen|Daniel Mietchen]] 19:05, 27 June 2012 (PDT)
== Comments of Christian P. Robert on the entry ==
A few comments on the specific entry on ABC written by Mikael Sunnåker et al.... The entry starts with the representation of the posterior probability of an hypothesis, rather than with the posterior density of a model parameter, which makes it seems likely it could lead the novice reader astray. After all, (a) ABC was not introduced for conducting model choice and (b) interchanging hypothesis and model means that the probability of an hypothesis '''''H''''' as used in the entry is actually the evidence in favour of the corresponding model. (There are a few typos and grammar mistakes, but I assume either PLoS or later contributors will correct those.) When the authors state that the "outcome of the ABC rejection algorithm is a set of parameter estimates distributed according to the desired posterior distribution", I think they are leading some of the readers astray as they forget the "approximative" aspect of this distribution. Further below, I would have used the title "Insufficient summary statistics" rather than "Sufficient summary statistics", as it spells out more clearly the fundamental issue with the potential difficulty in using ABC. (And I am not sure the subsequent paragraph on "Choice and sufficiency of summary statistics" should bother with the sufficiency aspects... It seems to me much more relevant to assess the impact on predictive performances.
Although this is most minor, I would not have made mention of the (rather artificial) "table for interpretation of the strength in values of [[wp:Bayes factor|the Bayes factor]] (...) originally published by [[wp:Harold Jeffreys|Harold Jeffreys]]". I obviously appreciate very much that the authors advertise our warning <ref> Robert, C.P., Cornuet, J.-M., Marin, J.-M. and Pillai, N. (2011) Lack of confidence in approximate Bayesian computation model choice. PNAS vol. 108 no. 37 15112-15117.</ref> about the potential lack of validity of an ABC based Bayes factor! I also like the notion of "quality control", even though it should only appear once. And the pseudo-example is quite fine as an introduction, while it could be supplemented with the outcome resulting from a large n, to be compared with the true posterior distribution. The section "Pitfalls and remedies" is remarkable in that it details the necessary steps for validating a ABC implementation: the only entry I would remove is the one about "Prior distribution and parameter ranges", in that this is not a problem inherent to ABC... (Granted, the authors present this as a "general risks in statistical inference exacerbated in ABC", which makes more sense!) It may be that the section on the non-zero tolerance should emphasize more clearly the fact that ε should not be zero. As discussed in the recent Read Paper by Fearnhead and Prangle <ref name="FP2012">Fearnhead, P. and Prangle, D. (2012) Constructing summary statistics for approximate Bayesian computation: semi-automatic approximate Bayesian computation. Journal of the Royal Statistical Society Series B. Volume 74, Issue 3, pages 419–474.</ref> when envisioning ABC as a non-parametric method of inference.
At last, it is always possible to criticise the coverage of the historical part, since this is such a recent field that it is constantly evolving. But the authors correctly point out to (Don) Rubin on the one hand and to Diggle and Graton on the other. I would suggest adding in this section links to the relevant softwares like our own DIY-ABC<ref>Cornuet, J.-M., Santos, F., Beaumont, M. et al. (2008) Inferring population history with DIY ABC: a user-friendly approach to approximate Bayesian computation . Bioinformatics 24 (23): 2713-2719.</ref>...
== Review by Dennis Prangle ==
This is a well written and accessible introductory article. I particularly like the balance struck between describing the simplicity of implementing ABC and the potential drawbacks.
=== Major comments ===
(nb I've included full references only for papers not in the original article.)
Much of the material in the "recent methodological developments" section is well established and no longer recent relative to the age of the field (e.g. the Marjoram et al paper was published in 2003). I'd suggest at least renaming this section. Alternatively, much of this material could be incorporated into the "approximation of the posterior" section, as regression correction ideas and MCMC / SMC algorithms are tools commonly used to improve the approximation.
A little more coverage of applications would be nice. One way to do this without increasing the length of the article would be to explicitly reference recent review papers (Beaumont 2010, Bertorelle et al 2010, Csillery et al 2010, Marin et al 2011<ref>Jean-Michel Marin, Pierre Pudlo, Christian P. Robert and Robin J. Ryder (2011) Approximate Bayesian computational methods. Statistics and Computing (published online)</ref>) for further details.
The model comparison section should explain how the ABC rejection sampling algorithm can be adapted to perform inference between models (or give a reference). A reference to more advanced algorithms (e.g. Didelot et al, Toni and Stumpf 2009<ref>Tina Toni and Michael P. H. Stumpf (2009) Simulation-based model selection for dynamical systems in systems and population biology. Bioinformatics (26) 104-110</ref>) would also be helpful.
I agree with Christian Robert's comments that the discussion of a hypothesis H in the motivation section is somewhat confusing, and that links to code could be helpful. Some additional suggestions are the "abc" R package and ABC-SysBio.
=== Minor comments ===
The acceptance criterion should be <math>\rho (\hat{D},D) \le \epsilon</math> not <math>\rho (\hat{D},D)<\epsilon</math> if <math>\epsilon=0</math> is to correspond to acceptance of exact matches only.
"Sufficient summary statistics": As Christian writes, it would seem more natural to discuss general summary statistics first, then the special and less practically useful case of sufficient statistics.
"Example": I'd point out that this is an example application only, and more accurate inference is possible here by particle filtering methods. If there were some missing data this would be a more natural ABC application e.g. if only the summary statistic was observed.
"Approximation of the posterior": "...has been justified theoretically under some limiting conditions". The word "limiting" doesn't seem (to me) to describe the measurement error case.
"Choice and sufficiency of summary statistics": "Sufficient statistics are optimal..." I'd change to "Low dimensional sufficient statistics". For some models (e.g. iid Cauchy) the only sufficient statistics are the full data set, which would be a poor choice.
"Choice and sufficiency of summary statistics": "...which is approximated with a pilot run of simulations". Something like "...which is approximated by linear regression based on simulated data" would be more accurate.
"Choice and sufficiency of summary statistics": It might be useful to reference a recent comparison<ref>M. G. B. Blum, M. A. Nunes, D. Prangle, S. A. Sisson (2012) A comparative review of dimension reduction methods in approximate Bayesian computation. arxiv.org/abs/1202.3819 </ref> (disclaimer: which I contributed to) between methods of choosing summary statistics.
"Bayes factor with ABC and summary statistics": "...can also be used to..." it might be more accurate to say "...is sufficient to..."
"Bayes factor with ABC and summary statistics": "meaningless" seems too strong as the next sentence suggests a potentially useful alternative way of doing inference.
"Prior distribution and parameter ranges": "...based on the principle of maximum entropy". A link to the general topic of [[wp:Prior_distribution#Uninformative_priors|objective priors]] might be helpful here.
"Large data sets": "which may be a tractable approach for ABC based methods". Note it is already easy to parallelise many of the steps in ABC algorithms based on rejection sampling and SMC.
"Curse of dimensionality": Some theoretical results have been proved here<ref>M. G. B. Blum (2010) Approximate Bayesian Computation: a nonparametric perspective. Journal of the American Statistical Association (105) 1178-1187</ref><ref name="FP2012" />.
"Conclusion": "With faster evaluation of the likelihood function..." I'm not sure what this is getting at; in ABC applications the likelihood function typically cannot be evaluated!
==References==
<references />
== Response to the reviewers ==
===Response to the comments by Christian P. Robert===
'''1:''' The entry starts with the representation of the posterior probability of an hypothesis, rather than with the posterior density of a model parameter, which makes it seems likely it could lead the novice reader astray. After all, (a) ABC was not introduced for conducting model choice and (b) interchanging hypothesis and model means that the probability of an hypothesis H as used in the entry is actually the evidence in favour of the corresponding model.
'''Comment:''' We now first talk only about parameter estimation. We have also rewritten the section about model selection for better coherence of the text.
'''2:''' There are a few typos and grammar mistakes, but I assume either PLoS or later contributors will correct those.
'''Comment:''' We have corrected the typos and grammatical mistakes found during the revision.
'''3:''' When the authors state that the "outcome of the ABC rejection algorithm is a set of parameter estimates distributed according to the desired posterior distribution", I think they are leading some of the readers astray as they forget the "approximative" aspect of this distribution.
'''Comment:''' This has been changed.
'''4:''' Further below, I would have used the title "Insufficient summary statistics" rather than "Sufficient summary statistics", as it spells out more clearly the fundamental issue with the potential difficulty in using ABC.
'''Comment:''' The title has been changed to “Summary statistics”.
'''5:''' And I am not sure the subsequent paragraph on "Choice and sufficiency of summary statistics" should bother with the sufficiency aspects... It seems to me much more relevant to assess the impact on predictive performances.
'''Comment:'''
'''6:''' Although this is most minor, I would not have made mention of the (rather artificial) "table for interpretation of the strength in values of the Bayes factor (...) originally published by Harold Jeffreys".
'''Comment:''' The reference has been removed.
'''7:''' I also like the notion of "quality control", even though it should only appear once.
'''Comment:''' We have now merged the two sections about quality control.
'''8:''' And the pseudo-example is quite fine as an introduction, while it could be supplemented with the outcome resulting from a large n, to be compared with the true posterior distribution.
'''Comment:'''
'''9:''' The section "Pitfalls and remedies" is remarkable in that it details the necessary steps for validating a ABC implementation: the only entry I would remove is the one about "Prior distribution and parameter ranges", in that this is not a problem inherent to ABC... (Granted, the authors present this as a "general risks in statistical inference exacerbated in ABC", which makes more sense!)
'''Comment:''' We have decided to keep this entry. However, a sentence was added under “Pitfalls and remedies” to emphasize that the problem related to “Prior distribution and parameter ranges” is not specific to ABC.
'''10:''' It may be that the section on the non-zero tolerance should emphasize more clearly the fact that ε should not be zero. As discussed in the recent Read Paper by Fearnhead and Prangle [2] when envisioning ABC as a non-parametric method of inference.
'''Comment:'''
'''11:''' I would suggest adding in this section links to the relevant softwares like our own DIY-ABC[3]...
'''Comment:''' A section about software has been added, as well as a Table (Table 3) with references to relevant papers.
===Response to the comments by Dennis Prangle===
====Major comments====
'''1:''' Much of the material in the "recent methodological developments" section is well established and no longer recent relative to the age of the field (e.g. the Marjoram et al paper was published in 2003). I'd suggest at least renaming this section. Alternatively, much of this material could be incorporated into the "approximation of the posterior" section, as regression correction ideas and MCMC / SMC algorithms are tools commonly used to improve the approximation.
'''Comment:''' The section has been removed and most of the material has been incorporated into the “approximation of the posterior” section.
'''2:''' A little more coverage of applications would be nice. One way to do this without increasing the length of the article would be to explicitly reference recent review papers (Beaumont 2010, Bertorelle et al 2010, Csillery et al 2010, Marin et al 2011[4]) for further details.
'''Comment:''' We have added a sentence about applications of ABC, with references to these review papers, at the end of the “Example” section.
'''3:''' The model comparison section should explain how the ABC rejection sampling algorithm can be adapted to perform inference between models (or give a reference). A reference to more advanced algorithms (e.g. Didelot et al, Toni and Stumpf 2009[5]) would also be helpful.
'''Comment:''' We have added a reference to the Toni & Stumpf SMC-ABC method for model selection.
'''4:''' I agree with Christian Robert's comments that the discussion of a hypothesis H in the motivation section is somewhat confusing,
'''Comment:''' See our response to Christian Robert’s comment above (1).
'''5:''' and that links to code could be helpful. Some additional suggestions are the "abc" R package and ABC-SysBio.
'''Comment:''' A section about software has been added.
====Minor comments====
'''6:''' The acceptance criterion should be not if ε = 0 is to correspond to acceptance of exact matches only.
'''Comment:''' This has been changed.
'''7:''' "Sufficient summary statistics": As Christian writes, it would seem more natural to discuss general summary statistics first, then the special and less practically useful case of sufficient statistics.
'''Comment:'''
'''8:''' "Example": I'd point out that this is an example application only, and more accurate inference is possible here by particle filtering methods. If there were some missing data this would be a more natural ABC application e.g. if only the summary statistic was observed.
'''Comment:''' We have modified the example so that only the frequency of switches is known (summary statistics). We have also added a sentence to point out that it is only an example application, and that the inference may be improved with more advanced methods.
'''9:''' "Approximation of the posterior": "...has been justified theoretically under some limiting conditions". The word "limiting" doesn't seem (to me) to describe the measurement error case.
'''Comment:'''
'''10:''' "Choice and sufficiency of summary statistics": "Sufficient statistics are optimal..." I'd change to "Low dimensional sufficient statistics". For some models (e.g. iid Cauchy) the only sufficient statistics are the full data set, which would be a poor choice.
'''Comment:''' This has been changed.
'''11:''' "Choice and sufficiency of summary statistics": "...which is approximated with a pilot run of simulations". Something like "...which is approximated by linear regression based on simulated data" would be more accurate.
'''Comment:''' This has been changed.
'''12:''' "Choice and sufficiency of summary statistics": It might be useful to reference a recent comparison[6] (disclaimer: which I contributed to) between methods of choosing summary statistics.
'''Comment:''' A sentence was added with a reference to the paper.
'''13:''' "Bayes factor with ABC and summary statistics": "...can also be used to..." it might be more accurate to say "...is sufficient to..."
'''Comment:''' This has been changed.
'''14:''' "Bayes factor with ABC and summary statistics": "meaningless" seems too strong as the next sentence suggests a potentially useful alternative way of doing inference.
'''Comment:''' The formulation was changed to “may therefore be misinformative”.
'''15:''' "Prior distribution and parameter ranges": "...based on the principle of maximum entropy". A link to the general topic of objective priors might be helpful here.
'''Comment:''' This formulation has been changed.
'''16:''' "Large data sets": "which may be a tractable approach for ABC based methods". Note it is already easy to parallelise many of the steps in ABC algorithms based on rejection sampling and SMC.
'''Comment:''' This has been changed.
'''17:''' "Curse of dimensionality": Some theoretical results have been proved here[7][2].
'''Comment:'''
'''18:''' "Conclusion": "With faster evaluation of the likelihood function..." I'm not sure what this is getting at; in ABC applications the likelihood function typically cannot be evaluated!
'''Comment:''' This formulation has been changed.
1145
2012-07-05T14:52:55Z
Cdessimoz
12
/* Comments of Darren Logan on things to do before moving the article to Wikipedia */
== Comments of Darren Logan on things to do before moving the article to Wikipedia ==
* The title should probably be "Approximate Bayesian computation" (small "c"). The "...in computational biology" is probably redundant for WP article. You can always make a redirection from the long title to the short one.
*: Moved. --<span style="font-family:Palatino Linotype, Book Antiqua, Palatino, serif;">[[User:Spencer Bliven|Spencer]] [[User talk:Spencer Bliven|Bliven]]</span> 15:35, 18 May 2012 (PDT)
'''The following comments apply specifically to the wikipedia-version of this article''' --[[User:Cdessimoz|Cdessimoz]] 03:41, 22 May 2012 (PDT)
* In the summary and elsewhere, you use terms like "over the last years" and "recently". These should be avoided, as WP articles are not dated and thus non-specific time-frames are not meaningful. If you need to refer to time, be specific (e.g. "Since 1999..." or "In 2010...")
* Example. In general, Wikipedia articles should not contain worked examples. That type of content is better suited to Wikiversity or Wikibooks. There are exceptions, however. The guidance on this can be seen at WP:NOT, specifically: "An article should not read like a "how-to" style... the purpose of Wikipedia is to present facts, not to teach subject matter. It is not appropriate to create or edit articles that read as textbooks, with leading questions and systematic problem solutions as examples... Some kinds of examples, specifically those intended to inform rather than to instruct, may be appropriate for inclusion in a Wikipedia article." I think your example might be ok, but you should be careful of the tone to ensure it doesn't seem like a "how-to" guide.
* Wikipedia articles do not have conclusion sections.
* Throughout the article you should try and avoid using a narrative voice and remove all self-references. For example:
** "As the previous section suggests..."
** "This section attempts to review important recent developments..."
** "...should be considered with sober caution, as discussed below."
** "Interestingly..."
** "This section discusses these potential risks and reviews possible ways to address them.."
** "As the above makes clear..."
** "This section reviews risks..."
** "This section attempts to review important recent developments."
To follow up on these remarks, the history section in such Wikipedia articles is typically the first after the lead section, as it puts the topic into its historic context. I have thus moved it there. However, this possibly breaks some of the narrative flow and should thus be checked again during the revision. --[[User:Daniel Mietchen|Daniel Mietchen]] 19:05, 27 June 2012 (PDT)
: '''Response''': We have verified the coherence of the narrative flow. --[[User:Cdessimoz|Cdessimoz]] 07:52, 5 July 2012 (PDT)
== Comments of Christian P. Robert on the entry ==
A few comments on the specific entry on ABC written by Mikael Sunnåker et al.... The entry starts with the representation of the posterior probability of an hypothesis, rather than with the posterior density of a model parameter, which makes it seems likely it could lead the novice reader astray. After all, (a) ABC was not introduced for conducting model choice and (b) interchanging hypothesis and model means that the probability of an hypothesis '''''H''''' as used in the entry is actually the evidence in favour of the corresponding model. (There are a few typos and grammar mistakes, but I assume either PLoS or later contributors will correct those.) When the authors state that the "outcome of the ABC rejection algorithm is a set of parameter estimates distributed according to the desired posterior distribution", I think they are leading some of the readers astray as they forget the "approximative" aspect of this distribution. Further below, I would have used the title "Insufficient summary statistics" rather than "Sufficient summary statistics", as it spells out more clearly the fundamental issue with the potential difficulty in using ABC. (And I am not sure the subsequent paragraph on "Choice and sufficiency of summary statistics" should bother with the sufficiency aspects... It seems to me much more relevant to assess the impact on predictive performances.
Although this is most minor, I would not have made mention of the (rather artificial) "table for interpretation of the strength in values of [[wp:Bayes factor|the Bayes factor]] (...) originally published by [[wp:Harold Jeffreys|Harold Jeffreys]]". I obviously appreciate very much that the authors advertise our warning <ref> Robert, C.P., Cornuet, J.-M., Marin, J.-M. and Pillai, N. (2011) Lack of confidence in approximate Bayesian computation model choice. PNAS vol. 108 no. 37 15112-15117.</ref> about the potential lack of validity of an ABC based Bayes factor! I also like the notion of "quality control", even though it should only appear once. And the pseudo-example is quite fine as an introduction, while it could be supplemented with the outcome resulting from a large n, to be compared with the true posterior distribution. The section "Pitfalls and remedies" is remarkable in that it details the necessary steps for validating a ABC implementation: the only entry I would remove is the one about "Prior distribution and parameter ranges", in that this is not a problem inherent to ABC... (Granted, the authors present this as a "general risks in statistical inference exacerbated in ABC", which makes more sense!) It may be that the section on the non-zero tolerance should emphasize more clearly the fact that ε should not be zero. As discussed in the recent Read Paper by Fearnhead and Prangle <ref name="FP2012">Fearnhead, P. and Prangle, D. (2012) Constructing summary statistics for approximate Bayesian computation: semi-automatic approximate Bayesian computation. Journal of the Royal Statistical Society Series B. Volume 74, Issue 3, pages 419–474.</ref> when envisioning ABC as a non-parametric method of inference.
At last, it is always possible to criticise the coverage of the historical part, since this is such a recent field that it is constantly evolving. But the authors correctly point out to (Don) Rubin on the one hand and to Diggle and Graton on the other. I would suggest adding in this section links to the relevant softwares like our own DIY-ABC<ref>Cornuet, J.-M., Santos, F., Beaumont, M. et al. (2008) Inferring population history with DIY ABC: a user-friendly approach to approximate Bayesian computation . Bioinformatics 24 (23): 2713-2719.</ref>...
== Review by Dennis Prangle ==
This is a well written and accessible introductory article. I particularly like the balance struck between describing the simplicity of implementing ABC and the potential drawbacks.
=== Major comments ===
(nb I've included full references only for papers not in the original article.)
Much of the material in the "recent methodological developments" section is well established and no longer recent relative to the age of the field (e.g. the Marjoram et al paper was published in 2003). I'd suggest at least renaming this section. Alternatively, much of this material could be incorporated into the "approximation of the posterior" section, as regression correction ideas and MCMC / SMC algorithms are tools commonly used to improve the approximation.
A little more coverage of applications would be nice. One way to do this without increasing the length of the article would be to explicitly reference recent review papers (Beaumont 2010, Bertorelle et al 2010, Csillery et al 2010, Marin et al 2011<ref>Jean-Michel Marin, Pierre Pudlo, Christian P. Robert and Robin J. Ryder (2011) Approximate Bayesian computational methods. Statistics and Computing (published online)</ref>) for further details.
The model comparison section should explain how the ABC rejection sampling algorithm can be adapted to perform inference between models (or give a reference). A reference to more advanced algorithms (e.g. Didelot et al, Toni and Stumpf 2009<ref>Tina Toni and Michael P. H. Stumpf (2009) Simulation-based model selection for dynamical systems in systems and population biology. Bioinformatics (26) 104-110</ref>) would also be helpful.
I agree with Christian Robert's comments that the discussion of a hypothesis H in the motivation section is somewhat confusing, and that links to code could be helpful. Some additional suggestions are the "abc" R package and ABC-SysBio.
=== Minor comments ===
The acceptance criterion should be <math>\rho (\hat{D},D) \le \epsilon</math> not <math>\rho (\hat{D},D)<\epsilon</math> if <math>\epsilon=0</math> is to correspond to acceptance of exact matches only.
"Sufficient summary statistics": As Christian writes, it would seem more natural to discuss general summary statistics first, then the special and less practically useful case of sufficient statistics.
"Example": I'd point out that this is an example application only, and more accurate inference is possible here by particle filtering methods. If there were some missing data this would be a more natural ABC application e.g. if only the summary statistic was observed.
"Approximation of the posterior": "...has been justified theoretically under some limiting conditions". The word "limiting" doesn't seem (to me) to describe the measurement error case.
"Choice and sufficiency of summary statistics": "Sufficient statistics are optimal..." I'd change to "Low dimensional sufficient statistics". For some models (e.g. iid Cauchy) the only sufficient statistics are the full data set, which would be a poor choice.
"Choice and sufficiency of summary statistics": "...which is approximated with a pilot run of simulations". Something like "...which is approximated by linear regression based on simulated data" would be more accurate.
"Choice and sufficiency of summary statistics": It might be useful to reference a recent comparison<ref>M. G. B. Blum, M. A. Nunes, D. Prangle, S. A. Sisson (2012) A comparative review of dimension reduction methods in approximate Bayesian computation. arxiv.org/abs/1202.3819 </ref> (disclaimer: which I contributed to) between methods of choosing summary statistics.
"Bayes factor with ABC and summary statistics": "...can also be used to..." it might be more accurate to say "...is sufficient to..."
"Bayes factor with ABC and summary statistics": "meaningless" seems too strong as the next sentence suggests a potentially useful alternative way of doing inference.
"Prior distribution and parameter ranges": "...based on the principle of maximum entropy". A link to the general topic of [[wp:Prior_distribution#Uninformative_priors|objective priors]] might be helpful here.
"Large data sets": "which may be a tractable approach for ABC based methods". Note it is already easy to parallelise many of the steps in ABC algorithms based on rejection sampling and SMC.
"Curse of dimensionality": Some theoretical results have been proved here<ref>M. G. B. Blum (2010) Approximate Bayesian Computation: a nonparametric perspective. Journal of the American Statistical Association (105) 1178-1187</ref><ref name="FP2012" />.
"Conclusion": "With faster evaluation of the likelihood function..." I'm not sure what this is getting at; in ABC applications the likelihood function typically cannot be evaluated!
==References==
<references />
== Response to the reviewers ==
===Response to the comments by Christian P. Robert===
'''1:''' The entry starts with the representation of the posterior probability of an hypothesis, rather than with the posterior density of a model parameter, which makes it seems likely it could lead the novice reader astray. After all, (a) ABC was not introduced for conducting model choice and (b) interchanging hypothesis and model means that the probability of an hypothesis H as used in the entry is actually the evidence in favour of the corresponding model.
'''Comment:''' We now first talk only about parameter estimation. We have also rewritten the section about model selection for better coherence of the text.
'''2:''' There are a few typos and grammar mistakes, but I assume either PLoS or later contributors will correct those.
'''Comment:''' We have corrected the typos and grammatical mistakes found during the revision.
'''3:''' When the authors state that the "outcome of the ABC rejection algorithm is a set of parameter estimates distributed according to the desired posterior distribution", I think they are leading some of the readers astray as they forget the "approximative" aspect of this distribution.
'''Comment:''' This has been changed.
'''4:''' Further below, I would have used the title "Insufficient summary statistics" rather than "Sufficient summary statistics", as it spells out more clearly the fundamental issue with the potential difficulty in using ABC.
'''Comment:''' The title has been changed to “Summary statistics”.
'''5:''' And I am not sure the subsequent paragraph on "Choice and sufficiency of summary statistics" should bother with the sufficiency aspects... It seems to me much more relevant to assess the impact on predictive performances.
'''Comment:'''
'''6:''' Although this is most minor, I would not have made mention of the (rather artificial) "table for interpretation of the strength in values of the Bayes factor (...) originally published by Harold Jeffreys".
'''Comment:''' The reference has been removed.
'''7:''' I also like the notion of "quality control", even though it should only appear once.
'''Comment:''' We have now merged the two sections about quality control.
'''8:''' And the pseudo-example is quite fine as an introduction, while it could be supplemented with the outcome resulting from a large n, to be compared with the true posterior distribution.
'''Comment:'''
'''9:''' The section "Pitfalls and remedies" is remarkable in that it details the necessary steps for validating a ABC implementation: the only entry I would remove is the one about "Prior distribution and parameter ranges", in that this is not a problem inherent to ABC... (Granted, the authors present this as a "general risks in statistical inference exacerbated in ABC", which makes more sense!)
'''Comment:''' We have decided to keep this entry. However, a sentence was added under “Pitfalls and remedies” to emphasize that the problem related to “Prior distribution and parameter ranges” is not specific to ABC.
'''10:''' It may be that the section on the non-zero tolerance should emphasize more clearly the fact that ε should not be zero. As discussed in the recent Read Paper by Fearnhead and Prangle [2] when envisioning ABC as a non-parametric method of inference.
'''Comment:'''
'''11:''' I would suggest adding in this section links to the relevant softwares like our own DIY-ABC[3]...
'''Comment:''' A section about software has been added, as well as a Table (Table 3) with references to relevant papers.
===Response to the comments by Dennis Prangle===
====Major comments====
'''1:''' Much of the material in the "recent methodological developments" section is well established and no longer recent relative to the age of the field (e.g. the Marjoram et al paper was published in 2003). I'd suggest at least renaming this section. Alternatively, much of this material could be incorporated into the "approximation of the posterior" section, as regression correction ideas and MCMC / SMC algorithms are tools commonly used to improve the approximation.
'''Comment:''' The section has been removed and most of the material has been incorporated into the “approximation of the posterior” section.
'''2:''' A little more coverage of applications would be nice. One way to do this without increasing the length of the article would be to explicitly reference recent review papers (Beaumont 2010, Bertorelle et al 2010, Csillery et al 2010, Marin et al 2011[4]) for further details.
'''Comment:''' We have added a sentence about applications of ABC, with references to these review papers, at the end of the “Example” section.
'''3:''' The model comparison section should explain how the ABC rejection sampling algorithm can be adapted to perform inference between models (or give a reference). A reference to more advanced algorithms (e.g. Didelot et al, Toni and Stumpf 2009[5]) would also be helpful.
'''Comment:''' We have added a reference to the Toni & Stumpf SMC-ABC method for model selection.
'''4:''' I agree with Christian Robert's comments that the discussion of a hypothesis H in the motivation section is somewhat confusing,
'''Comment:''' See our response to Christian Robert’s comment above (1).
'''5:''' and that links to code could be helpful. Some additional suggestions are the "abc" R package and ABC-SysBio.
'''Comment:''' A section about software has been added.
====Minor comments====
'''6:''' The acceptance criterion should be not if ε = 0 is to correspond to acceptance of exact matches only.
'''Comment:''' This has been changed.
'''7:''' "Sufficient summary statistics": As Christian writes, it would seem more natural to discuss general summary statistics first, then the special and less practically useful case of sufficient statistics.
'''Comment:'''
'''8:''' "Example": I'd point out that this is an example application only, and more accurate inference is possible here by particle filtering methods. If there were some missing data this would be a more natural ABC application e.g. if only the summary statistic was observed.
'''Comment:''' We have modified the example so that only the frequency of switches is known (summary statistics). We have also added a sentence to point out that it is only an example application, and that the inference may be improved with more advanced methods.
'''9:''' "Approximation of the posterior": "...has been justified theoretically under some limiting conditions". The word "limiting" doesn't seem (to me) to describe the measurement error case.
'''Comment:'''
'''10:''' "Choice and sufficiency of summary statistics": "Sufficient statistics are optimal..." I'd change to "Low dimensional sufficient statistics". For some models (e.g. iid Cauchy) the only sufficient statistics are the full data set, which would be a poor choice.
'''Comment:''' This has been changed.
'''11:''' "Choice and sufficiency of summary statistics": "...which is approximated with a pilot run of simulations". Something like "...which is approximated by linear regression based on simulated data" would be more accurate.
'''Comment:''' This has been changed.
'''12:''' "Choice and sufficiency of summary statistics": It might be useful to reference a recent comparison[6] (disclaimer: which I contributed to) between methods of choosing summary statistics.
'''Comment:''' A sentence was added with a reference to the paper.
'''13:''' "Bayes factor with ABC and summary statistics": "...can also be used to..." it might be more accurate to say "...is sufficient to..."
'''Comment:''' This has been changed.
'''14:''' "Bayes factor with ABC and summary statistics": "meaningless" seems too strong as the next sentence suggests a potentially useful alternative way of doing inference.
'''Comment:''' The formulation was changed to “may therefore be misinformative”.
'''15:''' "Prior distribution and parameter ranges": "...based on the principle of maximum entropy". A link to the general topic of objective priors might be helpful here.
'''Comment:''' This formulation has been changed.
'''16:''' "Large data sets": "which may be a tractable approach for ABC based methods". Note it is already easy to parallelise many of the steps in ABC algorithms based on rejection sampling and SMC.
'''Comment:''' This has been changed.
'''17:''' "Curse of dimensionality": Some theoretical results have been proved here[7][2].
'''Comment:'''
'''18:''' "Conclusion": "With faster evaluation of the likelihood function..." I'm not sure what this is getting at; in ABC applications the likelihood function typically cannot be evaluated!
'''Comment:''' This formulation has been changed.
1149
2012-07-05T15:04:44Z
Cdessimoz
12
/* Comments of Christian P. Robert on the entry */
== Comments of Darren Logan on things to do before moving the article to Wikipedia ==
* The title should probably be "Approximate Bayesian computation" (small "c"). The "...in computational biology" is probably redundant for WP article. You can always make a redirection from the long title to the short one.
*: Moved. --<span style="font-family:Palatino Linotype, Book Antiqua, Palatino, serif;">[[User:Spencer Bliven|Spencer]] [[User talk:Spencer Bliven|Bliven]]</span> 15:35, 18 May 2012 (PDT)
'''The following comments apply specifically to the wikipedia-version of this article''' --[[User:Cdessimoz|Cdessimoz]] 03:41, 22 May 2012 (PDT)
* In the summary and elsewhere, you use terms like "over the last years" and "recently". These should be avoided, as WP articles are not dated and thus non-specific time-frames are not meaningful. If you need to refer to time, be specific (e.g. "Since 1999..." or "In 2010...")
* Example. In general, Wikipedia articles should not contain worked examples. That type of content is better suited to Wikiversity or Wikibooks. There are exceptions, however. The guidance on this can be seen at WP:NOT, specifically: "An article should not read like a "how-to" style... the purpose of Wikipedia is to present facts, not to teach subject matter. It is not appropriate to create or edit articles that read as textbooks, with leading questions and systematic problem solutions as examples... Some kinds of examples, specifically those intended to inform rather than to instruct, may be appropriate for inclusion in a Wikipedia article." I think your example might be ok, but you should be careful of the tone to ensure it doesn't seem like a "how-to" guide.
* Wikipedia articles do not have conclusion sections.
* Throughout the article you should try and avoid using a narrative voice and remove all self-references. For example:
** "As the previous section suggests..."
** "This section attempts to review important recent developments..."
** "...should be considered with sober caution, as discussed below."
** "Interestingly..."
** "This section discusses these potential risks and reviews possible ways to address them.."
** "As the above makes clear..."
** "This section reviews risks..."
** "This section attempts to review important recent developments."
To follow up on these remarks, the history section in such Wikipedia articles is typically the first after the lead section, as it puts the topic into its historic context. I have thus moved it there. However, this possibly breaks some of the narrative flow and should thus be checked again during the revision. --[[User:Daniel Mietchen|Daniel Mietchen]] 19:05, 27 June 2012 (PDT)
: '''Response''': We have verified the coherence of the narrative flow. --[[User:Cdessimoz|Cdessimoz]] 07:52, 5 July 2012 (PDT)
== Comments of Christian P. Robert on the entry ==
A few comments on the specific entry on ABC written by Mikael Sunnåker et al....
*The entry starts with the representation of the posterior probability of an hypothesis, rather than with the posterior density of a model parameter, which makes it seems likely it could lead the novice reader astray. After all, (a) ABC was not introduced for conducting model choice and (b) interchanging hypothesis and model means that the probability of an hypothesis '''''H''''' as used in the entry is actually the evidence in favour of the corresponding model.
'''Response:''' We now first talk only about parameter estimation. We have also rewritten the section about model selection for better coherence of the text.
* (There are a few typos and grammar mistakes, but I assume either PLoS or later contributors will correct those.)
'''Response:''' We have corrected the typos and grammatical mistakes found during the revision.
* When the authors state that the "outcome of the ABC rejection algorithm is a set of parameter estimates distributed according to the desired posterior distribution", I think they are leading some of the readers astray as they forget the "approximative" aspect of this distribution.
'''Response:''' This has been changed.
* Further below, I would have used the title "Insufficient summary statistics" rather than "Sufficient summary statistics", as it spells out more clearly the fundamental issue with the potential difficulty in using ABC.
'''Response:''' The title has been changed to “Summary statistics” (see also Dennis Prangle's comment below)
* (And I am not sure the subsequent paragraph on "Choice and sufficiency of summary statistics" should bother with the sufficiency aspects... It seems to me much more relevant to assess the impact on predictive performances.
'''Response:'''
* Although this is most minor, I would not have made mention of the (rather artificial) "table for interpretation of the strength in values of [[wp:Bayes factor|the Bayes factor]] (...) originally published by [[wp:Harold Jeffreys|Harold Jeffreys]]". I obviously appreciate very much that the authors advertise our warning <ref> Robert, C.P., Cornuet, J.-M., Marin, J.-M. and Pillai, N. (2011) Lack of confidence in approximate Bayesian computation model choice. PNAS vol. 108 no. 37 15112-15117.</ref> about the potential lack of validity of an ABC based Bayes factor!
'''Response:''' The section on model selection has been almost entirely rewritten. In the process, the reference to Jeffreys's table has been removed.
* I also like the notion of "quality control", even though it should only appear once.
'''Response:''' We have merged the two sections about quality control.
* And the pseudo-example is quite fine as an introduction, while it could be supplemented with the outcome resulting from a large n, to be compared with the true posterior distribution.
'''Response:'''
* The section "Pitfalls and remedies" is remarkable in that it details the necessary steps for validating a ABC implementation: the only entry I would remove is the one about "Prior distribution and parameter ranges", in that this is not a problem inherent to ABC... (Granted, the authors present this as a "general risks in statistical inference exacerbated in ABC", which makes more sense!)
'''Response:''' We would like to keep the discussion on prior distribution and parameter ranges. However, a sentence was added under “Pitfalls and remedies” to emphasize that the problem related to “Prior distribution and parameter ranges” is not specific to ABC.
* It may be that the section on the non-zero tolerance should emphasize more clearly the fact that ε should not be zero. As discussed in the recent Read Paper by Fearnhead and Prangle <ref name="FP2012">Fearnhead, P. and Prangle, D. (2012) Constructing summary statistics for approximate Bayesian computation: semi-automatic approximate Bayesian computation. Journal of the Royal Statistical Society Series B. Volume 74, Issue 3, pages 419–474.</ref> when envisioning ABC as a non-parametric method of inference.
'''Response:'''
* At last, it is always possible to criticise the coverage of the historical part, since this is such a recent field that it is constantly evolving. But the authors correctly point out to (Don) Rubin on the one hand and to Diggle and Graton on the other. I would suggest adding in this section links to the relevant softwares like our own DIY-ABC<ref>Cornuet, J.-M., Santos, F., Beaumont, M. et al. (2008) Inferring population history with DIY ABC: a user-friendly approach to approximate Bayesian computation . Bioinformatics 24 (23): 2713-2719.</ref>...
'''Response:''' A section listing ABC software has been added, including a new table with references to the corresponding papers (Table 3) .
== Review by Dennis Prangle ==
This is a well written and accessible introductory article. I particularly like the balance struck between describing the simplicity of implementing ABC and the potential drawbacks.
=== Major comments ===
(nb I've included full references only for papers not in the original article.)
Much of the material in the "recent methodological developments" section is well established and no longer recent relative to the age of the field (e.g. the Marjoram et al paper was published in 2003). I'd suggest at least renaming this section. Alternatively, much of this material could be incorporated into the "approximation of the posterior" section, as regression correction ideas and MCMC / SMC algorithms are tools commonly used to improve the approximation.
A little more coverage of applications would be nice. One way to do this without increasing the length of the article would be to explicitly reference recent review papers (Beaumont 2010, Bertorelle et al 2010, Csillery et al 2010, Marin et al 2011<ref>Jean-Michel Marin, Pierre Pudlo, Christian P. Robert and Robin J. Ryder (2011) Approximate Bayesian computational methods. Statistics and Computing (published online)</ref>) for further details.
The model comparison section should explain how the ABC rejection sampling algorithm can be adapted to perform inference between models (or give a reference). A reference to more advanced algorithms (e.g. Didelot et al, Toni and Stumpf 2009<ref>Tina Toni and Michael P. H. Stumpf (2009) Simulation-based model selection for dynamical systems in systems and population biology. Bioinformatics (26) 104-110</ref>) would also be helpful.
I agree with Christian Robert's comments that the discussion of a hypothesis H in the motivation section is somewhat confusing, and that links to code could be helpful. Some additional suggestions are the "abc" R package and ABC-SysBio.
=== Minor comments ===
The acceptance criterion should be <math>\rho (\hat{D},D) \le \epsilon</math> not <math>\rho (\hat{D},D)<\epsilon</math> if <math>\epsilon=0</math> is to correspond to acceptance of exact matches only.
"Sufficient summary statistics": As Christian writes, it would seem more natural to discuss general summary statistics first, then the special and less practically useful case of sufficient statistics.
"Example": I'd point out that this is an example application only, and more accurate inference is possible here by particle filtering methods. If there were some missing data this would be a more natural ABC application e.g. if only the summary statistic was observed.
"Approximation of the posterior": "...has been justified theoretically under some limiting conditions". The word "limiting" doesn't seem (to me) to describe the measurement error case.
"Choice and sufficiency of summary statistics": "Sufficient statistics are optimal..." I'd change to "Low dimensional sufficient statistics". For some models (e.g. iid Cauchy) the only sufficient statistics are the full data set, which would be a poor choice.
"Choice and sufficiency of summary statistics": "...which is approximated with a pilot run of simulations". Something like "...which is approximated by linear regression based on simulated data" would be more accurate.
"Choice and sufficiency of summary statistics": It might be useful to reference a recent comparison<ref>M. G. B. Blum, M. A. Nunes, D. Prangle, S. A. Sisson (2012) A comparative review of dimension reduction methods in approximate Bayesian computation. arxiv.org/abs/1202.3819 </ref> (disclaimer: which I contributed to) between methods of choosing summary statistics.
"Bayes factor with ABC and summary statistics": "...can also be used to..." it might be more accurate to say "...is sufficient to..."
"Bayes factor with ABC and summary statistics": "meaningless" seems too strong as the next sentence suggests a potentially useful alternative way of doing inference.
"Prior distribution and parameter ranges": "...based on the principle of maximum entropy". A link to the general topic of [[wp:Prior_distribution#Uninformative_priors|objective priors]] might be helpful here.
"Large data sets": "which may be a tractable approach for ABC based methods". Note it is already easy to parallelise many of the steps in ABC algorithms based on rejection sampling and SMC.
"Curse of dimensionality": Some theoretical results have been proved here<ref>M. G. B. Blum (2010) Approximate Bayesian Computation: a nonparametric perspective. Journal of the American Statistical Association (105) 1178-1187</ref><ref name="FP2012" />.
"Conclusion": "With faster evaluation of the likelihood function..." I'm not sure what this is getting at; in ABC applications the likelihood function typically cannot be evaluated!
==References==
<references />
== Response to the reviewers ==
===Response to the comments by Christian P. Robert===
'''1:''' The entry starts with the representation of the posterior probability of an hypothesis, rather than with the posterior density of a model parameter, which makes it seems likely it could lead the novice reader astray. After all, (a) ABC was not introduced for conducting model choice and (b) interchanging hypothesis and model means that the probability of an hypothesis H as used in the entry is actually the evidence in favour of the corresponding model.
'''Comment:''' We now first talk only about parameter estimation. We have also rewritten the section about model selection for better coherence of the text.
'''2:''' There are a few typos and grammar mistakes, but I assume either PLoS or later contributors will correct those.
'''Comment:''' We have corrected the typos and grammatical mistakes found during the revision.
'''3:''' When the authors state that the "outcome of the ABC rejection algorithm is a set of parameter estimates distributed according to the desired posterior distribution", I think they are leading some of the readers astray as they forget the "approximative" aspect of this distribution.
'''Comment:''' This has been changed.
'''4:''' Further below, I would have used the title "Insufficient summary statistics" rather than "Sufficient summary statistics", as it spells out more clearly the fundamental issue with the potential difficulty in using ABC.
'''Comment:''' The title has been changed to “Summary statistics”.
'''5:''' And I am not sure the subsequent paragraph on "Choice and sufficiency of summary statistics" should bother with the sufficiency aspects... It seems to me much more relevant to assess the impact on predictive performances.
'''Comment:'''
'''6:''' Although this is most minor, I would not have made mention of the (rather artificial) "table for interpretation of the strength in values of the Bayes factor (...) originally published by Harold Jeffreys".
'''Comment:''' The reference has been removed.
'''7:''' I also like the notion of "quality control", even though it should only appear once.
'''Comment:''' We have now merged the two sections about quality control.
'''8:''' And the pseudo-example is quite fine as an introduction, while it could be supplemented with the outcome resulting from a large n, to be compared with the true posterior distribution.
'''Comment:'''
'''9:''' The section "Pitfalls and remedies" is remarkable in that it details the necessary steps for validating a ABC implementation: the only entry I would remove is the one about "Prior distribution and parameter ranges", in that this is not a problem inherent to ABC... (Granted, the authors present this as a "general risks in statistical inference exacerbated in ABC", which makes more sense!)
'''Comment:''' We have decided to keep this entry. However, a sentence was added under “Pitfalls and remedies” to emphasize that the problem related to “Prior distribution and parameter ranges” is not specific to ABC.
'''10:''' It may be that the section on the non-zero tolerance should emphasize more clearly the fact that ε should not be zero. As discussed in the recent Read Paper by Fearnhead and Prangle [2] when envisioning ABC as a non-parametric method of inference.
'''Comment:'''
'''11:''' I would suggest adding in this section links to the relevant softwares like our own DIY-ABC[3]...
'''Comment:''' A section about software has been added, as well as a Table (Table 3) with references to relevant papers.
===Response to the comments by Dennis Prangle===
====Major comments====
'''1:''' Much of the material in the "recent methodological developments" section is well established and no longer recent relative to the age of the field (e.g. the Marjoram et al paper was published in 2003). I'd suggest at least renaming this section. Alternatively, much of this material could be incorporated into the "approximation of the posterior" section, as regression correction ideas and MCMC / SMC algorithms are tools commonly used to improve the approximation.
'''Comment:''' The section has been removed and most of the material has been incorporated into the “approximation of the posterior” section.
'''2:''' A little more coverage of applications would be nice. One way to do this without increasing the length of the article would be to explicitly reference recent review papers (Beaumont 2010, Bertorelle et al 2010, Csillery et al 2010, Marin et al 2011[4]) for further details.
'''Comment:''' We have added a sentence about applications of ABC, with references to these review papers, at the end of the “Example” section.
'''3:''' The model comparison section should explain how the ABC rejection sampling algorithm can be adapted to perform inference between models (or give a reference). A reference to more advanced algorithms (e.g. Didelot et al, Toni and Stumpf 2009[5]) would also be helpful.
'''Comment:''' We have added a reference to the Toni & Stumpf SMC-ABC method for model selection.
'''4:''' I agree with Christian Robert's comments that the discussion of a hypothesis H in the motivation section is somewhat confusing,
'''Comment:''' See our response to Christian Robert’s comment above (1).
'''5:''' and that links to code could be helpful. Some additional suggestions are the "abc" R package and ABC-SysBio.
'''Comment:''' A section about software has been added.
====Minor comments====
'''6:''' The acceptance criterion should be not if ε = 0 is to correspond to acceptance of exact matches only.
'''Comment:''' This has been changed.
'''7:''' "Sufficient summary statistics": As Christian writes, it would seem more natural to discuss general summary statistics first, then the special and less practically useful case of sufficient statistics.
'''Comment:'''
'''8:''' "Example": I'd point out that this is an example application only, and more accurate inference is possible here by particle filtering methods. If there were some missing data this would be a more natural ABC application e.g. if only the summary statistic was observed.
'''Comment:''' We have modified the example so that only the frequency of switches is known (summary statistics). We have also added a sentence to point out that it is only an example application, and that the inference may be improved with more advanced methods.
'''9:''' "Approximation of the posterior": "...has been justified theoretically under some limiting conditions". The word "limiting" doesn't seem (to me) to describe the measurement error case.
'''Comment:'''
'''10:''' "Choice and sufficiency of summary statistics": "Sufficient statistics are optimal..." I'd change to "Low dimensional sufficient statistics". For some models (e.g. iid Cauchy) the only sufficient statistics are the full data set, which would be a poor choice.
'''Comment:''' This has been changed.
'''11:''' "Choice and sufficiency of summary statistics": "...which is approximated with a pilot run of simulations". Something like "...which is approximated by linear regression based on simulated data" would be more accurate.
'''Comment:''' This has been changed.
'''12:''' "Choice and sufficiency of summary statistics": It might be useful to reference a recent comparison[6] (disclaimer: which I contributed to) between methods of choosing summary statistics.
'''Comment:''' A sentence was added with a reference to the paper.
'''13:''' "Bayes factor with ABC and summary statistics": "...can also be used to..." it might be more accurate to say "...is sufficient to..."
'''Comment:''' This has been changed.
'''14:''' "Bayes factor with ABC and summary statistics": "meaningless" seems too strong as the next sentence suggests a potentially useful alternative way of doing inference.
'''Comment:''' The formulation was changed to “may therefore be misinformative”.
'''15:''' "Prior distribution and parameter ranges": "...based on the principle of maximum entropy". A link to the general topic of objective priors might be helpful here.
'''Comment:''' This formulation has been changed.
'''16:''' "Large data sets": "which may be a tractable approach for ABC based methods". Note it is already easy to parallelise many of the steps in ABC algorithms based on rejection sampling and SMC.
'''Comment:''' This has been changed.
'''17:''' "Curse of dimensionality": Some theoretical results have been proved here[7][2].
'''Comment:'''
'''18:''' "Conclusion": "With faster evaluation of the likelihood function..." I'm not sure what this is getting at; in ABC applications the likelihood function typically cannot be evaluated!
'''Comment:''' This formulation has been changed.
1150
2012-07-05T15:05:00Z
Cdessimoz
12
/* Response to the comments by Christian P. Robert */
== Comments of Darren Logan on things to do before moving the article to Wikipedia ==
* The title should probably be "Approximate Bayesian computation" (small "c"). The "...in computational biology" is probably redundant for WP article. You can always make a redirection from the long title to the short one.
*: Moved. --<span style="font-family:Palatino Linotype, Book Antiqua, Palatino, serif;">[[User:Spencer Bliven|Spencer]] [[User talk:Spencer Bliven|Bliven]]</span> 15:35, 18 May 2012 (PDT)
'''The following comments apply specifically to the wikipedia-version of this article''' --[[User:Cdessimoz|Cdessimoz]] 03:41, 22 May 2012 (PDT)
* In the summary and elsewhere, you use terms like "over the last years" and "recently". These should be avoided, as WP articles are not dated and thus non-specific time-frames are not meaningful. If you need to refer to time, be specific (e.g. "Since 1999..." or "In 2010...")
* Example. In general, Wikipedia articles should not contain worked examples. That type of content is better suited to Wikiversity or Wikibooks. There are exceptions, however. The guidance on this can be seen at WP:NOT, specifically: "An article should not read like a "how-to" style... the purpose of Wikipedia is to present facts, not to teach subject matter. It is not appropriate to create or edit articles that read as textbooks, with leading questions and systematic problem solutions as examples... Some kinds of examples, specifically those intended to inform rather than to instruct, may be appropriate for inclusion in a Wikipedia article." I think your example might be ok, but you should be careful of the tone to ensure it doesn't seem like a "how-to" guide.
* Wikipedia articles do not have conclusion sections.
* Throughout the article you should try and avoid using a narrative voice and remove all self-references. For example:
** "As the previous section suggests..."
** "This section attempts to review important recent developments..."
** "...should be considered with sober caution, as discussed below."
** "Interestingly..."
** "This section discusses these potential risks and reviews possible ways to address them.."
** "As the above makes clear..."
** "This section reviews risks..."
** "This section attempts to review important recent developments."
To follow up on these remarks, the history section in such Wikipedia articles is typically the first after the lead section, as it puts the topic into its historic context. I have thus moved it there. However, this possibly breaks some of the narrative flow and should thus be checked again during the revision. --[[User:Daniel Mietchen|Daniel Mietchen]] 19:05, 27 June 2012 (PDT)
: '''Response''': We have verified the coherence of the narrative flow. --[[User:Cdessimoz|Cdessimoz]] 07:52, 5 July 2012 (PDT)
== Comments of Christian P. Robert on the entry ==
A few comments on the specific entry on ABC written by Mikael Sunnåker et al....
*The entry starts with the representation of the posterior probability of an hypothesis, rather than with the posterior density of a model parameter, which makes it seems likely it could lead the novice reader astray. After all, (a) ABC was not introduced for conducting model choice and (b) interchanging hypothesis and model means that the probability of an hypothesis '''''H''''' as used in the entry is actually the evidence in favour of the corresponding model.
'''Response:''' We now first talk only about parameter estimation. We have also rewritten the section about model selection for better coherence of the text.
* (There are a few typos and grammar mistakes, but I assume either PLoS or later contributors will correct those.)
'''Response:''' We have corrected the typos and grammatical mistakes found during the revision.
* When the authors state that the "outcome of the ABC rejection algorithm is a set of parameter estimates distributed according to the desired posterior distribution", I think they are leading some of the readers astray as they forget the "approximative" aspect of this distribution.
'''Response:''' This has been changed.
* Further below, I would have used the title "Insufficient summary statistics" rather than "Sufficient summary statistics", as it spells out more clearly the fundamental issue with the potential difficulty in using ABC.
'''Response:''' The title has been changed to “Summary statistics” (see also Dennis Prangle's comment below)
* (And I am not sure the subsequent paragraph on "Choice and sufficiency of summary statistics" should bother with the sufficiency aspects... It seems to me much more relevant to assess the impact on predictive performances.
'''Response:'''
* Although this is most minor, I would not have made mention of the (rather artificial) "table for interpretation of the strength in values of [[wp:Bayes factor|the Bayes factor]] (...) originally published by [[wp:Harold Jeffreys|Harold Jeffreys]]". I obviously appreciate very much that the authors advertise our warning <ref> Robert, C.P., Cornuet, J.-M., Marin, J.-M. and Pillai, N. (2011) Lack of confidence in approximate Bayesian computation model choice. PNAS vol. 108 no. 37 15112-15117.</ref> about the potential lack of validity of an ABC based Bayes factor!
'''Response:''' The section on model selection has been almost entirely rewritten. In the process, the reference to Jeffreys's table has been removed.
* I also like the notion of "quality control", even though it should only appear once.
'''Response:''' We have merged the two sections about quality control.
* And the pseudo-example is quite fine as an introduction, while it could be supplemented with the outcome resulting from a large n, to be compared with the true posterior distribution.
'''Response:'''
* The section "Pitfalls and remedies" is remarkable in that it details the necessary steps for validating a ABC implementation: the only entry I would remove is the one about "Prior distribution and parameter ranges", in that this is not a problem inherent to ABC... (Granted, the authors present this as a "general risks in statistical inference exacerbated in ABC", which makes more sense!)
'''Response:''' We would like to keep the discussion on prior distribution and parameter ranges. However, a sentence was added under “Pitfalls and remedies” to emphasize that the problem related to “Prior distribution and parameter ranges” is not specific to ABC.
* It may be that the section on the non-zero tolerance should emphasize more clearly the fact that ε should not be zero. As discussed in the recent Read Paper by Fearnhead and Prangle <ref name="FP2012">Fearnhead, P. and Prangle, D. (2012) Constructing summary statistics for approximate Bayesian computation: semi-automatic approximate Bayesian computation. Journal of the Royal Statistical Society Series B. Volume 74, Issue 3, pages 419–474.</ref> when envisioning ABC as a non-parametric method of inference.
'''Response:'''
* At last, it is always possible to criticise the coverage of the historical part, since this is such a recent field that it is constantly evolving. But the authors correctly point out to (Don) Rubin on the one hand and to Diggle and Graton on the other. I would suggest adding in this section links to the relevant softwares like our own DIY-ABC<ref>Cornuet, J.-M., Santos, F., Beaumont, M. et al. (2008) Inferring population history with DIY ABC: a user-friendly approach to approximate Bayesian computation . Bioinformatics 24 (23): 2713-2719.</ref>...
'''Response:''' A section listing ABC software has been added, including a new table with references to the corresponding papers (Table 3) .
== Review by Dennis Prangle ==
This is a well written and accessible introductory article. I particularly like the balance struck between describing the simplicity of implementing ABC and the potential drawbacks.
=== Major comments ===
(nb I've included full references only for papers not in the original article.)
Much of the material in the "recent methodological developments" section is well established and no longer recent relative to the age of the field (e.g. the Marjoram et al paper was published in 2003). I'd suggest at least renaming this section. Alternatively, much of this material could be incorporated into the "approximation of the posterior" section, as regression correction ideas and MCMC / SMC algorithms are tools commonly used to improve the approximation.
A little more coverage of applications would be nice. One way to do this without increasing the length of the article would be to explicitly reference recent review papers (Beaumont 2010, Bertorelle et al 2010, Csillery et al 2010, Marin et al 2011<ref>Jean-Michel Marin, Pierre Pudlo, Christian P. Robert and Robin J. Ryder (2011) Approximate Bayesian computational methods. Statistics and Computing (published online)</ref>) for further details.
The model comparison section should explain how the ABC rejection sampling algorithm can be adapted to perform inference between models (or give a reference). A reference to more advanced algorithms (e.g. Didelot et al, Toni and Stumpf 2009<ref>Tina Toni and Michael P. H. Stumpf (2009) Simulation-based model selection for dynamical systems in systems and population biology. Bioinformatics (26) 104-110</ref>) would also be helpful.
I agree with Christian Robert's comments that the discussion of a hypothesis H in the motivation section is somewhat confusing, and that links to code could be helpful. Some additional suggestions are the "abc" R package and ABC-SysBio.
=== Minor comments ===
The acceptance criterion should be <math>\rho (\hat{D},D) \le \epsilon</math> not <math>\rho (\hat{D},D)<\epsilon</math> if <math>\epsilon=0</math> is to correspond to acceptance of exact matches only.
"Sufficient summary statistics": As Christian writes, it would seem more natural to discuss general summary statistics first, then the special and less practically useful case of sufficient statistics.
"Example": I'd point out that this is an example application only, and more accurate inference is possible here by particle filtering methods. If there were some missing data this would be a more natural ABC application e.g. if only the summary statistic was observed.
"Approximation of the posterior": "...has been justified theoretically under some limiting conditions". The word "limiting" doesn't seem (to me) to describe the measurement error case.
"Choice and sufficiency of summary statistics": "Sufficient statistics are optimal..." I'd change to "Low dimensional sufficient statistics". For some models (e.g. iid Cauchy) the only sufficient statistics are the full data set, which would be a poor choice.
"Choice and sufficiency of summary statistics": "...which is approximated with a pilot run of simulations". Something like "...which is approximated by linear regression based on simulated data" would be more accurate.
"Choice and sufficiency of summary statistics": It might be useful to reference a recent comparison<ref>M. G. B. Blum, M. A. Nunes, D. Prangle, S. A. Sisson (2012) A comparative review of dimension reduction methods in approximate Bayesian computation. arxiv.org/abs/1202.3819 </ref> (disclaimer: which I contributed to) between methods of choosing summary statistics.
"Bayes factor with ABC and summary statistics": "...can also be used to..." it might be more accurate to say "...is sufficient to..."
"Bayes factor with ABC and summary statistics": "meaningless" seems too strong as the next sentence suggests a potentially useful alternative way of doing inference.
"Prior distribution and parameter ranges": "...based on the principle of maximum entropy". A link to the general topic of [[wp:Prior_distribution#Uninformative_priors|objective priors]] might be helpful here.
"Large data sets": "which may be a tractable approach for ABC based methods". Note it is already easy to parallelise many of the steps in ABC algorithms based on rejection sampling and SMC.
"Curse of dimensionality": Some theoretical results have been proved here<ref>M. G. B. Blum (2010) Approximate Bayesian Computation: a nonparametric perspective. Journal of the American Statistical Association (105) 1178-1187</ref><ref name="FP2012" />.
"Conclusion": "With faster evaluation of the likelihood function..." I'm not sure what this is getting at; in ABC applications the likelihood function typically cannot be evaluated!
==References==
<references />
== Response to the reviewers ==
===Response to the comments by Dennis Prangle===
====Major comments====
'''1:''' Much of the material in the "recent methodological developments" section is well established and no longer recent relative to the age of the field (e.g. the Marjoram et al paper was published in 2003). I'd suggest at least renaming this section. Alternatively, much of this material could be incorporated into the "approximation of the posterior" section, as regression correction ideas and MCMC / SMC algorithms are tools commonly used to improve the approximation.
'''Comment:''' The section has been removed and most of the material has been incorporated into the “approximation of the posterior” section.
'''2:''' A little more coverage of applications would be nice. One way to do this without increasing the length of the article would be to explicitly reference recent review papers (Beaumont 2010, Bertorelle et al 2010, Csillery et al 2010, Marin et al 2011[4]) for further details.
'''Comment:''' We have added a sentence about applications of ABC, with references to these review papers, at the end of the “Example” section.
'''3:''' The model comparison section should explain how the ABC rejection sampling algorithm can be adapted to perform inference between models (or give a reference). A reference to more advanced algorithms (e.g. Didelot et al, Toni and Stumpf 2009[5]) would also be helpful.
'''Comment:''' We have added a reference to the Toni & Stumpf SMC-ABC method for model selection.
'''4:''' I agree with Christian Robert's comments that the discussion of a hypothesis H in the motivation section is somewhat confusing,
'''Comment:''' See our response to Christian Robert’s comment above (1).
'''5:''' and that links to code could be helpful. Some additional suggestions are the "abc" R package and ABC-SysBio.
'''Comment:''' A section about software has been added.
====Minor comments====
'''6:''' The acceptance criterion should be not if ε = 0 is to correspond to acceptance of exact matches only.
'''Comment:''' This has been changed.
'''7:''' "Sufficient summary statistics": As Christian writes, it would seem more natural to discuss general summary statistics first, then the special and less practically useful case of sufficient statistics.
'''Comment:'''
'''8:''' "Example": I'd point out that this is an example application only, and more accurate inference is possible here by particle filtering methods. If there were some missing data this would be a more natural ABC application e.g. if only the summary statistic was observed.
'''Comment:''' We have modified the example so that only the frequency of switches is known (summary statistics). We have also added a sentence to point out that it is only an example application, and that the inference may be improved with more advanced methods.
'''9:''' "Approximation of the posterior": "...has been justified theoretically under some limiting conditions". The word "limiting" doesn't seem (to me) to describe the measurement error case.
'''Comment:'''
'''10:''' "Choice and sufficiency of summary statistics": "Sufficient statistics are optimal..." I'd change to "Low dimensional sufficient statistics". For some models (e.g. iid Cauchy) the only sufficient statistics are the full data set, which would be a poor choice.
'''Comment:''' This has been changed.
'''11:''' "Choice and sufficiency of summary statistics": "...which is approximated with a pilot run of simulations". Something like "...which is approximated by linear regression based on simulated data" would be more accurate.
'''Comment:''' This has been changed.
'''12:''' "Choice and sufficiency of summary statistics": It might be useful to reference a recent comparison[6] (disclaimer: which I contributed to) between methods of choosing summary statistics.
'''Comment:''' A sentence was added with a reference to the paper.
'''13:''' "Bayes factor with ABC and summary statistics": "...can also be used to..." it might be more accurate to say "...is sufficient to..."
'''Comment:''' This has been changed.
'''14:''' "Bayes factor with ABC and summary statistics": "meaningless" seems too strong as the next sentence suggests a potentially useful alternative way of doing inference.
'''Comment:''' The formulation was changed to “may therefore be misinformative”.
'''15:''' "Prior distribution and parameter ranges": "...based on the principle of maximum entropy". A link to the general topic of objective priors might be helpful here.
'''Comment:''' This formulation has been changed.
'''16:''' "Large data sets": "which may be a tractable approach for ABC based methods". Note it is already easy to parallelise many of the steps in ABC algorithms based on rejection sampling and SMC.
'''Comment:''' This has been changed.
'''17:''' "Curse of dimensionality": Some theoretical results have been proved here[7][2].
'''Comment:'''
'''18:''' "Conclusion": "With faster evaluation of the likelihood function..." I'm not sure what this is getting at; in ABC applications the likelihood function typically cannot be evaluated!
'''Comment:''' This formulation has been changed.
1158
2012-07-05T20:40:28Z
Mikaelsunnaker
13
== Comments of Darren Logan on things to do before moving the article to Wikipedia ==
* The title should probably be "Approximate Bayesian computation" (small "c"). The "...in computational biology" is probably redundant for WP article. You can always make a redirection from the long title to the short one.
*: Moved. --<span style="font-family:Palatino Linotype, Book Antiqua, Palatino, serif;">[[User:Spencer Bliven|Spencer]] [[User talk:Spencer Bliven|Bliven]]</span> 15:35, 18 May 2012 (PDT)
'''The following comments apply specifically to the wikipedia-version of this article''' --[[User:Cdessimoz|Cdessimoz]] 03:41, 22 May 2012 (PDT)
* In the summary and elsewhere, you use terms like "over the last years" and "recently". These should be avoided, as WP articles are not dated and thus non-specific time-frames are not meaningful. If you need to refer to time, be specific (e.g. "Since 1999..." or "In 2010...")
* Example. In general, Wikipedia articles should not contain worked examples. That type of content is better suited to Wikiversity or Wikibooks. There are exceptions, however. The guidance on this can be seen at WP:NOT, specifically: "An article should not read like a "how-to" style... the purpose of Wikipedia is to present facts, not to teach subject matter. It is not appropriate to create or edit articles that read as textbooks, with leading questions and systematic problem solutions as examples... Some kinds of examples, specifically those intended to inform rather than to instruct, may be appropriate for inclusion in a Wikipedia article." I think your example might be ok, but you should be careful of the tone to ensure it doesn't seem like a "how-to" guide.
* Wikipedia articles do not have conclusion sections.
* Throughout the article you should try and avoid using a narrative voice and remove all self-references. For example:
** "As the previous section suggests..."
** "This section attempts to review important recent developments..."
** "...should be considered with sober caution, as discussed below."
** "Interestingly..."
** "This section discusses these potential risks and reviews possible ways to address them.."
** "As the above makes clear..."
** "This section reviews risks..."
** "This section attempts to review important recent developments."
To follow up on these remarks, the history section in such Wikipedia articles is typically the first after the lead section, as it puts the topic into its historic context. I have thus moved it there. However, this possibly breaks some of the narrative flow and should thus be checked again during the revision. --[[User:Daniel Mietchen|Daniel Mietchen]] 19:05, 27 June 2012 (PDT)
: '''Response''': We have verified the coherence of the narrative flow. --[[User:Cdessimoz|Cdessimoz]] 07:52, 5 July 2012 (PDT)
== Comments of Christian P. Robert on the entry ==
A few comments on the specific entry on ABC written by Mikael Sunnåker et al....
*The entry starts with the representation of the posterior probability of an hypothesis, rather than with the posterior density of a model parameter, which makes it seems likely it could lead the novice reader astray. After all, (a) ABC was not introduced for conducting model choice and (b) interchanging hypothesis and model means that the probability of an hypothesis '''''H''''' as used in the entry is actually the evidence in favour of the corresponding model.
'''Response:''' We now first talk only about parameter estimation. We have also rewritten the section about model selection for better coherence of the text.
* (There are a few typos and grammar mistakes, but I assume either PLoS or later contributors will correct those.)
'''Response:''' We have corrected the typos and grammatical mistakes found during the revision.
* When the authors state that the "outcome of the ABC rejection algorithm is a set of parameter estimates distributed according to the desired posterior distribution", I think they are leading some of the readers astray as they forget the "approximative" aspect of this distribution.
'''Response:''' This has been changed.
* Further below, I would have used the title "Insufficient summary statistics" rather than "Sufficient summary statistics", as it spells out more clearly the fundamental issue with the potential difficulty in using ABC.
'''Response:''' The title has been changed to “Summary statistics” (see also Dennis Prangle's comment below)
* (And I am not sure the subsequent paragraph on "Choice and sufficiency of summary statistics" should bother with the sufficiency aspects... It seems to me much more relevant to assess the impact on predictive performances.
'''Response:'''
* Although this is most minor, I would not have made mention of the (rather artificial) "table for interpretation of the strength in values of [[wp:Bayes factor|the Bayes factor]] (...) originally published by [[wp:Harold Jeffreys|Harold Jeffreys]]". I obviously appreciate very much that the authors advertise our warning <ref> Robert, C.P., Cornuet, J.-M., Marin, J.-M. and Pillai, N. (2011) Lack of confidence in approximate Bayesian computation model choice. PNAS vol. 108 no. 37 15112-15117.</ref> about the potential lack of validity of an ABC based Bayes factor!
'''Response:''' The section on model selection has been almost entirely rewritten. In the process, the reference to Jeffreys's table has been removed.
* I also like the notion of "quality control", even though it should only appear once.
'''Response:''' We have merged the two sections about quality control.
* And the pseudo-example is quite fine as an introduction, while it could be supplemented with the outcome resulting from a large n, to be compared with the true posterior distribution.
'''Response:'''
* The section "Pitfalls and remedies" is remarkable in that it details the necessary steps for validating a ABC implementation: the only entry I would remove is the one about "Prior distribution and parameter ranges", in that this is not a problem inherent to ABC... (Granted, the authors present this as a "general risks in statistical inference exacerbated in ABC", which makes more sense!)
'''Response:''' We would like to keep the discussion on prior distribution and parameter ranges. However, a sentence was added under “Pitfalls and remedies” to emphasize that the problem related to “Prior distribution and parameter ranges” is not specific to ABC.
* It may be that the section on the non-zero tolerance should emphasize more clearly the fact that ε should not be zero. As discussed in the recent Read Paper by Fearnhead and Prangle <ref name="FP2012">Fearnhead, P. and Prangle, D. (2012) Constructing summary statistics for approximate Bayesian computation: semi-automatic approximate Bayesian computation. Journal of the Royal Statistical Society Series B. Volume 74, Issue 3, pages 419–474.</ref> when envisioning ABC as a non-parametric method of inference.
'''Response:'''
* At last, it is always possible to criticise the coverage of the historical part, since this is such a recent field that it is constantly evolving. But the authors correctly point out to (Don) Rubin on the one hand and to Diggle and Graton on the other. I would suggest adding in this section links to the relevant softwares like our own DIY-ABC<ref>Cornuet, J.-M., Santos, F., Beaumont, M. et al. (2008) Inferring population history with DIY ABC: a user-friendly approach to approximate Bayesian computation . Bioinformatics 24 (23): 2713-2719.</ref>...
'''Response:''' A section listing ABC software has been added, including a new table with references to the corresponding papers (Table 3) .
== Review by Dennis Prangle ==
This is a well written and accessible introductory article. I particularly like the balance struck between describing the simplicity of implementing ABC and the potential drawbacks.
=== Major comments ===
(nb I've included full references only for papers not in the original article.)
* Much of the material in the "recent methodological developments" section is well established and no longer recent relative to the age of the field (e.g. the Marjoram et al paper was published in 2003). I'd suggest at least renaming this section. Alternatively, much of this material could be incorporated into the "approximation of the posterior" section, as regression correction ideas and MCMC / SMC algorithms are tools commonly used to improve the approximation.
'''Response:'''
* A little more coverage of applications would be nice. One way to do this without increasing the length of the article would be to explicitly reference recent review papers (Beaumont 2010, Bertorelle et al 2010, Csillery et al 2010, Marin et al 2011<ref>Jean-Michel Marin, Pierre Pudlo, Christian P. Robert and Robin J. Ryder (2011) Approximate Bayesian computational methods. Statistics and Computing (published online)</ref>) for further details.
'''Response:'''
* The model comparison section should explain how the ABC rejection sampling algorithm can be adapted to perform inference between models (or give a reference). A reference to more advanced algorithms (e.g. Didelot et al, Toni and Stumpf 2009<ref>Tina Toni and Michael P. H. Stumpf (2009) Simulation-based model selection for dynamical systems in systems and population biology. Bioinformatics (26) 104-110</ref>) would also be helpful.
'''Response:'''
* I agree with Christian Robert's comments that the discussion of a hypothesis H in the motivation section is somewhat confusing, and that links to code could be helpful. Some additional suggestions are the "abc" R package and ABC-SysBio.
'''Response:'''
=== Minor comments ===
* The acceptance criterion should be <math>\rho (\hat{D},D) \le \epsilon</math> not <math>\rho (\hat{D},D)<\epsilon</math> if <math>\epsilon=0</math> is to correspond to acceptance of exact matches only.
'''Response:'''
*"Sufficient summary statistics": As Christian writes, it would seem more natural to discuss general summary statistics first, then the special and less practically useful case of sufficient statistics.
'''Response:'''
*"Example": I'd point out that this is an example application only, and more accurate inference is possible here by particle filtering methods. If there were some missing data this would be a more natural ABC application e.g. if only the summary statistic was observed.
'''Response:'''
*"Approximation of the posterior": "...has been justified theoretically under some limiting conditions". The word "limiting" doesn't seem (to me) to describe the measurement error case.
'''Response:'''
*"Choice and sufficiency of summary statistics": "Sufficient statistics are optimal..." I'd change to "Low dimensional sufficient statistics". For some models (e.g. iid Cauchy) the only sufficient statistics are the full data set, which would be a poor choice.
'''Response:'''
*"Choice and sufficiency of summary statistics": "...which is approximated with a pilot run of simulations". Something like "...which is approximated by linear regression based on simulated data" would be more accurate.
'''Response:'''
*"Choice and sufficiency of summary statistics": It might be useful to reference a recent comparison<ref>M. G. B. Blum, M. A. Nunes, D. Prangle, S. A. Sisson (2012) A comparative review of dimension reduction methods in approximate Bayesian computation. arxiv.org/abs/1202.3819 </ref> (disclaimer: which I contributed to) between methods of choosing summary statistics.
'''Response:'''
*"Bayes factor with ABC and summary statistics": "...can also be used to..." it might be more accurate to say "...is sufficient to..."
'''Response:'''
*"Bayes factor with ABC and summary statistics": "meaningless" seems too strong as the next sentence suggests a potentially useful alternative way of doing inference.
'''Response:'''
*"Prior distribution and parameter ranges": "...based on the principle of maximum entropy". A link to the general topic of [[wp:Prior_distribution#Uninformative_priors|objective priors]] might be helpful here.
'''Response:'''
*"Large data sets": "which may be a tractable approach for ABC based methods". Note it is already easy to parallelise many of the steps in ABC algorithms based on rejection sampling and SMC.
'''Response:'''
*"Curse of dimensionality": Some theoretical results have been proved here<ref>M. G. B. Blum (2010) Approximate Bayesian Computation: a nonparametric perspective. Journal of the American Statistical Association (105) 1178-1187</ref><ref name="FP2012" />.
'''Response:'''
*"Conclusion": "With faster evaluation of the likelihood function..." I'm not sure what this is getting at; in ABC applications the likelihood function typically cannot be evaluated!
'''Response:'''
==References==
<references />
== Response to the reviewers ==
===Response to the comments by Dennis Prangle===
====Major comments====
'''1:''' Much of the material in the "recent methodological developments" section is well established and no longer recent relative to the age of the field (e.g. the Marjoram et al paper was published in 2003). I'd suggest at least renaming this section. Alternatively, much of this material could be incorporated into the "approximation of the posterior" section, as regression correction ideas and MCMC / SMC algorithms are tools commonly used to improve the approximation.
'''Comment:''' The section has been removed and most of the material has been incorporated into the “approximation of the posterior” section.
'''2:''' A little more coverage of applications would be nice. One way to do this without increasing the length of the article would be to explicitly reference recent review papers (Beaumont 2010, Bertorelle et al 2010, Csillery et al 2010, Marin et al 2011[4]) for further details.
'''Comment:''' We have added a sentence about applications of ABC, with references to these review papers, at the end of the “Example” section.
'''3:''' The model comparison section should explain how the ABC rejection sampling algorithm can be adapted to perform inference between models (or give a reference). A reference to more advanced algorithms (e.g. Didelot et al, Toni and Stumpf 2009[5]) would also be helpful.
'''Comment:''' We have added a reference to the Toni & Stumpf SMC-ABC method for model selection.
'''4:''' I agree with Christian Robert's comments that the discussion of a hypothesis H in the motivation section is somewhat confusing,
'''Comment:''' See our response to Christian Robert’s comment above (1).
'''5:''' and that links to code could be helpful. Some additional suggestions are the "abc" R package and ABC-SysBio.
'''Comment:''' A section about software has been added.
====Minor comments====
'''6:''' The acceptance criterion should be not if ε = 0 is to correspond to acceptance of exact matches only.
'''Comment:''' This has been changed.
'''7:''' "Sufficient summary statistics": As Christian writes, it would seem more natural to discuss general summary statistics first, then the special and less practically useful case of sufficient statistics.
'''Comment:'''
'''8:''' "Example": I'd point out that this is an example application only, and more accurate inference is possible here by particle filtering methods. If there were some missing data this would be a more natural ABC application e.g. if only the summary statistic was observed.
'''Comment:''' We have modified the example so that only the frequency of switches is known (summary statistics). We have also added a sentence to point out that it is only an example application, and that the inference may be improved with more advanced methods.
'''9:''' "Approximation of the posterior": "...has been justified theoretically under some limiting conditions". The word "limiting" doesn't seem (to me) to describe the measurement error case.
'''Comment:'''
'''10:''' "Choice and sufficiency of summary statistics": "Sufficient statistics are optimal..." I'd change to "Low dimensional sufficient statistics". For some models (e.g. iid Cauchy) the only sufficient statistics are the full data set, which would be a poor choice.
'''Comment:''' This has been changed.
'''11:''' "Choice and sufficiency of summary statistics": "...which is approximated with a pilot run of simulations". Something like "...which is approximated by linear regression based on simulated data" would be more accurate.
'''Comment:''' This has been changed.
'''12:''' "Choice and sufficiency of summary statistics": It might be useful to reference a recent comparison[6] (disclaimer: which I contributed to) between methods of choosing summary statistics.
'''Comment:''' A sentence was added with a reference to the paper.
'''13:''' "Bayes factor with ABC and summary statistics": "...can also be used to..." it might be more accurate to say "...is sufficient to..."
'''Comment:''' This has been changed.
'''14:''' "Bayes factor with ABC and summary statistics": "meaningless" seems too strong as the next sentence suggests a potentially useful alternative way of doing inference.
'''Comment:''' The formulation was changed to “may therefore be misinformative”.
'''15:''' "Prior distribution and parameter ranges": "...based on the principle of maximum entropy". A link to the general topic of objective priors might be helpful here.
'''Comment:''' This formulation has been changed.
'''16:''' "Large data sets": "which may be a tractable approach for ABC based methods". Note it is already easy to parallelise many of the steps in ABC algorithms based on rejection sampling and SMC.
'''Comment:''' This has been changed.
'''17:''' "Curse of dimensionality": Some theoretical results have been proved here[7][2].
'''Comment:'''
'''18:''' "Conclusion": "With faster evaluation of the likelihood function..." I'm not sure what this is getting at; in ABC applications the likelihood function typically cannot be evaluated!
'''Comment:''' This formulation has been changed.
1159
2012-07-05T20:46:19Z
Mikaelsunnaker
13
== Comments of Darren Logan on things to do before moving the article to Wikipedia ==
* The title should probably be "Approximate Bayesian computation" (small "c"). The "...in computational biology" is probably redundant for WP article. You can always make a redirection from the long title to the short one.
*: Moved. --<span style="font-family:Palatino Linotype, Book Antiqua, Palatino, serif;">[[User:Spencer Bliven|Spencer]] [[User talk:Spencer Bliven|Bliven]]</span> 15:35, 18 May 2012 (PDT)
'''The following comments apply specifically to the wikipedia-version of this article''' --[[User:Cdessimoz|Cdessimoz]] 03:41, 22 May 2012 (PDT)
* In the summary and elsewhere, you use terms like "over the last years" and "recently". These should be avoided, as WP articles are not dated and thus non-specific time-frames are not meaningful. If you need to refer to time, be specific (e.g. "Since 1999..." or "In 2010...")
* Example. In general, Wikipedia articles should not contain worked examples. That type of content is better suited to Wikiversity or Wikibooks. There are exceptions, however. The guidance on this can be seen at WP:NOT, specifically: "An article should not read like a "how-to" style... the purpose of Wikipedia is to present facts, not to teach subject matter. It is not appropriate to create or edit articles that read as textbooks, with leading questions and systematic problem solutions as examples... Some kinds of examples, specifically those intended to inform rather than to instruct, may be appropriate for inclusion in a Wikipedia article." I think your example might be ok, but you should be careful of the tone to ensure it doesn't seem like a "how-to" guide.
* Wikipedia articles do not have conclusion sections.
* Throughout the article you should try and avoid using a narrative voice and remove all self-references. For example:
** "As the previous section suggests..."
** "This section attempts to review important recent developments..."
** "...should be considered with sober caution, as discussed below."
** "Interestingly..."
** "This section discusses these potential risks and reviews possible ways to address them.."
** "As the above makes clear..."
** "This section reviews risks..."
** "This section attempts to review important recent developments."
To follow up on these remarks, the history section in such Wikipedia articles is typically the first after the lead section, as it puts the topic into its historic context. I have thus moved it there. However, this possibly breaks some of the narrative flow and should thus be checked again during the revision. --[[User:Daniel Mietchen|Daniel Mietchen]] 19:05, 27 June 2012 (PDT)
: '''Response''': We have verified the coherence of the narrative flow. --[[User:Cdessimoz|Cdessimoz]] 07:52, 5 July 2012 (PDT)
== Comments of Christian P. Robert on the entry ==
A few comments on the specific entry on ABC written by Mikael Sunnåker et al....
*The entry starts with the representation of the posterior probability of an hypothesis, rather than with the posterior density of a model parameter, which makes it seems likely it could lead the novice reader astray. After all, (a) ABC was not introduced for conducting model choice and (b) interchanging hypothesis and model means that the probability of an hypothesis '''''H''''' as used in the entry is actually the evidence in favour of the corresponding model.
'''Response:''' We now first talk only about parameter estimation. We have also rewritten the section about model selection for better coherence of the text.
* (There are a few typos and grammar mistakes, but I assume either PLoS or later contributors will correct those.)
'''Response:''' We have corrected the typos and grammatical mistakes found during the revision.
* When the authors state that the "outcome of the ABC rejection algorithm is a set of parameter estimates distributed according to the desired posterior distribution", I think they are leading some of the readers astray as they forget the "approximative" aspect of this distribution.
'''Response:''' This has been changed.
* Further below, I would have used the title "Insufficient summary statistics" rather than "Sufficient summary statistics", as it spells out more clearly the fundamental issue with the potential difficulty in using ABC.
'''Response:''' The title has been changed to “Summary statistics” (see also Dennis Prangle's comment below)
* (And I am not sure the subsequent paragraph on "Choice and sufficiency of summary statistics" should bother with the sufficiency aspects... It seems to me much more relevant to assess the impact on predictive performances.
'''Response:'''
* Although this is most minor, I would not have made mention of the (rather artificial) "table for interpretation of the strength in values of [[wp:Bayes factor|the Bayes factor]] (...) originally published by [[wp:Harold Jeffreys|Harold Jeffreys]]". I obviously appreciate very much that the authors advertise our warning <ref> Robert, C.P., Cornuet, J.-M., Marin, J.-M. and Pillai, N. (2011) Lack of confidence in approximate Bayesian computation model choice. PNAS vol. 108 no. 37 15112-15117.</ref> about the potential lack of validity of an ABC based Bayes factor!
'''Response:''' The section on model selection has been almost entirely rewritten. In the process, the reference to Jeffreys's table has been removed.
* I also like the notion of "quality control", even though it should only appear once.
'''Response:''' We have merged the two sections about quality control.
* And the pseudo-example is quite fine as an introduction, while it could be supplemented with the outcome resulting from a large n, to be compared with the true posterior distribution.
'''Response:'''
* The section "Pitfalls and remedies" is remarkable in that it details the necessary steps for validating a ABC implementation: the only entry I would remove is the one about "Prior distribution and parameter ranges", in that this is not a problem inherent to ABC... (Granted, the authors present this as a "general risks in statistical inference exacerbated in ABC", which makes more sense!)
'''Response:''' We would like to keep the discussion on prior distribution and parameter ranges. However, a sentence was added under “Pitfalls and remedies” to emphasize that the problem related to “Prior distribution and parameter ranges” is not specific to ABC.
* It may be that the section on the non-zero tolerance should emphasize more clearly the fact that ε should not be zero. As discussed in the recent Read Paper by Fearnhead and Prangle <ref name="FP2012">Fearnhead, P. and Prangle, D. (2012) Constructing summary statistics for approximate Bayesian computation: semi-automatic approximate Bayesian computation. Journal of the Royal Statistical Society Series B. Volume 74, Issue 3, pages 419–474.</ref> when envisioning ABC as a non-parametric method of inference.
'''Response:'''
* At last, it is always possible to criticise the coverage of the historical part, since this is such a recent field that it is constantly evolving. But the authors correctly point out to (Don) Rubin on the one hand and to Diggle and Graton on the other. I would suggest adding in this section links to the relevant softwares like our own DIY-ABC<ref>Cornuet, J.-M., Santos, F., Beaumont, M. et al. (2008) Inferring population history with DIY ABC: a user-friendly approach to approximate Bayesian computation . Bioinformatics 24 (23): 2713-2719.</ref>...
'''Response:''' A section listing ABC software has been added, including a new table with references to the corresponding papers (Table 3) .
== Review by Dennis Prangle ==
This is a well written and accessible introductory article. I particularly like the balance struck between describing the simplicity of implementing ABC and the potential drawbacks.
=== Major comments ===
(nb I've included full references only for papers not in the original article.)
* Much of the material in the "recent methodological developments" section is well established and no longer recent relative to the age of the field (e.g. the Marjoram et al paper was published in 2003). I'd suggest at least renaming this section. Alternatively, much of this material could be incorporated into the "approximation of the posterior" section, as regression correction ideas and MCMC / SMC algorithms are tools commonly used to improve the approximation.
'''Response:''' The section has been removed and most of the material has been incorporated into the “approximation of the posterior” section.
* A little more coverage of applications would be nice. One way to do this without increasing the length of the article would be to explicitly reference recent review papers (Beaumont 2010, Bertorelle et al 2010, Csillery et al 2010, Marin et al 2011<ref>Jean-Michel Marin, Pierre Pudlo, Christian P. Robert and Robin J. Ryder (2011) Approximate Bayesian computational methods. Statistics and Computing (published online)</ref>) for further details.
'''Response:''' We have added a sentence about applications of ABC, with references to these review papers, at the end of the “Example” section.
* The model comparison section should explain how the ABC rejection sampling algorithm can be adapted to perform inference between models (or give a reference). A reference to more advanced algorithms (e.g. Didelot et al, Toni and Stumpf 2009<ref>Tina Toni and Michael P. H. Stumpf (2009) Simulation-based model selection for dynamical systems in systems and population biology. Bioinformatics (26) 104-110</ref>) would also be helpful.
'''Response:''' We have added a reference to the Toni & Stumpf SMC-ABC method for model selection.
* I agree with Christian Robert's comments that the discussion of a hypothesis H in the motivation section is somewhat confusing, and that links to code could be helpful. Some additional suggestions are the "abc" R package and ABC-SysBio.
'''Response:''' See our response to Christian Robert’s comment above.
=== Minor comments ===
* The acceptance criterion should be <math>\rho (\hat{D},D) \le \epsilon</math> not <math>\rho (\hat{D},D)<\epsilon</math> if <math>\epsilon=0</math> is to correspond to acceptance of exact matches only.
'''Response:''' This has been changed.
*"Sufficient summary statistics": As Christian writes, it would seem more natural to discuss general summary statistics first, then the special and less practically useful case of sufficient statistics.
'''Response:'''
*"Example": I'd point out that this is an example application only, and more accurate inference is possible here by particle filtering methods. If there were some missing data this would be a more natural ABC application e.g. if only the summary statistic was observed.
'''Response:''' We have also added a sentence to point out that it is only an example application, and that the inference may be improved with more advanced methods.
*"Approximation of the posterior": "...has been justified theoretically under some limiting conditions". The word "limiting" doesn't seem (to me) to describe the measurement error case.
'''Response:'''
*"Choice and sufficiency of summary statistics": "Sufficient statistics are optimal..." I'd change to "Low dimensional sufficient statistics". For some models (e.g. iid Cauchy) the only sufficient statistics are the full data set, which would be a poor choice.
'''Response:'''
*"Choice and sufficiency of summary statistics": "...which is approximated with a pilot run of simulations". Something like "...which is approximated by linear regression based on simulated data" would be more accurate.
'''Response:'''
*"Choice and sufficiency of summary statistics": It might be useful to reference a recent comparison<ref>M. G. B. Blum, M. A. Nunes, D. Prangle, S. A. Sisson (2012) A comparative review of dimension reduction methods in approximate Bayesian computation. arxiv.org/abs/1202.3819 </ref> (disclaimer: which I contributed to) between methods of choosing summary statistics.
'''Response:'''
*"Bayes factor with ABC and summary statistics": "...can also be used to..." it might be more accurate to say "...is sufficient to..."
'''Response:'''
*"Bayes factor with ABC and summary statistics": "meaningless" seems too strong as the next sentence suggests a potentially useful alternative way of doing inference.
'''Response:'''
*"Prior distribution and parameter ranges": "...based on the principle of maximum entropy". A link to the general topic of [[wp:Prior_distribution#Uninformative_priors|objective priors]] might be helpful here.
'''Response:'''
*"Large data sets": "which may be a tractable approach for ABC based methods". Note it is already easy to parallelise many of the steps in ABC algorithms based on rejection sampling and SMC.
'''Response:'''
*"Curse of dimensionality": Some theoretical results have been proved here<ref>M. G. B. Blum (2010) Approximate Bayesian Computation: a nonparametric perspective. Journal of the American Statistical Association (105) 1178-1187</ref><ref name="FP2012" />.
'''Response:'''
*"Conclusion": "With faster evaluation of the likelihood function..." I'm not sure what this is getting at; in ABC applications the likelihood function typically cannot be evaluated!
'''Response:'''
==References==
<references />
== Response to the reviewers ==
===Response to the comments by Dennis Prangle===
====Major comments====
'''1:''' Much of the material in the "recent methodological developments" section is well established and no longer recent relative to the age of the field (e.g. the Marjoram et al paper was published in 2003). I'd suggest at least renaming this section. Alternatively, much of this material could be incorporated into the "approximation of the posterior" section, as regression correction ideas and MCMC / SMC algorithms are tools commonly used to improve the approximation.
'''Comment:''' The section has been removed and most of the material has been incorporated into the “approximation of the posterior” section.
'''2:''' A little more coverage of applications would be nice. One way to do this without increasing the length of the article would be to explicitly reference recent review papers (Beaumont 2010, Bertorelle et al 2010, Csillery et al 2010, Marin et al 2011[4]) for further details.
'''Comment:''' We have added a sentence about applications of ABC, with references to these review papers, at the end of the “Example” section.
'''3:''' The model comparison section should explain how the ABC rejection sampling algorithm can be adapted to perform inference between models (or give a reference). A reference to more advanced algorithms (e.g. Didelot et al, Toni and Stumpf 2009[5]) would also be helpful.
'''Comment:''' We have added a reference to the Toni & Stumpf SMC-ABC method for model selection.
'''4:''' I agree with Christian Robert's comments that the discussion of a hypothesis H in the motivation section is somewhat confusing,
'''Comment:''' See our response to Christian Robert’s comment above (1).
'''5:''' and that links to code could be helpful. Some additional suggestions are the "abc" R package and ABC-SysBio.
'''Comment:''' A section about software has been added.
====Minor comments====
'''6:''' The acceptance criterion should be not if ε = 0 is to correspond to acceptance of exact matches only.
'''Comment:''' This has been changed.
'''7:''' "Sufficient summary statistics": As Christian writes, it would seem more natural to discuss general summary statistics first, then the special and less practically useful case of sufficient statistics.
'''Comment:'''
'''8:''' "Example": I'd point out that this is an example application only, and more accurate inference is possible here by particle filtering methods. If there were some missing data this would be a more natural ABC application e.g. if only the summary statistic was observed.
'''Comment:''' We have modified the example so that only the frequency of switches is known (summary statistics). We have also added a sentence to point out that it is only an example application, and that the inference may be improved with more advanced methods.
'''9:''' "Approximation of the posterior": "...has been justified theoretically under some limiting conditions". The word "limiting" doesn't seem (to me) to describe the measurement error case.
'''Comment:'''
'''10:''' "Choice and sufficiency of summary statistics": "Sufficient statistics are optimal..." I'd change to "Low dimensional sufficient statistics". For some models (e.g. iid Cauchy) the only sufficient statistics are the full data set, which would be a poor choice.
'''Comment:''' This has been changed.
'''11:''' "Choice and sufficiency of summary statistics": "...which is approximated with a pilot run of simulations". Something like "...which is approximated by linear regression based on simulated data" would be more accurate.
'''Comment:''' This has been changed.
'''12:''' "Choice and sufficiency of summary statistics": It might be useful to reference a recent comparison[6] (disclaimer: which I contributed to) between methods of choosing summary statistics.
'''Comment:''' A sentence was added with a reference to the paper.
'''13:''' "Bayes factor with ABC and summary statistics": "...can also be used to..." it might be more accurate to say "...is sufficient to..."
'''Comment:''' This has been changed.
'''14:''' "Bayes factor with ABC and summary statistics": "meaningless" seems too strong as the next sentence suggests a potentially useful alternative way of doing inference.
'''Comment:''' The formulation was changed to “may therefore be misinformative”.
'''15:''' "Prior distribution and parameter ranges": "...based on the principle of maximum entropy". A link to the general topic of objective priors might be helpful here.
'''Comment:''' This formulation has been changed.
'''16:''' "Large data sets": "which may be a tractable approach for ABC based methods". Note it is already easy to parallelise many of the steps in ABC algorithms based on rejection sampling and SMC.
'''Comment:''' This has been changed.
'''17:''' "Curse of dimensionality": Some theoretical results have been proved here[7][2].
'''Comment:'''
'''18:''' "Conclusion": "With faster evaluation of the likelihood function..." I'm not sure what this is getting at; in ABC applications the likelihood function typically cannot be evaluated!
'''Comment:''' This formulation has been changed.
1161
2012-07-05T20:52:30Z
Mikaelsunnaker
13
== Comments of Darren Logan on things to do before moving the article to Wikipedia ==
* The title should probably be "Approximate Bayesian computation" (small "c"). The "...in computational biology" is probably redundant for WP article. You can always make a redirection from the long title to the short one.
*: Moved. --<span style="font-family:Palatino Linotype, Book Antiqua, Palatino, serif;">[[User:Spencer Bliven|Spencer]] [[User talk:Spencer Bliven|Bliven]]</span> 15:35, 18 May 2012 (PDT)
'''The following comments apply specifically to the wikipedia-version of this article''' --[[User:Cdessimoz|Cdessimoz]] 03:41, 22 May 2012 (PDT)
* In the summary and elsewhere, you use terms like "over the last years" and "recently". These should be avoided, as WP articles are not dated and thus non-specific time-frames are not meaningful. If you need to refer to time, be specific (e.g. "Since 1999..." or "In 2010...")
* Example. In general, Wikipedia articles should not contain worked examples. That type of content is better suited to Wikiversity or Wikibooks. There are exceptions, however. The guidance on this can be seen at WP:NOT, specifically: "An article should not read like a "how-to" style... the purpose of Wikipedia is to present facts, not to teach subject matter. It is not appropriate to create or edit articles that read as textbooks, with leading questions and systematic problem solutions as examples... Some kinds of examples, specifically those intended to inform rather than to instruct, may be appropriate for inclusion in a Wikipedia article." I think your example might be ok, but you should be careful of the tone to ensure it doesn't seem like a "how-to" guide.
* Wikipedia articles do not have conclusion sections.
* Throughout the article you should try and avoid using a narrative voice and remove all self-references. For example:
** "As the previous section suggests..."
** "This section attempts to review important recent developments..."
** "...should be considered with sober caution, as discussed below."
** "Interestingly..."
** "This section discusses these potential risks and reviews possible ways to address them.."
** "As the above makes clear..."
** "This section reviews risks..."
** "This section attempts to review important recent developments."
To follow up on these remarks, the history section in such Wikipedia articles is typically the first after the lead section, as it puts the topic into its historic context. I have thus moved it there. However, this possibly breaks some of the narrative flow and should thus be checked again during the revision. --[[User:Daniel Mietchen|Daniel Mietchen]] 19:05, 27 June 2012 (PDT)
: '''Response''': We have verified the coherence of the narrative flow. --[[User:Cdessimoz|Cdessimoz]] 07:52, 5 July 2012 (PDT)
== Comments of Christian P. Robert on the entry ==
A few comments on the specific entry on ABC written by Mikael Sunnåker et al....
*The entry starts with the representation of the posterior probability of an hypothesis, rather than with the posterior density of a model parameter, which makes it seems likely it could lead the novice reader astray. After all, (a) ABC was not introduced for conducting model choice and (b) interchanging hypothesis and model means that the probability of an hypothesis '''''H''''' as used in the entry is actually the evidence in favour of the corresponding model.
'''Response:''' We now first talk only about parameter estimation. We have also rewritten the section about model selection for better coherence of the text.
* (There are a few typos and grammar mistakes, but I assume either PLoS or later contributors will correct those.)
'''Response:''' We have corrected the typos and grammatical mistakes found during the revision.
* When the authors state that the "outcome of the ABC rejection algorithm is a set of parameter estimates distributed according to the desired posterior distribution", I think they are leading some of the readers astray as they forget the "approximative" aspect of this distribution.
'''Response:''' This has been changed.
* Further below, I would have used the title "Insufficient summary statistics" rather than "Sufficient summary statistics", as it spells out more clearly the fundamental issue with the potential difficulty in using ABC.
'''Response:''' The title has been changed to “Summary statistics” (see also Dennis Prangle's comment below)
* (And I am not sure the subsequent paragraph on "Choice and sufficiency of summary statistics" should bother with the sufficiency aspects... It seems to me much more relevant to assess the impact on predictive performances.
'''Response:'''
* Although this is most minor, I would not have made mention of the (rather artificial) "table for interpretation of the strength in values of [[wp:Bayes factor|the Bayes factor]] (...) originally published by [[wp:Harold Jeffreys|Harold Jeffreys]]". I obviously appreciate very much that the authors advertise our warning <ref> Robert, C.P., Cornuet, J.-M., Marin, J.-M. and Pillai, N. (2011) Lack of confidence in approximate Bayesian computation model choice. PNAS vol. 108 no. 37 15112-15117.</ref> about the potential lack of validity of an ABC based Bayes factor!
'''Response:''' The section on model selection has been almost entirely rewritten. In the process, the reference to Jeffreys's table has been removed.
* I also like the notion of "quality control", even though it should only appear once.
'''Response:''' We have merged the two sections about quality control.
* And the pseudo-example is quite fine as an introduction, while it could be supplemented with the outcome resulting from a large n, to be compared with the true posterior distribution.
'''Response:'''
* The section "Pitfalls and remedies" is remarkable in that it details the necessary steps for validating a ABC implementation: the only entry I would remove is the one about "Prior distribution and parameter ranges", in that this is not a problem inherent to ABC... (Granted, the authors present this as a "general risks in statistical inference exacerbated in ABC", which makes more sense!)
'''Response:''' We would like to keep the discussion on prior distribution and parameter ranges. However, a sentence was added under “Pitfalls and remedies” to emphasize that the problem related to “Prior distribution and parameter ranges” is not specific to ABC.
* It may be that the section on the non-zero tolerance should emphasize more clearly the fact that ε should not be zero. As discussed in the recent Read Paper by Fearnhead and Prangle <ref name="FP2012">Fearnhead, P. and Prangle, D. (2012) Constructing summary statistics for approximate Bayesian computation: semi-automatic approximate Bayesian computation. Journal of the Royal Statistical Society Series B. Volume 74, Issue 3, pages 419–474.</ref> when envisioning ABC as a non-parametric method of inference.
'''Response:'''
* At last, it is always possible to criticise the coverage of the historical part, since this is such a recent field that it is constantly evolving. But the authors correctly point out to (Don) Rubin on the one hand and to Diggle and Graton on the other. I would suggest adding in this section links to the relevant softwares like our own DIY-ABC<ref>Cornuet, J.-M., Santos, F., Beaumont, M. et al. (2008) Inferring population history with DIY ABC: a user-friendly approach to approximate Bayesian computation . Bioinformatics 24 (23): 2713-2719.</ref>...
'''Response:''' A section listing ABC software has been added, including a new table with references to the corresponding papers (Table 3) .
== Review by Dennis Prangle ==
This is a well written and accessible introductory article. I particularly like the balance struck between describing the simplicity of implementing ABC and the potential drawbacks.
=== Major comments ===
(nb I've included full references only for papers not in the original article.)
* Much of the material in the "recent methodological developments" section is well established and no longer recent relative to the age of the field (e.g. the Marjoram et al paper was published in 2003). I'd suggest at least renaming this section. Alternatively, much of this material could be incorporated into the "approximation of the posterior" section, as regression correction ideas and MCMC / SMC algorithms are tools commonly used to improve the approximation.
'''Response:''' The section has been removed and most of the material has been incorporated into the “approximation of the posterior” section.
* A little more coverage of applications would be nice. One way to do this without increasing the length of the article would be to explicitly reference recent review papers (Beaumont 2010, Bertorelle et al 2010, Csillery et al 2010, Marin et al 2011<ref>Jean-Michel Marin, Pierre Pudlo, Christian P. Robert and Robin J. Ryder (2011) Approximate Bayesian computational methods. Statistics and Computing (published online)</ref>) for further details.
'''Response:''' We have added a sentence about applications of ABC, with references to these review papers, at the end of the “Example” section.
* The model comparison section should explain how the ABC rejection sampling algorithm can be adapted to perform inference between models (or give a reference). A reference to more advanced algorithms (e.g. Didelot et al, Toni and Stumpf 2009<ref>Tina Toni and Michael P. H. Stumpf (2009) Simulation-based model selection for dynamical systems in systems and population biology. Bioinformatics (26) 104-110</ref>) would also be helpful.
'''Response:''' We have added a reference to the Toni & Stumpf SMC-ABC method for model selection.
* I agree with Christian Robert's comments that the discussion of a hypothesis H in the motivation section is somewhat confusing, and that links to code could be helpful. Some additional suggestions are the "abc" R package and ABC-SysBio.
'''Response:''' See our response to Christian Robert’s comment above.
=== Minor comments ===
* The acceptance criterion should be <math>\rho (\hat{D},D) \le \epsilon</math> not <math>\rho (\hat{D},D)<\epsilon</math> if <math>\epsilon=0</math> is to correspond to acceptance of exact matches only.
'''Response:''' This has been changed.
*"Sufficient summary statistics": As Christian writes, it would seem more natural to discuss general summary statistics first, then the special and less practically useful case of sufficient statistics.
'''Response:'''
*"Example": I'd point out that this is an example application only, and more accurate inference is possible here by particle filtering methods. If there were some missing data this would be a more natural ABC application e.g. if only the summary statistic was observed.
'''Response:''' We have also added a sentence to point out that it is only an example application, and that the inference may be improved with more advanced methods.
*"Approximation of the posterior": "...has been justified theoretically under some limiting conditions". The word "limiting" doesn't seem (to me) to describe the measurement error case.
'''Response:'''
*"Choice and sufficiency of summary statistics": "Sufficient statistics are optimal..." I'd change to "Low dimensional sufficient statistics". For some models (e.g. iid Cauchy) the only sufficient statistics are the full data set, which would be a poor choice.
'''Response:''' This has been changed.
*"Choice and sufficiency of summary statistics": "...which is approximated with a pilot run of simulations". Something like "...which is approximated by linear regression based on simulated data" would be more accurate.
'''Response:''' This has been changed.
*"Choice and sufficiency of summary statistics": It might be useful to reference a recent comparison<ref>M. G. B. Blum, M. A. Nunes, D. Prangle, S. A. Sisson (2012) A comparative review of dimension reduction methods in approximate Bayesian computation. arxiv.org/abs/1202.3819 </ref> (disclaimer: which I contributed to) between methods of choosing summary statistics.
'''Response:''' A sentence was added with a reference to the paper.
*"Bayes factor with ABC and summary statistics": "...can also be used to..." it might be more accurate to say "...is sufficient to..."
'''Response:''' This has been changed.
*"Bayes factor with ABC and summary statistics": "meaningless" seems too strong as the next sentence suggests a potentially useful alternative way of doing inference.
'''Response:''' The formulation was changed to “may therefore be misinformative”.
*"Prior distribution and parameter ranges": "...based on the principle of maximum entropy". A link to the general topic of [[wp:Prior_distribution#Uninformative_priors|objective priors]] might be helpful here.
'''Response:''' A link has been added.
*"Large data sets": "which may be a tractable approach for ABC based methods". Note it is already easy to parallelise many of the steps in ABC algorithms based on rejection sampling and SMC.
'''Response:''' This has been changed.
*"Curse of dimensionality": Some theoretical results have been proved here<ref>M. G. B. Blum (2010) Approximate Bayesian Computation: a nonparametric perspective. Journal of the American Statistical Association (105) 1178-1187</ref><ref name="FP2012" />.
'''Response:'''
*"Conclusion": "With faster evaluation of the likelihood function..." I'm not sure what this is getting at; in ABC applications the likelihood function typically cannot be evaluated!
'''Response:''' This formulation has been changed.
==References==
<references />
== Response to the reviewers ==
===Response to the comments by Dennis Prangle===
====Major comments====
'''1:''' Much of the material in the "recent methodological developments" section is well established and no longer recent relative to the age of the field (e.g. the Marjoram et al paper was published in 2003). I'd suggest at least renaming this section. Alternatively, much of this material could be incorporated into the "approximation of the posterior" section, as regression correction ideas and MCMC / SMC algorithms are tools commonly used to improve the approximation.
'''Comment:''' The section has been removed and most of the material has been incorporated into the “approximation of the posterior” section.
'''2:''' A little more coverage of applications would be nice. One way to do this without increasing the length of the article would be to explicitly reference recent review papers (Beaumont 2010, Bertorelle et al 2010, Csillery et al 2010, Marin et al 2011[4]) for further details.
'''Comment:''' We have added a sentence about applications of ABC, with references to these review papers, at the end of the “Example” section.
'''3:''' The model comparison section should explain how the ABC rejection sampling algorithm can be adapted to perform inference between models (or give a reference). A reference to more advanced algorithms (e.g. Didelot et al, Toni and Stumpf 2009[5]) would also be helpful.
'''Comment:''' We have added a reference to the Toni & Stumpf SMC-ABC method for model selection.
'''4:''' I agree with Christian Robert's comments that the discussion of a hypothesis H in the motivation section is somewhat confusing,
'''Comment:''' See our response to Christian Robert’s comment above (1).
'''5:''' and that links to code could be helpful. Some additional suggestions are the "abc" R package and ABC-SysBio.
'''Comment:''' A section about software has been added.
====Minor comments====
'''6:''' The acceptance criterion should be not if ε = 0 is to correspond to acceptance of exact matches only.
'''Comment:''' This has been changed.
'''7:''' "Sufficient summary statistics": As Christian writes, it would seem more natural to discuss general summary statistics first, then the special and less practically useful case of sufficient statistics.
'''Comment:'''
'''8:''' "Example": I'd point out that this is an example application only, and more accurate inference is possible here by particle filtering methods. If there were some missing data this would be a more natural ABC application e.g. if only the summary statistic was observed.
'''Comment:''' We have modified the example so that only the frequency of switches is known (summary statistics). We have also added a sentence to point out that it is only an example application, and that the inference may be improved with more advanced methods.
'''9:''' "Approximation of the posterior": "...has been justified theoretically under some limiting conditions". The word "limiting" doesn't seem (to me) to describe the measurement error case.
'''Comment:'''
'''10:''' "Choice and sufficiency of summary statistics": "Sufficient statistics are optimal..." I'd change to "Low dimensional sufficient statistics". For some models (e.g. iid Cauchy) the only sufficient statistics are the full data set, which would be a poor choice.
'''Comment:''' This has been changed.
'''11:''' "Choice and sufficiency of summary statistics": "...which is approximated with a pilot run of simulations". Something like "...which is approximated by linear regression based on simulated data" would be more accurate.
'''Comment:''' This has been changed.
'''12:''' "Choice and sufficiency of summary statistics": It might be useful to reference a recent comparison[6] (disclaimer: which I contributed to) between methods of choosing summary statistics.
'''Comment:''' A sentence was added with a reference to the paper.
'''13:''' "Bayes factor with ABC and summary statistics": "...can also be used to..." it might be more accurate to say "...is sufficient to..."
'''Comment:''' This has been changed.
'''14:''' "Bayes factor with ABC and summary statistics": "meaningless" seems too strong as the next sentence suggests a potentially useful alternative way of doing inference.
'''Comment:''' The formulation was changed to “may therefore be misinformative”.
'''15:''' "Prior distribution and parameter ranges": "...based on the principle of maximum entropy". A link to the general topic of objective priors might be helpful here.
'''Comment:''' This formulation has been changed.
'''16:''' "Large data sets": "which may be a tractable approach for ABC based methods". Note it is already easy to parallelise many of the steps in ABC algorithms based on rejection sampling and SMC.
'''Comment:''' This has been changed.
'''17:''' "Curse of dimensionality": Some theoretical results have been proved here[7][2].
'''Comment:'''
'''18:''' "Conclusion": "With faster evaluation of the likelihood function..." I'm not sure what this is getting at; in ABC applications the likelihood function typically cannot be evaluated!
'''Comment:''' This formulation has been changed.
1162
2012-07-05T20:56:32Z
Mikaelsunnaker
13
== Comments of Darren Logan on things to do before moving the article to Wikipedia ==
* The title should probably be "Approximate Bayesian computation" (small "c"). The "...in computational biology" is probably redundant for WP article. You can always make a redirection from the long title to the short one.
*: Moved. --<span style="font-family:Palatino Linotype, Book Antiqua, Palatino, serif;">[[User:Spencer Bliven|Spencer]] [[User talk:Spencer Bliven|Bliven]]</span> 15:35, 18 May 2012 (PDT)
'''The following comments apply specifically to the wikipedia-version of this article''' --[[User:Cdessimoz|Cdessimoz]] 03:41, 22 May 2012 (PDT)
* In the summary and elsewhere, you use terms like "over the last years" and "recently". These should be avoided, as WP articles are not dated and thus non-specific time-frames are not meaningful. If you need to refer to time, be specific (e.g. "Since 1999..." or "In 2010...")
* Example. In general, Wikipedia articles should not contain worked examples. That type of content is better suited to Wikiversity or Wikibooks. There are exceptions, however. The guidance on this can be seen at WP:NOT, specifically: "An article should not read like a "how-to" style... the purpose of Wikipedia is to present facts, not to teach subject matter. It is not appropriate to create or edit articles that read as textbooks, with leading questions and systematic problem solutions as examples... Some kinds of examples, specifically those intended to inform rather than to instruct, may be appropriate for inclusion in a Wikipedia article." I think your example might be ok, but you should be careful of the tone to ensure it doesn't seem like a "how-to" guide.
* Wikipedia articles do not have conclusion sections.
* Throughout the article you should try and avoid using a narrative voice and remove all self-references. For example:
** "As the previous section suggests..."
** "This section attempts to review important recent developments..."
** "...should be considered with sober caution, as discussed below."
** "Interestingly..."
** "This section discusses these potential risks and reviews possible ways to address them.."
** "As the above makes clear..."
** "This section reviews risks..."
** "This section attempts to review important recent developments."
To follow up on these remarks, the history section in such Wikipedia articles is typically the first after the lead section, as it puts the topic into its historic context. I have thus moved it there. However, this possibly breaks some of the narrative flow and should thus be checked again during the revision. --[[User:Daniel Mietchen|Daniel Mietchen]] 19:05, 27 June 2012 (PDT)
: '''Response''': We have verified the coherence of the narrative flow. --[[User:Cdessimoz|Cdessimoz]] 07:52, 5 July 2012 (PDT)
== Comments of Christian P. Robert on the entry ==
A few comments on the specific entry on ABC written by Mikael Sunnåker et al....
*The entry starts with the representation of the posterior probability of an hypothesis, rather than with the posterior density of a model parameter, which makes it seems likely it could lead the novice reader astray. After all, (a) ABC was not introduced for conducting model choice and (b) interchanging hypothesis and model means that the probability of an hypothesis '''''H''''' as used in the entry is actually the evidence in favour of the corresponding model.
'''Response:''' We now first talk only about parameter estimation. We have also rewritten the section about model selection for better coherence of the text.
* (There are a few typos and grammar mistakes, but I assume either PLoS or later contributors will correct those.)
'''Response:''' We have corrected the typos and grammatical mistakes found during the revision.
* When the authors state that the "outcome of the ABC rejection algorithm is a set of parameter estimates distributed according to the desired posterior distribution", I think they are leading some of the readers astray as they forget the "approximative" aspect of this distribution.
'''Response:''' This has been changed.
* Further below, I would have used the title "Insufficient summary statistics" rather than "Sufficient summary statistics", as it spells out more clearly the fundamental issue with the potential difficulty in using ABC.
'''Response:''' The title has been changed to “Summary statistics” (see also Dennis Prangle's comment below)
* (And I am not sure the subsequent paragraph on "Choice and sufficiency of summary statistics" should bother with the sufficiency aspects... It seems to me much more relevant to assess the impact on predictive performances.
'''Response:'''
* Although this is most minor, I would not have made mention of the (rather artificial) "table for interpretation of the strength in values of [[wp:Bayes factor|the Bayes factor]] (...) originally published by [[wp:Harold Jeffreys|Harold Jeffreys]]". I obviously appreciate very much that the authors advertise our warning <ref> Robert, C.P., Cornuet, J.-M., Marin, J.-M. and Pillai, N. (2011) Lack of confidence in approximate Bayesian computation model choice. PNAS vol. 108 no. 37 15112-15117.</ref> about the potential lack of validity of an ABC based Bayes factor!
'''Response:''' The section on model selection has been almost entirely rewritten. In the process, the reference to Jeffreys's table has been removed.
* I also like the notion of "quality control", even though it should only appear once.
'''Response:''' We have merged the two sections about quality control.
* And the pseudo-example is quite fine as an introduction, while it could be supplemented with the outcome resulting from a large n, to be compared with the true posterior distribution.
'''Response:'''
* The section "Pitfalls and remedies" is remarkable in that it details the necessary steps for validating a ABC implementation: the only entry I would remove is the one about "Prior distribution and parameter ranges", in that this is not a problem inherent to ABC... (Granted, the authors present this as a "general risks in statistical inference exacerbated in ABC", which makes more sense!)
'''Response:''' We would like to keep the discussion on prior distribution and parameter ranges. However, a sentence was added under “Pitfalls and remedies” to emphasize that the problem related to “Prior distribution and parameter ranges” is not specific to ABC.
* It may be that the section on the non-zero tolerance should emphasize more clearly the fact that ε should not be zero. As discussed in the recent Read Paper by Fearnhead and Prangle <ref name="FP2012">Fearnhead, P. and Prangle, D. (2012) Constructing summary statistics for approximate Bayesian computation: semi-automatic approximate Bayesian computation. Journal of the Royal Statistical Society Series B. Volume 74, Issue 3, pages 419–474.</ref> when envisioning ABC as a non-parametric method of inference.
'''Response:'''
* At last, it is always possible to criticise the coverage of the historical part, since this is such a recent field that it is constantly evolving. But the authors correctly point out to (Don) Rubin on the one hand and to Diggle and Graton on the other. I would suggest adding in this section links to the relevant softwares like our own DIY-ABC<ref>Cornuet, J.-M., Santos, F., Beaumont, M. et al. (2008) Inferring population history with DIY ABC: a user-friendly approach to approximate Bayesian computation . Bioinformatics 24 (23): 2713-2719.</ref>...
'''Response:''' A section listing ABC software has been added, including a new table with references to the corresponding papers (Table 3) .
== Review by Dennis Prangle ==
This is a well written and accessible introductory article. I particularly like the balance struck between describing the simplicity of implementing ABC and the potential drawbacks.
=== Major comments ===
(nb I've included full references only for papers not in the original article.)
* Much of the material in the "recent methodological developments" section is well established and no longer recent relative to the age of the field (e.g. the Marjoram et al paper was published in 2003). I'd suggest at least renaming this section. Alternatively, much of this material could be incorporated into the "approximation of the posterior" section, as regression correction ideas and MCMC / SMC algorithms are tools commonly used to improve the approximation.
'''Response:''' The section has been removed and most of the material has been incorporated into the “approximation of the posterior” section.
* A little more coverage of applications would be nice. One way to do this without increasing the length of the article would be to explicitly reference recent review papers (Beaumont 2010, Bertorelle et al 2010, Csillery et al 2010, Marin et al 2011<ref>Jean-Michel Marin, Pierre Pudlo, Christian P. Robert and Robin J. Ryder (2011) Approximate Bayesian computational methods. Statistics and Computing (published online)</ref>) for further details.
'''Response:''' We have added a sentence about applications of ABC, with references to these review papers, at the end of the “Example” section.
* The model comparison section should explain how the ABC rejection sampling algorithm can be adapted to perform inference between models (or give a reference). A reference to more advanced algorithms (e.g. Didelot et al, Toni and Stumpf 2009<ref>Tina Toni and Michael P. H. Stumpf (2009) Simulation-based model selection for dynamical systems in systems and population biology. Bioinformatics (26) 104-110</ref>) would also be helpful.
'''Response:''' We have added a reference to the Toni & Stumpf SMC-ABC method for model selection.
* I agree with Christian Robert's comments that the discussion of a hypothesis H in the motivation section is somewhat confusing, and that links to code could be helpful. Some additional suggestions are the "abc" R package and ABC-SysBio.
'''Response:''' See our response to Christian Robert’s comment above.
=== Minor comments ===
* The acceptance criterion should be <math>\rho (\hat{D},D) \le \epsilon</math> not <math>\rho (\hat{D},D)<\epsilon</math> if <math>\epsilon=0</math> is to correspond to acceptance of exact matches only.
'''Response:''' This has been changed.
*"Sufficient summary statistics": As Christian writes, it would seem more natural to discuss general summary statistics first, then the special and less practically useful case of sufficient statistics.
'''Response:'''
*"Example": I'd point out that this is an example application only, and more accurate inference is possible here by particle filtering methods. If there were some missing data this would be a more natural ABC application e.g. if only the summary statistic was observed.
'''Response:''' We have also added a sentence to point out that it is only an example application, and that the inference may be improved with more advanced methods.
*"Approximation of the posterior": "...has been justified theoretically under some limiting conditions". The word "limiting" doesn't seem (to me) to describe the measurement error case.
'''Response:'''
*"Choice and sufficiency of summary statistics": "Sufficient statistics are optimal..." I'd change to "Low dimensional sufficient statistics". For some models (e.g. iid Cauchy) the only sufficient statistics are the full data set, which would be a poor choice.
'''Response:''' This has been changed.
*"Choice and sufficiency of summary statistics": "...which is approximated with a pilot run of simulations". Something like "...which is approximated by linear regression based on simulated data" would be more accurate.
'''Response:''' This has been changed.
*"Choice and sufficiency of summary statistics": It might be useful to reference a recent comparison<ref>M. G. B. Blum, M. A. Nunes, D. Prangle, S. A. Sisson (2012) A comparative review of dimension reduction methods in approximate Bayesian computation. arxiv.org/abs/1202.3819 </ref> (disclaimer: which I contributed to) between methods of choosing summary statistics.
'''Response:''' A sentence was added with a reference to the paper.
*"Bayes factor with ABC and summary statistics": "...can also be used to..." it might be more accurate to say "...is sufficient to..."
'''Response:''' This has been changed.
*"Bayes factor with ABC and summary statistics": "meaningless" seems too strong as the next sentence suggests a potentially useful alternative way of doing inference.
'''Response:''' The formulation was changed to “may therefore be misinformative”.
*"Prior distribution and parameter ranges": "...based on the principle of maximum entropy". A link to the general topic of [[wp:Prior_distribution#Uninformative_priors|objective priors]] might be helpful here.
'''Response:''' A link has been added.
*"Large data sets": "which may be a tractable approach for ABC based methods". Note it is already easy to parallelise many of the steps in ABC algorithms based on rejection sampling and SMC.
'''Response:''' This has been changed.
*"Curse of dimensionality": Some theoretical results have been proved here<ref>M. G. B. Blum (2010) Approximate Bayesian Computation: a nonparametric perspective. Journal of the American Statistical Association (105) 1178-1187</ref><ref name="FP2012" />.
'''Response:'''
*"Conclusion": "With faster evaluation of the likelihood function..." I'm not sure what this is getting at; in ABC applications the likelihood function typically cannot be evaluated!
'''Response:''' This formulation has been changed.
==References==
<references />
1196
2012-07-09T13:03:18Z
Mikaelsunnaker
13
Remaining comments marked red
== Comments of Darren Logan on things to do before moving the article to Wikipedia ==
* The title should probably be "Approximate Bayesian computation" (small "c"). The "...in computational biology" is probably redundant for WP article. You can always make a redirection from the long title to the short one.
*: Moved. --<span style="font-family:Palatino Linotype, Book Antiqua, Palatino, serif;">[[User:Spencer Bliven|Spencer]] [[User talk:Spencer Bliven|Bliven]]</span> 15:35, 18 May 2012 (PDT)
'''The following comments apply specifically to the wikipedia-version of this article''' --[[User:Cdessimoz|Cdessimoz]] 03:41, 22 May 2012 (PDT)
* In the summary and elsewhere, you use terms like "over the last years" and "recently". These should be avoided, as WP articles are not dated and thus non-specific time-frames are not meaningful. If you need to refer to time, be specific (e.g. "Since 1999..." or "In 2010...")
* Example. In general, Wikipedia articles should not contain worked examples. That type of content is better suited to Wikiversity or Wikibooks. There are exceptions, however. The guidance on this can be seen at WP:NOT, specifically: "An article should not read like a "how-to" style... the purpose of Wikipedia is to present facts, not to teach subject matter. It is not appropriate to create or edit articles that read as textbooks, with leading questions and systematic problem solutions as examples... Some kinds of examples, specifically those intended to inform rather than to instruct, may be appropriate for inclusion in a Wikipedia article." I think your example might be ok, but you should be careful of the tone to ensure it doesn't seem like a "how-to" guide.
* Wikipedia articles do not have conclusion sections.
* Throughout the article you should try and avoid using a narrative voice and remove all self-references. For example:
** "As the previous section suggests..."
** "This section attempts to review important recent developments..."
** "...should be considered with sober caution, as discussed below."
** "Interestingly..."
** "This section discusses these potential risks and reviews possible ways to address them.."
** "As the above makes clear..."
** "This section reviews risks..."
** "This section attempts to review important recent developments."
To follow up on these remarks, the history section in such Wikipedia articles is typically the first after the lead section, as it puts the topic into its historic context. I have thus moved it there. However, this possibly breaks some of the narrative flow and should thus be checked again during the revision. --[[User:Daniel Mietchen|Daniel Mietchen]] 19:05, 27 June 2012 (PDT)
: '''Response''': We have verified the coherence of the narrative flow. --[[User:Cdessimoz|Cdessimoz]] 07:52, 5 July 2012 (PDT)
== Comments of Christian P. Robert on the entry ==
A few comments on the specific entry on ABC written by Mikael Sunnåker et al....
*The entry starts with the representation of the posterior probability of an hypothesis, rather than with the posterior density of a model parameter, which makes it seems likely it could lead the novice reader astray. After all, (a) ABC was not introduced for conducting model choice and (b) interchanging hypothesis and model means that the probability of an hypothesis '''''H''''' as used in the entry is actually the evidence in favour of the corresponding model.
'''Response:''' We now first talk only about parameter estimation. We have also rewritten the section about model selection for better coherence of the text.
* (There are a few typos and grammar mistakes, but I assume either PLoS or later contributors will correct those.)
'''Response:''' We have corrected the typos and grammatical mistakes found during the revision.
* When the authors state that the "outcome of the ABC rejection algorithm is a set of parameter estimates distributed according to the desired posterior distribution", I think they are leading some of the readers astray as they forget the "approximative" aspect of this distribution.
'''Response:''' This has been changed.
* Further below, I would have used the title "Insufficient summary statistics" rather than "Sufficient summary statistics", as it spells out more clearly the fundamental issue with the potential difficulty in using ABC.
'''Response:''' The title has been changed to “Summary statistics” (see also Dennis Prangle's comment below)
* (And I am not sure the subsequent paragraph on "Choice and sufficiency of summary statistics" should bother with the sufficiency aspects... It seems to me much more relevant to assess the impact on predictive performances.
<span style="color:red">'''Response:'''</span>
* Although this is most minor, I would not have made mention of the (rather artificial) "table for interpretation of the strength in values of [[wp:Bayes factor|the Bayes factor]] (...) originally published by [[wp:Harold Jeffreys|Harold Jeffreys]]". I obviously appreciate very much that the authors advertise our warning <ref> Robert, C.P., Cornuet, J.-M., Marin, J.-M. and Pillai, N. (2011) Lack of confidence in approximate Bayesian computation model choice. PNAS vol. 108 no. 37 15112-15117.</ref> about the potential lack of validity of an ABC based Bayes factor!
'''Response:''' The section on model selection has been almost entirely rewritten. In the process, the reference to Jeffreys's table has been removed.
* I also like the notion of "quality control", even though it should only appear once.
'''Response:''' We have merged the two sections about quality control.
* And the pseudo-example is quite fine as an introduction, while it could be supplemented with the outcome resulting from a large n, to be compared with the true posterior distribution.
<span style="color:red">'''Response:'''</span>
* The section "Pitfalls and remedies" is remarkable in that it details the necessary steps for validating a ABC implementation: the only entry I would remove is the one about "Prior distribution and parameter ranges", in that this is not a problem inherent to ABC... (Granted, the authors present this as a "general risks in statistical inference exacerbated in ABC", which makes more sense!)
'''Response:''' We would like to keep the discussion on prior distribution and parameter ranges. However, a sentence was added under “Pitfalls and remedies” to emphasize that the problem related to “Prior distribution and parameter ranges” is not specific to ABC.
* It may be that the section on the non-zero tolerance should emphasize more clearly the fact that ε should not be zero. As discussed in the recent Read Paper by Fearnhead and Prangle <ref name="FP2012">Fearnhead, P. and Prangle, D. (2012) Constructing summary statistics for approximate Bayesian computation: semi-automatic approximate Bayesian computation. Journal of the Royal Statistical Society Series B. Volume 74, Issue 3, pages 419–474.</ref> when envisioning ABC as a non-parametric method of inference.
<span style="color:red">'''Response:'''</span>
* At last, it is always possible to criticise the coverage of the historical part, since this is such a recent field that it is constantly evolving. But the authors correctly point out to (Don) Rubin on the one hand and to Diggle and Graton on the other. I would suggest adding in this section links to the relevant softwares like our own DIY-ABC<ref>Cornuet, J.-M., Santos, F., Beaumont, M. et al. (2008) Inferring population history with DIY ABC: a user-friendly approach to approximate Bayesian computation . Bioinformatics 24 (23): 2713-2719.</ref>...
'''Response:''' A section listing ABC software has been added, including a new table with references to the corresponding papers (Table 3) .
== Review by Dennis Prangle ==
This is a well written and accessible introductory article. I particularly like the balance struck between describing the simplicity of implementing ABC and the potential drawbacks.
=== Major comments ===
(nb I've included full references only for papers not in the original article.)
* Much of the material in the "recent methodological developments" section is well established and no longer recent relative to the age of the field (e.g. the Marjoram et al paper was published in 2003). I'd suggest at least renaming this section. Alternatively, much of this material could be incorporated into the "approximation of the posterior" section, as regression correction ideas and MCMC / SMC algorithms are tools commonly used to improve the approximation.
'''Response:''' The section has been removed and most of the material has been incorporated into the “approximation of the posterior” section.
* A little more coverage of applications would be nice. One way to do this without increasing the length of the article would be to explicitly reference recent review papers (Beaumont 2010, Bertorelle et al 2010, Csillery et al 2010, Marin et al 2011<ref>Jean-Michel Marin, Pierre Pudlo, Christian P. Robert and Robin J. Ryder (2011) Approximate Bayesian computational methods. Statistics and Computing (published online)</ref>) for further details.
'''Response:''' We have added a sentence about applications of ABC, with references to these review papers, at the end of the “Example” section.
* The model comparison section should explain how the ABC rejection sampling algorithm can be adapted to perform inference between models (or give a reference). A reference to more advanced algorithms (e.g. Didelot et al, Toni and Stumpf 2009<ref>Tina Toni and Michael P. H. Stumpf (2009) Simulation-based model selection for dynamical systems in systems and population biology. Bioinformatics (26) 104-110</ref>) would also be helpful.
'''Response:''' We have added a reference to the Toni & Stumpf SMC-ABC method for model selection.
* I agree with Christian Robert's comments that the discussion of a hypothesis H in the motivation section is somewhat confusing, and that links to code could be helpful. Some additional suggestions are the "abc" R package and ABC-SysBio.
'''Response:''' See our response to Christian Robert’s comment above.
=== Minor comments ===
* The acceptance criterion should be <math>\rho (\hat{D},D) \le \epsilon</math> not <math>\rho (\hat{D},D)<\epsilon</math> if <math>\epsilon=0</math> is to correspond to acceptance of exact matches only.
'''Response:''' This has been changed.
*"Sufficient summary statistics": As Christian writes, it would seem more natural to discuss general summary statistics first, then the special and less practically useful case of sufficient statistics.
<span style="color:red">'''Response:'''</span>
*"Example": I'd point out that this is an example application only, and more accurate inference is possible here by particle filtering methods. If there were some missing data this would be a more natural ABC application e.g. if only the summary statistic was observed.
'''Response:''' We have also added a sentence to point out that it is only an example application, and that the inference may be improved with more advanced methods.
*"Approximation of the posterior": "...has been justified theoretically under some limiting conditions". The word "limiting" doesn't seem (to me) to describe the measurement error case.
<span style="color:red">'''Response:'''</span>
*"Choice and sufficiency of summary statistics": "Sufficient statistics are optimal..." I'd change to "Low dimensional sufficient statistics". For some models (e.g. iid Cauchy) the only sufficient statistics are the full data set, which would be a poor choice.
'''Response:''' This has been changed.
*"Choice and sufficiency of summary statistics": "...which is approximated with a pilot run of simulations". Something like "...which is approximated by linear regression based on simulated data" would be more accurate.
'''Response:''' This has been changed.
*"Choice and sufficiency of summary statistics": It might be useful to reference a recent comparison<ref>M. G. B. Blum, M. A. Nunes, D. Prangle, S. A. Sisson (2012) A comparative review of dimension reduction methods in approximate Bayesian computation. arxiv.org/abs/1202.3819 </ref> (disclaimer: which I contributed to) between methods of choosing summary statistics.
'''Response:''' A sentence was added with a reference to the paper.
*"Bayes factor with ABC and summary statistics": "...can also be used to..." it might be more accurate to say "...is sufficient to..."
'''Response:''' This has been changed.
*"Bayes factor with ABC and summary statistics": "meaningless" seems too strong as the next sentence suggests a potentially useful alternative way of doing inference.
'''Response:''' The formulation was changed to “may therefore be misinformative”.
*"Prior distribution and parameter ranges": "...based on the principle of maximum entropy". A link to the general topic of [[wp:Prior_distribution#Uninformative_priors|objective priors]] might be helpful here.
'''Response:''' A link has been added.
*"Large data sets": "which may be a tractable approach for ABC based methods". Note it is already easy to parallelise many of the steps in ABC algorithms based on rejection sampling and SMC.
'''Response:''' This has been changed.
*"Curse of dimensionality": Some theoretical results have been proved here<ref>M. G. B. Blum (2010) Approximate Bayesian Computation: a nonparametric perspective. Journal of the American Statistical Association (105) 1178-1187</ref><ref name="FP2012" />.
<span style="color:red">'''Response:'''</span>
*"Conclusion": "With faster evaluation of the likelihood function..." I'm not sure what this is getting at; in ABC applications the likelihood function typically cannot be evaluated!
'''Response:''' This formulation has been changed.
==References==
<references />
1210
2012-07-09T14:55:50Z
Mikaelsunnaker
13
/* Minor comments */
== Comments of Darren Logan on things to do before moving the article to Wikipedia ==
* The title should probably be "Approximate Bayesian computation" (small "c"). The "...in computational biology" is probably redundant for WP article. You can always make a redirection from the long title to the short one.
*: Moved. --<span style="font-family:Palatino Linotype, Book Antiqua, Palatino, serif;">[[User:Spencer Bliven|Spencer]] [[User talk:Spencer Bliven|Bliven]]</span> 15:35, 18 May 2012 (PDT)
'''The following comments apply specifically to the wikipedia-version of this article''' --[[User:Cdessimoz|Cdessimoz]] 03:41, 22 May 2012 (PDT)
* In the summary and elsewhere, you use terms like "over the last years" and "recently". These should be avoided, as WP articles are not dated and thus non-specific time-frames are not meaningful. If you need to refer to time, be specific (e.g. "Since 1999..." or "In 2010...")
* Example. In general, Wikipedia articles should not contain worked examples. That type of content is better suited to Wikiversity or Wikibooks. There are exceptions, however. The guidance on this can be seen at WP:NOT, specifically: "An article should not read like a "how-to" style... the purpose of Wikipedia is to present facts, not to teach subject matter. It is not appropriate to create or edit articles that read as textbooks, with leading questions and systematic problem solutions as examples... Some kinds of examples, specifically those intended to inform rather than to instruct, may be appropriate for inclusion in a Wikipedia article." I think your example might be ok, but you should be careful of the tone to ensure it doesn't seem like a "how-to" guide.
* Wikipedia articles do not have conclusion sections.
* Throughout the article you should try and avoid using a narrative voice and remove all self-references. For example:
** "As the previous section suggests..."
** "This section attempts to review important recent developments..."
** "...should be considered with sober caution, as discussed below."
** "Interestingly..."
** "This section discusses these potential risks and reviews possible ways to address them.."
** "As the above makes clear..."
** "This section reviews risks..."
** "This section attempts to review important recent developments."
To follow up on these remarks, the history section in such Wikipedia articles is typically the first after the lead section, as it puts the topic into its historic context. I have thus moved it there. However, this possibly breaks some of the narrative flow and should thus be checked again during the revision. --[[User:Daniel Mietchen|Daniel Mietchen]] 19:05, 27 June 2012 (PDT)
: '''Response''': We have verified the coherence of the narrative flow. --[[User:Cdessimoz|Cdessimoz]] 07:52, 5 July 2012 (PDT)
== Comments of Christian P. Robert on the entry ==
A few comments on the specific entry on ABC written by Mikael Sunnåker et al....
*The entry starts with the representation of the posterior probability of an hypothesis, rather than with the posterior density of a model parameter, which makes it seems likely it could lead the novice reader astray. After all, (a) ABC was not introduced for conducting model choice and (b) interchanging hypothesis and model means that the probability of an hypothesis '''''H''''' as used in the entry is actually the evidence in favour of the corresponding model.
'''Response:''' We now first talk only about parameter estimation. We have also rewritten the section about model selection for better coherence of the text.
* (There are a few typos and grammar mistakes, but I assume either PLoS or later contributors will correct those.)
'''Response:''' We have corrected the typos and grammatical mistakes found during the revision.
* When the authors state that the "outcome of the ABC rejection algorithm is a set of parameter estimates distributed according to the desired posterior distribution", I think they are leading some of the readers astray as they forget the "approximative" aspect of this distribution.
'''Response:''' This has been changed.
* Further below, I would have used the title "Insufficient summary statistics" rather than "Sufficient summary statistics", as it spells out more clearly the fundamental issue with the potential difficulty in using ABC.
'''Response:''' The title has been changed to “Summary statistics” (see also Dennis Prangle's comment below)
* (And I am not sure the subsequent paragraph on "Choice and sufficiency of summary statistics" should bother with the sufficiency aspects... It seems to me much more relevant to assess the impact on predictive performances.
<span style="color:red">'''Response:'''</span>
* Although this is most minor, I would not have made mention of the (rather artificial) "table for interpretation of the strength in values of [[wp:Bayes factor|the Bayes factor]] (...) originally published by [[wp:Harold Jeffreys|Harold Jeffreys]]". I obviously appreciate very much that the authors advertise our warning <ref> Robert, C.P., Cornuet, J.-M., Marin, J.-M. and Pillai, N. (2011) Lack of confidence in approximate Bayesian computation model choice. PNAS vol. 108 no. 37 15112-15117.</ref> about the potential lack of validity of an ABC based Bayes factor!
'''Response:''' The section on model selection has been almost entirely rewritten. In the process, the reference to Jeffreys's table has been removed.
* I also like the notion of "quality control", even though it should only appear once.
'''Response:''' We have merged the two sections about quality control.
* And the pseudo-example is quite fine as an introduction, while it could be supplemented with the outcome resulting from a large n, to be compared with the true posterior distribution.
<span style="color:red">'''Response:'''</span>
* The section "Pitfalls and remedies" is remarkable in that it details the necessary steps for validating a ABC implementation: the only entry I would remove is the one about "Prior distribution and parameter ranges", in that this is not a problem inherent to ABC... (Granted, the authors present this as a "general risks in statistical inference exacerbated in ABC", which makes more sense!)
'''Response:''' We would like to keep the discussion on prior distribution and parameter ranges. However, a sentence was added under “Pitfalls and remedies” to emphasize that the problem related to “Prior distribution and parameter ranges” is not specific to ABC.
* It may be that the section on the non-zero tolerance should emphasize more clearly the fact that ε should not be zero. As discussed in the recent Read Paper by Fearnhead and Prangle <ref name="FP2012">Fearnhead, P. and Prangle, D. (2012) Constructing summary statistics for approximate Bayesian computation: semi-automatic approximate Bayesian computation. Journal of the Royal Statistical Society Series B. Volume 74, Issue 3, pages 419–474.</ref> when envisioning ABC as a non-parametric method of inference.
<span style="color:red">'''Response:'''</span>
* At last, it is always possible to criticise the coverage of the historical part, since this is such a recent field that it is constantly evolving. But the authors correctly point out to (Don) Rubin on the one hand and to Diggle and Graton on the other. I would suggest adding in this section links to the relevant softwares like our own DIY-ABC<ref>Cornuet, J.-M., Santos, F., Beaumont, M. et al. (2008) Inferring population history with DIY ABC: a user-friendly approach to approximate Bayesian computation . Bioinformatics 24 (23): 2713-2719.</ref>...
'''Response:''' A section listing ABC software has been added, including a new table with references to the corresponding papers (Table 3) .
== Review by Dennis Prangle ==
This is a well written and accessible introductory article. I particularly like the balance struck between describing the simplicity of implementing ABC and the potential drawbacks.
=== Major comments ===
(nb I've included full references only for papers not in the original article.)
* Much of the material in the "recent methodological developments" section is well established and no longer recent relative to the age of the field (e.g. the Marjoram et al paper was published in 2003). I'd suggest at least renaming this section. Alternatively, much of this material could be incorporated into the "approximation of the posterior" section, as regression correction ideas and MCMC / SMC algorithms are tools commonly used to improve the approximation.
'''Response:''' The section has been removed and most of the material has been incorporated into the “approximation of the posterior” section.
* A little more coverage of applications would be nice. One way to do this without increasing the length of the article would be to explicitly reference recent review papers (Beaumont 2010, Bertorelle et al 2010, Csillery et al 2010, Marin et al 2011<ref>Jean-Michel Marin, Pierre Pudlo, Christian P. Robert and Robin J. Ryder (2011) Approximate Bayesian computational methods. Statistics and Computing (published online)</ref>) for further details.
'''Response:''' We have added a sentence about applications of ABC, with references to these review papers, at the end of the “Example” section.
* The model comparison section should explain how the ABC rejection sampling algorithm can be adapted to perform inference between models (or give a reference). A reference to more advanced algorithms (e.g. Didelot et al, Toni and Stumpf 2009<ref>Tina Toni and Michael P. H. Stumpf (2009) Simulation-based model selection for dynamical systems in systems and population biology. Bioinformatics (26) 104-110</ref>) would also be helpful.
'''Response:''' We have added a reference to the Toni & Stumpf SMC-ABC method for model selection.
* I agree with Christian Robert's comments that the discussion of a hypothesis H in the motivation section is somewhat confusing, and that links to code could be helpful. Some additional suggestions are the "abc" R package and ABC-SysBio.
'''Response:''' See our response to Christian Robert’s comment above.
=== Minor comments ===
* The acceptance criterion should be <math>\rho (\hat{D},D) \le \epsilon</math> not <math>\rho (\hat{D},D)<\epsilon</math> if <math>\epsilon=0</math> is to correspond to acceptance of exact matches only.
'''Response:''' This has been changed.
*"Sufficient summary statistics": As Christian writes, it would seem more natural to discuss general summary statistics first, then the special and less practically useful case of sufficient statistics.
<span style="color:red">'''Response:'''</span>
*"Example": I'd point out that this is an example application only, and more accurate inference is possible here by particle filtering methods. If there were some missing data this would be a more natural ABC application e.g. if only the summary statistic was observed.
'''Response:''' We have also added a sentence to point out that it is only an example application, and that the inference may be improved with more advanced methods.
*"Approximation of the posterior": "...has been justified theoretically under some limiting conditions". The word "limiting" doesn't seem (to me) to describe the measurement error case.
<span style="color:red">'''Response:'''</span>
*"Choice and sufficiency of summary statistics": "Sufficient statistics are optimal..." I'd change to "Low dimensional sufficient statistics". For some models (e.g. iid Cauchy) the only sufficient statistics are the full data set, which would be a poor choice.
'''Response:''' This has been changed.
*"Choice and sufficiency of summary statistics": "...which is approximated with a pilot run of simulations". Something like "...which is approximated by linear regression based on simulated data" would be more accurate.
'''Response:''' This has been changed.
*"Choice and sufficiency of summary statistics": It might be useful to reference a recent comparison<ref>M. G. B. Blum, M. A. Nunes, D. Prangle, S. A. Sisson (2012) A comparative review of dimension reduction methods in approximate Bayesian computation. arxiv.org/abs/1202.3819 </ref> (disclaimer: which I contributed to) between methods of choosing summary statistics.
'''Response:''' A sentence was added with a reference to the paper.
*"Bayes factor with ABC and summary statistics": "...can also be used to..." it might be more accurate to say "...is sufficient to..."
'''Response:''' This has been changed.
*"Bayes factor with ABC and summary statistics": "meaningless" seems too strong as the next sentence suggests a potentially useful alternative way of doing inference.
'''Response:''' The formulation was changed to “may therefore be misinformative”.
*"Prior distribution and parameter ranges": "...based on the principle of maximum entropy". A link to the general topic of [[wp:Prior_distribution#Uninformative_priors|objective priors]] might be helpful here.
'''Response:''' A link has been added.
*"Large data sets": "which may be a tractable approach for ABC based methods". Note it is already easy to parallelise many of the steps in ABC algorithms based on rejection sampling and SMC.
'''Response:''' This has been changed.
*"Curse of dimensionality": Some theoretical results have been proved here<ref>M. G. B. Blum (2010) Approximate Bayesian Computation: a nonparametric perspective. Journal of the American Statistical Association (105) 1178-1187</ref><ref name="FP2012" />.
'''Response:''' We have added a reference to this paper.
*"Conclusion": "With faster evaluation of the likelihood function..." I'm not sure what this is getting at; in ABC applications the likelihood function typically cannot be evaluated!
'''Response:''' This formulation has been changed.
==References==
<references />
1211
2012-07-09T14:56:31Z
Mikaelsunnaker
13
/* Minor comments */
== Comments of Darren Logan on things to do before moving the article to Wikipedia ==
* The title should probably be "Approximate Bayesian computation" (small "c"). The "...in computational biology" is probably redundant for WP article. You can always make a redirection from the long title to the short one.
*: Moved. --<span style="font-family:Palatino Linotype, Book Antiqua, Palatino, serif;">[[User:Spencer Bliven|Spencer]] [[User talk:Spencer Bliven|Bliven]]</span> 15:35, 18 May 2012 (PDT)
'''The following comments apply specifically to the wikipedia-version of this article''' --[[User:Cdessimoz|Cdessimoz]] 03:41, 22 May 2012 (PDT)
* In the summary and elsewhere, you use terms like "over the last years" and "recently". These should be avoided, as WP articles are not dated and thus non-specific time-frames are not meaningful. If you need to refer to time, be specific (e.g. "Since 1999..." or "In 2010...")
* Example. In general, Wikipedia articles should not contain worked examples. That type of content is better suited to Wikiversity or Wikibooks. There are exceptions, however. The guidance on this can be seen at WP:NOT, specifically: "An article should not read like a "how-to" style... the purpose of Wikipedia is to present facts, not to teach subject matter. It is not appropriate to create or edit articles that read as textbooks, with leading questions and systematic problem solutions as examples... Some kinds of examples, specifically those intended to inform rather than to instruct, may be appropriate for inclusion in a Wikipedia article." I think your example might be ok, but you should be careful of the tone to ensure it doesn't seem like a "how-to" guide.
* Wikipedia articles do not have conclusion sections.
* Throughout the article you should try and avoid using a narrative voice and remove all self-references. For example:
** "As the previous section suggests..."
** "This section attempts to review important recent developments..."
** "...should be considered with sober caution, as discussed below."
** "Interestingly..."
** "This section discusses these potential risks and reviews possible ways to address them.."
** "As the above makes clear..."
** "This section reviews risks..."
** "This section attempts to review important recent developments."
To follow up on these remarks, the history section in such Wikipedia articles is typically the first after the lead section, as it puts the topic into its historic context. I have thus moved it there. However, this possibly breaks some of the narrative flow and should thus be checked again during the revision. --[[User:Daniel Mietchen|Daniel Mietchen]] 19:05, 27 June 2012 (PDT)
: '''Response''': We have verified the coherence of the narrative flow. --[[User:Cdessimoz|Cdessimoz]] 07:52, 5 July 2012 (PDT)
== Comments of Christian P. Robert on the entry ==
A few comments on the specific entry on ABC written by Mikael Sunnåker et al....
*The entry starts with the representation of the posterior probability of an hypothesis, rather than with the posterior density of a model parameter, which makes it seems likely it could lead the novice reader astray. After all, (a) ABC was not introduced for conducting model choice and (b) interchanging hypothesis and model means that the probability of an hypothesis '''''H''''' as used in the entry is actually the evidence in favour of the corresponding model.
'''Response:''' We now first talk only about parameter estimation. We have also rewritten the section about model selection for better coherence of the text.
* (There are a few typos and grammar mistakes, but I assume either PLoS or later contributors will correct those.)
'''Response:''' We have corrected the typos and grammatical mistakes found during the revision.
* When the authors state that the "outcome of the ABC rejection algorithm is a set of parameter estimates distributed according to the desired posterior distribution", I think they are leading some of the readers astray as they forget the "approximative" aspect of this distribution.
'''Response:''' This has been changed.
* Further below, I would have used the title "Insufficient summary statistics" rather than "Sufficient summary statistics", as it spells out more clearly the fundamental issue with the potential difficulty in using ABC.
'''Response:''' The title has been changed to “Summary statistics” (see also Dennis Prangle's comment below)
* (And I am not sure the subsequent paragraph on "Choice and sufficiency of summary statistics" should bother with the sufficiency aspects... It seems to me much more relevant to assess the impact on predictive performances.
<span style="color:red">'''Response:'''</span>
* Although this is most minor, I would not have made mention of the (rather artificial) "table for interpretation of the strength in values of [[wp:Bayes factor|the Bayes factor]] (...) originally published by [[wp:Harold Jeffreys|Harold Jeffreys]]". I obviously appreciate very much that the authors advertise our warning <ref> Robert, C.P., Cornuet, J.-M., Marin, J.-M. and Pillai, N. (2011) Lack of confidence in approximate Bayesian computation model choice. PNAS vol. 108 no. 37 15112-15117.</ref> about the potential lack of validity of an ABC based Bayes factor!
'''Response:''' The section on model selection has been almost entirely rewritten. In the process, the reference to Jeffreys's table has been removed.
* I also like the notion of "quality control", even though it should only appear once.
'''Response:''' We have merged the two sections about quality control.
* And the pseudo-example is quite fine as an introduction, while it could be supplemented with the outcome resulting from a large n, to be compared with the true posterior distribution.
<span style="color:red">'''Response:'''</span>
* The section "Pitfalls and remedies" is remarkable in that it details the necessary steps for validating a ABC implementation: the only entry I would remove is the one about "Prior distribution and parameter ranges", in that this is not a problem inherent to ABC... (Granted, the authors present this as a "general risks in statistical inference exacerbated in ABC", which makes more sense!)
'''Response:''' We would like to keep the discussion on prior distribution and parameter ranges. However, a sentence was added under “Pitfalls and remedies” to emphasize that the problem related to “Prior distribution and parameter ranges” is not specific to ABC.
* It may be that the section on the non-zero tolerance should emphasize more clearly the fact that ε should not be zero. As discussed in the recent Read Paper by Fearnhead and Prangle <ref name="FP2012">Fearnhead, P. and Prangle, D. (2012) Constructing summary statistics for approximate Bayesian computation: semi-automatic approximate Bayesian computation. Journal of the Royal Statistical Society Series B. Volume 74, Issue 3, pages 419–474.</ref> when envisioning ABC as a non-parametric method of inference.
<span style="color:red">'''Response:'''</span>
* At last, it is always possible to criticise the coverage of the historical part, since this is such a recent field that it is constantly evolving. But the authors correctly point out to (Don) Rubin on the one hand and to Diggle and Graton on the other. I would suggest adding in this section links to the relevant softwares like our own DIY-ABC<ref>Cornuet, J.-M., Santos, F., Beaumont, M. et al. (2008) Inferring population history with DIY ABC: a user-friendly approach to approximate Bayesian computation . Bioinformatics 24 (23): 2713-2719.</ref>...
'''Response:''' A section listing ABC software has been added, including a new table with references to the corresponding papers (Table 3) .
== Review by Dennis Prangle ==
This is a well written and accessible introductory article. I particularly like the balance struck between describing the simplicity of implementing ABC and the potential drawbacks.
=== Major comments ===
(nb I've included full references only for papers not in the original article.)
* Much of the material in the "recent methodological developments" section is well established and no longer recent relative to the age of the field (e.g. the Marjoram et al paper was published in 2003). I'd suggest at least renaming this section. Alternatively, much of this material could be incorporated into the "approximation of the posterior" section, as regression correction ideas and MCMC / SMC algorithms are tools commonly used to improve the approximation.
'''Response:''' The section has been removed and most of the material has been incorporated into the “approximation of the posterior” section.
* A little more coverage of applications would be nice. One way to do this without increasing the length of the article would be to explicitly reference recent review papers (Beaumont 2010, Bertorelle et al 2010, Csillery et al 2010, Marin et al 2011<ref>Jean-Michel Marin, Pierre Pudlo, Christian P. Robert and Robin J. Ryder (2011) Approximate Bayesian computational methods. Statistics and Computing (published online)</ref>) for further details.
'''Response:''' We have added a sentence about applications of ABC, with references to these review papers, at the end of the “Example” section.
* The model comparison section should explain how the ABC rejection sampling algorithm can be adapted to perform inference between models (or give a reference). A reference to more advanced algorithms (e.g. Didelot et al, Toni and Stumpf 2009<ref>Tina Toni and Michael P. H. Stumpf (2009) Simulation-based model selection for dynamical systems in systems and population biology. Bioinformatics (26) 104-110</ref>) would also be helpful.
'''Response:''' We have added a reference to the Toni & Stumpf SMC-ABC method for model selection.
* I agree with Christian Robert's comments that the discussion of a hypothesis H in the motivation section is somewhat confusing, and that links to code could be helpful. Some additional suggestions are the "abc" R package and ABC-SysBio.
'''Response:''' See our response to Christian Robert’s comment above.
=== Minor comments ===
* The acceptance criterion should be <math>\rho (\hat{D},D) \le \epsilon</math> not <math>\rho (\hat{D},D)<\epsilon</math> if <math>\epsilon=0</math> is to correspond to acceptance of exact matches only.
'''Response:''' This has been changed.
*"Sufficient summary statistics": As Christian writes, it would seem more natural to discuss general summary statistics first, then the special and less practically useful case of sufficient statistics.
<span style="color:red">'''Response:'''</span>
*"Example": I'd point out that this is an example application only, and more accurate inference is possible here by particle filtering methods. If there were some missing data this would be a more natural ABC application e.g. if only the summary statistic was observed.
'''Response:''' We have also added a sentence to point out that it is only an example application, and that the inference may be improved with more advanced methods.
*"Approximation of the posterior": "...has been justified theoretically under some limiting conditions". The word "limiting" doesn't seem (to me) to describe the measurement error case.
<span style="color:red">'''Response:'''</span>
*"Choice and sufficiency of summary statistics": "Sufficient statistics are optimal..." I'd change to "Low dimensional sufficient statistics". For some models (e.g. iid Cauchy) the only sufficient statistics are the full data set, which would be a poor choice.
'''Response:''' This has been changed.
*"Choice and sufficiency of summary statistics": "...which is approximated with a pilot run of simulations". Something like "...which is approximated by linear regression based on simulated data" would be more accurate.
'''Response:''' This has been changed.
*"Choice and sufficiency of summary statistics": It might be useful to reference a recent comparison<ref>M. G. B. Blum, M. A. Nunes, D. Prangle, S. A. Sisson (2012) A comparative review of dimension reduction methods in approximate Bayesian computation. arxiv.org/abs/1202.3819 </ref> (disclaimer: which I contributed to) between methods of choosing summary statistics.
'''Response:''' A sentence was added with a reference to the paper.
*"Bayes factor with ABC and summary statistics": "...can also be used to..." it might be more accurate to say "...is sufficient to..."
'''Response:''' This has been changed.
*"Bayes factor with ABC and summary statistics": "meaningless" seems too strong as the next sentence suggests a potentially useful alternative way of doing inference.
'''Response:''' The formulation was changed to “may therefore be misinformative”.
*"Prior distribution and parameter ranges": "...based on the principle of maximum entropy". A link to the general topic of [[wp:Prior_distribution#Uninformative_priors|objective priors]] might be helpful here.
'''Response:''' A link has been added.
*"Large data sets": "which may be a tractable approach for ABC based methods". Note it is already easy to parallelise many of the steps in ABC algorithms based on rejection sampling and SMC.
'''Response:''' This has been changed.
*"Curse of dimensionality": Some theoretical results have been proved here<ref>M. G. B. Blum (2010) Approximate Bayesian Computation: a nonparametric perspective. Journal of the American Statistical Association (105) 1178-1187</ref><ref name="FP2012" />.
'''Response:''' We have added references to these papers.
*"Conclusion": "With faster evaluation of the likelihood function..." I'm not sure what this is getting at; in ABC applications the likelihood function typically cannot be evaluated!
'''Response:''' This formulation has been changed.
==References==
<references />
1212
2012-07-09T15:00:22Z
Mikaelsunnaker
13
== Comments of Darren Logan on things to do before moving the article to Wikipedia ==
* The title should probably be "Approximate Bayesian computation" (small "c"). The "...in computational biology" is probably redundant for WP article. You can always make a redirection from the long title to the short one.
*: Moved. --<span style="font-family:Palatino Linotype, Book Antiqua, Palatino, serif;">[[User:Spencer Bliven|Spencer]] [[User talk:Spencer Bliven|Bliven]]</span> 15:35, 18 May 2012 (PDT)
'''The following comments apply specifically to the wikipedia-version of this article''' --[[User:Cdessimoz|Cdessimoz]] 03:41, 22 May 2012 (PDT)
* In the summary and elsewhere, you use terms like "over the last years" and "recently". These should be avoided, as WP articles are not dated and thus non-specific time-frames are not meaningful. If you need to refer to time, be specific (e.g. "Since 1999..." or "In 2010...")
* Example. In general, Wikipedia articles should not contain worked examples. That type of content is better suited to Wikiversity or Wikibooks. There are exceptions, however. The guidance on this can be seen at WP:NOT, specifically: "An article should not read like a "how-to" style... the purpose of Wikipedia is to present facts, not to teach subject matter. It is not appropriate to create or edit articles that read as textbooks, with leading questions and systematic problem solutions as examples... Some kinds of examples, specifically those intended to inform rather than to instruct, may be appropriate for inclusion in a Wikipedia article." I think your example might be ok, but you should be careful of the tone to ensure it doesn't seem like a "how-to" guide.
* Wikipedia articles do not have conclusion sections.
* Throughout the article you should try and avoid using a narrative voice and remove all self-references. For example:
** "As the previous section suggests..."
** "This section attempts to review important recent developments..."
** "...should be considered with sober caution, as discussed below."
** "Interestingly..."
** "This section discusses these potential risks and reviews possible ways to address them.."
** "As the above makes clear..."
** "This section reviews risks..."
** "This section attempts to review important recent developments."
To follow up on these remarks, the history section in such Wikipedia articles is typically the first after the lead section, as it puts the topic into its historic context. I have thus moved it there. However, this possibly breaks some of the narrative flow and should thus be checked again during the revision. --[[User:Daniel Mietchen|Daniel Mietchen]] 19:05, 27 June 2012 (PDT)
: '''Response''': We have verified the coherence of the narrative flow. --[[User:Cdessimoz|Cdessimoz]] 07:52, 5 July 2012 (PDT)
== Comments of Christian P. Robert on the entry ==
A few comments on the specific entry on ABC written by Mikael Sunnåker et al....
*The entry starts with the representation of the posterior probability of an hypothesis, rather than with the posterior density of a model parameter, which makes it seems likely it could lead the novice reader astray. After all, (a) ABC was not introduced for conducting model choice and (b) interchanging hypothesis and model means that the probability of an hypothesis '''''H''''' as used in the entry is actually the evidence in favour of the corresponding model.
'''Response:''' We now first talk only about parameter estimation. We have also rewritten the section about model selection for better coherence of the text.
* (There are a few typos and grammar mistakes, but I assume either PLoS or later contributors will correct those.)
'''Response:''' We have corrected the typos and grammatical mistakes found during the revision.
* When the authors state that the "outcome of the ABC rejection algorithm is a set of parameter estimates distributed according to the desired posterior distribution", I think they are leading some of the readers astray as they forget the "approximative" aspect of this distribution.
'''Response:''' This has been changed.
* Further below, I would have used the title "Insufficient summary statistics" rather than "Sufficient summary statistics", as it spells out more clearly the fundamental issue with the potential difficulty in using ABC.
'''Response:''' The title has been changed to “Summary statistics” (see also Dennis Prangle's comment below)
* (And I am not sure the subsequent paragraph on "Choice and sufficiency of summary statistics" should bother with the sufficiency aspects... It seems to me much more relevant to assess the impact on predictive performances.
<span style="color:red">'''Response:'''</span>
* Although this is most minor, I would not have made mention of the (rather artificial) "table for interpretation of the strength in values of [[wp:Bayes factor|the Bayes factor]] (...) originally published by [[wp:Harold Jeffreys|Harold Jeffreys]]". I obviously appreciate very much that the authors advertise our warning <ref> Robert, C.P., Cornuet, J.-M., Marin, J.-M. and Pillai, N. (2011) Lack of confidence in approximate Bayesian computation model choice. PNAS vol. 108 no. 37 15112-15117.</ref> about the potential lack of validity of an ABC based Bayes factor!
'''Response:''' The section on model selection has been almost entirely rewritten. In the process, the reference to Jeffreys's table has been removed.
* I also like the notion of "quality control", even though it should only appear once.
'''Response:''' We have merged the two sections about quality control.
* And the pseudo-example is quite fine as an introduction, while it could be supplemented with the outcome resulting from a large n, to be compared with the true posterior distribution.
<span style="color:red">'''Response:'''</span>
* The section "Pitfalls and remedies" is remarkable in that it details the necessary steps for validating a ABC implementation: the only entry I would remove is the one about "Prior distribution and parameter ranges", in that this is not a problem inherent to ABC... (Granted, the authors present this as a "general risks in statistical inference exacerbated in ABC", which makes more sense!)
'''Response:''' We would like to keep the discussion on prior distribution and parameter ranges. However, a sentence was added under “Pitfalls and remedies” to emphasize that the problem related to “Prior distribution and parameter ranges” is not specific to ABC.
* It may be that the section on the non-zero tolerance should emphasize more clearly the fact that ε should not be zero. As discussed in the recent Read Paper by Fearnhead and Prangle <ref name="FP2012">Fearnhead, P. and Prangle, D. (2012) Constructing summary statistics for approximate Bayesian computation: semi-automatic approximate Bayesian computation. Journal of the Royal Statistical Society Series B. Volume 74, Issue 3, pages 419–474.</ref> when envisioning ABC as a non-parametric method of inference.
<span style="color:red">'''Response:''' JM</span>
* At last, it is always possible to criticise the coverage of the historical part, since this is such a recent field that it is constantly evolving. But the authors correctly point out to (Don) Rubin on the one hand and to Diggle and Graton on the other. I would suggest adding in this section links to the relevant softwares like our own DIY-ABC<ref>Cornuet, J.-M., Santos, F., Beaumont, M. et al. (2008) Inferring population history with DIY ABC: a user-friendly approach to approximate Bayesian computation . Bioinformatics 24 (23): 2713-2719.</ref>...
'''Response:''' A section listing ABC software has been added, including a new table with references to the corresponding papers (Table 3) .
== Review by Dennis Prangle ==
This is a well written and accessible introductory article. I particularly like the balance struck between describing the simplicity of implementing ABC and the potential drawbacks.
=== Major comments ===
(nb I've included full references only for papers not in the original article.)
* Much of the material in the "recent methodological developments" section is well established and no longer recent relative to the age of the field (e.g. the Marjoram et al paper was published in 2003). I'd suggest at least renaming this section. Alternatively, much of this material could be incorporated into the "approximation of the posterior" section, as regression correction ideas and MCMC / SMC algorithms are tools commonly used to improve the approximation.
'''Response:''' The section has been removed and most of the material has been incorporated into the “approximation of the posterior” section.
* A little more coverage of applications would be nice. One way to do this without increasing the length of the article would be to explicitly reference recent review papers (Beaumont 2010, Bertorelle et al 2010, Csillery et al 2010, Marin et al 2011<ref>Jean-Michel Marin, Pierre Pudlo, Christian P. Robert and Robin J. Ryder (2011) Approximate Bayesian computational methods. Statistics and Computing (published online)</ref>) for further details.
'''Response:''' We have added a sentence about applications of ABC, with references to these review papers, at the end of the “Example” section.
* The model comparison section should explain how the ABC rejection sampling algorithm can be adapted to perform inference between models (or give a reference). A reference to more advanced algorithms (e.g. Didelot et al, Toni and Stumpf 2009<ref>Tina Toni and Michael P. H. Stumpf (2009) Simulation-based model selection for dynamical systems in systems and population biology. Bioinformatics (26) 104-110</ref>) would also be helpful.
'''Response:''' We have added a reference to the Toni & Stumpf SMC-ABC method for model selection.
* I agree with Christian Robert's comments that the discussion of a hypothesis H in the motivation section is somewhat confusing, and that links to code could be helpful. Some additional suggestions are the "abc" R package and ABC-SysBio.
'''Response:''' See our response to Christian Robert’s comment above.
=== Minor comments ===
* The acceptance criterion should be <math>\rho (\hat{D},D) \le \epsilon</math> not <math>\rho (\hat{D},D)<\epsilon</math> if <math>\epsilon=0</math> is to correspond to acceptance of exact matches only.
'''Response:''' This has been changed.
*"Sufficient summary statistics": As Christian writes, it would seem more natural to discuss general summary statistics first, then the special and less practically useful case of sufficient statistics.
<span style="color:red">'''Response:'''</span>
*"Example": I'd point out that this is an example application only, and more accurate inference is possible here by particle filtering methods. If there were some missing data this would be a more natural ABC application e.g. if only the summary statistic was observed.
'''Response:''' We have also added a sentence to point out that it is only an example application, and that the inference may be improved with more advanced methods.
*"Approximation of the posterior": "...has been justified theoretically under some limiting conditions". The word "limiting" doesn't seem (to me) to describe the measurement error case.
<span style="color:red">'''Response:''' JM</span>
*"Choice and sufficiency of summary statistics": "Sufficient statistics are optimal..." I'd change to "Low dimensional sufficient statistics". For some models (e.g. iid Cauchy) the only sufficient statistics are the full data set, which would be a poor choice.
'''Response:''' This has been changed.
*"Choice and sufficiency of summary statistics": "...which is approximated with a pilot run of simulations". Something like "...which is approximated by linear regression based on simulated data" would be more accurate.
'''Response:''' This has been changed.
*"Choice and sufficiency of summary statistics": It might be useful to reference a recent comparison<ref>M. G. B. Blum, M. A. Nunes, D. Prangle, S. A. Sisson (2012) A comparative review of dimension reduction methods in approximate Bayesian computation. arxiv.org/abs/1202.3819 </ref> (disclaimer: which I contributed to) between methods of choosing summary statistics.
'''Response:''' A sentence was added with a reference to the paper.
*"Bayes factor with ABC and summary statistics": "...can also be used to..." it might be more accurate to say "...is sufficient to..."
'''Response:''' This has been changed.
*"Bayes factor with ABC and summary statistics": "meaningless" seems too strong as the next sentence suggests a potentially useful alternative way of doing inference.
'''Response:''' The formulation was changed to “may therefore be misinformative”.
*"Prior distribution and parameter ranges": "...based on the principle of maximum entropy". A link to the general topic of [[wp:Prior_distribution#Uninformative_priors|objective priors]] might be helpful here.
'''Response:''' A link has been added.
*"Large data sets": "which may be a tractable approach for ABC based methods". Note it is already easy to parallelise many of the steps in ABC algorithms based on rejection sampling and SMC.
'''Response:''' This has been changed.
*"Curse of dimensionality": Some theoretical results have been proved here<ref>M. G. B. Blum (2010) Approximate Bayesian Computation: a nonparametric perspective. Journal of the American Statistical Association (105) 1178-1187</ref><ref name="FP2012" />.
'''Response:''' We have added references to these papers.
*"Conclusion": "With faster evaluation of the likelihood function..." I'm not sure what this is getting at; in ABC applications the likelihood function typically cannot be evaluated!
'''Response:''' This formulation has been changed.
==References==
<references />
1274
2012-07-16T14:23:37Z
Mikaelsunnaker
13
/* Comments of Christian P. Robert on the entry */
== Comments of Darren Logan on things to do before moving the article to Wikipedia ==
* The title should probably be "Approximate Bayesian computation" (small "c"). The "...in computational biology" is probably redundant for WP article. You can always make a redirection from the long title to the short one.
*: Moved. --<span style="font-family:Palatino Linotype, Book Antiqua, Palatino, serif;">[[User:Spencer Bliven|Spencer]] [[User talk:Spencer Bliven|Bliven]]</span> 15:35, 18 May 2012 (PDT)
'''The following comments apply specifically to the wikipedia-version of this article''' --[[User:Cdessimoz|Cdessimoz]] 03:41, 22 May 2012 (PDT)
* In the summary and elsewhere, you use terms like "over the last years" and "recently". These should be avoided, as WP articles are not dated and thus non-specific time-frames are not meaningful. If you need to refer to time, be specific (e.g. "Since 1999..." or "In 2010...")
* Example. In general, Wikipedia articles should not contain worked examples. That type of content is better suited to Wikiversity or Wikibooks. There are exceptions, however. The guidance on this can be seen at WP:NOT, specifically: "An article should not read like a "how-to" style... the purpose of Wikipedia is to present facts, not to teach subject matter. It is not appropriate to create or edit articles that read as textbooks, with leading questions and systematic problem solutions as examples... Some kinds of examples, specifically those intended to inform rather than to instruct, may be appropriate for inclusion in a Wikipedia article." I think your example might be ok, but you should be careful of the tone to ensure it doesn't seem like a "how-to" guide.
* Wikipedia articles do not have conclusion sections.
* Throughout the article you should try and avoid using a narrative voice and remove all self-references. For example:
** "As the previous section suggests..."
** "This section attempts to review important recent developments..."
** "...should be considered with sober caution, as discussed below."
** "Interestingly..."
** "This section discusses these potential risks and reviews possible ways to address them.."
** "As the above makes clear..."
** "This section reviews risks..."
** "This section attempts to review important recent developments."
To follow up on these remarks, the history section in such Wikipedia articles is typically the first after the lead section, as it puts the topic into its historic context. I have thus moved it there. However, this possibly breaks some of the narrative flow and should thus be checked again during the revision. --[[User:Daniel Mietchen|Daniel Mietchen]] 19:05, 27 June 2012 (PDT)
: '''Response''': We have verified the coherence of the narrative flow. --[[User:Cdessimoz|Cdessimoz]] 07:52, 5 July 2012 (PDT)
== Comments of Christian P. Robert on the entry ==
A few comments on the specific entry on ABC written by Mikael Sunnåker et al....
*The entry starts with the representation of the posterior probability of an hypothesis, rather than with the posterior density of a model parameter, which makes it seems likely it could lead the novice reader astray. After all, (a) ABC was not introduced for conducting model choice and (b) interchanging hypothesis and model means that the probability of an hypothesis '''''H''''' as used in the entry is actually the evidence in favour of the corresponding model.
'''Response:''' We now first talk only about parameter estimation. We have also rewritten the section about model selection for better coherence of the text.
* (There are a few typos and grammar mistakes, but I assume either PLoS or later contributors will correct those.)
'''Response:''' We have corrected the typos and grammatical mistakes found during the revision.
* When the authors state that the "outcome of the ABC rejection algorithm is a set of parameter estimates distributed according to the desired posterior distribution", I think they are leading some of the readers astray as they forget the "approximative" aspect of this distribution.
'''Response:''' This has been changed.
* Further below, I would have used the title "Insufficient summary statistics" rather than "Sufficient summary statistics", as it spells out more clearly the fundamental issue with the potential difficulty in using ABC.
'''Response:''' The title has been changed to “Summary statistics” (see also Dennis Prangle's comment below)
* (And I am not sure the subsequent paragraph on "Choice and sufficiency of summary statistics" should bother with the sufficiency aspects... It seems to me much more relevant to assess the impact on predictive performances.
<span style="color:red">'''Response:'''</span>
* Although this is most minor, I would not have made mention of the (rather artificial) "table for interpretation of the strength in values of [[wp:Bayes factor|the Bayes factor]] (...) originally published by [[wp:Harold Jeffreys|Harold Jeffreys]]". I obviously appreciate very much that the authors advertise our warning <ref> Robert, C.P., Cornuet, J.-M., Marin, J.-M. and Pillai, N. (2011) Lack of confidence in approximate Bayesian computation model choice. PNAS vol. 108 no. 37 15112-15117.</ref> about the potential lack of validity of an ABC based Bayes factor!
'''Response:''' The section on model selection has been almost entirely rewritten. In the process, the reference to Jeffreys's table has been removed.
* I also like the notion of "quality control", even though it should only appear once.
'''Response:''' We have merged the two sections about quality control.
* And the pseudo-example is quite fine as an introduction, while it could be supplemented with the outcome resulting from a large n, to be compared with the true posterior distribution.
'''Response:''' We have included a new figure (Fig. 3), which shows ABC with large n for full data, and summary statistics (\epsilon = 0 and \epsilon = 2). It also shows the theoretical posterior, based on propagation of uncertainty, and sequential incorporation of new information about \theta from the data.
* The section "Pitfalls and remedies" is remarkable in that it details the necessary steps for validating a ABC implementation: the only entry I would remove is the one about "Prior distribution and parameter ranges", in that this is not a problem inherent to ABC... (Granted, the authors present this as a "general risks in statistical inference exacerbated in ABC", which makes more sense!)
'''Response:''' We would like to keep the discussion on prior distribution and parameter ranges. However, a sentence was added under “Pitfalls and remedies” to emphasize that the problem related to “Prior distribution and parameter ranges” is not specific to ABC.
* It may be that the section on the non-zero tolerance should emphasize more clearly the fact that ε should not be zero. As discussed in the recent Read Paper by Fearnhead and Prangle <ref name="FP2012">Fearnhead, P. and Prangle, D. (2012) Constructing summary statistics for approximate Bayesian computation: semi-automatic approximate Bayesian computation. Journal of the Royal Statistical Society Series B. Volume 74, Issue 3, pages 419–474.</ref> when envisioning ABC as a non-parametric method of inference.
<span style="color:red">'''Response:''' JM</span>
* At last, it is always possible to criticise the coverage of the historical part, since this is such a recent field that it is constantly evolving. But the authors correctly point out to (Don) Rubin on the one hand and to Diggle and Graton on the other. I would suggest adding in this section links to the relevant softwares like our own DIY-ABC<ref>Cornuet, J.-M., Santos, F., Beaumont, M. et al. (2008) Inferring population history with DIY ABC: a user-friendly approach to approximate Bayesian computation . Bioinformatics 24 (23): 2713-2719.</ref>...
'''Response:''' A section listing ABC software has been added, including a new table with references to the corresponding papers (Table 3) .
== Review by Dennis Prangle ==
This is a well written and accessible introductory article. I particularly like the balance struck between describing the simplicity of implementing ABC and the potential drawbacks.
=== Major comments ===
(nb I've included full references only for papers not in the original article.)
* Much of the material in the "recent methodological developments" section is well established and no longer recent relative to the age of the field (e.g. the Marjoram et al paper was published in 2003). I'd suggest at least renaming this section. Alternatively, much of this material could be incorporated into the "approximation of the posterior" section, as regression correction ideas and MCMC / SMC algorithms are tools commonly used to improve the approximation.
'''Response:''' The section has been removed and most of the material has been incorporated into the “approximation of the posterior” section.
* A little more coverage of applications would be nice. One way to do this without increasing the length of the article would be to explicitly reference recent review papers (Beaumont 2010, Bertorelle et al 2010, Csillery et al 2010, Marin et al 2011<ref>Jean-Michel Marin, Pierre Pudlo, Christian P. Robert and Robin J. Ryder (2011) Approximate Bayesian computational methods. Statistics and Computing (published online)</ref>) for further details.
'''Response:''' We have added a sentence about applications of ABC, with references to these review papers, at the end of the “Example” section.
* The model comparison section should explain how the ABC rejection sampling algorithm can be adapted to perform inference between models (or give a reference). A reference to more advanced algorithms (e.g. Didelot et al, Toni and Stumpf 2009<ref>Tina Toni and Michael P. H. Stumpf (2009) Simulation-based model selection for dynamical systems in systems and population biology. Bioinformatics (26) 104-110</ref>) would also be helpful.
'''Response:''' We have added a reference to the Toni & Stumpf SMC-ABC method for model selection.
* I agree with Christian Robert's comments that the discussion of a hypothesis H in the motivation section is somewhat confusing, and that links to code could be helpful. Some additional suggestions are the "abc" R package and ABC-SysBio.
'''Response:''' See our response to Christian Robert’s comment above.
=== Minor comments ===
* The acceptance criterion should be <math>\rho (\hat{D},D) \le \epsilon</math> not <math>\rho (\hat{D},D)<\epsilon</math> if <math>\epsilon=0</math> is to correspond to acceptance of exact matches only.
'''Response:''' This has been changed.
*"Sufficient summary statistics": As Christian writes, it would seem more natural to discuss general summary statistics first, then the special and less practically useful case of sufficient statistics.
<span style="color:red">'''Response:'''</span>
*"Example": I'd point out that this is an example application only, and more accurate inference is possible here by particle filtering methods. If there were some missing data this would be a more natural ABC application e.g. if only the summary statistic was observed.
'''Response:''' We have also added a sentence to point out that it is only an example application, and that the inference may be improved with more advanced methods.
*"Approximation of the posterior": "...has been justified theoretically under some limiting conditions". The word "limiting" doesn't seem (to me) to describe the measurement error case.
<span style="color:red">'''Response:''' JM</span>
*"Choice and sufficiency of summary statistics": "Sufficient statistics are optimal..." I'd change to "Low dimensional sufficient statistics". For some models (e.g. iid Cauchy) the only sufficient statistics are the full data set, which would be a poor choice.
'''Response:''' This has been changed.
*"Choice and sufficiency of summary statistics": "...which is approximated with a pilot run of simulations". Something like "...which is approximated by linear regression based on simulated data" would be more accurate.
'''Response:''' This has been changed.
*"Choice and sufficiency of summary statistics": It might be useful to reference a recent comparison<ref>M. G. B. Blum, M. A. Nunes, D. Prangle, S. A. Sisson (2012) A comparative review of dimension reduction methods in approximate Bayesian computation. arxiv.org/abs/1202.3819 </ref> (disclaimer: which I contributed to) between methods of choosing summary statistics.
'''Response:''' A sentence was added with a reference to the paper.
*"Bayes factor with ABC and summary statistics": "...can also be used to..." it might be more accurate to say "...is sufficient to..."
'''Response:''' This has been changed.
*"Bayes factor with ABC and summary statistics": "meaningless" seems too strong as the next sentence suggests a potentially useful alternative way of doing inference.
'''Response:''' The formulation was changed to “may therefore be misinformative”.
*"Prior distribution and parameter ranges": "...based on the principle of maximum entropy". A link to the general topic of [[wp:Prior_distribution#Uninformative_priors|objective priors]] might be helpful here.
'''Response:''' A link has been added.
*"Large data sets": "which may be a tractable approach for ABC based methods". Note it is already easy to parallelise many of the steps in ABC algorithms based on rejection sampling and SMC.
'''Response:''' This has been changed.
*"Curse of dimensionality": Some theoretical results have been proved here<ref>M. G. B. Blum (2010) Approximate Bayesian Computation: a nonparametric perspective. Journal of the American Statistical Association (105) 1178-1187</ref><ref name="FP2012" />.
'''Response:''' We have added references to these papers.
*"Conclusion": "With faster evaluation of the likelihood function..." I'm not sure what this is getting at; in ABC applications the likelihood function typically cannot be evaluated!
'''Response:''' This formulation has been changed.
==References==
<references />
1275
2012-07-16T14:24:38Z
Cdessimoz
12
/* Minor comments */
== Comments of Darren Logan on things to do before moving the article to Wikipedia ==
* The title should probably be "Approximate Bayesian computation" (small "c"). The "...in computational biology" is probably redundant for WP article. You can always make a redirection from the long title to the short one.
*: Moved. --<span style="font-family:Palatino Linotype, Book Antiqua, Palatino, serif;">[[User:Spencer Bliven|Spencer]] [[User talk:Spencer Bliven|Bliven]]</span> 15:35, 18 May 2012 (PDT)
'''The following comments apply specifically to the wikipedia-version of this article''' --[[User:Cdessimoz|Cdessimoz]] 03:41, 22 May 2012 (PDT)
* In the summary and elsewhere, you use terms like "over the last years" and "recently". These should be avoided, as WP articles are not dated and thus non-specific time-frames are not meaningful. If you need to refer to time, be specific (e.g. "Since 1999..." or "In 2010...")
* Example. In general, Wikipedia articles should not contain worked examples. That type of content is better suited to Wikiversity or Wikibooks. There are exceptions, however. The guidance on this can be seen at WP:NOT, specifically: "An article should not read like a "how-to" style... the purpose of Wikipedia is to present facts, not to teach subject matter. It is not appropriate to create or edit articles that read as textbooks, with leading questions and systematic problem solutions as examples... Some kinds of examples, specifically those intended to inform rather than to instruct, may be appropriate for inclusion in a Wikipedia article." I think your example might be ok, but you should be careful of the tone to ensure it doesn't seem like a "how-to" guide.
* Wikipedia articles do not have conclusion sections.
* Throughout the article you should try and avoid using a narrative voice and remove all self-references. For example:
** "As the previous section suggests..."
** "This section attempts to review important recent developments..."
** "...should be considered with sober caution, as discussed below."
** "Interestingly..."
** "This section discusses these potential risks and reviews possible ways to address them.."
** "As the above makes clear..."
** "This section reviews risks..."
** "This section attempts to review important recent developments."
To follow up on these remarks, the history section in such Wikipedia articles is typically the first after the lead section, as it puts the topic into its historic context. I have thus moved it there. However, this possibly breaks some of the narrative flow and should thus be checked again during the revision. --[[User:Daniel Mietchen|Daniel Mietchen]] 19:05, 27 June 2012 (PDT)
: '''Response''': We have verified the coherence of the narrative flow. --[[User:Cdessimoz|Cdessimoz]] 07:52, 5 July 2012 (PDT)
== Comments of Christian P. Robert on the entry ==
A few comments on the specific entry on ABC written by Mikael Sunnåker et al....
*The entry starts with the representation of the posterior probability of an hypothesis, rather than with the posterior density of a model parameter, which makes it seems likely it could lead the novice reader astray. After all, (a) ABC was not introduced for conducting model choice and (b) interchanging hypothesis and model means that the probability of an hypothesis '''''H''''' as used in the entry is actually the evidence in favour of the corresponding model.
'''Response:''' We now first talk only about parameter estimation. We have also rewritten the section about model selection for better coherence of the text.
* (There are a few typos and grammar mistakes, but I assume either PLoS or later contributors will correct those.)
'''Response:''' We have corrected the typos and grammatical mistakes found during the revision.
* When the authors state that the "outcome of the ABC rejection algorithm is a set of parameter estimates distributed according to the desired posterior distribution", I think they are leading some of the readers astray as they forget the "approximative" aspect of this distribution.
'''Response:''' This has been changed.
* Further below, I would have used the title "Insufficient summary statistics" rather than "Sufficient summary statistics", as it spells out more clearly the fundamental issue with the potential difficulty in using ABC.
'''Response:''' The title has been changed to “Summary statistics” (see also Dennis Prangle's comment below)
* (And I am not sure the subsequent paragraph on "Choice and sufficiency of summary statistics" should bother with the sufficiency aspects... It seems to me much more relevant to assess the impact on predictive performances.
<span style="color:red">'''Response:'''</span>
* Although this is most minor, I would not have made mention of the (rather artificial) "table for interpretation of the strength in values of [[wp:Bayes factor|the Bayes factor]] (...) originally published by [[wp:Harold Jeffreys|Harold Jeffreys]]". I obviously appreciate very much that the authors advertise our warning <ref> Robert, C.P., Cornuet, J.-M., Marin, J.-M. and Pillai, N. (2011) Lack of confidence in approximate Bayesian computation model choice. PNAS vol. 108 no. 37 15112-15117.</ref> about the potential lack of validity of an ABC based Bayes factor!
'''Response:''' The section on model selection has been almost entirely rewritten. In the process, the reference to Jeffreys's table has been removed.
* I also like the notion of "quality control", even though it should only appear once.
'''Response:''' We have merged the two sections about quality control.
* And the pseudo-example is quite fine as an introduction, while it could be supplemented with the outcome resulting from a large n, to be compared with the true posterior distribution.
'''Response:''' We have included a new figure (Fig. 3), which shows ABC with large n for full data, and summary statistics (\epsilon = 0 and \epsilon = 2). It also shows the theoretical posterior, based on propagation of uncertainty, and sequential incorporation of new information about \theta from the data.
* The section "Pitfalls and remedies" is remarkable in that it details the necessary steps for validating a ABC implementation: the only entry I would remove is the one about "Prior distribution and parameter ranges", in that this is not a problem inherent to ABC... (Granted, the authors present this as a "general risks in statistical inference exacerbated in ABC", which makes more sense!)
'''Response:''' We would like to keep the discussion on prior distribution and parameter ranges. However, a sentence was added under “Pitfalls and remedies” to emphasize that the problem related to “Prior distribution and parameter ranges” is not specific to ABC.
* It may be that the section on the non-zero tolerance should emphasize more clearly the fact that ε should not be zero. As discussed in the recent Read Paper by Fearnhead and Prangle <ref name="FP2012">Fearnhead, P. and Prangle, D. (2012) Constructing summary statistics for approximate Bayesian computation: semi-automatic approximate Bayesian computation. Journal of the Royal Statistical Society Series B. Volume 74, Issue 3, pages 419–474.</ref> when envisioning ABC as a non-parametric method of inference.
<span style="color:red">'''Response:''' JM</span>
* At last, it is always possible to criticise the coverage of the historical part, since this is such a recent field that it is constantly evolving. But the authors correctly point out to (Don) Rubin on the one hand and to Diggle and Graton on the other. I would suggest adding in this section links to the relevant softwares like our own DIY-ABC<ref>Cornuet, J.-M., Santos, F., Beaumont, M. et al. (2008) Inferring population history with DIY ABC: a user-friendly approach to approximate Bayesian computation . Bioinformatics 24 (23): 2713-2719.</ref>...
'''Response:''' A section listing ABC software has been added, including a new table with references to the corresponding papers (Table 3) .
== Review by Dennis Prangle ==
This is a well written and accessible introductory article. I particularly like the balance struck between describing the simplicity of implementing ABC and the potential drawbacks.
=== Major comments ===
(nb I've included full references only for papers not in the original article.)
* Much of the material in the "recent methodological developments" section is well established and no longer recent relative to the age of the field (e.g. the Marjoram et al paper was published in 2003). I'd suggest at least renaming this section. Alternatively, much of this material could be incorporated into the "approximation of the posterior" section, as regression correction ideas and MCMC / SMC algorithms are tools commonly used to improve the approximation.
'''Response:''' The section has been removed and most of the material has been incorporated into the “approximation of the posterior” section.
* A little more coverage of applications would be nice. One way to do this without increasing the length of the article would be to explicitly reference recent review papers (Beaumont 2010, Bertorelle et al 2010, Csillery et al 2010, Marin et al 2011<ref>Jean-Michel Marin, Pierre Pudlo, Christian P. Robert and Robin J. Ryder (2011) Approximate Bayesian computational methods. Statistics and Computing (published online)</ref>) for further details.
'''Response:''' We have added a sentence about applications of ABC, with references to these review papers, at the end of the “Example” section.
* The model comparison section should explain how the ABC rejection sampling algorithm can be adapted to perform inference between models (or give a reference). A reference to more advanced algorithms (e.g. Didelot et al, Toni and Stumpf 2009<ref>Tina Toni and Michael P. H. Stumpf (2009) Simulation-based model selection for dynamical systems in systems and population biology. Bioinformatics (26) 104-110</ref>) would also be helpful.
'''Response:''' We have added a reference to the Toni & Stumpf SMC-ABC method for model selection.
* I agree with Christian Robert's comments that the discussion of a hypothesis H in the motivation section is somewhat confusing, and that links to code could be helpful. Some additional suggestions are the "abc" R package and ABC-SysBio.
'''Response:''' See our response to Christian Robert’s comment above.
=== Minor comments ===
* The acceptance criterion should be <math>\rho (\hat{D},D) \le \epsilon</math> not <math>\rho (\hat{D},D)<\epsilon</math> if <math>\epsilon=0</math> is to correspond to acceptance of exact matches only.
'''Response:''' This has been changed.
*"Sufficient summary statistics": As Christian writes, it would seem more natural to discuss general summary statistics first, then the special and less practically useful case of sufficient statistics.
<span style="color:red">'''Response:'''</span>
*"Example": I'd point out that this is an example application only, and more accurate inference is possible here by particle filtering methods. If there were some missing data this would be a more natural ABC application e.g. if only the summary statistic was observed.
'''Response:''' We have also added a sentence to point out that it is only an example application, and that the posterior can be computed exactly.
*"Approximation of the posterior": "...has been justified theoretically under some limiting conditions". The word "limiting" doesn't seem (to me) to describe the measurement error case.
<span style="color:red">'''Response:''' JM</span>
*"Choice and sufficiency of summary statistics": "Sufficient statistics are optimal..." I'd change to "Low dimensional sufficient statistics". For some models (e.g. iid Cauchy) the only sufficient statistics are the full data set, which would be a poor choice.
'''Response:''' This has been changed.
*"Choice and sufficiency of summary statistics": "...which is approximated with a pilot run of simulations". Something like "...which is approximated by linear regression based on simulated data" would be more accurate.
'''Response:''' This has been changed.
*"Choice and sufficiency of summary statistics": It might be useful to reference a recent comparison<ref>M. G. B. Blum, M. A. Nunes, D. Prangle, S. A. Sisson (2012) A comparative review of dimension reduction methods in approximate Bayesian computation. arxiv.org/abs/1202.3819 </ref> (disclaimer: which I contributed to) between methods of choosing summary statistics.
'''Response:''' A sentence was added with a reference to the paper.
*"Bayes factor with ABC and summary statistics": "...can also be used to..." it might be more accurate to say "...is sufficient to..."
'''Response:''' This has been changed.
*"Bayes factor with ABC and summary statistics": "meaningless" seems too strong as the next sentence suggests a potentially useful alternative way of doing inference.
'''Response:''' The formulation was changed to “may therefore be misinformative”.
*"Prior distribution and parameter ranges": "...based on the principle of maximum entropy". A link to the general topic of [[wp:Prior_distribution#Uninformative_priors|objective priors]] might be helpful here.
'''Response:''' A link has been added.
*"Large data sets": "which may be a tractable approach for ABC based methods". Note it is already easy to parallelise many of the steps in ABC algorithms based on rejection sampling and SMC.
'''Response:''' This has been changed.
*"Curse of dimensionality": Some theoretical results have been proved here<ref>M. G. B. Blum (2010) Approximate Bayesian Computation: a nonparametric perspective. Journal of the American Statistical Association (105) 1178-1187</ref><ref name="FP2012" />.
'''Response:''' We have added references to these papers.
*"Conclusion": "With faster evaluation of the likelihood function..." I'm not sure what this is getting at; in ABC applications the likelihood function typically cannot be evaluated!
'''Response:''' This formulation has been changed.
==References==
<references />
1276
2012-07-16T14:26:08Z
Mikaelsunnaker
13
/* Comments of Christian P. Robert on the entry */
== Comments of Darren Logan on things to do before moving the article to Wikipedia ==
* The title should probably be "Approximate Bayesian computation" (small "c"). The "...in computational biology" is probably redundant for WP article. You can always make a redirection from the long title to the short one.
*: Moved. --<span style="font-family:Palatino Linotype, Book Antiqua, Palatino, serif;">[[User:Spencer Bliven|Spencer]] [[User talk:Spencer Bliven|Bliven]]</span> 15:35, 18 May 2012 (PDT)
'''The following comments apply specifically to the wikipedia-version of this article''' --[[User:Cdessimoz|Cdessimoz]] 03:41, 22 May 2012 (PDT)
* In the summary and elsewhere, you use terms like "over the last years" and "recently". These should be avoided, as WP articles are not dated and thus non-specific time-frames are not meaningful. If you need to refer to time, be specific (e.g. "Since 1999..." or "In 2010...")
* Example. In general, Wikipedia articles should not contain worked examples. That type of content is better suited to Wikiversity or Wikibooks. There are exceptions, however. The guidance on this can be seen at WP:NOT, specifically: "An article should not read like a "how-to" style... the purpose of Wikipedia is to present facts, not to teach subject matter. It is not appropriate to create or edit articles that read as textbooks, with leading questions and systematic problem solutions as examples... Some kinds of examples, specifically those intended to inform rather than to instruct, may be appropriate for inclusion in a Wikipedia article." I think your example might be ok, but you should be careful of the tone to ensure it doesn't seem like a "how-to" guide.
* Wikipedia articles do not have conclusion sections.
* Throughout the article you should try and avoid using a narrative voice and remove all self-references. For example:
** "As the previous section suggests..."
** "This section attempts to review important recent developments..."
** "...should be considered with sober caution, as discussed below."
** "Interestingly..."
** "This section discusses these potential risks and reviews possible ways to address them.."
** "As the above makes clear..."
** "This section reviews risks..."
** "This section attempts to review important recent developments."
To follow up on these remarks, the history section in such Wikipedia articles is typically the first after the lead section, as it puts the topic into its historic context. I have thus moved it there. However, this possibly breaks some of the narrative flow and should thus be checked again during the revision. --[[User:Daniel Mietchen|Daniel Mietchen]] 19:05, 27 June 2012 (PDT)
: '''Response''': We have verified the coherence of the narrative flow. --[[User:Cdessimoz|Cdessimoz]] 07:52, 5 July 2012 (PDT)
== Comments of Christian P. Robert on the entry ==
A few comments on the specific entry on ABC written by Mikael Sunnåker et al....
*The entry starts with the representation of the posterior probability of an hypothesis, rather than with the posterior density of a model parameter, which makes it seems likely it could lead the novice reader astray. After all, (a) ABC was not introduced for conducting model choice and (b) interchanging hypothesis and model means that the probability of an hypothesis '''''H''''' as used in the entry is actually the evidence in favour of the corresponding model.
'''Response:''' We now first talk only about parameter estimation. We have also rewritten the section about model selection for better coherence of the text.
* (There are a few typos and grammar mistakes, but I assume either PLoS or later contributors will correct those.)
'''Response:''' We have corrected the typos and grammatical mistakes found during the revision.
* When the authors state that the "outcome of the ABC rejection algorithm is a set of parameter estimates distributed according to the desired posterior distribution", I think they are leading some of the readers astray as they forget the "approximative" aspect of this distribution.
'''Response:''' This has been changed.
* Further below, I would have used the title "Insufficient summary statistics" rather than "Sufficient summary statistics", as it spells out more clearly the fundamental issue with the potential difficulty in using ABC.
'''Response:''' The title has been changed to “Summary statistics” (see also Dennis Prangle's comment below)
* (And I am not sure the subsequent paragraph on "Choice and sufficiency of summary statistics" should bother with the sufficiency aspects... It seems to me much more relevant to assess the impact on predictive performances.
<span style="color:red">'''Response:'''</span>
* Although this is most minor, I would not have made mention of the (rather artificial) "table for interpretation of the strength in values of [[wp:Bayes factor|the Bayes factor]] (...) originally published by [[wp:Harold Jeffreys|Harold Jeffreys]]". I obviously appreciate very much that the authors advertise our warning <ref> Robert, C.P., Cornuet, J.-M., Marin, J.-M. and Pillai, N. (2011) Lack of confidence in approximate Bayesian computation model choice. PNAS vol. 108 no. 37 15112-15117.</ref> about the potential lack of validity of an ABC based Bayes factor!
'''Response:''' The section on model selection has been almost entirely rewritten. In the process, the reference to Jeffreys's table has been removed.
* I also like the notion of "quality control", even though it should only appear once.
'''Response:''' We have merged the two sections about quality control.
* And the pseudo-example is quite fine as an introduction, while it could be supplemented with the outcome resulting from a large n, to be compared with the true posterior distribution.
'''Response:''' We have included a new figure (Fig. 3), which shows ABC with large n for full data, and summary statistics (<math>\epsilon = 0</math> and <math>\epsilon = 2</math>). It also shows the theoretical posterior, based on propagation of uncertainty and sequential incorporation of new information about <math>\theta</math> from the data.
* The section "Pitfalls and remedies" is remarkable in that it details the necessary steps for validating a ABC implementation: the only entry I would remove is the one about "Prior distribution and parameter ranges", in that this is not a problem inherent to ABC... (Granted, the authors present this as a "general risks in statistical inference exacerbated in ABC", which makes more sense!)
'''Response:''' We would like to keep the discussion on prior distribution and parameter ranges. However, a sentence was added under “Pitfalls and remedies” to emphasize that the problem related to “Prior distribution and parameter ranges” is not specific to ABC.
* It may be that the section on the non-zero tolerance should emphasize more clearly the fact that ε should not be zero. As discussed in the recent Read Paper by Fearnhead and Prangle <ref name="FP2012">Fearnhead, P. and Prangle, D. (2012) Constructing summary statistics for approximate Bayesian computation: semi-automatic approximate Bayesian computation. Journal of the Royal Statistical Society Series B. Volume 74, Issue 3, pages 419–474.</ref> when envisioning ABC as a non-parametric method of inference.
<span style="color:red">'''Response:''' JM</span>
* At last, it is always possible to criticise the coverage of the historical part, since this is such a recent field that it is constantly evolving. But the authors correctly point out to (Don) Rubin on the one hand and to Diggle and Graton on the other. I would suggest adding in this section links to the relevant softwares like our own DIY-ABC<ref>Cornuet, J.-M., Santos, F., Beaumont, M. et al. (2008) Inferring population history with DIY ABC: a user-friendly approach to approximate Bayesian computation . Bioinformatics 24 (23): 2713-2719.</ref>...
'''Response:''' A section listing ABC software has been added, including a new table with references to the corresponding papers (Table 3) .
== Review by Dennis Prangle ==
This is a well written and accessible introductory article. I particularly like the balance struck between describing the simplicity of implementing ABC and the potential drawbacks.
=== Major comments ===
(nb I've included full references only for papers not in the original article.)
* Much of the material in the "recent methodological developments" section is well established and no longer recent relative to the age of the field (e.g. the Marjoram et al paper was published in 2003). I'd suggest at least renaming this section. Alternatively, much of this material could be incorporated into the "approximation of the posterior" section, as regression correction ideas and MCMC / SMC algorithms are tools commonly used to improve the approximation.
'''Response:''' The section has been removed and most of the material has been incorporated into the “approximation of the posterior” section.
* A little more coverage of applications would be nice. One way to do this without increasing the length of the article would be to explicitly reference recent review papers (Beaumont 2010, Bertorelle et al 2010, Csillery et al 2010, Marin et al 2011<ref>Jean-Michel Marin, Pierre Pudlo, Christian P. Robert and Robin J. Ryder (2011) Approximate Bayesian computational methods. Statistics and Computing (published online)</ref>) for further details.
'''Response:''' We have added a sentence about applications of ABC, with references to these review papers, at the end of the “Example” section.
* The model comparison section should explain how the ABC rejection sampling algorithm can be adapted to perform inference between models (or give a reference). A reference to more advanced algorithms (e.g. Didelot et al, Toni and Stumpf 2009<ref>Tina Toni and Michael P. H. Stumpf (2009) Simulation-based model selection for dynamical systems in systems and population biology. Bioinformatics (26) 104-110</ref>) would also be helpful.
'''Response:''' We have added a reference to the Toni & Stumpf SMC-ABC method for model selection.
* I agree with Christian Robert's comments that the discussion of a hypothesis H in the motivation section is somewhat confusing, and that links to code could be helpful. Some additional suggestions are the "abc" R package and ABC-SysBio.
'''Response:''' See our response to Christian Robert’s comment above.
=== Minor comments ===
* The acceptance criterion should be <math>\rho (\hat{D},D) \le \epsilon</math> not <math>\rho (\hat{D},D)<\epsilon</math> if <math>\epsilon=0</math> is to correspond to acceptance of exact matches only.
'''Response:''' This has been changed.
*"Sufficient summary statistics": As Christian writes, it would seem more natural to discuss general summary statistics first, then the special and less practically useful case of sufficient statistics.
<span style="color:red">'''Response:'''</span>
*"Example": I'd point out that this is an example application only, and more accurate inference is possible here by particle filtering methods. If there were some missing data this would be a more natural ABC application e.g. if only the summary statistic was observed.
'''Response:''' We have also added a sentence to point out that it is only an example application, and that the posterior can be computed exactly.
*"Approximation of the posterior": "...has been justified theoretically under some limiting conditions". The word "limiting" doesn't seem (to me) to describe the measurement error case.
<span style="color:red">'''Response:''' JM</span>
*"Choice and sufficiency of summary statistics": "Sufficient statistics are optimal..." I'd change to "Low dimensional sufficient statistics". For some models (e.g. iid Cauchy) the only sufficient statistics are the full data set, which would be a poor choice.
'''Response:''' This has been changed.
*"Choice and sufficiency of summary statistics": "...which is approximated with a pilot run of simulations". Something like "...which is approximated by linear regression based on simulated data" would be more accurate.
'''Response:''' This has been changed.
*"Choice and sufficiency of summary statistics": It might be useful to reference a recent comparison<ref>M. G. B. Blum, M. A. Nunes, D. Prangle, S. A. Sisson (2012) A comparative review of dimension reduction methods in approximate Bayesian computation. arxiv.org/abs/1202.3819 </ref> (disclaimer: which I contributed to) between methods of choosing summary statistics.
'''Response:''' A sentence was added with a reference to the paper.
*"Bayes factor with ABC and summary statistics": "...can also be used to..." it might be more accurate to say "...is sufficient to..."
'''Response:''' This has been changed.
*"Bayes factor with ABC and summary statistics": "meaningless" seems too strong as the next sentence suggests a potentially useful alternative way of doing inference.
'''Response:''' The formulation was changed to “may therefore be misinformative”.
*"Prior distribution and parameter ranges": "...based on the principle of maximum entropy". A link to the general topic of [[wp:Prior_distribution#Uninformative_priors|objective priors]] might be helpful here.
'''Response:''' A link has been added.
*"Large data sets": "which may be a tractable approach for ABC based methods". Note it is already easy to parallelise many of the steps in ABC algorithms based on rejection sampling and SMC.
'''Response:''' This has been changed.
*"Curse of dimensionality": Some theoretical results have been proved here<ref>M. G. B. Blum (2010) Approximate Bayesian Computation: a nonparametric perspective. Journal of the American Statistical Association (105) 1178-1187</ref><ref name="FP2012" />.
'''Response:''' We have added references to these papers.
*"Conclusion": "With faster evaluation of the likelihood function..." I'm not sure what this is getting at; in ABC applications the likelihood function typically cannot be evaluated!
'''Response:''' This formulation has been changed.
==References==
<references />
1288
2012-07-16T15:49:06Z
Cdessimoz
12
/* Comments of Christian P. Robert on the entry */
== Comments of Darren Logan on things to do before moving the article to Wikipedia ==
* The title should probably be "Approximate Bayesian computation" (small "c"). The "...in computational biology" is probably redundant for WP article. You can always make a redirection from the long title to the short one.
*: Moved. --<span style="font-family:Palatino Linotype, Book Antiqua, Palatino, serif;">[[User:Spencer Bliven|Spencer]] [[User talk:Spencer Bliven|Bliven]]</span> 15:35, 18 May 2012 (PDT)
'''The following comments apply specifically to the wikipedia-version of this article''' --[[User:Cdessimoz|Cdessimoz]] 03:41, 22 May 2012 (PDT)
* In the summary and elsewhere, you use terms like "over the last years" and "recently". These should be avoided, as WP articles are not dated and thus non-specific time-frames are not meaningful. If you need to refer to time, be specific (e.g. "Since 1999..." or "In 2010...")
* Example. In general, Wikipedia articles should not contain worked examples. That type of content is better suited to Wikiversity or Wikibooks. There are exceptions, however. The guidance on this can be seen at WP:NOT, specifically: "An article should not read like a "how-to" style... the purpose of Wikipedia is to present facts, not to teach subject matter. It is not appropriate to create or edit articles that read as textbooks, with leading questions and systematic problem solutions as examples... Some kinds of examples, specifically those intended to inform rather than to instruct, may be appropriate for inclusion in a Wikipedia article." I think your example might be ok, but you should be careful of the tone to ensure it doesn't seem like a "how-to" guide.
* Wikipedia articles do not have conclusion sections.
* Throughout the article you should try and avoid using a narrative voice and remove all self-references. For example:
** "As the previous section suggests..."
** "This section attempts to review important recent developments..."
** "...should be considered with sober caution, as discussed below."
** "Interestingly..."
** "This section discusses these potential risks and reviews possible ways to address them.."
** "As the above makes clear..."
** "This section reviews risks..."
** "This section attempts to review important recent developments."
To follow up on these remarks, the history section in such Wikipedia articles is typically the first after the lead section, as it puts the topic into its historic context. I have thus moved it there. However, this possibly breaks some of the narrative flow and should thus be checked again during the revision. --[[User:Daniel Mietchen|Daniel Mietchen]] 19:05, 27 June 2012 (PDT)
: '''Response''': We have verified the coherence of the narrative flow. --[[User:Cdessimoz|Cdessimoz]] 07:52, 5 July 2012 (PDT)
== Comments of Christian P. Robert on the entry ==
A few comments on the specific entry on ABC written by Mikael Sunnåker et al....
*The entry starts with the representation of the posterior probability of an hypothesis, rather than with the posterior density of a model parameter, which makes it seems likely it could lead the novice reader astray. After all, (a) ABC was not introduced for conducting model choice and (b) interchanging hypothesis and model means that the probability of an hypothesis '''''H''''' as used in the entry is actually the evidence in favour of the corresponding model.
'''Response:''' We now first talk only about parameter estimation. We have also rewritten the section about model selection for better coherence of the text.
* (There are a few typos and grammar mistakes, but I assume either PLoS or later contributors will correct those.)
'''Response:''' We have corrected the typos and grammatical mistakes found during the revision.
* When the authors state that the "outcome of the ABC rejection algorithm is a set of parameter estimates distributed according to the desired posterior distribution", I think they are leading some of the readers astray as they forget the "approximative" aspect of this distribution.
'''Response:''' This has been changed.
* Further below, I would have used the title "Insufficient summary statistics" rather than "Sufficient summary statistics", as it spells out more clearly the fundamental issue with the potential difficulty in using ABC.
'''Response:''' The title has been changed to “Summary statistics” (see also Dennis Prangle's comment below)
* (And I am not sure the subsequent paragraph on "Choice and sufficiency of summary statistics" should bother with the sufficiency aspects... It seems to me much more relevant to assess the impact on predictive performances.
<span style="color:red">'''Response:'''</span>
* Although this is most minor, I would not have made mention of the (rather artificial) "table for interpretation of the strength in values of [[wp:Bayes factor|the Bayes factor]] (...) originally published by [[wp:Harold Jeffreys|Harold Jeffreys]]". I obviously appreciate very much that the authors advertise our warning <ref> Robert, C.P., Cornuet, J.-M., Marin, J.-M. and Pillai, N. (2011) Lack of confidence in approximate Bayesian computation model choice. PNAS vol. 108 no. 37 15112-15117.</ref> about the potential lack of validity of an ABC based Bayes factor!
'''Response:''' The section on model selection has been almost entirely rewritten. In the process, the reference to Jeffreys's table has been removed.
* I also like the notion of "quality control", even though it should only appear once.
'''Response:''' We have merged the two sections about quality control.
* And the pseudo-example is quite fine as an introduction, while it could be supplemented with the outcome resulting from a large n, to be compared with the true posterior distribution.
'''Response:''' We have included a new figure (Fig. 3), which shows ABC with large n for full data, and summary statistics (<math>\epsilon = 0</math> and <math>\epsilon = 2</math>). As suggested, it also compare the ABC results with the theoretical posterior.
* The section "Pitfalls and remedies" is remarkable in that it details the necessary steps for validating a ABC implementation: the only entry I would remove is the one about "Prior distribution and parameter ranges", in that this is not a problem inherent to ABC... (Granted, the authors present this as a "general risks in statistical inference exacerbated in ABC", which makes more sense!)
'''Response:''' We would like to keep the discussion on prior distribution and parameter ranges. However, a sentence was added under “Pitfalls and remedies” to emphasize that the problem related to “Prior distribution and parameter ranges” is not specific to ABC.
* It may be that the section on the non-zero tolerance should emphasize more clearly the fact that ε should not be zero. As discussed in the recent Read Paper by Fearnhead and Prangle <ref name="FP2012">Fearnhead, P. and Prangle, D. (2012) Constructing summary statistics for approximate Bayesian computation: semi-automatic approximate Bayesian computation. Journal of the Royal Statistical Society Series B. Volume 74, Issue 3, pages 419–474.</ref> when envisioning ABC as a non-parametric method of inference.
<span style="color:red">'''Response:''' JM</span>
* At last, it is always possible to criticise the coverage of the historical part, since this is such a recent field that it is constantly evolving. But the authors correctly point out to (Don) Rubin on the one hand and to Diggle and Graton on the other. I would suggest adding in this section links to the relevant softwares like our own DIY-ABC<ref>Cornuet, J.-M., Santos, F., Beaumont, M. et al. (2008) Inferring population history with DIY ABC: a user-friendly approach to approximate Bayesian computation . Bioinformatics 24 (23): 2713-2719.</ref>...
'''Response:''' A section listing ABC software has been added, including a new table with references to the corresponding papers (Table 3) .
== Review by Dennis Prangle ==
This is a well written and accessible introductory article. I particularly like the balance struck between describing the simplicity of implementing ABC and the potential drawbacks.
=== Major comments ===
(nb I've included full references only for papers not in the original article.)
* Much of the material in the "recent methodological developments" section is well established and no longer recent relative to the age of the field (e.g. the Marjoram et al paper was published in 2003). I'd suggest at least renaming this section. Alternatively, much of this material could be incorporated into the "approximation of the posterior" section, as regression correction ideas and MCMC / SMC algorithms are tools commonly used to improve the approximation.
'''Response:''' The section has been removed and most of the material has been incorporated into the “approximation of the posterior” section.
* A little more coverage of applications would be nice. One way to do this without increasing the length of the article would be to explicitly reference recent review papers (Beaumont 2010, Bertorelle et al 2010, Csillery et al 2010, Marin et al 2011<ref>Jean-Michel Marin, Pierre Pudlo, Christian P. Robert and Robin J. Ryder (2011) Approximate Bayesian computational methods. Statistics and Computing (published online)</ref>) for further details.
'''Response:''' We have added a sentence about applications of ABC, with references to these review papers, at the end of the “Example” section.
* The model comparison section should explain how the ABC rejection sampling algorithm can be adapted to perform inference between models (or give a reference). A reference to more advanced algorithms (e.g. Didelot et al, Toni and Stumpf 2009<ref>Tina Toni and Michael P. H. Stumpf (2009) Simulation-based model selection for dynamical systems in systems and population biology. Bioinformatics (26) 104-110</ref>) would also be helpful.
'''Response:''' We have added a reference to the Toni & Stumpf SMC-ABC method for model selection.
* I agree with Christian Robert's comments that the discussion of a hypothesis H in the motivation section is somewhat confusing, and that links to code could be helpful. Some additional suggestions are the "abc" R package and ABC-SysBio.
'''Response:''' See our response to Christian Robert’s comment above.
=== Minor comments ===
* The acceptance criterion should be <math>\rho (\hat{D},D) \le \epsilon</math> not <math>\rho (\hat{D},D)<\epsilon</math> if <math>\epsilon=0</math> is to correspond to acceptance of exact matches only.
'''Response:''' This has been changed.
*"Sufficient summary statistics": As Christian writes, it would seem more natural to discuss general summary statistics first, then the special and less practically useful case of sufficient statistics.
<span style="color:red">'''Response:'''</span>
*"Example": I'd point out that this is an example application only, and more accurate inference is possible here by particle filtering methods. If there were some missing data this would be a more natural ABC application e.g. if only the summary statistic was observed.
'''Response:''' We have also added a sentence to point out that it is only an example application, and that the posterior can be computed exactly.
*"Approximation of the posterior": "...has been justified theoretically under some limiting conditions". The word "limiting" doesn't seem (to me) to describe the measurement error case.
<span style="color:red">'''Response:''' JM</span>
*"Choice and sufficiency of summary statistics": "Sufficient statistics are optimal..." I'd change to "Low dimensional sufficient statistics". For some models (e.g. iid Cauchy) the only sufficient statistics are the full data set, which would be a poor choice.
'''Response:''' This has been changed.
*"Choice and sufficiency of summary statistics": "...which is approximated with a pilot run of simulations". Something like "...which is approximated by linear regression based on simulated data" would be more accurate.
'''Response:''' This has been changed.
*"Choice and sufficiency of summary statistics": It might be useful to reference a recent comparison<ref>M. G. B. Blum, M. A. Nunes, D. Prangle, S. A. Sisson (2012) A comparative review of dimension reduction methods in approximate Bayesian computation. arxiv.org/abs/1202.3819 </ref> (disclaimer: which I contributed to) between methods of choosing summary statistics.
'''Response:''' A sentence was added with a reference to the paper.
*"Bayes factor with ABC and summary statistics": "...can also be used to..." it might be more accurate to say "...is sufficient to..."
'''Response:''' This has been changed.
*"Bayes factor with ABC and summary statistics": "meaningless" seems too strong as the next sentence suggests a potentially useful alternative way of doing inference.
'''Response:''' The formulation was changed to “may therefore be misinformative”.
*"Prior distribution and parameter ranges": "...based on the principle of maximum entropy". A link to the general topic of [[wp:Prior_distribution#Uninformative_priors|objective priors]] might be helpful here.
'''Response:''' A link has been added.
*"Large data sets": "which may be a tractable approach for ABC based methods". Note it is already easy to parallelise many of the steps in ABC algorithms based on rejection sampling and SMC.
'''Response:''' This has been changed.
*"Curse of dimensionality": Some theoretical results have been proved here<ref>M. G. B. Blum (2010) Approximate Bayesian Computation: a nonparametric perspective. Journal of the American Statistical Association (105) 1178-1187</ref><ref name="FP2012" />.
'''Response:''' We have added references to these papers.
*"Conclusion": "With faster evaluation of the likelihood function..." I'm not sure what this is getting at; in ABC applications the likelihood function typically cannot be evaluated!
'''Response:''' This formulation has been changed.
==References==
<references />
1289
2012-07-16T15:59:35Z
Cdessimoz
12
/* Comments of Christian P. Robert on the entry */
== Comments of Darren Logan on things to do before moving the article to Wikipedia ==
* The title should probably be "Approximate Bayesian computation" (small "c"). The "...in computational biology" is probably redundant for WP article. You can always make a redirection from the long title to the short one.
*: Moved. --<span style="font-family:Palatino Linotype, Book Antiqua, Palatino, serif;">[[User:Spencer Bliven|Spencer]] [[User talk:Spencer Bliven|Bliven]]</span> 15:35, 18 May 2012 (PDT)
'''The following comments apply specifically to the wikipedia-version of this article''' --[[User:Cdessimoz|Cdessimoz]] 03:41, 22 May 2012 (PDT)
* In the summary and elsewhere, you use terms like "over the last years" and "recently". These should be avoided, as WP articles are not dated and thus non-specific time-frames are not meaningful. If you need to refer to time, be specific (e.g. "Since 1999..." or "In 2010...")
* Example. In general, Wikipedia articles should not contain worked examples. That type of content is better suited to Wikiversity or Wikibooks. There are exceptions, however. The guidance on this can be seen at WP:NOT, specifically: "An article should not read like a "how-to" style... the purpose of Wikipedia is to present facts, not to teach subject matter. It is not appropriate to create or edit articles that read as textbooks, with leading questions and systematic problem solutions as examples... Some kinds of examples, specifically those intended to inform rather than to instruct, may be appropriate for inclusion in a Wikipedia article." I think your example might be ok, but you should be careful of the tone to ensure it doesn't seem like a "how-to" guide.
* Wikipedia articles do not have conclusion sections.
* Throughout the article you should try and avoid using a narrative voice and remove all self-references. For example:
** "As the previous section suggests..."
** "This section attempts to review important recent developments..."
** "...should be considered with sober caution, as discussed below."
** "Interestingly..."
** "This section discusses these potential risks and reviews possible ways to address them.."
** "As the above makes clear..."
** "This section reviews risks..."
** "This section attempts to review important recent developments."
To follow up on these remarks, the history section in such Wikipedia articles is typically the first after the lead section, as it puts the topic into its historic context. I have thus moved it there. However, this possibly breaks some of the narrative flow and should thus be checked again during the revision. --[[User:Daniel Mietchen|Daniel Mietchen]] 19:05, 27 June 2012 (PDT)
: '''Response''': We have verified the coherence of the narrative flow. --[[User:Cdessimoz|Cdessimoz]] 07:52, 5 July 2012 (PDT)
== Comments of Christian P. Robert on the entry ==
A few comments on the specific entry on ABC written by Mikael Sunnåker et al....
*The entry starts with the representation of the posterior probability of an hypothesis, rather than with the posterior density of a model parameter, which makes it seems likely it could lead the novice reader astray. After all, (a) ABC was not introduced for conducting model choice and (b) interchanging hypothesis and model means that the probability of an hypothesis '''''H''''' as used in the entry is actually the evidence in favour of the corresponding model.
'''Response:''' We now first talk only about parameter estimation. We have also rewritten the section about model selection for better coherence of the text.
* (There are a few typos and grammar mistakes, but I assume either PLoS or later contributors will correct those.)
'''Response:''' We have corrected the typos and grammatical mistakes found during the revision.
* When the authors state that the "outcome of the ABC rejection algorithm is a set of parameter estimates distributed according to the desired posterior distribution", I think they are leading some of the readers astray as they forget the "approximative" aspect of this distribution.
'''Response:''' This has been changed.
* Further below, I would have used the title "Insufficient summary statistics" rather than "Sufficient summary statistics", as it spells out more clearly the fundamental issue with the potential difficulty in using ABC.
'''Response:''' The title has been changed to “Summary statistics” (see also Dennis Prangle's comment below)
* (And I am not sure the subsequent paragraph on "Choice and sufficiency of summary statistics" should bother with the sufficiency aspects... It seems to me much more relevant to assess the impact on predictive performances.
<span style="color:red">'''Response:'''</span>
* Although this is most minor, I would not have made mention of the (rather artificial) "table for interpretation of the strength in values of [[wp:Bayes factor|the Bayes factor]] (...) originally published by [[wp:Harold Jeffreys|Harold Jeffreys]]". I obviously appreciate very much that the authors advertise our warning <ref> Robert, C.P., Cornuet, J.-M., Marin, J.-M. and Pillai, N. (2011) Lack of confidence in approximate Bayesian computation model choice. PNAS vol. 108 no. 37 15112-15117.</ref> about the potential lack of validity of an ABC based Bayes factor!
'''Response:''' The section on model selection has been almost entirely rewritten. In the process, the reference to Jeffreys's table has been removed.
* I also like the notion of "quality control", even though it should only appear once.
'''Response:''' We have merged the two sections about quality control.
* And the pseudo-example is quite fine as an introduction, while it could be supplemented with the outcome resulting from a large n, to be compared with the true posterior distribution.
'''Response:''' We have included a new figure (Fig. 3), which shows ABC with large n for full data, and summary statistics (<math>\epsilon = 0</math> and <math>\epsilon = 2</math>). As suggested, it also compare the ABC results with the theoretical posterior.
* The section "Pitfalls and remedies" is remarkable in that it details the necessary steps for validating a ABC implementation: the only entry I would remove is the one about "Prior distribution and parameter ranges", in that this is not a problem inherent to ABC... (Granted, the authors present this as a "general risks in statistical inference exacerbated in ABC", which makes more sense!)
'''Response:''' We would like to keep the discussion on prior distribution and parameter ranges. However, a sentence was added under “Pitfalls and remedies” to emphasize that the problem related to “Prior distribution and parameter ranges” is not specific to ABC.
* It may be that the section on the non-zero tolerance should emphasize more clearly the fact that ε should not be zero. As discussed in the recent Read Paper by Fearnhead and Prangle <ref name="FP2012">Fearnhead, P. and Prangle, D. (2012) Constructing summary statistics for approximate Bayesian computation: semi-automatic approximate Bayesian computation. Journal of the Royal Statistical Society Series B. Volume 74, Issue 3, pages 419–474.</ref> when envisioning ABC as a non-parametric method of inference.
<span style="color:red">'''Response:'''</span>
* At last, it is always possible to criticise the coverage of the historical part, since this is such a recent field that it is constantly evolving. But the authors correctly point out to (Don) Rubin on the one hand and to Diggle and Graton on the other. I would suggest adding in this section links to the relevant softwares like our own DIY-ABC<ref>Cornuet, J.-M., Santos, F., Beaumont, M. et al. (2008) Inferring population history with DIY ABC: a user-friendly approach to approximate Bayesian computation . Bioinformatics 24 (23): 2713-2719.</ref>...
'''Response:''' A section listing ABC software has been added, including a new table with references to the corresponding papers (Table 3) .
== Review by Dennis Prangle ==
This is a well written and accessible introductory article. I particularly like the balance struck between describing the simplicity of implementing ABC and the potential drawbacks.
=== Major comments ===
(nb I've included full references only for papers not in the original article.)
* Much of the material in the "recent methodological developments" section is well established and no longer recent relative to the age of the field (e.g. the Marjoram et al paper was published in 2003). I'd suggest at least renaming this section. Alternatively, much of this material could be incorporated into the "approximation of the posterior" section, as regression correction ideas and MCMC / SMC algorithms are tools commonly used to improve the approximation.
'''Response:''' The section has been removed and most of the material has been incorporated into the “approximation of the posterior” section.
* A little more coverage of applications would be nice. One way to do this without increasing the length of the article would be to explicitly reference recent review papers (Beaumont 2010, Bertorelle et al 2010, Csillery et al 2010, Marin et al 2011<ref>Jean-Michel Marin, Pierre Pudlo, Christian P. Robert and Robin J. Ryder (2011) Approximate Bayesian computational methods. Statistics and Computing (published online)</ref>) for further details.
'''Response:''' We have added a sentence about applications of ABC, with references to these review papers, at the end of the “Example” section.
* The model comparison section should explain how the ABC rejection sampling algorithm can be adapted to perform inference between models (or give a reference). A reference to more advanced algorithms (e.g. Didelot et al, Toni and Stumpf 2009<ref>Tina Toni and Michael P. H. Stumpf (2009) Simulation-based model selection for dynamical systems in systems and population biology. Bioinformatics (26) 104-110</ref>) would also be helpful.
'''Response:''' We have added a reference to the Toni & Stumpf SMC-ABC method for model selection.
* I agree with Christian Robert's comments that the discussion of a hypothesis H in the motivation section is somewhat confusing, and that links to code could be helpful. Some additional suggestions are the "abc" R package and ABC-SysBio.
'''Response:''' See our response to Christian Robert’s comment above.
=== Minor comments ===
* The acceptance criterion should be <math>\rho (\hat{D},D) \le \epsilon</math> not <math>\rho (\hat{D},D)<\epsilon</math> if <math>\epsilon=0</math> is to correspond to acceptance of exact matches only.
'''Response:''' This has been changed.
*"Sufficient summary statistics": As Christian writes, it would seem more natural to discuss general summary statistics first, then the special and less practically useful case of sufficient statistics.
<span style="color:red">'''Response:'''</span>
*"Example": I'd point out that this is an example application only, and more accurate inference is possible here by particle filtering methods. If there were some missing data this would be a more natural ABC application e.g. if only the summary statistic was observed.
'''Response:''' We have also added a sentence to point out that it is only an example application, and that the posterior can be computed exactly.
*"Approximation of the posterior": "...has been justified theoretically under some limiting conditions". The word "limiting" doesn't seem (to me) to describe the measurement error case.
<span style="color:red">'''Response:''' JM</span>
*"Choice and sufficiency of summary statistics": "Sufficient statistics are optimal..." I'd change to "Low dimensional sufficient statistics". For some models (e.g. iid Cauchy) the only sufficient statistics are the full data set, which would be a poor choice.
'''Response:''' This has been changed.
*"Choice and sufficiency of summary statistics": "...which is approximated with a pilot run of simulations". Something like "...which is approximated by linear regression based on simulated data" would be more accurate.
'''Response:''' This has been changed.
*"Choice and sufficiency of summary statistics": It might be useful to reference a recent comparison<ref>M. G. B. Blum, M. A. Nunes, D. Prangle, S. A. Sisson (2012) A comparative review of dimension reduction methods in approximate Bayesian computation. arxiv.org/abs/1202.3819 </ref> (disclaimer: which I contributed to) between methods of choosing summary statistics.
'''Response:''' A sentence was added with a reference to the paper.
*"Bayes factor with ABC and summary statistics": "...can also be used to..." it might be more accurate to say "...is sufficient to..."
'''Response:''' This has been changed.
*"Bayes factor with ABC and summary statistics": "meaningless" seems too strong as the next sentence suggests a potentially useful alternative way of doing inference.
'''Response:''' The formulation was changed to “may therefore be misinformative”.
*"Prior distribution and parameter ranges": "...based on the principle of maximum entropy". A link to the general topic of [[wp:Prior_distribution#Uninformative_priors|objective priors]] might be helpful here.
'''Response:''' A link has been added.
*"Large data sets": "which may be a tractable approach for ABC based methods". Note it is already easy to parallelise many of the steps in ABC algorithms based on rejection sampling and SMC.
'''Response:''' This has been changed.
*"Curse of dimensionality": Some theoretical results have been proved here<ref>M. G. B. Blum (2010) Approximate Bayesian Computation: a nonparametric perspective. Journal of the American Statistical Association (105) 1178-1187</ref><ref name="FP2012" />.
'''Response:''' We have added references to these papers.
*"Conclusion": "With faster evaluation of the likelihood function..." I'm not sure what this is getting at; in ABC applications the likelihood function typically cannot be evaluated!
'''Response:''' This formulation has been changed.
==References==
<references />
1290
2012-07-16T15:59:59Z
Cdessimoz
12
/* Minor comments */
== Comments of Darren Logan on things to do before moving the article to Wikipedia ==
* The title should probably be "Approximate Bayesian computation" (small "c"). The "...in computational biology" is probably redundant for WP article. You can always make a redirection from the long title to the short one.
*: Moved. --<span style="font-family:Palatino Linotype, Book Antiqua, Palatino, serif;">[[User:Spencer Bliven|Spencer]] [[User talk:Spencer Bliven|Bliven]]</span> 15:35, 18 May 2012 (PDT)
'''The following comments apply specifically to the wikipedia-version of this article''' --[[User:Cdessimoz|Cdessimoz]] 03:41, 22 May 2012 (PDT)
* In the summary and elsewhere, you use terms like "over the last years" and "recently". These should be avoided, as WP articles are not dated and thus non-specific time-frames are not meaningful. If you need to refer to time, be specific (e.g. "Since 1999..." or "In 2010...")
* Example. In general, Wikipedia articles should not contain worked examples. That type of content is better suited to Wikiversity or Wikibooks. There are exceptions, however. The guidance on this can be seen at WP:NOT, specifically: "An article should not read like a "how-to" style... the purpose of Wikipedia is to present facts, not to teach subject matter. It is not appropriate to create or edit articles that read as textbooks, with leading questions and systematic problem solutions as examples... Some kinds of examples, specifically those intended to inform rather than to instruct, may be appropriate for inclusion in a Wikipedia article." I think your example might be ok, but you should be careful of the tone to ensure it doesn't seem like a "how-to" guide.
* Wikipedia articles do not have conclusion sections.
* Throughout the article you should try and avoid using a narrative voice and remove all self-references. For example:
** "As the previous section suggests..."
** "This section attempts to review important recent developments..."
** "...should be considered with sober caution, as discussed below."
** "Interestingly..."
** "This section discusses these potential risks and reviews possible ways to address them.."
** "As the above makes clear..."
** "This section reviews risks..."
** "This section attempts to review important recent developments."
To follow up on these remarks, the history section in such Wikipedia articles is typically the first after the lead section, as it puts the topic into its historic context. I have thus moved it there. However, this possibly breaks some of the narrative flow and should thus be checked again during the revision. --[[User:Daniel Mietchen|Daniel Mietchen]] 19:05, 27 June 2012 (PDT)
: '''Response''': We have verified the coherence of the narrative flow. --[[User:Cdessimoz|Cdessimoz]] 07:52, 5 July 2012 (PDT)
== Comments of Christian P. Robert on the entry ==
A few comments on the specific entry on ABC written by Mikael Sunnåker et al....
*The entry starts with the representation of the posterior probability of an hypothesis, rather than with the posterior density of a model parameter, which makes it seems likely it could lead the novice reader astray. After all, (a) ABC was not introduced for conducting model choice and (b) interchanging hypothesis and model means that the probability of an hypothesis '''''H''''' as used in the entry is actually the evidence in favour of the corresponding model.
'''Response:''' We now first talk only about parameter estimation. We have also rewritten the section about model selection for better coherence of the text.
* (There are a few typos and grammar mistakes, but I assume either PLoS or later contributors will correct those.)
'''Response:''' We have corrected the typos and grammatical mistakes found during the revision.
* When the authors state that the "outcome of the ABC rejection algorithm is a set of parameter estimates distributed according to the desired posterior distribution", I think they are leading some of the readers astray as they forget the "approximative" aspect of this distribution.
'''Response:''' This has been changed.
* Further below, I would have used the title "Insufficient summary statistics" rather than "Sufficient summary statistics", as it spells out more clearly the fundamental issue with the potential difficulty in using ABC.
'''Response:''' The title has been changed to “Summary statistics” (see also Dennis Prangle's comment below)
* (And I am not sure the subsequent paragraph on "Choice and sufficiency of summary statistics" should bother with the sufficiency aspects... It seems to me much more relevant to assess the impact on predictive performances.
<span style="color:red">'''Response:'''</span>
* Although this is most minor, I would not have made mention of the (rather artificial) "table for interpretation of the strength in values of [[wp:Bayes factor|the Bayes factor]] (...) originally published by [[wp:Harold Jeffreys|Harold Jeffreys]]". I obviously appreciate very much that the authors advertise our warning <ref> Robert, C.P., Cornuet, J.-M., Marin, J.-M. and Pillai, N. (2011) Lack of confidence in approximate Bayesian computation model choice. PNAS vol. 108 no. 37 15112-15117.</ref> about the potential lack of validity of an ABC based Bayes factor!
'''Response:''' The section on model selection has been almost entirely rewritten. In the process, the reference to Jeffreys's table has been removed.
* I also like the notion of "quality control", even though it should only appear once.
'''Response:''' We have merged the two sections about quality control.
* And the pseudo-example is quite fine as an introduction, while it could be supplemented with the outcome resulting from a large n, to be compared with the true posterior distribution.
'''Response:''' We have included a new figure (Fig. 3), which shows ABC with large n for full data, and summary statistics (<math>\epsilon = 0</math> and <math>\epsilon = 2</math>). As suggested, it also compare the ABC results with the theoretical posterior.
* The section "Pitfalls and remedies" is remarkable in that it details the necessary steps for validating a ABC implementation: the only entry I would remove is the one about "Prior distribution and parameter ranges", in that this is not a problem inherent to ABC... (Granted, the authors present this as a "general risks in statistical inference exacerbated in ABC", which makes more sense!)
'''Response:''' We would like to keep the discussion on prior distribution and parameter ranges. However, a sentence was added under “Pitfalls and remedies” to emphasize that the problem related to “Prior distribution and parameter ranges” is not specific to ABC.
* It may be that the section on the non-zero tolerance should emphasize more clearly the fact that ε should not be zero. As discussed in the recent Read Paper by Fearnhead and Prangle <ref name="FP2012">Fearnhead, P. and Prangle, D. (2012) Constructing summary statistics for approximate Bayesian computation: semi-automatic approximate Bayesian computation. Journal of the Royal Statistical Society Series B. Volume 74, Issue 3, pages 419–474.</ref> when envisioning ABC as a non-parametric method of inference.
<span style="color:red">'''Response:'''</span>
* At last, it is always possible to criticise the coverage of the historical part, since this is such a recent field that it is constantly evolving. But the authors correctly point out to (Don) Rubin on the one hand and to Diggle and Graton on the other. I would suggest adding in this section links to the relevant softwares like our own DIY-ABC<ref>Cornuet, J.-M., Santos, F., Beaumont, M. et al. (2008) Inferring population history with DIY ABC: a user-friendly approach to approximate Bayesian computation . Bioinformatics 24 (23): 2713-2719.</ref>...
'''Response:''' A section listing ABC software has been added, including a new table with references to the corresponding papers (Table 3) .
== Review by Dennis Prangle ==
This is a well written and accessible introductory article. I particularly like the balance struck between describing the simplicity of implementing ABC and the potential drawbacks.
=== Major comments ===
(nb I've included full references only for papers not in the original article.)
* Much of the material in the "recent methodological developments" section is well established and no longer recent relative to the age of the field (e.g. the Marjoram et al paper was published in 2003). I'd suggest at least renaming this section. Alternatively, much of this material could be incorporated into the "approximation of the posterior" section, as regression correction ideas and MCMC / SMC algorithms are tools commonly used to improve the approximation.
'''Response:''' The section has been removed and most of the material has been incorporated into the “approximation of the posterior” section.
* A little more coverage of applications would be nice. One way to do this without increasing the length of the article would be to explicitly reference recent review papers (Beaumont 2010, Bertorelle et al 2010, Csillery et al 2010, Marin et al 2011<ref>Jean-Michel Marin, Pierre Pudlo, Christian P. Robert and Robin J. Ryder (2011) Approximate Bayesian computational methods. Statistics and Computing (published online)</ref>) for further details.
'''Response:''' We have added a sentence about applications of ABC, with references to these review papers, at the end of the “Example” section.
* The model comparison section should explain how the ABC rejection sampling algorithm can be adapted to perform inference between models (or give a reference). A reference to more advanced algorithms (e.g. Didelot et al, Toni and Stumpf 2009<ref>Tina Toni and Michael P. H. Stumpf (2009) Simulation-based model selection for dynamical systems in systems and population biology. Bioinformatics (26) 104-110</ref>) would also be helpful.
'''Response:''' We have added a reference to the Toni & Stumpf SMC-ABC method for model selection.
* I agree with Christian Robert's comments that the discussion of a hypothesis H in the motivation section is somewhat confusing, and that links to code could be helpful. Some additional suggestions are the "abc" R package and ABC-SysBio.
'''Response:''' See our response to Christian Robert’s comment above.
=== Minor comments ===
* The acceptance criterion should be <math>\rho (\hat{D},D) \le \epsilon</math> not <math>\rho (\hat{D},D)<\epsilon</math> if <math>\epsilon=0</math> is to correspond to acceptance of exact matches only.
'''Response:''' This has been changed.
*"Sufficient summary statistics": As Christian writes, it would seem more natural to discuss general summary statistics first, then the special and less practically useful case of sufficient statistics.
<span style="color:red">'''Response:'''</span>
*"Example": I'd point out that this is an example application only, and more accurate inference is possible here by particle filtering methods. If there were some missing data this would be a more natural ABC application e.g. if only the summary statistic was observed.
'''Response:''' We have also added a sentence to point out that it is only an example application, and that the posterior can be computed exactly.
*"Approximation of the posterior": "...has been justified theoretically under some limiting conditions". The word "limiting" doesn't seem (to me) to describe the measurement error case.
<span style="color:red">'''Response:'''</span>
*"Choice and sufficiency of summary statistics": "Sufficient statistics are optimal..." I'd change to "Low dimensional sufficient statistics". For some models (e.g. iid Cauchy) the only sufficient statistics are the full data set, which would be a poor choice.
'''Response:''' This has been changed.
*"Choice and sufficiency of summary statistics": "...which is approximated with a pilot run of simulations". Something like "...which is approximated by linear regression based on simulated data" would be more accurate.
'''Response:''' This has been changed.
*"Choice and sufficiency of summary statistics": It might be useful to reference a recent comparison<ref>M. G. B. Blum, M. A. Nunes, D. Prangle, S. A. Sisson (2012) A comparative review of dimension reduction methods in approximate Bayesian computation. arxiv.org/abs/1202.3819 </ref> (disclaimer: which I contributed to) between methods of choosing summary statistics.
'''Response:''' A sentence was added with a reference to the paper.
*"Bayes factor with ABC and summary statistics": "...can also be used to..." it might be more accurate to say "...is sufficient to..."
'''Response:''' This has been changed.
*"Bayes factor with ABC and summary statistics": "meaningless" seems too strong as the next sentence suggests a potentially useful alternative way of doing inference.
'''Response:''' The formulation was changed to “may therefore be misinformative”.
*"Prior distribution and parameter ranges": "...based on the principle of maximum entropy". A link to the general topic of [[wp:Prior_distribution#Uninformative_priors|objective priors]] might be helpful here.
'''Response:''' A link has been added.
*"Large data sets": "which may be a tractable approach for ABC based methods". Note it is already easy to parallelise many of the steps in ABC algorithms based on rejection sampling and SMC.
'''Response:''' This has been changed.
*"Curse of dimensionality": Some theoretical results have been proved here<ref>M. G. B. Blum (2010) Approximate Bayesian Computation: a nonparametric perspective. Journal of the American Statistical Association (105) 1178-1187</ref><ref name="FP2012" />.
'''Response:''' We have added references to these papers.
*"Conclusion": "With faster evaluation of the likelihood function..." I'm not sure what this is getting at; in ABC applications the likelihood function typically cannot be evaluated!
'''Response:''' This formulation has been changed.
==References==
<references />
1336
2012-07-20T09:48:46Z
Cdessimoz
12
/* Comments of Christian P. Robert on the entry */
== Comments of Darren Logan on things to do before moving the article to Wikipedia ==
* The title should probably be "Approximate Bayesian computation" (small "c"). The "...in computational biology" is probably redundant for WP article. You can always make a redirection from the long title to the short one.
*: Moved. --<span style="font-family:Palatino Linotype, Book Antiqua, Palatino, serif;">[[User:Spencer Bliven|Spencer]] [[User talk:Spencer Bliven|Bliven]]</span> 15:35, 18 May 2012 (PDT)
'''The following comments apply specifically to the wikipedia-version of this article''' --[[User:Cdessimoz|Cdessimoz]] 03:41, 22 May 2012 (PDT)
* In the summary and elsewhere, you use terms like "over the last years" and "recently". These should be avoided, as WP articles are not dated and thus non-specific time-frames are not meaningful. If you need to refer to time, be specific (e.g. "Since 1999..." or "In 2010...")
* Example. In general, Wikipedia articles should not contain worked examples. That type of content is better suited to Wikiversity or Wikibooks. There are exceptions, however. The guidance on this can be seen at WP:NOT, specifically: "An article should not read like a "how-to" style... the purpose of Wikipedia is to present facts, not to teach subject matter. It is not appropriate to create or edit articles that read as textbooks, with leading questions and systematic problem solutions as examples... Some kinds of examples, specifically those intended to inform rather than to instruct, may be appropriate for inclusion in a Wikipedia article." I think your example might be ok, but you should be careful of the tone to ensure it doesn't seem like a "how-to" guide.
* Wikipedia articles do not have conclusion sections.
* Throughout the article you should try and avoid using a narrative voice and remove all self-references. For example:
** "As the previous section suggests..."
** "This section attempts to review important recent developments..."
** "...should be considered with sober caution, as discussed below."
** "Interestingly..."
** "This section discusses these potential risks and reviews possible ways to address them.."
** "As the above makes clear..."
** "This section reviews risks..."
** "This section attempts to review important recent developments."
To follow up on these remarks, the history section in such Wikipedia articles is typically the first after the lead section, as it puts the topic into its historic context. I have thus moved it there. However, this possibly breaks some of the narrative flow and should thus be checked again during the revision. --[[User:Daniel Mietchen|Daniel Mietchen]] 19:05, 27 June 2012 (PDT)
: '''Response''': We have verified the coherence of the narrative flow. --[[User:Cdessimoz|Cdessimoz]] 07:52, 5 July 2012 (PDT)
== Comments of Christian P. Robert on the entry ==
A few comments on the specific entry on ABC written by Mikael Sunnåker et al....
*The entry starts with the representation of the posterior probability of an hypothesis, rather than with the posterior density of a model parameter, which makes it seems likely it could lead the novice reader astray. After all, (a) ABC was not introduced for conducting model choice and (b) interchanging hypothesis and model means that the probability of an hypothesis '''''H''''' as used in the entry is actually the evidence in favour of the corresponding model.
'''Response:''' We now first talk only about parameter estimation. We have also rewritten the section about model selection for better coherence of the text.
* (There are a few typos and grammar mistakes, but I assume either PLoS or later contributors will correct those.)
'''Response:''' We have corrected the typos and grammatical mistakes found during the revision.
* When the authors state that the "outcome of the ABC rejection algorithm is a set of parameter estimates distributed according to the desired posterior distribution", I think they are leading some of the readers astray as they forget the "approximative" aspect of this distribution.
'''Response:''' This has been changed.
* Further below, I would have used the title "Insufficient summary statistics" rather than "Sufficient summary statistics", as it spells out more clearly the fundamental issue with the potential difficulty in using ABC.
'''Response:''' The title has been changed to “Summary statistics” (see also Dennis Prangle's comment below)
* (And I am not sure the subsequent paragraph on "Choice and sufficiency of summary statistics" should bother with the sufficiency aspects... It seems to me much more relevant to assess the impact on predictive performances.
'''Response:''' We have toned down the issue of sufficiency. For clarity reason, we prefer to defer the discussion on predictive performance to the "pitfall and remedies" section.
* Although this is most minor, I would not have made mention of the (rather artificial) "table for interpretation of the strength in values of [[wp:Bayes factor|the Bayes factor]] (...) originally published by [[wp:Harold Jeffreys|Harold Jeffreys]]". I obviously appreciate very much that the authors advertise our warning <ref> Robert, C.P., Cornuet, J.-M., Marin, J.-M. and Pillai, N. (2011) Lack of confidence in approximate Bayesian computation model choice. PNAS vol. 108 no. 37 15112-15117.</ref> about the potential lack of validity of an ABC based Bayes factor!
'''Response:''' The section on model selection has been almost entirely rewritten. In the process, the reference to Jeffreys's table has been removed.
* I also like the notion of "quality control", even though it should only appear once.
'''Response:''' We have merged the two sections about quality control.
* And the pseudo-example is quite fine as an introduction, while it could be supplemented with the outcome resulting from a large n, to be compared with the true posterior distribution.
'''Response:''' We have included a new figure (Fig. 3), which shows ABC with large n for full data, and summary statistics (<math>\epsilon = 0</math> and <math>\epsilon = 2</math>). As suggested, it also compare the ABC results with the theoretical posterior.
* The section "Pitfalls and remedies" is remarkable in that it details the necessary steps for validating a ABC implementation: the only entry I would remove is the one about "Prior distribution and parameter ranges", in that this is not a problem inherent to ABC... (Granted, the authors present this as a "general risks in statistical inference exacerbated in ABC", which makes more sense!)
'''Response:''' We would like to keep the discussion on prior distribution and parameter ranges. However, a sentence was added under “Pitfalls and remedies” to emphasize that the problem related to “Prior distribution and parameter ranges” is not specific to ABC.
* It may be that the section on the non-zero tolerance should emphasize more clearly the fact that ε should not be zero. As discussed in the recent Read Paper by Fearnhead and Prangle <ref name="FP2012">Fearnhead, P. and Prangle, D. (2012) Constructing summary statistics for approximate Bayesian computation: semi-automatic approximate Bayesian computation. Journal of the Royal Statistical Society Series B. Volume 74, Issue 3, pages 419–474.</ref> when envisioning ABC as a non-parametric method of inference.
<span style="color:red">'''Response:'''</span>
* At last, it is always possible to criticise the coverage of the historical part, since this is such a recent field that it is constantly evolving. But the authors correctly point out to (Don) Rubin on the one hand and to Diggle and Graton on the other. I would suggest adding in this section links to the relevant softwares like our own DIY-ABC<ref>Cornuet, J.-M., Santos, F., Beaumont, M. et al. (2008) Inferring population history with DIY ABC: a user-friendly approach to approximate Bayesian computation . Bioinformatics 24 (23): 2713-2719.</ref>...
'''Response:''' A section listing ABC software has been added, including a new table with references to the corresponding papers (Table 3) .
== Review by Dennis Prangle ==
This is a well written and accessible introductory article. I particularly like the balance struck between describing the simplicity of implementing ABC and the potential drawbacks.
=== Major comments ===
(nb I've included full references only for papers not in the original article.)
* Much of the material in the "recent methodological developments" section is well established and no longer recent relative to the age of the field (e.g. the Marjoram et al paper was published in 2003). I'd suggest at least renaming this section. Alternatively, much of this material could be incorporated into the "approximation of the posterior" section, as regression correction ideas and MCMC / SMC algorithms are tools commonly used to improve the approximation.
'''Response:''' The section has been removed and most of the material has been incorporated into the “approximation of the posterior” section.
* A little more coverage of applications would be nice. One way to do this without increasing the length of the article would be to explicitly reference recent review papers (Beaumont 2010, Bertorelle et al 2010, Csillery et al 2010, Marin et al 2011<ref>Jean-Michel Marin, Pierre Pudlo, Christian P. Robert and Robin J. Ryder (2011) Approximate Bayesian computational methods. Statistics and Computing (published online)</ref>) for further details.
'''Response:''' We have added a sentence about applications of ABC, with references to these review papers, at the end of the “Example” section.
* The model comparison section should explain how the ABC rejection sampling algorithm can be adapted to perform inference between models (or give a reference). A reference to more advanced algorithms (e.g. Didelot et al, Toni and Stumpf 2009<ref>Tina Toni and Michael P. H. Stumpf (2009) Simulation-based model selection for dynamical systems in systems and population biology. Bioinformatics (26) 104-110</ref>) would also be helpful.
'''Response:''' We have added a reference to the Toni & Stumpf SMC-ABC method for model selection.
* I agree with Christian Robert's comments that the discussion of a hypothesis H in the motivation section is somewhat confusing, and that links to code could be helpful. Some additional suggestions are the "abc" R package and ABC-SysBio.
'''Response:''' See our response to Christian Robert’s comment above.
=== Minor comments ===
* The acceptance criterion should be <math>\rho (\hat{D},D) \le \epsilon</math> not <math>\rho (\hat{D},D)<\epsilon</math> if <math>\epsilon=0</math> is to correspond to acceptance of exact matches only.
'''Response:''' This has been changed.
*"Sufficient summary statistics": As Christian writes, it would seem more natural to discuss general summary statistics first, then the special and less practically useful case of sufficient statistics.
<span style="color:red">'''Response:'''</span>
*"Example": I'd point out that this is an example application only, and more accurate inference is possible here by particle filtering methods. If there were some missing data this would be a more natural ABC application e.g. if only the summary statistic was observed.
'''Response:''' We have also added a sentence to point out that it is only an example application, and that the posterior can be computed exactly.
*"Approximation of the posterior": "...has been justified theoretically under some limiting conditions". The word "limiting" doesn't seem (to me) to describe the measurement error case.
<span style="color:red">'''Response:'''</span>
*"Choice and sufficiency of summary statistics": "Sufficient statistics are optimal..." I'd change to "Low dimensional sufficient statistics". For some models (e.g. iid Cauchy) the only sufficient statistics are the full data set, which would be a poor choice.
'''Response:''' This has been changed.
*"Choice and sufficiency of summary statistics": "...which is approximated with a pilot run of simulations". Something like "...which is approximated by linear regression based on simulated data" would be more accurate.
'''Response:''' This has been changed.
*"Choice and sufficiency of summary statistics": It might be useful to reference a recent comparison<ref>M. G. B. Blum, M. A. Nunes, D. Prangle, S. A. Sisson (2012) A comparative review of dimension reduction methods in approximate Bayesian computation. arxiv.org/abs/1202.3819 </ref> (disclaimer: which I contributed to) between methods of choosing summary statistics.
'''Response:''' A sentence was added with a reference to the paper.
*"Bayes factor with ABC and summary statistics": "...can also be used to..." it might be more accurate to say "...is sufficient to..."
'''Response:''' This has been changed.
*"Bayes factor with ABC and summary statistics": "meaningless" seems too strong as the next sentence suggests a potentially useful alternative way of doing inference.
'''Response:''' The formulation was changed to “may therefore be misinformative”.
*"Prior distribution and parameter ranges": "...based on the principle of maximum entropy". A link to the general topic of [[wp:Prior_distribution#Uninformative_priors|objective priors]] might be helpful here.
'''Response:''' A link has been added.
*"Large data sets": "which may be a tractable approach for ABC based methods". Note it is already easy to parallelise many of the steps in ABC algorithms based on rejection sampling and SMC.
'''Response:''' This has been changed.
*"Curse of dimensionality": Some theoretical results have been proved here<ref>M. G. B. Blum (2010) Approximate Bayesian Computation: a nonparametric perspective. Journal of the American Statistical Association (105) 1178-1187</ref><ref name="FP2012" />.
'''Response:''' We have added references to these papers.
*"Conclusion": "With faster evaluation of the likelihood function..." I'm not sure what this is getting at; in ABC applications the likelihood function typically cannot be evaluated!
'''Response:''' This formulation has been changed.
==References==
<references />
1338
2012-07-20T11:52:06Z
Cdessimoz
12
/* Review by Dennis Prangle */
== Comments of Darren Logan on things to do before moving the article to Wikipedia ==
* The title should probably be "Approximate Bayesian computation" (small "c"). The "...in computational biology" is probably redundant for WP article. You can always make a redirection from the long title to the short one.
*: Moved. --<span style="font-family:Palatino Linotype, Book Antiqua, Palatino, serif;">[[User:Spencer Bliven|Spencer]] [[User talk:Spencer Bliven|Bliven]]</span> 15:35, 18 May 2012 (PDT)
'''The following comments apply specifically to the wikipedia-version of this article''' --[[User:Cdessimoz|Cdessimoz]] 03:41, 22 May 2012 (PDT)
* In the summary and elsewhere, you use terms like "over the last years" and "recently". These should be avoided, as WP articles are not dated and thus non-specific time-frames are not meaningful. If you need to refer to time, be specific (e.g. "Since 1999..." or "In 2010...")
* Example. In general, Wikipedia articles should not contain worked examples. That type of content is better suited to Wikiversity or Wikibooks. There are exceptions, however. The guidance on this can be seen at WP:NOT, specifically: "An article should not read like a "how-to" style... the purpose of Wikipedia is to present facts, not to teach subject matter. It is not appropriate to create or edit articles that read as textbooks, with leading questions and systematic problem solutions as examples... Some kinds of examples, specifically those intended to inform rather than to instruct, may be appropriate for inclusion in a Wikipedia article." I think your example might be ok, but you should be careful of the tone to ensure it doesn't seem like a "how-to" guide.
* Wikipedia articles do not have conclusion sections.
* Throughout the article you should try and avoid using a narrative voice and remove all self-references. For example:
** "As the previous section suggests..."
** "This section attempts to review important recent developments..."
** "...should be considered with sober caution, as discussed below."
** "Interestingly..."
** "This section discusses these potential risks and reviews possible ways to address them.."
** "As the above makes clear..."
** "This section reviews risks..."
** "This section attempts to review important recent developments."
To follow up on these remarks, the history section in such Wikipedia articles is typically the first after the lead section, as it puts the topic into its historic context. I have thus moved it there. However, this possibly breaks some of the narrative flow and should thus be checked again during the revision. --[[User:Daniel Mietchen|Daniel Mietchen]] 19:05, 27 June 2012 (PDT)
: '''Response''': We have verified the coherence of the narrative flow. --[[User:Cdessimoz|Cdessimoz]] 07:52, 5 July 2012 (PDT)
== Comments of Christian P. Robert on the entry ==
A few comments on the specific entry on ABC written by Mikael Sunnåker et al....
*The entry starts with the representation of the posterior probability of an hypothesis, rather than with the posterior density of a model parameter, which makes it seems likely it could lead the novice reader astray. After all, (a) ABC was not introduced for conducting model choice and (b) interchanging hypothesis and model means that the probability of an hypothesis '''''H''''' as used in the entry is actually the evidence in favour of the corresponding model.
'''Response:''' We now first talk only about parameter estimation. We have also rewritten the section about model selection for better coherence of the text.
* (There are a few typos and grammar mistakes, but I assume either PLoS or later contributors will correct those.)
'''Response:''' We have corrected the typos and grammatical mistakes found during the revision.
* When the authors state that the "outcome of the ABC rejection algorithm is a set of parameter estimates distributed according to the desired posterior distribution", I think they are leading some of the readers astray as they forget the "approximative" aspect of this distribution.
'''Response:''' This has been changed.
* Further below, I would have used the title "Insufficient summary statistics" rather than "Sufficient summary statistics", as it spells out more clearly the fundamental issue with the potential difficulty in using ABC.
'''Response:''' The title has been changed to “Summary statistics” (see also Dennis Prangle's comment below)
* (And I am not sure the subsequent paragraph on "Choice and sufficiency of summary statistics" should bother with the sufficiency aspects... It seems to me much more relevant to assess the impact on predictive performances.
'''Response:''' We have toned down the issue of sufficiency. For clarity reason, we prefer to defer the discussion on predictive performance to the "pitfall and remedies" section.
* Although this is most minor, I would not have made mention of the (rather artificial) "table for interpretation of the strength in values of [[wp:Bayes factor|the Bayes factor]] (...) originally published by [[wp:Harold Jeffreys|Harold Jeffreys]]". I obviously appreciate very much that the authors advertise our warning <ref> Robert, C.P., Cornuet, J.-M., Marin, J.-M. and Pillai, N. (2011) Lack of confidence in approximate Bayesian computation model choice. PNAS vol. 108 no. 37 15112-15117.</ref> about the potential lack of validity of an ABC based Bayes factor!
'''Response:''' The section on model selection has been almost entirely rewritten. In the process, the reference to Jeffreys's table has been removed.
* I also like the notion of "quality control", even though it should only appear once.
'''Response:''' We have merged the two sections about quality control.
* And the pseudo-example is quite fine as an introduction, while it could be supplemented with the outcome resulting from a large n, to be compared with the true posterior distribution.
'''Response:''' We have included a new figure (Fig. 3), which shows ABC with large n for full data, and summary statistics (<math>\epsilon = 0</math> and <math>\epsilon = 2</math>). As suggested, it also compare the ABC results with the theoretical posterior.
* The section "Pitfalls and remedies" is remarkable in that it details the necessary steps for validating a ABC implementation: the only entry I would remove is the one about "Prior distribution and parameter ranges", in that this is not a problem inherent to ABC... (Granted, the authors present this as a "general risks in statistical inference exacerbated in ABC", which makes more sense!)
'''Response:''' We would like to keep the discussion on prior distribution and parameter ranges. However, a sentence was added under “Pitfalls and remedies” to emphasize that the problem related to “Prior distribution and parameter ranges” is not specific to ABC.
* It may be that the section on the non-zero tolerance should emphasize more clearly the fact that ε should not be zero. As discussed in the recent Read Paper by Fearnhead and Prangle <ref name="FP2012">Fearnhead, P. and Prangle, D. (2012) Constructing summary statistics for approximate Bayesian computation: semi-automatic approximate Bayesian computation. Journal of the Royal Statistical Society Series B. Volume 74, Issue 3, pages 419–474.</ref> when envisioning ABC as a non-parametric method of inference.
<span style="color:red">'''Response:'''</span>
* At last, it is always possible to criticise the coverage of the historical part, since this is such a recent field that it is constantly evolving. But the authors correctly point out to (Don) Rubin on the one hand and to Diggle and Graton on the other. I would suggest adding in this section links to the relevant softwares like our own DIY-ABC<ref>Cornuet, J.-M., Santos, F., Beaumont, M. et al. (2008) Inferring population history with DIY ABC: a user-friendly approach to approximate Bayesian computation . Bioinformatics 24 (23): 2713-2719.</ref>...
'''Response:''' A section listing ABC software has been added, including a new table with references to the corresponding papers (Table 3) .
== Review by Dennis Prangle ==
This is a well written and accessible introductory article. I particularly like the balance struck between describing the simplicity of implementing ABC and the potential drawbacks.
=== Major comments ===
(nb I've included full references only for papers not in the original article.)
* Much of the material in the "recent methodological developments" section is well established and no longer recent relative to the age of the field (e.g. the Marjoram et al paper was published in 2003). I'd suggest at least renaming this section. Alternatively, much of this material could be incorporated into the "approximation of the posterior" section, as regression correction ideas and MCMC / SMC algorithms are tools commonly used to improve the approximation.
'''Response:''' The section has been removed and most of the material has been incorporated into the “approximation of the posterior” section.
* A little more coverage of applications would be nice. One way to do this without increasing the length of the article would be to explicitly reference recent review papers (Beaumont 2010, Bertorelle et al 2010, Csillery et al 2010, Marin et al 2011<ref>Jean-Michel Marin, Pierre Pudlo, Christian P. Robert and Robin J. Ryder (2011) Approximate Bayesian computational methods. Statistics and Computing (published online)</ref>) for further details.
'''Response:''' We have added a sentence about applications of ABC, with references to these review papers, at the end of the “Example” section.
* The model comparison section should explain how the ABC rejection sampling algorithm can be adapted to perform inference between models (or give a reference). A reference to more advanced algorithms (e.g. Didelot et al, Toni and Stumpf 2009<ref>Tina Toni and Michael P. H. Stumpf (2009) Simulation-based model selection for dynamical systems in systems and population biology. Bioinformatics (26) 104-110</ref>) would also be helpful.
'''Response:''' We have added a reference to the Toni & Stumpf SMC-ABC method for model selection.
* I agree with Christian Robert's comments that the discussion of a hypothesis H in the motivation section is somewhat confusing, and that links to code could be helpful. Some additional suggestions are the "abc" R package and ABC-SysBio.
'''Response:''' See our response to Christian Robert’s comment above.
=== Minor comments ===
* The acceptance criterion should be <math>\rho (\hat{D},D) \le \epsilon</math> not <math>\rho (\hat{D},D)<\epsilon</math> if <math>\epsilon=0</math> is to correspond to acceptance of exact matches only.
'''Response:''' This has been changed.
*"Sufficient summary statistics": As Christian writes, it would seem more natural to discuss general summary statistics first, then the special and less practically useful case of sufficient statistics.
<span style="color:red">'''Response:'''</span>
*"Example": I'd point out that this is an example application only, and more accurate inference is possible here by particle filtering methods. If there were some missing data this would be a more natural ABC application e.g. if only the summary statistic was observed.
'''Response:''' We have also added a sentence to point out that it is only an example application, and that the posterior can be computed exactly.
*"Approximation of the posterior": "...has been justified theoretically under some limiting conditions". The word "limiting" doesn't seem (to me) to describe the measurement error case.
'''Response:''' We agree and have reformulated this sentence.
*"Choice and sufficiency of summary statistics": "Sufficient statistics are optimal..." I'd change to "Low dimensional sufficient statistics". For some models (e.g. iid Cauchy) the only sufficient statistics are the full data set, which would be a poor choice.
'''Response:''' This has been changed.
*"Choice and sufficiency of summary statistics": "...which is approximated with a pilot run of simulations". Something like "...which is approximated by linear regression based on simulated data" would be more accurate.
'''Response:''' This has been changed.
*"Choice and sufficiency of summary statistics": It might be useful to reference a recent comparison<ref>M. G. B. Blum, M. A. Nunes, D. Prangle, S. A. Sisson (2012) A comparative review of dimension reduction methods in approximate Bayesian computation. arxiv.org/abs/1202.3819 </ref> (disclaimer: which I contributed to) between methods of choosing summary statistics.
'''Response:''' A sentence was added with a reference to the paper.
*"Bayes factor with ABC and summary statistics": "...can also be used to..." it might be more accurate to say "...is sufficient to..."
'''Response:''' This has been changed.
*"Bayes factor with ABC and summary statistics": "meaningless" seems too strong as the next sentence suggests a potentially useful alternative way of doing inference.
'''Response:''' The formulation was changed to “may therefore be misinformative”.
*"Prior distribution and parameter ranges": "...based on the principle of maximum entropy". A link to the general topic of [[wp:Prior_distribution#Uninformative_priors|objective priors]] might be helpful here.
'''Response:''' A link has been added.
*"Large data sets": "which may be a tractable approach for ABC based methods". Note it is already easy to parallelise many of the steps in ABC algorithms based on rejection sampling and SMC.
'''Response:''' This has been changed.
*"Curse of dimensionality": Some theoretical results have been proved here<ref>M. G. B. Blum (2010) Approximate Bayesian Computation: a nonparametric perspective. Journal of the American Statistical Association (105) 1178-1187</ref><ref name="FP2012" />.
'''Response:''' We have added references to these papers.
*"Conclusion": "With faster evaluation of the likelihood function..." I'm not sure what this is getting at; in ABC applications the likelihood function typically cannot be evaluated!
'''Response:''' This formulation has been changed.
==References==
<references />
1340
2012-07-20T12:10:26Z
Cdessimoz
12
/* Minor comments */
== Comments of Darren Logan on things to do before moving the article to Wikipedia ==
* The title should probably be "Approximate Bayesian computation" (small "c"). The "...in computational biology" is probably redundant for WP article. You can always make a redirection from the long title to the short one.
*: Moved. --<span style="font-family:Palatino Linotype, Book Antiqua, Palatino, serif;">[[User:Spencer Bliven|Spencer]] [[User talk:Spencer Bliven|Bliven]]</span> 15:35, 18 May 2012 (PDT)
'''The following comments apply specifically to the wikipedia-version of this article''' --[[User:Cdessimoz|Cdessimoz]] 03:41, 22 May 2012 (PDT)
* In the summary and elsewhere, you use terms like "over the last years" and "recently". These should be avoided, as WP articles are not dated and thus non-specific time-frames are not meaningful. If you need to refer to time, be specific (e.g. "Since 1999..." or "In 2010...")
* Example. In general, Wikipedia articles should not contain worked examples. That type of content is better suited to Wikiversity or Wikibooks. There are exceptions, however. The guidance on this can be seen at WP:NOT, specifically: "An article should not read like a "how-to" style... the purpose of Wikipedia is to present facts, not to teach subject matter. It is not appropriate to create or edit articles that read as textbooks, with leading questions and systematic problem solutions as examples... Some kinds of examples, specifically those intended to inform rather than to instruct, may be appropriate for inclusion in a Wikipedia article." I think your example might be ok, but you should be careful of the tone to ensure it doesn't seem like a "how-to" guide.
* Wikipedia articles do not have conclusion sections.
* Throughout the article you should try and avoid using a narrative voice and remove all self-references. For example:
** "As the previous section suggests..."
** "This section attempts to review important recent developments..."
** "...should be considered with sober caution, as discussed below."
** "Interestingly..."
** "This section discusses these potential risks and reviews possible ways to address them.."
** "As the above makes clear..."
** "This section reviews risks..."
** "This section attempts to review important recent developments."
To follow up on these remarks, the history section in such Wikipedia articles is typically the first after the lead section, as it puts the topic into its historic context. I have thus moved it there. However, this possibly breaks some of the narrative flow and should thus be checked again during the revision. --[[User:Daniel Mietchen|Daniel Mietchen]] 19:05, 27 June 2012 (PDT)
: '''Response''': We have verified the coherence of the narrative flow. --[[User:Cdessimoz|Cdessimoz]] 07:52, 5 July 2012 (PDT)
== Comments of Christian P. Robert on the entry ==
A few comments on the specific entry on ABC written by Mikael Sunnåker et al....
*The entry starts with the representation of the posterior probability of an hypothesis, rather than with the posterior density of a model parameter, which makes it seems likely it could lead the novice reader astray. After all, (a) ABC was not introduced for conducting model choice and (b) interchanging hypothesis and model means that the probability of an hypothesis '''''H''''' as used in the entry is actually the evidence in favour of the corresponding model.
'''Response:''' We now first talk only about parameter estimation. We have also rewritten the section about model selection for better coherence of the text.
* (There are a few typos and grammar mistakes, but I assume either PLoS or later contributors will correct those.)
'''Response:''' We have corrected the typos and grammatical mistakes found during the revision.
* When the authors state that the "outcome of the ABC rejection algorithm is a set of parameter estimates distributed according to the desired posterior distribution", I think they are leading some of the readers astray as they forget the "approximative" aspect of this distribution.
'''Response:''' This has been changed.
* Further below, I would have used the title "Insufficient summary statistics" rather than "Sufficient summary statistics", as it spells out more clearly the fundamental issue with the potential difficulty in using ABC.
'''Response:''' The title has been changed to “Summary statistics” (see also Dennis Prangle's comment below)
* (And I am not sure the subsequent paragraph on "Choice and sufficiency of summary statistics" should bother with the sufficiency aspects... It seems to me much more relevant to assess the impact on predictive performances.
'''Response:''' We have toned down the issue of sufficiency. For clarity reason, we prefer to defer the discussion on predictive performance to the "pitfall and remedies" section.
* Although this is most minor, I would not have made mention of the (rather artificial) "table for interpretation of the strength in values of [[wp:Bayes factor|the Bayes factor]] (...) originally published by [[wp:Harold Jeffreys|Harold Jeffreys]]". I obviously appreciate very much that the authors advertise our warning <ref> Robert, C.P., Cornuet, J.-M., Marin, J.-M. and Pillai, N. (2011) Lack of confidence in approximate Bayesian computation model choice. PNAS vol. 108 no. 37 15112-15117.</ref> about the potential lack of validity of an ABC based Bayes factor!
'''Response:''' The section on model selection has been almost entirely rewritten. In the process, the reference to Jeffreys's table has been removed.
* I also like the notion of "quality control", even though it should only appear once.
'''Response:''' We have merged the two sections about quality control.
* And the pseudo-example is quite fine as an introduction, while it could be supplemented with the outcome resulting from a large n, to be compared with the true posterior distribution.
'''Response:''' We have included a new figure (Fig. 3), which shows ABC with large n for full data, and summary statistics (<math>\epsilon = 0</math> and <math>\epsilon = 2</math>). As suggested, it also compare the ABC results with the theoretical posterior.
* The section "Pitfalls and remedies" is remarkable in that it details the necessary steps for validating a ABC implementation: the only entry I would remove is the one about "Prior distribution and parameter ranges", in that this is not a problem inherent to ABC... (Granted, the authors present this as a "general risks in statistical inference exacerbated in ABC", which makes more sense!)
'''Response:''' We would like to keep the discussion on prior distribution and parameter ranges. However, a sentence was added under “Pitfalls and remedies” to emphasize that the problem related to “Prior distribution and parameter ranges” is not specific to ABC.
* It may be that the section on the non-zero tolerance should emphasize more clearly the fact that ε should not be zero. As discussed in the recent Read Paper by Fearnhead and Prangle <ref name="FP2012">Fearnhead, P. and Prangle, D. (2012) Constructing summary statistics for approximate Bayesian computation: semi-automatic approximate Bayesian computation. Journal of the Royal Statistical Society Series B. Volume 74, Issue 3, pages 419–474.</ref> when envisioning ABC as a non-parametric method of inference.
<span style="color:red">'''Response:'''</span>
* At last, it is always possible to criticise the coverage of the historical part, since this is such a recent field that it is constantly evolving. But the authors correctly point out to (Don) Rubin on the one hand and to Diggle and Graton on the other. I would suggest adding in this section links to the relevant softwares like our own DIY-ABC<ref>Cornuet, J.-M., Santos, F., Beaumont, M. et al. (2008) Inferring population history with DIY ABC: a user-friendly approach to approximate Bayesian computation . Bioinformatics 24 (23): 2713-2719.</ref>...
'''Response:''' A section listing ABC software has been added, including a new table with references to the corresponding papers (Table 3) .
== Review by Dennis Prangle ==
This is a well written and accessible introductory article. I particularly like the balance struck between describing the simplicity of implementing ABC and the potential drawbacks.
=== Major comments ===
(nb I've included full references only for papers not in the original article.)
* Much of the material in the "recent methodological developments" section is well established and no longer recent relative to the age of the field (e.g. the Marjoram et al paper was published in 2003). I'd suggest at least renaming this section. Alternatively, much of this material could be incorporated into the "approximation of the posterior" section, as regression correction ideas and MCMC / SMC algorithms are tools commonly used to improve the approximation.
'''Response:''' The section has been removed and most of the material has been incorporated into the “approximation of the posterior” section.
* A little more coverage of applications would be nice. One way to do this without increasing the length of the article would be to explicitly reference recent review papers (Beaumont 2010, Bertorelle et al 2010, Csillery et al 2010, Marin et al 2011<ref>Jean-Michel Marin, Pierre Pudlo, Christian P. Robert and Robin J. Ryder (2011) Approximate Bayesian computational methods. Statistics and Computing (published online)</ref>) for further details.
'''Response:''' We have added a sentence about applications of ABC, with references to these review papers, at the end of the “Example” section.
* The model comparison section should explain how the ABC rejection sampling algorithm can be adapted to perform inference between models (or give a reference). A reference to more advanced algorithms (e.g. Didelot et al, Toni and Stumpf 2009<ref>Tina Toni and Michael P. H. Stumpf (2009) Simulation-based model selection for dynamical systems in systems and population biology. Bioinformatics (26) 104-110</ref>) would also be helpful.
'''Response:''' We have added a reference to the Toni & Stumpf SMC-ABC method for model selection.
* I agree with Christian Robert's comments that the discussion of a hypothesis H in the motivation section is somewhat confusing, and that links to code could be helpful. Some additional suggestions are the "abc" R package and ABC-SysBio.
'''Response:''' See our response to Christian Robert’s comment above.
=== Minor comments ===
* The acceptance criterion should be <math>\rho (\hat{D},D) \le \epsilon</math> not <math>\rho (\hat{D},D)<\epsilon</math> if <math>\epsilon=0</math> is to correspond to acceptance of exact matches only.
'''Response:''' This has been changed.
*"Sufficient summary statistics": As Christian writes, it would seem more natural to discuss general summary statistics first, then the special and less practically useful case of sufficient statistics.
'''Response:''' This has been changed.
*"Example": I'd point out that this is an example application only, and more accurate inference is possible here by particle filtering methods. If there were some missing data this would be a more natural ABC application e.g. if only the summary statistic was observed.
'''Response:''' We have also added a sentence to point out that it is only an example application, and that the posterior can be computed exactly.
*"Approximation of the posterior": "...has been justified theoretically under some limiting conditions". The word "limiting" doesn't seem (to me) to describe the measurement error case.
'''Response:''' We agree and have reformulated this sentence.
*"Choice and sufficiency of summary statistics": "Sufficient statistics are optimal..." I'd change to "Low dimensional sufficient statistics". For some models (e.g. iid Cauchy) the only sufficient statistics are the full data set, which would be a poor choice.
'''Response:''' This has been changed.
*"Choice and sufficiency of summary statistics": "...which is approximated with a pilot run of simulations". Something like "...which is approximated by linear regression based on simulated data" would be more accurate.
'''Response:''' This has been changed.
*"Choice and sufficiency of summary statistics": It might be useful to reference a recent comparison<ref>M. G. B. Blum, M. A. Nunes, D. Prangle, S. A. Sisson (2012) A comparative review of dimension reduction methods in approximate Bayesian computation. arxiv.org/abs/1202.3819 </ref> (disclaimer: which I contributed to) between methods of choosing summary statistics.
'''Response:''' A sentence was added with a reference to the paper.
*"Bayes factor with ABC and summary statistics": "...can also be used to..." it might be more accurate to say "...is sufficient to..."
'''Response:''' This has been changed.
*"Bayes factor with ABC and summary statistics": "meaningless" seems too strong as the next sentence suggests a potentially useful alternative way of doing inference.
'''Response:''' The formulation was changed to “may therefore be misinformative”.
*"Prior distribution and parameter ranges": "...based on the principle of maximum entropy". A link to the general topic of [[wp:Prior_distribution#Uninformative_priors|objective priors]] might be helpful here.
'''Response:''' A link has been added.
*"Large data sets": "which may be a tractable approach for ABC based methods". Note it is already easy to parallelise many of the steps in ABC algorithms based on rejection sampling and SMC.
'''Response:''' This has been changed.
*"Curse of dimensionality": Some theoretical results have been proved here<ref>M. G. B. Blum (2010) Approximate Bayesian Computation: a nonparametric perspective. Journal of the American Statistical Association (105) 1178-1187</ref><ref name="FP2012" />.
'''Response:''' We have added references to these papers.
*"Conclusion": "With faster evaluation of the likelihood function..." I'm not sure what this is getting at; in ABC applications the likelihood function typically cannot be evaluated!
'''Response:''' This formulation has been changed.
==References==
<references />
1341
2012-07-20T12:12:01Z
Cdessimoz
12
/* Comments of Christian P. Robert on the entry */
== Comments of Darren Logan on things to do before moving the article to Wikipedia ==
* The title should probably be "Approximate Bayesian computation" (small "c"). The "...in computational biology" is probably redundant for WP article. You can always make a redirection from the long title to the short one.
*: Moved. --<span style="font-family:Palatino Linotype, Book Antiqua, Palatino, serif;">[[User:Spencer Bliven|Spencer]] [[User talk:Spencer Bliven|Bliven]]</span> 15:35, 18 May 2012 (PDT)
'''The following comments apply specifically to the wikipedia-version of this article''' --[[User:Cdessimoz|Cdessimoz]] 03:41, 22 May 2012 (PDT)
* In the summary and elsewhere, you use terms like "over the last years" and "recently". These should be avoided, as WP articles are not dated and thus non-specific time-frames are not meaningful. If you need to refer to time, be specific (e.g. "Since 1999..." or "In 2010...")
* Example. In general, Wikipedia articles should not contain worked examples. That type of content is better suited to Wikiversity or Wikibooks. There are exceptions, however. The guidance on this can be seen at WP:NOT, specifically: "An article should not read like a "how-to" style... the purpose of Wikipedia is to present facts, not to teach subject matter. It is not appropriate to create or edit articles that read as textbooks, with leading questions and systematic problem solutions as examples... Some kinds of examples, specifically those intended to inform rather than to instruct, may be appropriate for inclusion in a Wikipedia article." I think your example might be ok, but you should be careful of the tone to ensure it doesn't seem like a "how-to" guide.
* Wikipedia articles do not have conclusion sections.
* Throughout the article you should try and avoid using a narrative voice and remove all self-references. For example:
** "As the previous section suggests..."
** "This section attempts to review important recent developments..."
** "...should be considered with sober caution, as discussed below."
** "Interestingly..."
** "This section discusses these potential risks and reviews possible ways to address them.."
** "As the above makes clear..."
** "This section reviews risks..."
** "This section attempts to review important recent developments."
To follow up on these remarks, the history section in such Wikipedia articles is typically the first after the lead section, as it puts the topic into its historic context. I have thus moved it there. However, this possibly breaks some of the narrative flow and should thus be checked again during the revision. --[[User:Daniel Mietchen|Daniel Mietchen]] 19:05, 27 June 2012 (PDT)
: '''Response''': We have verified the coherence of the narrative flow. --[[User:Cdessimoz|Cdessimoz]] 07:52, 5 July 2012 (PDT)
== Comments of Christian P. Robert on the entry ==
A few comments on the specific entry on ABC written by Mikael Sunnåker et al....
*The entry starts with the representation of the posterior probability of an hypothesis, rather than with the posterior density of a model parameter, which makes it seems likely it could lead the novice reader astray. After all, (a) ABC was not introduced for conducting model choice and (b) interchanging hypothesis and model means that the probability of an hypothesis '''''H''''' as used in the entry is actually the evidence in favour of the corresponding model.
'''Response:''' We now first talk only about parameter estimation. We have also rewritten the section about model selection for better coherence of the text.
* (There are a few typos and grammar mistakes, but I assume either PLoS or later contributors will correct those.)
'''Response:''' We have corrected the typos and grammatical mistakes found during the revision.
* When the authors state that the "outcome of the ABC rejection algorithm is a set of parameter estimates distributed according to the desired posterior distribution", I think they are leading some of the readers astray as they forget the "approximative" aspect of this distribution.
'''Response:''' This has been changed.
* Further below, I would have used the title "Insufficient summary statistics" rather than "Sufficient summary statistics", as it spells out more clearly the fundamental issue with the potential difficulty in using ABC.
'''Response:''' The title has been changed to “Summary statistics” (see also Dennis Prangle's comment below)
* (And I am not sure the subsequent paragraph on "Choice and sufficiency of summary statistics" should bother with the sufficiency aspects... It seems to me much more relevant to assess the impact on predictive performances.
'''Response:''' We have toned down the issue of sufficiency. For clarity reason, we prefer to defer the discussion on predictive performance to the "pitfall and remedies" section.
* Although this is most minor, I would not have made mention of the (rather artificial) "table for interpretation of the strength in values of [[wp:Bayes factor|the Bayes factor]] (...) originally published by [[wp:Harold Jeffreys|Harold Jeffreys]]". I obviously appreciate very much that the authors advertise our warning <ref> Robert, C.P., Cornuet, J.-M., Marin, J.-M. and Pillai, N. (2011) Lack of confidence in approximate Bayesian computation model choice. PNAS vol. 108 no. 37 15112-15117.</ref> about the potential lack of validity of an ABC based Bayes factor!
'''Response:''' The section on model selection has been almost entirely rewritten. In the process, the reference to Jeffreys's table has been removed.
* I also like the notion of "quality control", even though it should only appear once.
'''Response:''' We have merged the two sections about quality control.
* And the pseudo-example is quite fine as an introduction, while it could be supplemented with the outcome resulting from a large n, to be compared with the true posterior distribution.
'''Response:''' We have included a new figure (Fig. 3), which shows ABC with large n for full data, and summary statistics (<math>\epsilon = 0</math> and <math>\epsilon = 2</math>). As suggested, it also compare the ABC results with the theoretical posterior.
* The section "Pitfalls and remedies" is remarkable in that it details the necessary steps for validating a ABC implementation: the only entry I would remove is the one about "Prior distribution and parameter ranges", in that this is not a problem inherent to ABC... (Granted, the authors present this as a "general risks in statistical inference exacerbated in ABC", which makes more sense!)
'''Response:''' We would like to keep the discussion on prior distribution and parameter ranges. However, a sentence was added under “Pitfalls and remedies” to emphasize that the problem related to “Prior distribution and parameter ranges” is not specific to ABC.
* It may be that the section on the non-zero tolerance should emphasize more clearly the fact that ε should not be zero. As discussed in the recent Read Paper by Fearnhead and Prangle <ref name="FP2012">Fearnhead, P. and Prangle, D. (2012) Constructing summary statistics for approximate Bayesian computation: semi-automatic approximate Bayesian computation. Journal of the Royal Statistical Society Series B. Volume 74, Issue 3, pages 419–474.</ref> when envisioning ABC as a non-parametric method of inference.
'''Response:''' This has been changed accordingly.
* At last, it is always possible to criticise the coverage of the historical part, since this is such a recent field that it is constantly evolving. But the authors correctly point out to (Don) Rubin on the one hand and to Diggle and Graton on the other. I would suggest adding in this section links to the relevant softwares like our own DIY-ABC<ref>Cornuet, J.-M., Santos, F., Beaumont, M. et al. (2008) Inferring population history with DIY ABC: a user-friendly approach to approximate Bayesian computation . Bioinformatics 24 (23): 2713-2719.</ref>...
'''Response:''' A section listing ABC software has been added, including a new table with references to the corresponding papers (Table 3) .
== Review by Dennis Prangle ==
This is a well written and accessible introductory article. I particularly like the balance struck between describing the simplicity of implementing ABC and the potential drawbacks.
=== Major comments ===
(nb I've included full references only for papers not in the original article.)
* Much of the material in the "recent methodological developments" section is well established and no longer recent relative to the age of the field (e.g. the Marjoram et al paper was published in 2003). I'd suggest at least renaming this section. Alternatively, much of this material could be incorporated into the "approximation of the posterior" section, as regression correction ideas and MCMC / SMC algorithms are tools commonly used to improve the approximation.
'''Response:''' The section has been removed and most of the material has been incorporated into the “approximation of the posterior” section.
* A little more coverage of applications would be nice. One way to do this without increasing the length of the article would be to explicitly reference recent review papers (Beaumont 2010, Bertorelle et al 2010, Csillery et al 2010, Marin et al 2011<ref>Jean-Michel Marin, Pierre Pudlo, Christian P. Robert and Robin J. Ryder (2011) Approximate Bayesian computational methods. Statistics and Computing (published online)</ref>) for further details.
'''Response:''' We have added a sentence about applications of ABC, with references to these review papers, at the end of the “Example” section.
* The model comparison section should explain how the ABC rejection sampling algorithm can be adapted to perform inference between models (or give a reference). A reference to more advanced algorithms (e.g. Didelot et al, Toni and Stumpf 2009<ref>Tina Toni and Michael P. H. Stumpf (2009) Simulation-based model selection for dynamical systems in systems and population biology. Bioinformatics (26) 104-110</ref>) would also be helpful.
'''Response:''' We have added a reference to the Toni & Stumpf SMC-ABC method for model selection.
* I agree with Christian Robert's comments that the discussion of a hypothesis H in the motivation section is somewhat confusing, and that links to code could be helpful. Some additional suggestions are the "abc" R package and ABC-SysBio.
'''Response:''' See our response to Christian Robert’s comment above.
=== Minor comments ===
* The acceptance criterion should be <math>\rho (\hat{D},D) \le \epsilon</math> not <math>\rho (\hat{D},D)<\epsilon</math> if <math>\epsilon=0</math> is to correspond to acceptance of exact matches only.
'''Response:''' This has been changed.
*"Sufficient summary statistics": As Christian writes, it would seem more natural to discuss general summary statistics first, then the special and less practically useful case of sufficient statistics.
'''Response:''' This has been changed.
*"Example": I'd point out that this is an example application only, and more accurate inference is possible here by particle filtering methods. If there were some missing data this would be a more natural ABC application e.g. if only the summary statistic was observed.
'''Response:''' We have also added a sentence to point out that it is only an example application, and that the posterior can be computed exactly.
*"Approximation of the posterior": "...has been justified theoretically under some limiting conditions". The word "limiting" doesn't seem (to me) to describe the measurement error case.
'''Response:''' We agree and have reformulated this sentence.
*"Choice and sufficiency of summary statistics": "Sufficient statistics are optimal..." I'd change to "Low dimensional sufficient statistics". For some models (e.g. iid Cauchy) the only sufficient statistics are the full data set, which would be a poor choice.
'''Response:''' This has been changed.
*"Choice and sufficiency of summary statistics": "...which is approximated with a pilot run of simulations". Something like "...which is approximated by linear regression based on simulated data" would be more accurate.
'''Response:''' This has been changed.
*"Choice and sufficiency of summary statistics": It might be useful to reference a recent comparison<ref>M. G. B. Blum, M. A. Nunes, D. Prangle, S. A. Sisson (2012) A comparative review of dimension reduction methods in approximate Bayesian computation. arxiv.org/abs/1202.3819 </ref> (disclaimer: which I contributed to) between methods of choosing summary statistics.
'''Response:''' A sentence was added with a reference to the paper.
*"Bayes factor with ABC and summary statistics": "...can also be used to..." it might be more accurate to say "...is sufficient to..."
'''Response:''' This has been changed.
*"Bayes factor with ABC and summary statistics": "meaningless" seems too strong as the next sentence suggests a potentially useful alternative way of doing inference.
'''Response:''' The formulation was changed to “may therefore be misinformative”.
*"Prior distribution and parameter ranges": "...based on the principle of maximum entropy". A link to the general topic of [[wp:Prior_distribution#Uninformative_priors|objective priors]] might be helpful here.
'''Response:''' A link has been added.
*"Large data sets": "which may be a tractable approach for ABC based methods". Note it is already easy to parallelise many of the steps in ABC algorithms based on rejection sampling and SMC.
'''Response:''' This has been changed.
*"Curse of dimensionality": Some theoretical results have been proved here<ref>M. G. B. Blum (2010) Approximate Bayesian Computation: a nonparametric perspective. Journal of the American Statistical Association (105) 1178-1187</ref><ref name="FP2012" />.
'''Response:''' We have added references to these papers.
*"Conclusion": "With faster evaluation of the likelihood function..." I'm not sure what this is getting at; in ABC applications the likelihood function typically cannot be evaluated!
'''Response:''' This formulation has been changed.
==References==
<references />
1354
2012-07-21T12:43:45Z
Mikaelsunnaker
13
/* Comments of Christian P. Robert on the entry */
== Comments of Darren Logan on things to do before moving the article to Wikipedia ==
* The title should probably be "Approximate Bayesian computation" (small "c"). The "...in computational biology" is probably redundant for WP article. You can always make a redirection from the long title to the short one.
*: Moved. --<span style="font-family:Palatino Linotype, Book Antiqua, Palatino, serif;">[[User:Spencer Bliven|Spencer]] [[User talk:Spencer Bliven|Bliven]]</span> 15:35, 18 May 2012 (PDT)
'''The following comments apply specifically to the wikipedia-version of this article''' --[[User:Cdessimoz|Cdessimoz]] 03:41, 22 May 2012 (PDT)
* In the summary and elsewhere, you use terms like "over the last years" and "recently". These should be avoided, as WP articles are not dated and thus non-specific time-frames are not meaningful. If you need to refer to time, be specific (e.g. "Since 1999..." or "In 2010...")
* Example. In general, Wikipedia articles should not contain worked examples. That type of content is better suited to Wikiversity or Wikibooks. There are exceptions, however. The guidance on this can be seen at WP:NOT, specifically: "An article should not read like a "how-to" style... the purpose of Wikipedia is to present facts, not to teach subject matter. It is not appropriate to create or edit articles that read as textbooks, with leading questions and systematic problem solutions as examples... Some kinds of examples, specifically those intended to inform rather than to instruct, may be appropriate for inclusion in a Wikipedia article." I think your example might be ok, but you should be careful of the tone to ensure it doesn't seem like a "how-to" guide.
* Wikipedia articles do not have conclusion sections.
* Throughout the article you should try and avoid using a narrative voice and remove all self-references. For example:
** "As the previous section suggests..."
** "This section attempts to review important recent developments..."
** "...should be considered with sober caution, as discussed below."
** "Interestingly..."
** "This section discusses these potential risks and reviews possible ways to address them.."
** "As the above makes clear..."
** "This section reviews risks..."
** "This section attempts to review important recent developments."
To follow up on these remarks, the history section in such Wikipedia articles is typically the first after the lead section, as it puts the topic into its historic context. I have thus moved it there. However, this possibly breaks some of the narrative flow and should thus be checked again during the revision. --[[User:Daniel Mietchen|Daniel Mietchen]] 19:05, 27 June 2012 (PDT)
: '''Response''': We have verified the coherence of the narrative flow. --[[User:Cdessimoz|Cdessimoz]] 07:52, 5 July 2012 (PDT)
== Comments of Christian P. Robert on the entry ==
A few comments on the specific entry on ABC written by Mikael Sunnåker et al....
*The entry starts with the representation of the posterior probability of an hypothesis, rather than with the posterior density of a model parameter, which makes it seems likely it could lead the novice reader astray. After all, (a) ABC was not introduced for conducting model choice and (b) interchanging hypothesis and model means that the probability of an hypothesis '''''H''''' as used in the entry is actually the evidence in favour of the corresponding model.
'''Response:''' We now first talk only about parameter estimation. We have also rewritten the section about model selection for better coherence of the text.
* (There are a few typos and grammar mistakes, but I assume either PLoS or later contributors will correct those.)
'''Response:''' We have corrected the typos and grammatical mistakes found during the revision.
* When the authors state that the "outcome of the ABC rejection algorithm is a set of parameter estimates distributed according to the desired posterior distribution", I think they are leading some of the readers astray as they forget the "approximative" aspect of this distribution.
'''Response:''' This has been changed.
* Further below, I would have used the title "Insufficient summary statistics" rather than "Sufficient summary statistics", as it spells out more clearly the fundamental issue with the potential difficulty in using ABC.
'''Response:''' The title has been changed to “Summary statistics” (see also Dennis Prangle's comment below)
* (And I am not sure the subsequent paragraph on "Choice and sufficiency of summary statistics" should bother with the sufficiency aspects... It seems to me much more relevant to assess the impact on predictive performances.
'''Response:''' We have toned down the issue of sufficiency. For clarity reason, we prefer to defer the discussion on predictive performance to the "pitfall and remedies" section.
* Although this is most minor, I would not have made mention of the (rather artificial) "table for interpretation of the strength in values of [[wp:Bayes factor|the Bayes factor]] (...) originally published by [[wp:Harold Jeffreys|Harold Jeffreys]]". I obviously appreciate very much that the authors advertise our warning <ref> Robert, C.P., Cornuet, J.-M., Marin, J.-M. and Pillai, N. (2011) Lack of confidence in approximate Bayesian computation model choice. PNAS vol. 108 no. 37 15112-15117.</ref> about the potential lack of validity of an ABC based Bayes factor!
'''Response:''' The section on model selection has been rewritten. In the process, the reference to Jeffreys's table has been removed.
* I also like the notion of "quality control", even though it should only appear once.
'''Response:''' We have merged the two sections about quality control.
* And the pseudo-example is quite fine as an introduction, while it could be supplemented with the outcome resulting from a large n, to be compared with the true posterior distribution.
'''Response:''' We have included a new figure (Fig. 3), which shows ABC with large n for full data, and summary statistics (<math>\epsilon = 0</math> and <math>\epsilon = 2</math>). As suggested, it also compares the ABC results with the theoretical posterior.
* The section "Pitfalls and remedies" is remarkable in that it details the necessary steps for validating a ABC implementation: the only entry I would remove is the one about "Prior distribution and parameter ranges", in that this is not a problem inherent to ABC... (Granted, the authors present this as a "general risks in statistical inference exacerbated in ABC", which makes more sense!)
'''Response:''' We would like to keep the discussion on prior distribution and parameter ranges. However, a sentence was added under “Pitfalls and remedies” to emphasize that the problem related to “Prior distribution and parameter ranges” is not specific to ABC.
* It may be that the section on the non-zero tolerance should emphasize more clearly the fact that ε should not be zero. As discussed in the recent Read Paper by Fearnhead and Prangle <ref name="FP2012">Fearnhead, P. and Prangle, D. (2012) Constructing summary statistics for approximate Bayesian computation: semi-automatic approximate Bayesian computation. Journal of the Royal Statistical Society Series B. Volume 74, Issue 3, pages 419–474.</ref> when envisioning ABC as a non-parametric method of inference.
'''Response:''' This has been changed accordingly.
* At last, it is always possible to criticise the coverage of the historical part, since this is such a recent field that it is constantly evolving. But the authors correctly point out to (Don) Rubin on the one hand and to Diggle and Graton on the other. I would suggest adding in this section links to the relevant softwares like our own DIY-ABC<ref>Cornuet, J.-M., Santos, F., Beaumont, M. et al. (2008) Inferring population history with DIY ABC: a user-friendly approach to approximate Bayesian computation . Bioinformatics 24 (23): 2713-2719.</ref>...
'''Response:''' A section listing ABC software has been added, including a new table with references to the corresponding papers (Table 3) .
== Review by Dennis Prangle ==
This is a well written and accessible introductory article. I particularly like the balance struck between describing the simplicity of implementing ABC and the potential drawbacks.
=== Major comments ===
(nb I've included full references only for papers not in the original article.)
* Much of the material in the "recent methodological developments" section is well established and no longer recent relative to the age of the field (e.g. the Marjoram et al paper was published in 2003). I'd suggest at least renaming this section. Alternatively, much of this material could be incorporated into the "approximation of the posterior" section, as regression correction ideas and MCMC / SMC algorithms are tools commonly used to improve the approximation.
'''Response:''' The section has been removed and most of the material has been incorporated into the “approximation of the posterior” section.
* A little more coverage of applications would be nice. One way to do this without increasing the length of the article would be to explicitly reference recent review papers (Beaumont 2010, Bertorelle et al 2010, Csillery et al 2010, Marin et al 2011<ref>Jean-Michel Marin, Pierre Pudlo, Christian P. Robert and Robin J. Ryder (2011) Approximate Bayesian computational methods. Statistics and Computing (published online)</ref>) for further details.
'''Response:''' We have added a sentence about applications of ABC, with references to these review papers, at the end of the “Example” section.
* The model comparison section should explain how the ABC rejection sampling algorithm can be adapted to perform inference between models (or give a reference). A reference to more advanced algorithms (e.g. Didelot et al, Toni and Stumpf 2009<ref>Tina Toni and Michael P. H. Stumpf (2009) Simulation-based model selection for dynamical systems in systems and population biology. Bioinformatics (26) 104-110</ref>) would also be helpful.
'''Response:''' We have added a reference to the Toni & Stumpf SMC-ABC method for model selection.
* I agree with Christian Robert's comments that the discussion of a hypothesis H in the motivation section is somewhat confusing, and that links to code could be helpful. Some additional suggestions are the "abc" R package and ABC-SysBio.
'''Response:''' See our response to Christian Robert’s comment above.
=== Minor comments ===
* The acceptance criterion should be <math>\rho (\hat{D},D) \le \epsilon</math> not <math>\rho (\hat{D},D)<\epsilon</math> if <math>\epsilon=0</math> is to correspond to acceptance of exact matches only.
'''Response:''' This has been changed.
*"Sufficient summary statistics": As Christian writes, it would seem more natural to discuss general summary statistics first, then the special and less practically useful case of sufficient statistics.
'''Response:''' This has been changed.
*"Example": I'd point out that this is an example application only, and more accurate inference is possible here by particle filtering methods. If there were some missing data this would be a more natural ABC application e.g. if only the summary statistic was observed.
'''Response:''' We have also added a sentence to point out that it is only an example application, and that the posterior can be computed exactly.
*"Approximation of the posterior": "...has been justified theoretically under some limiting conditions". The word "limiting" doesn't seem (to me) to describe the measurement error case.
'''Response:''' We agree and have reformulated this sentence.
*"Choice and sufficiency of summary statistics": "Sufficient statistics are optimal..." I'd change to "Low dimensional sufficient statistics". For some models (e.g. iid Cauchy) the only sufficient statistics are the full data set, which would be a poor choice.
'''Response:''' This has been changed.
*"Choice and sufficiency of summary statistics": "...which is approximated with a pilot run of simulations". Something like "...which is approximated by linear regression based on simulated data" would be more accurate.
'''Response:''' This has been changed.
*"Choice and sufficiency of summary statistics": It might be useful to reference a recent comparison<ref>M. G. B. Blum, M. A. Nunes, D. Prangle, S. A. Sisson (2012) A comparative review of dimension reduction methods in approximate Bayesian computation. arxiv.org/abs/1202.3819 </ref> (disclaimer: which I contributed to) between methods of choosing summary statistics.
'''Response:''' A sentence was added with a reference to the paper.
*"Bayes factor with ABC and summary statistics": "...can also be used to..." it might be more accurate to say "...is sufficient to..."
'''Response:''' This has been changed.
*"Bayes factor with ABC and summary statistics": "meaningless" seems too strong as the next sentence suggests a potentially useful alternative way of doing inference.
'''Response:''' The formulation was changed to “may therefore be misinformative”.
*"Prior distribution and parameter ranges": "...based on the principle of maximum entropy". A link to the general topic of [[wp:Prior_distribution#Uninformative_priors|objective priors]] might be helpful here.
'''Response:''' A link has been added.
*"Large data sets": "which may be a tractable approach for ABC based methods". Note it is already easy to parallelise many of the steps in ABC algorithms based on rejection sampling and SMC.
'''Response:''' This has been changed.
*"Curse of dimensionality": Some theoretical results have been proved here<ref>M. G. B. Blum (2010) Approximate Bayesian Computation: a nonparametric perspective. Journal of the American Statistical Association (105) 1178-1187</ref><ref name="FP2012" />.
'''Response:''' We have added references to these papers.
*"Conclusion": "With faster evaluation of the likelihood function..." I'm not sure what this is getting at; in ABC applications the likelihood function typically cannot be evaluated!
'''Response:''' This formulation has been changed.
==References==
<references />
1382
2012-08-13T16:12:17Z
Dennisprangle
21
Added review of updated article
== Comments of Darren Logan on things to do before moving the article to Wikipedia ==
* The title should probably be "Approximate Bayesian computation" (small "c"). The "...in computational biology" is probably redundant for WP article. You can always make a redirection from the long title to the short one.
*: Moved. --<span style="font-family:Palatino Linotype, Book Antiqua, Palatino, serif;">[[User:Spencer Bliven|Spencer]] [[User talk:Spencer Bliven|Bliven]]</span> 15:35, 18 May 2012 (PDT)
'''The following comments apply specifically to the wikipedia-version of this article''' --[[User:Cdessimoz|Cdessimoz]] 03:41, 22 May 2012 (PDT)
* In the summary and elsewhere, you use terms like "over the last years" and "recently". These should be avoided, as WP articles are not dated and thus non-specific time-frames are not meaningful. If you need to refer to time, be specific (e.g. "Since 1999..." or "In 2010...")
* Example. In general, Wikipedia articles should not contain worked examples. That type of content is better suited to Wikiversity or Wikibooks. There are exceptions, however. The guidance on this can be seen at WP:NOT, specifically: "An article should not read like a "how-to" style... the purpose of Wikipedia is to present facts, not to teach subject matter. It is not appropriate to create or edit articles that read as textbooks, with leading questions and systematic problem solutions as examples... Some kinds of examples, specifically those intended to inform rather than to instruct, may be appropriate for inclusion in a Wikipedia article." I think your example might be ok, but you should be careful of the tone to ensure it doesn't seem like a "how-to" guide.
* Wikipedia articles do not have conclusion sections.
* Throughout the article you should try and avoid using a narrative voice and remove all self-references. For example:
** "As the previous section suggests..."
** "This section attempts to review important recent developments..."
** "...should be considered with sober caution, as discussed below."
** "Interestingly..."
** "This section discusses these potential risks and reviews possible ways to address them.."
** "As the above makes clear..."
** "This section reviews risks..."
** "This section attempts to review important recent developments."
To follow up on these remarks, the history section in such Wikipedia articles is typically the first after the lead section, as it puts the topic into its historic context. I have thus moved it there. However, this possibly breaks some of the narrative flow and should thus be checked again during the revision. --[[User:Daniel Mietchen|Daniel Mietchen]] 19:05, 27 June 2012 (PDT)
: '''Response''': We have verified the coherence of the narrative flow. --[[User:Cdessimoz|Cdessimoz]] 07:52, 5 July 2012 (PDT)
== Comments of Christian P. Robert on the entry ==
A few comments on the specific entry on ABC written by Mikael Sunnåker et al....
*The entry starts with the representation of the posterior probability of an hypothesis, rather than with the posterior density of a model parameter, which makes it seems likely it could lead the novice reader astray. After all, (a) ABC was not introduced for conducting model choice and (b) interchanging hypothesis and model means that the probability of an hypothesis '''''H''''' as used in the entry is actually the evidence in favour of the corresponding model.
'''Response:''' We now first talk only about parameter estimation. We have also rewritten the section about model selection for better coherence of the text.
* (There are a few typos and grammar mistakes, but I assume either PLoS or later contributors will correct those.)
'''Response:''' We have corrected the typos and grammatical mistakes found during the revision.
* When the authors state that the "outcome of the ABC rejection algorithm is a set of parameter estimates distributed according to the desired posterior distribution", I think they are leading some of the readers astray as they forget the "approximative" aspect of this distribution.
'''Response:''' This has been changed.
* Further below, I would have used the title "Insufficient summary statistics" rather than "Sufficient summary statistics", as it spells out more clearly the fundamental issue with the potential difficulty in using ABC.
'''Response:''' The title has been changed to “Summary statistics” (see also Dennis Prangle's comment below)
* (And I am not sure the subsequent paragraph on "Choice and sufficiency of summary statistics" should bother with the sufficiency aspects... It seems to me much more relevant to assess the impact on predictive performances.
'''Response:''' We have toned down the issue of sufficiency. For clarity reason, we prefer to defer the discussion on predictive performance to the "pitfall and remedies" section.
* Although this is most minor, I would not have made mention of the (rather artificial) "table for interpretation of the strength in values of [[wp:Bayes factor|the Bayes factor]] (...) originally published by [[wp:Harold Jeffreys|Harold Jeffreys]]". I obviously appreciate very much that the authors advertise our warning <ref> Robert, C.P., Cornuet, J.-M., Marin, J.-M. and Pillai, N. (2011) Lack of confidence in approximate Bayesian computation model choice. PNAS vol. 108 no. 37 15112-15117.</ref> about the potential lack of validity of an ABC based Bayes factor!
'''Response:''' The section on model selection has been rewritten. In the process, the reference to Jeffreys's table has been removed.
* I also like the notion of "quality control", even though it should only appear once.
'''Response:''' We have merged the two sections about quality control.
* And the pseudo-example is quite fine as an introduction, while it could be supplemented with the outcome resulting from a large n, to be compared with the true posterior distribution.
'''Response:''' We have included a new figure (Fig. 3), which shows ABC with large n for full data, and summary statistics (<math>\epsilon = 0</math> and <math>\epsilon = 2</math>). As suggested, it also compares the ABC results with the theoretical posterior.
* The section "Pitfalls and remedies" is remarkable in that it details the necessary steps for validating a ABC implementation: the only entry I would remove is the one about "Prior distribution and parameter ranges", in that this is not a problem inherent to ABC... (Granted, the authors present this as a "general risks in statistical inference exacerbated in ABC", which makes more sense!)
'''Response:''' We would like to keep the discussion on prior distribution and parameter ranges. However, a sentence was added under “Pitfalls and remedies” to emphasize that the problem related to “Prior distribution and parameter ranges” is not specific to ABC.
* It may be that the section on the non-zero tolerance should emphasize more clearly the fact that ε should not be zero. As discussed in the recent Read Paper by Fearnhead and Prangle <ref name="FP2012">Fearnhead, P. and Prangle, D. (2012) Constructing summary statistics for approximate Bayesian computation: semi-automatic approximate Bayesian computation. Journal of the Royal Statistical Society Series B. Volume 74, Issue 3, pages 419–474.</ref> when envisioning ABC as a non-parametric method of inference.
'''Response:''' This has been changed accordingly.
* At last, it is always possible to criticise the coverage of the historical part, since this is such a recent field that it is constantly evolving. But the authors correctly point out to (Don) Rubin on the one hand and to Diggle and Graton on the other. I would suggest adding in this section links to the relevant softwares like our own DIY-ABC<ref>Cornuet, J.-M., Santos, F., Beaumont, M. et al. (2008) Inferring population history with DIY ABC: a user-friendly approach to approximate Bayesian computation . Bioinformatics 24 (23): 2713-2719.</ref>...
'''Response:''' A section listing ABC software has been added, including a new table with references to the corresponding papers (Table 3) .
== Review by Dennis Prangle ==
This is a well written and accessible introductory article. I particularly like the balance struck between describing the simplicity of implementing ABC and the potential drawbacks.
=== Major comments ===
(nb I've included full references only for papers not in the original article.)
* Much of the material in the "recent methodological developments" section is well established and no longer recent relative to the age of the field (e.g. the Marjoram et al paper was published in 2003). I'd suggest at least renaming this section. Alternatively, much of this material could be incorporated into the "approximation of the posterior" section, as regression correction ideas and MCMC / SMC algorithms are tools commonly used to improve the approximation.
'''Response:''' The section has been removed and most of the material has been incorporated into the “approximation of the posterior” section.
* A little more coverage of applications would be nice. One way to do this without increasing the length of the article would be to explicitly reference recent review papers (Beaumont 2010, Bertorelle et al 2010, Csillery et al 2010, Marin et al 2011<ref>Jean-Michel Marin, Pierre Pudlo, Christian P. Robert and Robin J. Ryder (2011) Approximate Bayesian computational methods. Statistics and Computing (published online)</ref>) for further details.
'''Response:''' We have added a sentence about applications of ABC, with references to these review papers, at the end of the “Example” section.
* The model comparison section should explain how the ABC rejection sampling algorithm can be adapted to perform inference between models (or give a reference). A reference to more advanced algorithms (e.g. Didelot et al, Toni and Stumpf 2009<ref>Tina Toni and Michael P. H. Stumpf (2009) Simulation-based model selection for dynamical systems in systems and population biology. Bioinformatics (26) 104-110</ref>) would also be helpful.
'''Response:''' We have added a reference to the Toni & Stumpf SMC-ABC method for model selection.
* I agree with Christian Robert's comments that the discussion of a hypothesis H in the motivation section is somewhat confusing, and that links to code could be helpful. Some additional suggestions are the "abc" R package and ABC-SysBio.
'''Response:''' See our response to Christian Robert’s comment above.
=== Minor comments ===
* The acceptance criterion should be <math>\rho (\hat{D},D) \le \epsilon</math> not <math>\rho (\hat{D},D)<\epsilon</math> if <math>\epsilon=0</math> is to correspond to acceptance of exact matches only.
'''Response:''' This has been changed.
*"Sufficient summary statistics": As Christian writes, it would seem more natural to discuss general summary statistics first, then the special and less practically useful case of sufficient statistics.
'''Response:''' This has been changed.
*"Example": I'd point out that this is an example application only, and more accurate inference is possible here by particle filtering methods. If there were some missing data this would be a more natural ABC application e.g. if only the summary statistic was observed.
'''Response:''' We have also added a sentence to point out that it is only an example application, and that the posterior can be computed exactly.
*"Approximation of the posterior": "...has been justified theoretically under some limiting conditions". The word "limiting" doesn't seem (to me) to describe the measurement error case.
'''Response:''' We agree and have reformulated this sentence.
*"Choice and sufficiency of summary statistics": "Sufficient statistics are optimal..." I'd change to "Low dimensional sufficient statistics". For some models (e.g. iid Cauchy) the only sufficient statistics are the full data set, which would be a poor choice.
'''Response:''' This has been changed.
*"Choice and sufficiency of summary statistics": "...which is approximated with a pilot run of simulations". Something like "...which is approximated by linear regression based on simulated data" would be more accurate.
'''Response:''' This has been changed.
*"Choice and sufficiency of summary statistics": It might be useful to reference a recent comparison<ref>M. G. B. Blum, M. A. Nunes, D. Prangle, S. A. Sisson (2012) A comparative review of dimension reduction methods in approximate Bayesian computation. arxiv.org/abs/1202.3819 </ref> (disclaimer: which I contributed to) between methods of choosing summary statistics.
'''Response:''' A sentence was added with a reference to the paper.
*"Bayes factor with ABC and summary statistics": "...can also be used to..." it might be more accurate to say "...is sufficient to..."
'''Response:''' This has been changed.
*"Bayes factor with ABC and summary statistics": "meaningless" seems too strong as the next sentence suggests a potentially useful alternative way of doing inference.
'''Response:''' The formulation was changed to “may therefore be misinformative”.
*"Prior distribution and parameter ranges": "...based on the principle of maximum entropy". A link to the general topic of [[wp:Prior_distribution#Uninformative_priors|objective priors]] might be helpful here.
'''Response:''' A link has been added.
*"Large data sets": "which may be a tractable approach for ABC based methods". Note it is already easy to parallelise many of the steps in ABC algorithms based on rejection sampling and SMC.
'''Response:''' This has been changed.
*"Curse of dimensionality": Some theoretical results have been proved here<ref>M. G. B. Blum (2010) Approximate Bayesian Computation: a nonparametric perspective. Journal of the American Statistical Association (105) 1178-1187</ref><ref name="FP2012" />.
'''Response:''' We have added references to these papers.
*"Conclusion": "With faster evaluation of the likelihood function..." I'm not sure what this is getting at; in ABC applications the likelihood function typically cannot be evaluated!
'''Response:''' This formulation has been changed.
=== Review of updated article ===
I have read the revised article and discussion of the amendments, and am happy to accept it for publication.
==References==
<references />
1729
2012-09-22T01:02:50Z
Daniel Mietchen
5
/* Comments of Christian P. Robert on the entry */
== Comments of Darren Logan on things to do before moving the article to Wikipedia ==
* The title should probably be "Approximate Bayesian computation" (small "c"). The "...in computational biology" is probably redundant for WP article. You can always make a redirection from the long title to the short one.
*: Moved. --<span style="font-family:Palatino Linotype, Book Antiqua, Palatino, serif;">[[User:Spencer Bliven|Spencer]] [[User talk:Spencer Bliven|Bliven]]</span> 15:35, 18 May 2012 (PDT)
'''The following comments apply specifically to the wikipedia-version of this article''' --[[User:Cdessimoz|Cdessimoz]] 03:41, 22 May 2012 (PDT)
* In the summary and elsewhere, you use terms like "over the last years" and "recently". These should be avoided, as WP articles are not dated and thus non-specific time-frames are not meaningful. If you need to refer to time, be specific (e.g. "Since 1999..." or "In 2010...")
* Example. In general, Wikipedia articles should not contain worked examples. That type of content is better suited to Wikiversity or Wikibooks. There are exceptions, however. The guidance on this can be seen at WP:NOT, specifically: "An article should not read like a "how-to" style... the purpose of Wikipedia is to present facts, not to teach subject matter. It is not appropriate to create or edit articles that read as textbooks, with leading questions and systematic problem solutions as examples... Some kinds of examples, specifically those intended to inform rather than to instruct, may be appropriate for inclusion in a Wikipedia article." I think your example might be ok, but you should be careful of the tone to ensure it doesn't seem like a "how-to" guide.
* Wikipedia articles do not have conclusion sections.
* Throughout the article you should try and avoid using a narrative voice and remove all self-references. For example:
** "As the previous section suggests..."
** "This section attempts to review important recent developments..."
** "...should be considered with sober caution, as discussed below."
** "Interestingly..."
** "This section discusses these potential risks and reviews possible ways to address them.."
** "As the above makes clear..."
** "This section reviews risks..."
** "This section attempts to review important recent developments."
To follow up on these remarks, the history section in such Wikipedia articles is typically the first after the lead section, as it puts the topic into its historic context. I have thus moved it there. However, this possibly breaks some of the narrative flow and should thus be checked again during the revision. --[[User:Daniel Mietchen|Daniel Mietchen]] 19:05, 27 June 2012 (PDT)
: '''Response''': We have verified the coherence of the narrative flow. --[[User:Cdessimoz|Cdessimoz]] 07:52, 5 July 2012 (PDT)
== Comments of Christian P. Robert on the entry ==
A few comments on the specific entry on ABC written by Mikael Sunnåker et al....
*The entry starts with the representation of the posterior probability of an hypothesis, rather than with the posterior density of a model parameter, which makes it seems likely it could lead the novice reader astray. After all, (a) ABC was not introduced for conducting model choice and (b) interchanging hypothesis and model means that the probability of an hypothesis '''''H''''' as used in the entry is actually the evidence in favour of the corresponding model.
'''Response:''' We now first talk only about parameter estimation. We have also rewritten the section about model selection for better coherence of the text.
* (There are a few typos and grammar mistakes, but I assume either PLoS or later contributors will correct those.)
'''Response:''' We have corrected the typos and grammatical mistakes found during the revision.
* When the authors state that the "outcome of the ABC rejection algorithm is a set of parameter estimates distributed according to the desired posterior distribution", I think they are leading some of the readers astray as they forget the "approximative" aspect of this distribution.
'''Response:''' This has been changed.
* Further below, I would have used the title "Insufficient summary statistics" rather than "Sufficient summary statistics", as it spells out more clearly the fundamental issue with the potential difficulty in using ABC.
'''Response:''' The title has been changed to “Summary statistics” (see also Dennis Prangle's comment below)
* (And I am not sure the subsequent paragraph on "Choice and sufficiency of summary statistics" should bother with the sufficiency aspects... It seems to me much more relevant to assess the impact on predictive performances.
'''Response:''' We have toned down the issue of sufficiency. For clarity reason, we prefer to defer the discussion on predictive performance to the "pitfall and remedies" section.
* Although this is most minor, I would not have made mention of the (rather artificial) "table for interpretation of the strength in values of [[wp:Bayes factor|the Bayes factor]] (...) originally published by [[wp:Harold Jeffreys|Harold Jeffreys]]". I obviously appreciate very much that the authors advertise our warning <ref> Robert, C.P., Cornuet, J.-M., Marin, J.-M. and Pillai, N. (2011) Lack of confidence in approximate Bayesian computation model choice. PNAS vol. 108 no. 37 15112-15117.</ref> about the potential lack of validity of an ABC based Bayes factor!
'''Response:''' The section on model selection has been rewritten. In the process, the reference to Jeffreys's table has been removed.
* I also like the notion of "quality control", even though it should only appear once.
'''Response:''' We have merged the two sections about quality control.
* And the pseudo-example is quite fine as an introduction, while it could be supplemented with the outcome resulting from a large n, to be compared with the true posterior distribution.
'''Response:''' We have included a new figure (Fig. 3), which shows ABC with large n for full data, and summary statistics (<math>\epsilon = 0</math> and <math>\epsilon = 2</math>). As suggested, it also compares the ABC results with the theoretical posterior.
* The section "Pitfalls and remedies" is remarkable in that it details the necessary steps for validating a ABC implementation: the only entry I would remove is the one about "Prior distribution and parameter ranges", in that this is not a problem inherent to ABC... (Granted, the authors present this as a "general risks in statistical inference exacerbated in ABC", which makes more sense!)
'''Response:''' We would like to keep the discussion on prior distribution and parameter ranges. However, a sentence was added under “Pitfalls and remedies” to emphasize that the problem related to “Prior distribution and parameter ranges” is not specific to ABC.
* It may be that the section on the non-zero tolerance should emphasize more clearly the fact that ε should not be zero. As discussed in the recent Read Paper by Fearnhead and Prangle <ref name="FP2012">Fearnhead, P. and Prangle, D. (2012) Constructing summary statistics for approximate Bayesian computation: semi-automatic approximate Bayesian computation. Journal of the Royal Statistical Society Series B. Volume 74, Issue 3, pages 419–474.</ref> when envisioning ABC as a non-parametric method of inference.
'''Response:''' This has been changed accordingly.
* At last, it is always possible to criticise the coverage of the historical part, since this is such a recent field that it is constantly evolving. But the authors correctly point out to (Don) Rubin on the one hand and to Diggle and Graton on the other. I would suggest adding in this section links to the relevant softwares like our own DIY-ABC<ref>Cornuet, J.-M., Santos, F., Beaumont, M. et al. (2008) Inferring population history with DIY ABC: a user-friendly approach to approximate Bayesian computation . Bioinformatics 24 (23): 2713-2719.</ref>...
'''Response:''' A section listing ABC software has been added, including a new table with references to the corresponding papers (Table 3) .
=== Review after revision ===
Christian Robert wrote:
"I have nothing to add to my earlier review, I am completely happy with the current version!"
--[[User:Daniel Mietchen|Daniel Mietchen]] 18:02, 21 September 2012 (PDT)
== Review by Dennis Prangle ==
This is a well written and accessible introductory article. I particularly like the balance struck between describing the simplicity of implementing ABC and the potential drawbacks.
=== Major comments ===
(nb I've included full references only for papers not in the original article.)
* Much of the material in the "recent methodological developments" section is well established and no longer recent relative to the age of the field (e.g. the Marjoram et al paper was published in 2003). I'd suggest at least renaming this section. Alternatively, much of this material could be incorporated into the "approximation of the posterior" section, as regression correction ideas and MCMC / SMC algorithms are tools commonly used to improve the approximation.
'''Response:''' The section has been removed and most of the material has been incorporated into the “approximation of the posterior” section.
* A little more coverage of applications would be nice. One way to do this without increasing the length of the article would be to explicitly reference recent review papers (Beaumont 2010, Bertorelle et al 2010, Csillery et al 2010, Marin et al 2011<ref>Jean-Michel Marin, Pierre Pudlo, Christian P. Robert and Robin J. Ryder (2011) Approximate Bayesian computational methods. Statistics and Computing (published online)</ref>) for further details.
'''Response:''' We have added a sentence about applications of ABC, with references to these review papers, at the end of the “Example” section.
* The model comparison section should explain how the ABC rejection sampling algorithm can be adapted to perform inference between models (or give a reference). A reference to more advanced algorithms (e.g. Didelot et al, Toni and Stumpf 2009<ref>Tina Toni and Michael P. H. Stumpf (2009) Simulation-based model selection for dynamical systems in systems and population biology. Bioinformatics (26) 104-110</ref>) would also be helpful.
'''Response:''' We have added a reference to the Toni & Stumpf SMC-ABC method for model selection.
* I agree with Christian Robert's comments that the discussion of a hypothesis H in the motivation section is somewhat confusing, and that links to code could be helpful. Some additional suggestions are the "abc" R package and ABC-SysBio.
'''Response:''' See our response to Christian Robert’s comment above.
=== Minor comments ===
* The acceptance criterion should be <math>\rho (\hat{D},D) \le \epsilon</math> not <math>\rho (\hat{D},D)<\epsilon</math> if <math>\epsilon=0</math> is to correspond to acceptance of exact matches only.
'''Response:''' This has been changed.
*"Sufficient summary statistics": As Christian writes, it would seem more natural to discuss general summary statistics first, then the special and less practically useful case of sufficient statistics.
'''Response:''' This has been changed.
*"Example": I'd point out that this is an example application only, and more accurate inference is possible here by particle filtering methods. If there were some missing data this would be a more natural ABC application e.g. if only the summary statistic was observed.
'''Response:''' We have also added a sentence to point out that it is only an example application, and that the posterior can be computed exactly.
*"Approximation of the posterior": "...has been justified theoretically under some limiting conditions". The word "limiting" doesn't seem (to me) to describe the measurement error case.
'''Response:''' We agree and have reformulated this sentence.
*"Choice and sufficiency of summary statistics": "Sufficient statistics are optimal..." I'd change to "Low dimensional sufficient statistics". For some models (e.g. iid Cauchy) the only sufficient statistics are the full data set, which would be a poor choice.
'''Response:''' This has been changed.
*"Choice and sufficiency of summary statistics": "...which is approximated with a pilot run of simulations". Something like "...which is approximated by linear regression based on simulated data" would be more accurate.
'''Response:''' This has been changed.
*"Choice and sufficiency of summary statistics": It might be useful to reference a recent comparison<ref>M. G. B. Blum, M. A. Nunes, D. Prangle, S. A. Sisson (2012) A comparative review of dimension reduction methods in approximate Bayesian computation. arxiv.org/abs/1202.3819 </ref> (disclaimer: which I contributed to) between methods of choosing summary statistics.
'''Response:''' A sentence was added with a reference to the paper.
*"Bayes factor with ABC and summary statistics": "...can also be used to..." it might be more accurate to say "...is sufficient to..."
'''Response:''' This has been changed.
*"Bayes factor with ABC and summary statistics": "meaningless" seems too strong as the next sentence suggests a potentially useful alternative way of doing inference.
'''Response:''' The formulation was changed to “may therefore be misinformative”.
*"Prior distribution and parameter ranges": "...based on the principle of maximum entropy". A link to the general topic of [[wp:Prior_distribution#Uninformative_priors|objective priors]] might be helpful here.
'''Response:''' A link has been added.
*"Large data sets": "which may be a tractable approach for ABC based methods". Note it is already easy to parallelise many of the steps in ABC algorithms based on rejection sampling and SMC.
'''Response:''' This has been changed.
*"Curse of dimensionality": Some theoretical results have been proved here<ref>M. G. B. Blum (2010) Approximate Bayesian Computation: a nonparametric perspective. Journal of the American Statistical Association (105) 1178-1187</ref><ref name="FP2012" />.
'''Response:''' We have added references to these papers.
*"Conclusion": "With faster evaluation of the likelihood function..." I'm not sure what this is getting at; in ABC applications the likelihood function typically cannot be evaluated!
'''Response:''' This formulation has been changed.
=== Review of updated article ===
I have read the revised article and discussion of the amendments, and am happy to accept it for publication.
==References==
<references />