Dear Judith, Please find below the response to the referee report for paper MS# 70721 by Amaya Moro-Martin et al. entitled: "Are Debris Disks and Massive Planets Correlated?" Thanks for your consideration and thanks to the referee for very useful comments and his/her careful reading of the manuscript. Cheers, Amaya ______________________________ REFEREE: The main result of this paper - that there is no significant correlation between planet and debris disk detections - is presented concisely and clearly. The implications are important for those interested in extrasolar planets, debris disks, and the formation of planetary systems. With some clarification of the survival analysis, I recommend this paper for publication. While the disk-planet correlation is the main subject of the paper, the detection of a new debris disk + giant planet system (HD 38529) is interesting in its own right, and has received a prominent place in the abstract and summary. I would suggest, however, that further attention be given to this system, in light of the IRS spectrum. There is currently very little discussion of this spectrum, mainly just a passing mention of 0.7-sigma excess at 33um (in section 3). I have not seen the spectrum in detail (and I would recommend adding it as a new figure) but from the derived photometry in Table 3 it appears that the excess is much more significant. It is important to note that the claimed 6% calibration uncertainty for the IRS photometry is for the overall spectrum, not for individual wavelengths relative to each other. As such, smaller deviations from the star's wavelength^-2 flux can be detected. Looking at Table 3, the slope from 13um to 24um appears photospheric. The 33um point, meanwhile, is at least 10% higher than a fit to these points, a very significant increase that I would think would stand out very prominently in a plot of observed/model flux (again, a figure would help here). The significance of IRS excess depends on the data analysis. Please indicate that the spectra have not been pinned to the stellar photosphere and discuss how the various modules are spliced together. If the IRS excess is truly significant, additional analysis of the orbiting dust is warranted, such as an improved fit to the dust temperature and better estimates of the dust location, mass, etc. RESPONSE: The paper under consideration was originally part of a larger paper that included the debris disk planet correlation plus a detailed analysis of the HD38529 system. For clarity, and because we considered that both topics were interesting in their own right, we decided to split the paper in two. This is the reason why the results for HD38529 are not discussed any further. To inform the reader that these results are presented elsewhere, we have added a reference to the HD38529 paper at the end of the introduction. In this paper, expected to be submitted within the next 3 weeks, we model the disk SED, estimating the possible location of the emitting dust, and we model (analytically and numerically) the dynamical evolution of potential dust-producing planetesimals in the presence of the known two planets, trying to give a self consistent picture of this interesting system. I appreciate very much the above comments by the referee and I will introduce them in the HD38529 paper before submission. _____________________ REFEREE: The most significant comments I have are for the survival analysis. First, please describe the method a bit in section 4.2. At a minimum, indicate that 1) this analysis is ideal for data sets containing upper limits, but 2) it requires some assumptions for the underlying distribution. Explain that the three tests considered (Gehan, logrank, Peto-Prentice) are based on different assumptions for the flux distribution. Is there any reason to think that one of these tests would be most reliable for this particular dataset? RESPONSE: To address this point We have added the following to the survival analysis section (now section 4.3): "The planet sample and the control sample are dominated by upper limits, therefore, the K-S test is not sufficient to assess the probability that they could have been drawn from the same parent distribution. To extract the maximum amount of information from the non-detections it is necessary to use survival analysis methods, which make certain assumptions about the underlying distributions. (...) As discussed in Feigelson \& Nelson (\citeyear{feig85}), the logrank test is more sensitive to differences at low values of the variable under consideration (i.e. near the upper limits), while the Gehan test is more sensitive to differences at the high end (i.e. in the detections). The Peto-Prentice test is preferred when the upper limits dominate and the sizes of the samples to be compared differ (as it is our case)." ____________________ REFEREE: I strongly question the use of 70um excess flux (in units of mJy) as the variable in consideration. Wouldn't it be more useful to consider the excess luminosity (i.e. L_dust/L_*) rather than flux? Flux will presumably be most highly correlated with distance, rather than other variables such as planets. In several places (4.2 and the conclusion) it is claimed that there is no correlation with excess luminosity, when in fact the test uses flux. RESPONSE: Thanks for pointing that out. We have changed the survival analysis so that it now uses Ldust/L* as a variable, instead of the excess flux in mJy. The probabilities have changed, but the conclusion is still the same (no evidence of correlation). We have modified the the survival analysis section to reflect this change. ________________ REFEREE: Given that the strength of the survival analysis is in dealing with upper limits, please indicate what upper limits are used - 1-sigma or 3-sigma; observed plus noise or simply noise. RESPONSE: It's 3-sigma, as it is now indicated in section 4. _________________ REFEREE: Again, given the method's ability to handle different detection limits, why were the two samples, FEPS & Bryden, only considered separately? You're explicitly taking into account any selection effects based on distance and background noise, so can't the two samples be combined into one? RESPONSE: The stars in Bryden's sample are sistematically closer than those in the FEPS sample, that's why we originally kept both samples separate. In any case, to check if it would make any difference, we've repeated the survival analysis (with the excess luminosity) with both samples together and the results are now mentioned in the survival analysis section. The conclusions don't change. _________________ REFEREE: Introduction: Is there a reference for the mass of the Kuiper Belt? RESPONSE: The mass estimate has been updated in the intro. Ref. are Bernstein et al. 2004 and Luu and Jewitt 2002. _________________ REFEREE: Section 2: Uncertainties for the three instruments are quoted from various sources. It would be informative to compare these uncertainties with the dispersion within your own data relative to model values (for IRS-13,24,33 and MIPS-24 in particular). How are the MIPS-70 uncertainties in Table 3 calculated? RESPONSE: We have added to section 2 a description of how the uncertainties for the 70 micron fluxes were calculated. Thanks for pointing out that this piece of information was missing. This is what we have added: "The photometry uncertainty is given by $\sigma$ = $\Omega$$\sigma$$_{sky}$ ($\it{N_{corr}}$)$^{1/2}$$\eta$$_{sky}$$\eta$$_{corr}$(1.0+$\it{N_{ap}/N_{sky}}$)$^{1/2}$, where $\sigma$$_{sky}$ is the standard deviation in the sky annulus surface brightness, $\Omega$ is the pixel solid angle, $\it{N_{sky}}$ and $\it{N_{ap}}$ are the number of pixels in the sky annulus and in the aperture, and $\eta$$_{sky}$ and $\eta$$_{corr}$ are correction factors that account for the presence on the mosaic of non-uniform noise and of correlated noise, respectively. We used $\eta$$_{sky}$ = 2.5 and $\eta$$_{corr}$ = 1.40 (see full description in Carperter et al. in preparation)." A very detailed description of the FEPS data (including what the referee suggests) will soon appear in Carpenter et al. This is why we chose not to present this same analysis here. We refer to Carpenter's paper through out the section. __________________ REFEREE: Section 4: The second paragraph (on the color-color diagram) seems misplaced. Should it go in section 3? RESPONSE: Agreed. It's now in the first paragraph of Section 3. __________________ REFEREE: Section 4.1: The K-S test shows that the control and planet samples are consistent with each other in terms of distance and IR background. Why was age, which was discussed at length in the first half of the paragraph, left out in the second half? Age is certainly the most important variable to consider - survival analysis naturally takes into account different distances and backgrounds, but not ages. A K-S test for age must also be included. Also, please list the number of stars in the two rough age bins, <1 Gyr and >1 Gyr. RESPONSE: The K-S test for age yield a very low probability Prob(D>obs)~1e-5, i.e. in terms of stellar age the planet and control sample have NOT been drawn from the same distribution (planet stars tend to be older). This would be a problem if there were a correlation between the 70 micron excess and the age of the system for the ages under consideration. There is indeed a correlation for stars <100 Myr, but our control sample is not including those younger bins, only stars older than 300 Myr. Observations indicate that for these older stars there is no strong correlation between 70 micron excess and age (Bryden et al. 2006 and also Hillenbrand et al. in preparation - the later are results from FEPS), and based on these results, we decided that it was reasonable to compare both samples, even though they differ in age. To clarify this point we have added this to section 4.1: "Note that the K-S test for age yield a much lower probability ($\it{Probability(D>observed)}$ $\sim$10$^{-5}$), i.e. both samples are likely not drawn from the same distribution in terms of age. However, given that the observations indicate that for the ages under consideration ($>$ 300 Myr, with approximately half of the stars having ages $>$ 1 Gyr) there is no correlation between the 70 $\mu$~excess and the stellar age, we do not expect to introduce any significant bias by comparing both samples (but keep in mind that the validity of the comparison relies on the observed lack of correlation with age). " _________________ REFEREE: Section 4.2 (Survival Analysis) The second paragraph has nothing to do with survival analysis. Maybe it should go in 4.1? RESPONSE: Yes, it was out of place. Section 4.1 describes the sample selection so it probably doesn't belong there either. I've added a new small subsection "4.2 Frequency of debris disks". And for consistency, I've changed the title of old subsection 4.2 (now 4.3) to "4.3 Fractional excess luminosity: survival analysis". This way, the sample selection criteria and the results are in different subsections. _________________ REFEREE: Section 5.2, 3rd paragraph The caption to figure 3 states that the model might be consistent with the data if the model is allowed to float by a factor of 10. A stronger and perhaps more interesting statement would be to rule out the model based on its predicted time dependence, which is not observed (also see Najita & Williams 2005 who show that the model's predicted evolution of disk radii is not seen in observations). RESPONSE: Agreed. We have added the following sentences to section 5.2. "However, this cannot be the only mechanism because if it were to dominate debris production one would expect to see the dust temperature to be correlated with age and this trend has not been observed (Najita \& Williams~\citeyear{naji05}). Similarly, the observations in Fig. 3 could not confirm the time dependence of the fractional 70 $\mu$m excess emission predicted by the models." _________________ REFEREE: Section 5.2, 4th paragraph Exactly what type of planets does this 12% frequency consider? The text refers to gas giants with mass < 13MJup, but certainly this does not refer to Neptune-mass planets, for which little is known. RESPONSE: Yes, it cannot refer to Neptune-like planets. We've added a few clarifications to that paragraph about those estimates: "Firstly, debris disks are more common than massive planets: it is found that $>$7\% of stars have giant planets with M$<$13 M$_{Jup}$ and semimajor axis within 5 AU, but this is a lower limit because the duration of the surveys (6--8 years) limits the ability to detect planets between 3 AU and 5 AU. The expected frequency of gas giant planets increases to $\sim$12\% when RV surveys are extrapolated to 20 AU (Marcy et al.~\citeyear{marc05}), with the distribution of planets following d$\it{N}$/d$\it{M}$ $\propto$$\it{M}$$^{-1.05}$ from M$_{Saturn}$ to 10 M$_{Jup}$ (the surveys are incomplete at smaller masses)." _________________ REFEREE: Section 5.2, 5th paragraph Besides the Greaves 2006 reference, there are several papers looking at the correlation with metallicity based on more up-to-date (Spitzer) data (e.g. Beichman 2005, Bryden 2006). RESPONSE: Of course. Those two references have been added. _________________ REFEREE: Table 1: Although it is hard to determine statistical trends with just a single detection, it may be worth noting that the one star with excess, HD 38529, is the most evolved, most luminous, and most massive star in the sample. Particular so, given theories that predict melting of Kuiper Belt material as a star evolves (Jura 2004). At the very least, its unusual placement in the H-R diagram (well above the main sequence) confirms its old age (>Gyr). RESPONSE: Thanks for pointing that out. We have added the referee's comment to to end of the second paragraph of section 3 (where the HD 38529 detection is discussed). "Even though it is difficult to identify statistical trends from one detection, it is interesting to note that HD 38529 is the most luminous, most massive, and most evolved of the planet bearing stars in Table 1. Assuming V = 5.95 (Johnson 1966), a Hipparcos distance of 42 pc and no reddening, the object has an absolute visual magnitude of M$_{v}$ = 2.81 and Log(L/L$_{Sun}$) = 0.82, putting the star on the Hertzsprung gap, so it is clearly post-main sequence." ________________ REFEREE: Table 3: Perhaps note that calibration uncertainties are not included in the error bars. RESPONSE: Done. This is clarified now in the table caption. _______________ REFEREE: Figure 1: Having excess shown as a subpanel below each SED is a nice way of presenting this type of data that I haven't seen before. RESPONSE: Thanks. _______________ REFEREE: Finally, as to the main conclusion of the paper (no correlation between planets and IR excess), is it possible to place a quantitative limit on this lack of correlation? For example, can you state that the frequency of debris around planet-bearing stars must be within a factor of two of control sample's? RESPONSE: We have the following detection frequencies: 1/9 for planet stars vs. 9/99 for control sample (FEPS) 1/11 for planet stars vs. 7/69 for control sample (Bryden). If we assume that the error in the number of stars with excesses is ~ sqrt(N), then we could state that the frequency of debris around a planet-baring star is within a factor of 3 of the control sample. It's a crude estimate because we are dealing with very small number statistics. We've added this to section 4.2. _______________ Typos: >>"`scale" --> "scale" in section 2, paragraph 2 I didn't find this one... >>"asses" --> "assess" in section 4.1 Done. >>"cannot not" --> "cannot" in the appendix Done. >>The lower left panel of Figure 2 appears to have an error. >>Where does the point with F24/F8 = 0.12 come from? Thanks for pointing that out. That point was HD 80606. The reason why it did not appear in the panel above (F33/F24 vs F24/F8) was because its IRS spectrum is very noisy beyond 30 microns, making the 33 micron IRS synthetic photometric point unreliable (this source was flagged by the SSC as non-nominal possibly because of a peak-up failure). However, in checking this out I've realized that there was a small inconsistency between the 24 micron values that I was using to create the color-color plot and the 24 micron values listed in table 3, the latter coming from our latest data reductions with improved 24 micron photometry. The issue was that I had forgotten to update the color color plot with these new values. The new manuscript now includes the updated color-color plot, including HD 80606 in both panels. I have added a note about this star in the table caption explaining why it's an outlier in the upper left panel. Thanks again for pointing out this inconsistency. >>"could consistent" --> "could be consistent" in Fig. 3 caption Done. *************************************************