[PDF]
The HIDE package has been developed at ETH Zurich in the Software Lab of the Cosmology Research Group of the ETH Institute of Astronomy.
The development is coordinated on GitHub and contributions are welcome. The documentation of HIDE is available at readthedocs.org .
AbstractAs several large single-dish radio surveys begin operation within the coming decade, a wealth of radio data will become available and provide a new window to the Universe. In order to fully exploit the potential of these datasets, it is important to understand the systematic effects associated with the instrument and the analysis pipeline. A common approach to tackle this is to forward-model the entire system—from the hardware to the analysis of the data products. For this purpose, we introduce two newly developed, open-source Python packages: the HI Data Emulator (HIDE) and the Signal Extraction and Emission Kartographer (SEEK) for simulating and processing single-dish radio survey data. HIDE forward-models the process of collecting astronomical radio signals in a single-dish radio telescope instrument and outputs pixel-level time-ordered-data. SEEK processes the time-ordered-data, removes artifacts from Radio Frequency Interference (RFI), automatically applies flux calibration, and aims to recover the astronomical radio signal. The two packages can be used separately or together depending on the application. Their modular and flexible nature allows easy adaptation to other instruments and datasets. We describe the basic architecture of the two packages and examine in detail the noise and RFI modeling in HIDE, as well as the implementation of gain calibration and RFI mitigation in SEEK. We then apply HIDE & SEEK to forward-model a Galactic survey in the frequency range 990–1260 MHz based on data taken at the Bleien Observatory. For this survey, we expect to cover 70% of the full sky and achieve a median signal-to-noise ratio of approximately 5–6 in the cleanest channels including systematic uncertainties. However, we also point out the potential challenges of high RFI contamination and baseline removal when examining the early data from the Bleien Observatory.
The fully documented HIDE & SEEK packages are available at HIDE & SEEK – Cosmology Research Group | ETH Zurich and are published under the GPLv3 license on GitHub.
随着几次大型单天线射电观测在未来十年开始运作,大量的射电数据将成为可用的,并为宇宙提供一个新的窗口。为了充分利用这些数据集的潜力,了解与仪器和分析管道相关的系统效应是很重要的。解决这个问题的常用方法是对整个系统进行前向建模——从硬件到数据产品的分析。为此,本文介绍了两个新开发的开源 Python 包: HI 数据模拟器 (HIDE) 和信号提取和发射记录仪 (SEEK),用于模拟和处理单天线无线电测量数据。HIDE 前向建模在单天线射电望远镜仪器中收集天文无线电信号的过程,并输出像素级的时间顺序数据。SEEK 对时间顺序数据进行处理,去除射频干扰(RFI) 中的伪影,自动应用通量校准,旨在恢复天文无线电信号。
这两个包可以单独使用,也可以根据应用程序一起使用。它们的模块化和灵活性使其易于适应其他仪器和数据集。本文描述了这两个包的基本架构,并详细检查了 HIDE 中的噪声和 RFI 建模,以及SEEK 中的增益校准和 RFI 抑制的实现。然后,基于 Bleien 天文台的数据,应用 HIDE 和 SEEK 对频率范围为 990-1260 MHz 的银河巡天进行前向建模。在这次调查中,本文希望覆盖整个天空的 70%,并在包括系统不确定性在内的最干净的通道中达到约 5-6 的中值信噪比。然而,本文也指出了在检查布莱恩天文台的早期数据时,高 RFI 污染和基线去除的潜在挑战。
完整文档的 HIDE & SEEK 包可在 HIDE & SEEK -宇宙学研究小组|苏黎世联邦理工学院获得 HIDE & SEEK – Cosmology Research Group | ETH Zurich,并在 GitHub 上以 GPLv3 许可发布。
1 IntroductionForward-modeling has become a common approach in various fields of astronomy where mock datasets are simulated and analyzed in parallel with the science data. This has become especially prevalent in cosmology where large datasets are used and high precision is required. Prominent examples are analyses of the cosmic microwave background (Reinecke et al., 2006), spectroscopy (Nord et al., 2016) and weak gravitational lensing (e.g. Bridle et al., 2009, Refregier and Amara, 2014, Bruderer et al., 0000, Peterson et al., 2015). These forward-modeling pipelines simulate the astrophysical signals, the instrument response and the data reduction process in order to understand any systematic biases from hardware or software and to estimate statistical errors in the measurement chain.
前向建模已经成为天文学各个领域的常用方法,在这些领域中,模拟数据集与科学数据并行模拟和分析。这在宇宙学中尤其普遍,因为宇宙学中使用了大量的数据集,并且要求高精度。突出的例子是宇宙微波背景分析,光谱学和弱引力透镜分析。这些正向建模 pipeline 模拟天体物理信号、仪器响应和数据缩减过程,以了解来自硬件或软件的任何系统偏差,并估计测量链中的统计误差。
In this paper, we implement this forward-modeling approach for single-dish radio surveys. Several single-dish radio surveys are being planned for the next decades with the goal of mapping the HI neutral hydrogen in the Universe (Battye et al., 2013, Santos et al., 2015, Bigot-Sazy et al., 2016). We develop two software packages: the HI Data Emulator (HIDE) and the Signal Extraction and Emission Kartographer (SEEK). HIDE forward models the entire radio survey system chain, while SEEK processes both the simulated data and the observed survey data in a reproducible and consistent way. Various sophisticated simulation and data reduction pipeline packages for radio astronomy exist (e.g., Swinbank et al., 2015, McMullin et al., 2007, Dodson et al., 2016). However, many of them are either non-open source or project-specific. HIDE & SEEK are developed in a different angle—the initial functionalities are rather simple, but can be expanded easily as the codes are designed with a high level of modularity, flexibility and transparency, in a pure Python implementation with rigorous testing. Developing the two packages simultaneously has the advantage that the individual components of one pipeline can be cross validated against its counter part in the other pipeline.
本文将这种前向建模方法应用于单天线无线电调查。未来几十年计划进行几次单天线射电调查,目标是测绘宇宙中 HI neutral hydrogen (Battye et al., 2013, Santos et al., 2015, Bigot-Sazy et al., 2016)。本文开发了两个软件包:HI 数据模拟器 (HIDE) 和信号提取和发射 Kartographer (SEEK)。HIDE 正向建模整个无线电测量系统链,而 SEEK 以可重复和一致的方式处理模拟数据和观测数据。存在各种复杂的射电天文学模拟和数据缩减 pipeline 包 (例如,Swinbank等人,2015年,McMullin等人,2007年,Dodson等人,2016年)。然而,其中许多不是开源的,就是特定于项目的。HIDE 和 SEEK 是在不同的角度开发的——初始功能相当简单,但可以很容易地扩展,因为代码的设计具有高度的模块化、灵活性和透明度,在一个纯 Python 实现和严格的测试。同时开发这两个包的优点是,一个 pipeline 中的单个组件可以与另一个 pipeline 中的对应组件进行交叉验证。
HIDE & SEEK are developed based on the hardware system and data products from the 7m telescope at the Bleien Observatory as described in Chang et al. (0000, hereafter C16). This framework is then used to forward-model a Galactic survey in the frequency range 990–1260 MHz conducted at the Bleien Observatory for testing and science verification purposes. Such an analysis allows us to forecast the expected power of this survey with the existing hardware system at Bleien. Comparing the results of the forward model and data also helps to identify areas that require improvements in HIDE & SEEK as well as the hardware system.
HIDE 和 SEEK 是基于 Blien 天文台 7m 望远镜的硬件系统和数据产品开发的,如 Chang et al.(0000,以下简称 C16) 所述。然后,该框架用于前向模拟在布莱恩天文台进行的频率范围为 990-1260 MHz 的银河调查,以进行测试和科学验证。这样的分析使我们能够利用布莱恩现有的硬件系统来预测这次调查的预期功率。比较前向模型和数据的结果也有助于识别 HIDE 和 SEEK 以及硬件系统中需要改进的地方。
This paper is organized as follows. In Section 2, we first describe the basic architecture and design of HIDE & SEEK. Detailed implementations for specific functionalities are described in Appendix A Implementation of beam convolution on a sphere, Appendix B RFI masking, Appendix C Flux calibration. In Section 3, we apply HIDE & SEEK to forward-model a survey based on early data taken at the Bleien Observatory. This includes customizing the various functionalities to this specific survey and providing a forecast for the expected outcome of the survey. Finally, we conclude in Section 4. In Appendix D, we show an example of how we applied SEEK to process part of the early data from the Bleien Observatory and what we learn comparing these results to the HIDE simulations. Information for downloading and installing HIDE & SEEK, as well as the default file format is described in Appendix E and Appendix F, respectively.
本文组织如下。在第二节,首先描述了 HIDE 和 SEEK 的基本架构和设计。具体功能的详细实现参见附录A球面上的波束卷积实现、附录B RFI masking、附录C 通量校准。在第 3 节中,基于 Bleien 天文台的早期数据,应用 HIDE 和 SEEK 对一次调查进行前向建模。这包括为这个特定的调查定制各种功能,并提供调查预期结果的预测。最后,在第四部分进行总结。在附录D中,展示了一个例子,说明如何应用 SEEK 来处理来自 Bleien 天文台的部分早期数据,以及将这些结果与 HIDE 模拟进行比较的结果。下载和安装 HIDE & SEEK 的信息以及默认文件格式分别在附录 E 和附录 F 中描述。
2 The HIDE & SEEK pipelinesHIDE is a simulation pipeline for single-dish radio telescopes and SEEK is a data processing pipeline for observed or simulated radio telescope data. We have developed the two independent software packages simultaneously, which means that one pipeline can be used to cross validate the other. For example, in HIDE we simulate radio frequency interference (RFI) signals, while in SEEK we detect and mask the RFI signals. This suggests that the quality of the RFI masking in SEEK can be assessed using simulated data from HIDE. On the other hand, the goodness of the modeling in HIDE can be verified by processing real data with SEEK and comparing the results with the simulation. Both pipelines share a common design. Fig. 1 shows a schematic illustration of both packages.
HIDE 是单天线射电望远镜的模拟 pipeline,SEEK 是观测或模拟射电望远镜数据的数据处理 pipeline。本文同时开发了两个独立的软件包,这意味着一个 pipeline 可以用来交叉验证另一个。例如,在 HIDE 中,是模拟射频干扰 (RFI) 信号,而在 SEEK 中,是检测和 masking RFI 信号。这表明,可以使用 HIDE 的模拟数据来评估 SEEK 中 RFI masking 的质量。另一方面,利用 SEEK 对真实数据进行处理,并与仿真结果进行比较,验证了 HIDE 中建模的正确性。这两个 pipeline 共享一个共同的设计。图 1 显示了两个包的示意图。
Fig. 1. Flow diagram of the steps executed in HIDE (left) and SEEK (right). Each box with rounded corners is a plugin and the other boxes indicate external data. The dotted boxes indicate the data format (Healpix map or TOD) along the process. For HIDE, the simulations start on a Healpix map and outputs in TOD format. This TOD can be directly fed into SEEK, where it aims to reconstruct the TOD back onto a Healpix map. The two packages together close the end-to-end loop of this forward-modeling framework. This parallel simulation and analysis setting also facilitates cross-validation during the development of the packages.
图 1 所示。 在HIDE (左) 和 SEEK (右) 中执行步骤的流程图。每个圆角框是一个插件,其他框表示外部数据。虚线框表示流程中的数据格式 (Healpix map 或 TOD)。对于 HIDE,模拟从 Healpix map 开始,并以 TOD 格式输出。这个 TOD 可以直接输入 SEEK, SEEK 的目标是在 Healpix map 上重建 TOD。这两个包一起关闭了这个正向建模框架的端到端循环。这种并行模拟和分析设置也有助于包开发期间的交叉验证。
To ensure the common design and facilitate the switching between HIDE & SEEK, both package are based on the simple plugin-based workflow engine Ivy, which we introduce in Section 2.1. In Section 2.2, we briefly describe the two main data structures that are used in HIDE & SEEK.
为了确保通用的设计和方便 HIDE 和 SEEK 之间的切换,这两个包都基于简单的基于插件的工作流引擎 Ivy,将在 2.1 节介绍它。在 2.2 节中,简要描述了在 HIDE 和 SEEK 中使用的两个主要数据结构。
2.1. Ivy: Plugin-based workflow engineIvy is a generic workflow engine written in Python and is open-source under the GPLv3 license.1 Its architecture ensures that individual elements of pipelines (which we refer as “plugins” here) are self-contained. The interaction between the plugins is done via a context object that is passed from plugin to plugin by the framework. Ivy’s design fosters reusability and maintainability of the code. As every plugin is loosely coupled to the pipeline and the other plugins, unit testing the individual components becomes greatly simplified. Furthermore, new features can easily be added to an existing pipeline in the form of a new plugin without interfering with the other plugins.
基于插件的工作流引擎
Ivy 是用 Python 编写的通用工作流引擎,是 GPLv3 许可证下的开放源码它的体系结构,确保了 pipelines 的各个元素 (这里称之为插件) 是自包含的。插件之间的交互是通过框架从一个插件传递到另一个插件的上下文对象来完成的。Ivy 的设计促进了代码的可重用性和可维护性。由于每个插件都与管道和其他插件松散耦合,对单个组件的单元测试将大大简化。此外,新特性可以很容易地以新插件的形式添加到现有的 pipeline 中,而不会干扰其他插件。
A pipeline developed with the Ivy framework always consists of a configuration file, that contains a list of the plugins that belong to this pipeline and parameters that are used by the plugins. An Ivy pipeline can be executed in parallel on multiple CPU cores with minimal work. By default Ivy uses Python’s built-in multiprocessing package but alternative parallelization scheme such as IPython cluster can be chosen. The workload is automatically distributed among the available CPU’s and Ivy executes the plugins in parallel.
使用 Ivy 框架开发的 pipeline 总是由一个配置文件组成,该配置文件包含属于该 pipeline 的插件列表和插件使用的参数。Ivy pipeline 可以在多个 CPU 核上以最小的工作量并行执行。默认情况下,Ivy 使用 Python 的内置多处理包,但也可以选择其他并行方案,如 IPython 集群。工作负载被自动分配到可用的 CPU 中,Ivy 并行地执行插件。
2.2. Data structures: time-ordered data and Healpix mapsThe two relevant data structures used in HIDE & SEEK are time-ordered data (TOD) and Healpix (Górski et al., 2005) maps. TOD refers to the data type we record at the end of the instrument chain (a bolometer or a spectrometer). Typically, one or multiple values that scale linearly with the signal received by the telescope is recorded over time. In our case, since the measurement instrument is a spectrometer, we have one value per frequency channel per time, which results in a 2D plane with time and frequency on the two axes (see Fig. 3 for an example of the observed and simulated TOD). This data format itself is agnostic about where the signal is from in the sky. only by combining the data with the telescope pointing we map the TOD to the sky coordinates.
数据结构:时间顺序数据和 Healpix map
在 HIDE 和 SEEK 中使用的两个相关数据结构是时序数据 (TOD) 和 Healpix。TOD 指的是我们在仪器链末端记录的数据类型 (辐射热计或光谱仪)。通常,一个或多个与望远镜接收到的信号成线性比例的值会随时间记录下来。在我们的例子中,由于测量仪器是一个光谱仪,我们在每个频率通道上每次都有一个值,这导致在两个轴上有时间和频率的二维平面 (图 3 为观察和模拟的 TOD 示例)。这种数据格式本身并不知道信号来自天空的哪里。只有结合望远镜指向的数据,我们才能将 TOD 映射到天空坐标。
Fig. 3. The left panel in the top row displays the unprocessed TOD recorded on 21st of March 2016. The broadband RFI contamination mainly coming from the nearby airport and is visible in the 1025–1150 MHz frequency band. The TOD also shows the variation between day and night as the amount of RFI increase at around 6:00 am and decreased at 11:00 pm UT. The lower left panel shows the TOD simulated with HIDE. The right panels show the corresponding TOD with the RFI mask overlaid (orange). (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
图 3 所示。最上面一行的左边面板显示的是 2016 年 3 月 21 日记录的未处理 TOD。宽带 RFI 污染主要来自附近的机场,在 1025-1150 MHz 频段可见。TOD 还显示了白天和晚上的变化,因为RFI 量在上午 6 点左右增加,在晚上 11 点下降。左下角的面板显示了使用 HIDE 模拟的 TOD。右边的面板显示了对应的 TOD 和覆盖的 RFI mask (橙色)。(读者可参考本文的网络版,以解释本图例中有关颜色的参考资料。)
A Hierarchical Equal Area isoLatitude Pixelation (Healpix) map refers to a specific pixelation scheme implemented on a sphere. This format is commonly used in cosmology. Healpix maps have by construction equal-area pixels and the pixelation introduces minimal distortions and errors compared to other projection methods. Healpix maps can be manipulated with the Python wrapper Healpy.2 In this work, we use Healpix maps whenever the celestial coordinate is relevant. This includes the input Milky Way signal in HIDE and the final reconstructed map in SEEK.
分层等面积孤立像素化 (Healpix) map 是指在球体上实现的特定像素化方案。这种形式在宇宙学中很常用。与其他投影方法相比,Healpix map 通过构造等面积像素和像素化引入了最小的失真和错误。Healpix map 可以通过 Python 包装器 healpy2 进行 *** 作。在本工作中,只要天体坐标相关,我们就使用 Healpix map。这包括 HIDE 中的输入银河系信号和 SEEK 中的最终重建映射。
2.3. HIDE architectureHIDE & SEEK both follow the plugin design concept of the Ivy framework described above. The architecture allows the user to easily add new features or replace existing functionality. In the following, we give a high-level overview of the functionalities in both packages.
HIDE 和 SEEK 都遵循上述 Ivy 框架的插件设计概念。该体系结构允许用户轻松地添加新特性或替换现有功能。在下面的文章中,我们将对这两个包中的功能进行一个高级概述。
HIDE is a package for simulating a single dish radio telescope survey. As such, it takes Healpix maps as inputs and processes them into TOD. The design is flexible and can be customized to different instruments and survey designs. In the following, we describe the setup of the HIDE pipeline. This is not an exhaustive list for generic surveys and can easily be extended. The left half of Fig. 1 shows the structure of the pipeline and includes the following steps:
HIDE 是一个用于模拟单天线射电望远镜调查的软件包。因此,它将 Healpix 映射作为输入,并将它们处理到 TOD 中。设计灵活,可根据不同的仪器和测量设计进行定制。下面,我们将描述 HIDE pipeline 的设置。这不是一个通用调查的详尽列表,可以很容易地扩展。图 1 左半部分为 pipeline 结构,包括以下步骤:
Initialize
The Ivy configuration is loaded into memory and the random seed is initialized to ensure reproducible results.
Ivy 配置被加载到内存中,并初始化随机种子,以确保结果的可重复性
Load beam profile
A beam response pattern is loaded according to the configuration. The current design supports parametrized Gaussian or Airy (Airy, 1838) profiles and arbitrary beam patterns specified on a grid. The beam profiles can be frequency-dependent.
根据结构对 beam 响应模式进行加载。目前的设计支持参数化高斯或艾里 (Airy, 1838) 轮廓和在网格上指定的任意波束模式。波束分布可能与频率有关。
Load survey strategy
The pointing of the telescope at a given time in the survey is computed according to the desired survey strategy. The plugin converts the telescope pointing from terrestrial coordinates (azimuth, elevation) into equatorial coordinates (RA, Dec). Different strategies can be chosen such as drift-scan surveys, or a file-based scanning schedule such that a planned survey can be exactly simulated.
在给定的观测时间内,望远镜的指向是根据期望的观测策略来计算的。插件转换望远镜指向从地面坐标 (方位角,仰角) 到赤道坐标 (RA, Dec)。可以选择不同的策略,如漂移扫描测量,或基于文件的扫描计划,以便能够准确地模拟计划的测量。
Load astronomical signal
The astronomical signal used for the simulation is loaded. Here, we use the Global Sky Model (GSM, de Oliveira-Costa et al., 2008), which is a synthetic model of the Milky Way as a function of frequency based on a large number of radio datasets. The GSM is stored as Healpix maps on a grid of frequencies and interpolated to the desired frequency when needed. The GSM maps are in units of brightness temperature (Kelvin).
加载用于模拟的天文信号。在这里,我们使用全球天空模型 (GSM, de Oliveira-Costa et al., 2008),这是一个基于大量无线电数据集的银河系作为频率函数的综合模型。GSM 以 Healpix 映射的形式存储在频率网格上,并在需要时插入所需的频率。GSM maps 的单位是亮度温度 (开尔文)。
Convolve signals with beam
For each telescope pointing defined by the survey strategy, the telescope beam response is convolved with the astronomical signals and appended to the TOD array.
对于测量策略定义的每个望远镜指向,望远镜波束响应与天文信号卷积,并附加到 TOD 阵列。
Apply gain
To transform the TOD from units of Kelvin into internal units (Analog-to-digital unit, ADU) as recorded in the instrument, we multiply the TOD by a gain template. This information can come from external calibration or specifications of the instrument.
为了将 TOD 从开尔文单位转换为仪器中记录的内部单位 (模数单位,ADU),我们将 TOD 乘以一个增益模板。这些信息可以来自仪器的外部校准或规格。
Add baseline
An frequency and point dependent baseline offset is added to the TOD to account for contributions to the overall intensity from the instrument, the environment, etc. (see Section 3.1 for more details)
在 TOD 中添加一个与频率和点相关的基线偏移量,以考虑仪器、环境等因素对整体强度的影响 (详见3.1节)。
Add RFI
A simulated RFI signal is added to the TOD array (see Section 3.1 for the specific model derived from data).
将模拟的 RFI 信号添加到 TOD 阵列中 (由数据导出的具体模型见 3.1 节)。
Add noise
Gaussian (white) noise and 1/f (pink) noise is added to the data to model noise contribution from the atmosphere and electronics.
将高斯 (白) 噪声 和 1/f (粉色) 噪声添加到数据中,以模拟来自大气和电子的噪声影响。
Write TOD
The simulations are written to disk in TOD format.
模拟结果以 TOD 格式写入磁盘。
We note that the most computationally demanding step in the HIDE pipeline is the step of convolving the signals with the beam. We speed up this step by using Quaternions and a KD-tree data structure. The technical details of this operation is described in Appendix A.
我们注意到在 HIDE pipeline 中最需要计算的步骤是将信号与波束卷积的步骤。我们使用四元数和 KD 树数据结构来加速这一步。该 *** 作的技术细节在 附录 A 中描述。
2.4. SEEK architectureSEEK is a flexible and easy-to-extend data processing pipeline for single dish radio telescopes. It takes the observed (or simulated) TOD in the time-frequency domain as an input and processes it into Healpix maps while applying calibration and automatically masking RFI. The data processing is parallelized using Ivy’s parallelization scheme.
We outline the setup of the SEEK pipeline below. Again, this list is not exhaustive but can easily be extended and modified given a different experiment. The structure of the SEEK pipeline is illustrated in the right half of Fig. 1 and includes the following steps:
SEEK 是一种灵活且易于扩展的单天线射电望远镜数据处理 pipeline。它将观察到的 (或模拟的) 时频域 TOD 作为输入,并将其处理成 Healpix map,同时应用校准和自动屏蔽 RFI。使用 Ivy 的并行化方案将数据处理并行化。
我们将在下面概述 SEEK pipeline 的设置。同样,这个列表并不详尽,但在不同的实验中可以很容易地扩展和修改。SEEK pipeline 结构如图 1 右半部分所示,包括以下步骤:
Find files
The file system is traversed to find simulated or observed data for a given time period and file-name convention.
遍历文件系统以查找给定时间段和文件名称约定的模拟或观察数据。
Load data
The data is loaded from the file system into memory and smoothed if specified by the user. SEEK is currently able to process both FITS [17] and HDF5 data formats. The design concept of Ivy ensures that the other plugins do not depend on the origin of the data. Therefore extending the support for further file formats can be implemented without interfering with the other functionalities.
数据将从文件系统加载到内存中,并在用户指定的情况下进行平滑处理。SEEK 目前能够处理 FITS 和 HDF5 数据格式。Ivy 的设计概念确保其他插件不依赖于数据的来源。因此,可以在不干扰其他功能的情况下实现对进一步文件格式的扩展支持。
Apply gain
A gain factor is computed by using special calibration data that was collected on dedicated calibration days in the survey. Alternatively, an externally provided template can be loaded from the file system. This gain factor is applied to the TOD to convert the instrument-recorded values (ADU) to physical units (Kelvins).
增益系数是通过使用在调查中专门的校准日收集的特殊校准数据来计算的。另外,可以从文件系统加载外部提供的模板。该增益因子应用于 TOD,将仪器记录值 (ADU) 转换为物理单位 (开尔文)。
Coordinate transformation
The telescope pointing at each given time is connected to the TOD to give each pixel in the TOD a terrestrial coordinate. These coordinates are now transformed into equatorial coordinates corresponding to a given point in the celestial sphere.
指向每个给定时间的望远镜被连接到 TOD,从而给 TOD 中的每个像素一个地面坐标。现在将这些坐标转换成与天球中给定点相对应的赤道坐标。
Mask objects and bad frequency channels
The TOD is masked if known bright objects such as the Sun and the Moon are too close to the telescope pointing. Furthermore, frequency bands known to be unusable (e.g., seriously contaminated by satellite communication bands) are masked.
如果已知的明亮物体,如太阳和月亮离望远镜的指向太近,TOD就会被掩盖。此外,已知不能使用的频带(例如,被卫星通信频带严重污染)被掩盖。
Mask RFI
The TOD is analyzed and pixels identified as contaminated by RFI are masked (see Appendix B for more details on SEEK’s automated RFI masking mechanism).
TOD 被分析,被识别为被 RFI 污染的像素被屏蔽(关于 SEEK 的自动 RFI mask 机制的更多细节,请参阅附录 B)。
baseline removal
The baseline level per frequency is estimated from the median value over time of the cleaned TOD. This baseline is subtracted from the TOD.
每个频率的基线水平是根据经过清理的 TOD 时间的中值估计的。从 TOD 中减去这个基线。
Create map
For every frequency channel the TOD is processed into a Healpix map. By default, each pixel in the Healpix map is filled with the mean value of all measurements in that pixel. One can also invoke an outlier-rejection step to avoid uncleaned RFI contaminating the mean.
对于每个频率通道,TOD 被处理成 Healpix map。默认情况下,Healpix map 中的每个像素都用该像素中所有测量值的平均值填充。还可以调用异常值拒绝步骤,以避免未经清理的 RFI 污染平均值。
Write map
The Healpix maps and auxiliary information such as the frequency range and redshift information is written to disk in HDF5 file format.
Healpix map 和辅助信息,如频率范围和红移信息,以 HDF5 文件格式写入磁盘。
In the SEEK pipeline, tackling the RFI is the most computationally challenging task. This includes both the RFI masking in the TOD plane and the final outlier-rejection applied at the Healpix map level. We describe our specific treatment of RFI in Appendix B.
在 SEEK 过程中,处理 RFI 是计算上最具挑战性的任务。这包括 TOD 平面中的 RFI masking,以及在 Healpix map 水平上应用的最终异常值拒绝。我们在附录 B 中描述了对 RFI 的具体处理。
2.5. Quality assuranceWe developed HIDE & SEEK using best practice from software engineering. In particular, we use tools common in the Python community. Both packages follow a standardized packaging, which simplifies the development and installation including resolving dependent third party packages. The standardization released by the Python Package authority defines the directory structures of a package, which enforces the separation of functionality and makes it easier for new developers to engage in the project. Furthermore, it defines how to store meta-information of the package. In order to maintain a high level of quality and that newly developed features do not infer with existing code, we rigorously test the functionality of the packages with unit tests. To develop those tests we use the common testing framework py.test. Finally, both packages are fully documented using the standardized reStructuredText syntax such that we can automatically generate and publish a documentation using the Sphinx package.
我们使用软件工程的最佳实践开发了 HIDE & SEEK。特别是,我们使用 Python 社区中常见的工具。这两个包都遵循一个标准化的包,这简化了开发和安装,包括解析相关的第三方包。由 Python Package 权威机构发布的标准化定义了包的目录结构,这强制了功能的分离,并使新开发人员更容易参与到项目中。此外,它还定义了如何存储包的元信息。为了保持高水平的质量,并且新开发的特性不会从现有的代码中推断出来,我们用单元测试严格地测试包的功能。为了开发这些测试,我们使用公共测试框架 py.test。最后,这两个包都使用标准化的 reStructuredText 语法进行了完整的文档记录,这样我们就可以使用 Sphinx 包自动生成和发布 documentation。
Appendix A. Implementation of beam convolution on a sphereIn order to convolve the beam response with the simulated astronomical signal, we have to rotate the grid that defines the beam geometry on a sphere. As this step is repeated for every telescope pointing defined by the scanning strategy as well as for every simulated frequency, high efficiency for the operation is crucial.
为了将波束响应与模拟的天文信号进行卷积,我们必须在球体上旋转定义波束几何形状的网格。由于该步骤对扫描策略定义的每个望远镜指向以及每个模拟频率都重复,因此高效率的 *** 作是至关重要的。
Conventionally, spherical rotations are implemented by using the Euler matrix rotation. Applying this rotation scheme to a Healpix map requires transforming the spherical pointing angles (θ and ϕ) into Euler coordinates (x,y,z). This involves computing the rotation matrix R, applying this matrix to the coordinate vector, transforming the results back into spherical coordinates, and finally performing the convolution. Typically, applying rotation on multiple axes requires a repetition of the above steps for each axis. This can be computationally even more demanding and numerically less stable.
通常,球面旋转是通过使用欧拉矩阵旋转来实现的。将这种旋转方案应用到 Healpix maps 需要将球形指向角度 (θ 和 ϕ) 转换为 Euler 坐标 (x,y,z)。这包括计算旋转矩阵 R,将这个矩阵应用到坐标向量上,将结果转换回球坐标,最后进行卷积。通常,在多个轴上应用旋转需要对每个轴重复上述步骤。这在计算上要求更高,在数值上稳定性更差。
We have implemented all the coordinate rotations with Quaternions, a technique commonly used in 3D computer vision (Shoemake, 1985). Rotations over multiple axes can be concentrated into one operation with Quaternions, which make them computationally more efficient and stable. Furthermore they do not suffer from the so-called gimbal lock as Euler rotation do, where a rotation along one axis may produce coordinates that do not allow further rotations (Shoemake, 1985).
我们已经用四元数实现了所有的坐标旋转,这是一种在 3D 计算机视觉中常用的技术 (Shoemake, 1985)。多个轴上的旋转可以集中到四元数的一个 *** 作中,这使它们在计算上更高效和稳定。此外,它们不会像欧拉旋转那样受到所谓的万向节锁的影响,在欧拉旋转中,沿着一个轴的旋转可能会产生不允许进一步旋转的坐标 (Shoemake, 1985)。
Additionally, this allowed us to easily implement a lookup table with a binary search for the sine and cosine function to further speed up the rotations. In order to efficiently find the pixels relevant for the rotation of the beam geometry we also use store all the Healpix pixel information in a KD-Tree adapted for spherical coordinates.
此外,这让我们可以很容易地实现一个查找表,通过二分查找正弦和余弦函数来进一步加快旋转速度。为了有效地找到与光束几何旋转相关的像素,我们还使用了将所有 Healpix 像素信息存储在适合球坐标的 KD-Tree 中。
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)