- Ablation Study / 部分切除学习
- Accuracy / 精确度
- Activation / 激活函数
- Anchor Box / 锚箱,边界框
- Annotation / 标记
- Annotation Format / 标记格式
- Annotation Group / 标记组
- Architecture / 架构
- AUC / 曲线下面积
- Augmentation / 增加训练集
- AutoML
- Backbone / 主干网络
- Backprop / Back propagation / 反向传播
- Bag of Freebies / 免费赠品
- Batch Inference / 批量推理
- Batch Size / 批大小
- BCCD / 血细胞计数和检测数据集
- Black Box / 黑盒
- Block / 块
- Bounding Box / 边界框
- Channel / 通道
- Checkpoint / 检查点
- Class / 类别
- Class Balance / 类别平衡
- Classification / 分类任务
- COCO
- Colab / Google Colaboratory
- Computer Vision / 机器视觉
- Confidence / 置信度
- Confidence Threshold / 置信度阈值
- Converge / 收敛
- Convert / 转换
- Convolution / 卷积
- Convolutional Neural Network (CNN, ConvNet) / 卷积神经网络
- CoreML
- CreateML
- Cross Validation / 交叉验证
- CUDA
- CuDNN
- Custom Dataset / 自定义数据集
在深度学习相关论文中,经常可以看到一堆“生物学”方面的名词,例如 backbone(脊梁骨、脊椎),**head(脑袋)**等。那么它们是什么意思呢,这篇文章我们就来做深度学习方面的词汇扫盲。
Ablation Study / 部分切除学习removing features from your model one by one to see how much each one individually contributes to the performance. Common to see in research papers about new model architectures that contain many novel contributions.
一项一项地从模型中删除特征,以查看每个特征对性能的贡献。 这是一种用来分析新模型和功能的常用手段。
Accuracy / 精确度proportion of “correct” vs “incorrect” predictions a model makes. Common in classification models that have a single correct answer (vs object detection where there is a gradient from “perfect” to “pretty close” to “completely wrong”.) Often terms such as “top-5 accuracy” are used which means “how much of the time was the correct answer in the model’s top 5 most confident predictions?” Top-1 accuracy and Top-3 accuracy are also common.
模型做出的“正确”与“不正确”预测的比例。 在具有单一正确答案的分类模型中很常见(与对象检测相比,其中存在从“完美”到“非常接近”到“完全错误”的梯度)。
Activation / 激活函数The equation of a neural network cell that transforms data as it passes through the network. See activation function.
对网络输出进行非线性转换的一类函数,见激活函数。
Anchor Box / 锚箱,边界框common in object detection models to help predict the location of bounding boxes.
在目标检测应用中,用来标定物体所在位置的边界框。
Annotation / 标记the “answer key” for each image. Annotations are markup placed on an image (bounding boxes for object detection, polygons or a segmentation map for segmentation) to teach the model the ground truth.
目标检测时,用来标记被检测出的物体的名字或答案。
Annotation Format / 标记格式the particular way of encoding an annotation. There are many ways to describe a bounding box’s size and position (JSON, XML, TXT, etc) and to delineate which annotation goes with which image.
通常指的是标记的存储格式,大多数标记以TXT、XML、JSON格式存储,开发者需要对标记格式进行解析后才能使用标记。
Annotation Group / 标记组describes what types of object you are identifying. For example, “Chess Pieces” or “Vehicles”. Classes (eg “rook”, “pawn”) are members of an annotation group.
这个术语不太常用,更多会用Categories或者Class,它表示标记所属的组别,例如“人类”,“鸟”,“车”等。
Architecture / 架构a specific neural network layout (layers, neurons, blocks, etc). These often come in multiple sizes whose design is similar except for the number of parameters. For example, EfficientDet ranges from D0 (smallest) to D7 (largest).
描述神经网络结构的术语,特定的神经网络布局(层、神经元、块等)。 这些通常有多种尺寸,除了参数数量外,它们的设计相似。 例如,EfficientDet 的范围从 D0(最小)到 D7(最大)。
AUC / 曲线下面积Area Under the Curve. An evaluation metric for the efficacy of a prediction system that is trading off precision at the expense of recall. The precision recall curve is downward sloping as a predictions algorithms confidence is decreased, to allow more, but less precise predictions.
模型有效性的一种评价指标。由PR曲线得到,它用来预测系统有效性的评估指标,以牺牲召回率为代价来权衡精度。 随着预测算法置信度的降低,精确召回曲线向下倾斜,以允许更多但不太精确的预测。
Augmentation / 增加训练集creating more training examples by distorting your input images so your model doesn’t overfit on specific training examples. For example, you may flip, rotate, blur, or add noise.
通过扭曲输入图像来创建更多训练示例,这样您的模型就不会在特定训练示例上过度拟合。 例如,您可以翻转、旋转、模糊或添加噪声。
AutoMLone-click to train models that optimize themselves (usually hosted in the cloud). They can be a good starting point, a good baseline, and in some cases a “it just works” solution vs tuning your own models.
更多的是一种概念,同时也是一种自动化的机器学习工具。它内置了大量通用模块,可以允许用户仅输入数据和标签,通过查找找出最合适的模型,并自动化地学习,使得机器学习任务可以在无人工干预的情况下即可被使用。
当前提供的AutoML大多是收费的,且价格不菲。
Backbone / 主干网络an object detection model is made up of three parts, a head, a neck, and a backbone. The backbone is the “base” classification model that the object detection model is based on.
这个主干网络大多时候指的是提取特征的网络,其作用就是提取图片中的信息,共后面的网络使用。
这些网络经常使用的是resnet、VGG等,而不是我们自己设计的网络,因为这些网络已经证明了在分类等问题上的特征提取能力是很强的。在用这些网络作为backbone的时候,都是直接加载官方已经训练好的模型参数,后面接着我们自己的网络。让网络的这两个部分同时进行训练,因为加载的backbone模型已经具有提取特征的能力了,在我们的训练过程中,会对他进行微调,使得其更适合于我们自己的任务。
Backprop / Back propagation / 反向传播Back propagation is the way that neural networks improve themselves. For each batch of training data, they do a “forward pass” through the network and then find the direction of the “gradient” of each neuron in each layer from the end working backwards and adjust it a little bit in the direction that most reduces the loss function. Over millions of iterations, they get better bit by bit and this is how they “learn” to fit the training data.
网络用来更新参数权重的过程,关于反向传播的具体细节,可以参考我的文章。
Bag of Freebies / 免费赠品a collection of augmentation techniques that have been shown to improve performance regardless of model architecture. YOLOv4 and YOLOv5 have built these techniques into their training pipeline to improve performance over YOLOv3 without dramatically changing the model architecture.
一组增强技术已被证明可以提高性能,而不管模型架构如何。 YOLOv4 和 YOLOv5 已将这些技术构建到他们的训练管道中,以在不显着改变模型架构的情况下提高 YOLOv3 的性能。
Batch Inference / 批量推理making predictions on many frames at once to take advantage of the GPU’s ability to perform parallel operations. This can help improve performance if you are doing offline (as opposed to real-time) prediction. It increases throughput (but not FPS).
利用 GPU 执行并行 *** 作的能力一次对多帧进行预测。 如果您进行离线(而不是实时)预测,这有助于提高性能。 它增加了吞吐量(但不是 FPS)。
Batch Size / 批大小the number of images your model is training on in each step. This is a hyperparameter you can adjust. There are pros (faster training) and cons (increased memory usage) to increasing the batch size. It can also affect the model’s overall accuracy (and there is a bit of an art to choosing a good batch size as it is dependent on a number of factors). You may want to experiment with larger or smaller batch sizes.
模型在每一步中训练的数据数量。 这是可以调整的超参数。 增加批量大小有优点(更快的训练)和缺点(增加内存使用量)。 它还可以影响模型的整体准确性(并且选择一个好的批量大小需要一些技巧,因为它取决于许多因素)。
BCCD / 血细胞计数和检测数据集blood cell count and detection dataset. A set of blood cell images taken under a microscope that we commonly used for experimentation.
血细胞计数和检测数据集,在显微镜下拍摄的一组血细胞图像。
Black Box / 黑盒a system that makes it hard to peek behind the curtain to understand what is going on. Neural networks are often described as black boxes because it can be hard to explain “why” they are making a particular prediction. Model explainability is currently a hot topic and field of study.
一个很难窥视幕后了解正在发生的事情的系统。 神经网络通常被描述为黑盒,因为很难解释它们“为什么”做出特定预测。 模型可解释性是当前的热门话题和研究领域。
Block / 块to simplify their description and creation, many computer vision models are composed of various “blocks” which describe a set of inter-connected neurons. You can think of them a bit like LEGO bricks; they interoperate with each other and various configurations of blocks make up a layer (and many layers make up a model).
为了简化它们的描述和创建,许多计算机视觉模型由描述一组相互连接的神经元的各种“块”组成。 你可以把它们想象成乐高积木; 它们彼此互 *** 作,块的各种配置构成一个层(许多层构成一个模型)。
Bounding Box / 边界框a rectangular region of an image containing an object. Commonly described by its min/max x/y positions or a center point (x/y) and its width and height (w/h) along with its class label.
包含对象的图像的矩形区域。 通常由其最小/最大 x/y 位置或中心点 (x/y) 及其宽度和高度 (w/h) 及其类别标签来描述。
Channel / 通道images are composed of one or more channels. A channel has one value for each pixel in the image. A grayscale image may have one channel describing the brightness of each pixel. A color image may have three channels (one for red, green, and blue or hue, saturation, lightness respectively). A fourth channel is sometimes used for depth or transparency.
图像由一个或多个通道组成。 一个通道对于图像中的每个像素都有一个值。 灰度图像可能有一个通道来描述每个像素的亮度。 彩色图像可能具有三个通道(一个分别用于红色、绿色和蓝色或色调、饱和度、亮度)。 第四个通道有时用于深度或透明度。
Checkpoint / 检查点a point-in-time snapshot of your model’s weights. Oftentimes you will capture a checkpoint at the end of each epoch so you can go back to it later if your model’s performance degrades because it starts to overfit.
模型权重的时间点快照。 通常,您会在每个 epoch 结束时捕获一个检查点,以便在您的模型由于开始过度拟合而性能下降时可以返回到它。
Class / 类别a type of thing to be identified. For example, a model identifying pieces on a Chess board might have the following classes: white-pawn, black-pawn, white-rook, black-rook, white-knight, black-knight, white-bishop, black-bishop, white-queen, black-queen, white-king, black-king. The Annotation Group in this instance would be “Chess Pieces”.
一种待识别的事物。 例如,识别棋盘上棋子的模型可能具有以下类别:白棋、黑棋、白车、黑车、白骑士、黑骑士、白象、黑象、白 -皇后,黑皇后,白王,黑王。 在这种情况下,注释组将是“棋子”。
Class Balance / 类别平衡the relative distribution between the number of examples of each class. Models generally perform better if there is a relatively even number of examples for each class. If there are too few of a particular class, that class is “under-represented”. If there are many more instances of a particular class, that class is “over-represented”.
每个类的示例数量之间的相对分布。 如果每个类的示例数量相对偶数,则模型通常会表现得更好。 如果某个特定类别的人数太少,则该类别“代表性不足”。 如果某个特定类有更多实例,则该类被“过度代表”。
Classification / 分类任务a type of computer vision task that aims to determine only whether a certain class is present in an image (but not its location).
一种计算机视觉任务,旨在仅确定图像中是否存在某个类别(而不是其位置)。
COCOthe Microsoft Common Objects in Context dataset contains over 2 million images in 80 classes (ranging from “person” to “handbag” to “sink”). MS COCO is a standard dataset used to benchmark different models to compare their performance. Its JSON annotation format has also become commonly used for other datasets.
Microsoft Common Objects in Context 数据集包含 80 个类(从“人”到“手提包”再到“水槽”)中超过 200 万张图像。 MS COCO 是一个标准数据集,用于对不同模型进行基准测试以比较其性能。 其 JSON 注释格式也已普遍用于其他数据集。
Colab / Google ColaboratoryGoogle Colaboratory is a free platform that provides hosted Jupyter Notebooks connected to free GPUs.
Google Colaboratory 是一个免费平台,可提供连接到免费 GPU 的托管 Jupyter Notebook。
Computer Vision / 机器视觉the field pertaining to making sense of imagery. Images are just a collection of pixel values; with computer vision we can take those pixels and gain understanding of what they represent.
与图像有关的研究领域。我们可以通过机器视觉,获取图像像素背后的含义(例如图片测量深度信息、机械物件的损伤等)。
Confidence / 置信度A model is inherently statistical. Along with its prediction, it also outputs a confidence value that quantifies how “sure” it is that its prediction is correct.
模型本质上是统计的。 除了预测之外,它还输出一个置信值,用于量化其预测正确的“确定性”。
Confidence Threshold / 置信度阈值we often discard predictions that fall below a certain bar. This bar is the confidence threshold.
我们经常丢弃低于某个条的预测。 该条是置信度阈值。
Converge / 收敛over time we hope our models get closer and closer to a hypothetical “most accurate” set of weights. The march towards this maximum performance is called converging. The opposite of convergence is divergence, where a model gets off track and gets worse and worse over time.
随着时间的推移,我们希望我们的模型越来越接近假设的“最准确”权重集。 迈向这种目标的过程称为收敛。 收敛的反面是发散,其中模型偏离轨道并且随着时间的推移变得越来越糟。
Convert / 转换taking annotations or images in one format and translating them into another format. Each model requires input in a specific format; if our data is not already in that format we need to convert it with a custom script or a tool like Roboflow.
以一种格式获取注释或图像,然后将它们翻译成另一种格式。 每个模型都需要特定格式的输入; 如果我们的数据还不是那种格式,我们需要使用自定义脚本或 Roboflow 之类的工具对其进行转换。
Convolution / 卷积a convolution is a type of block that helps a model learn information about relationships between nearby pixels.
一种特殊的数学矩阵工具,可以通过卷积提取或加强被卷积数据的某些特征。
Convolutional Neural Network (CNN, ConvNet) / 卷积神经网络the most common type of network used in computer vision. By combining many convolutional layers, it can learn about more and more complex concepts. The early layers learn about things like horizontal, vertical, and diagonal lines and blocks of similar colors, the middle layers learn about combinations of those features like textures and corners, and the final layers learn to combine those features into identifying higher level concepts like “ears” and “clocks”.
计算机视觉中最常见的网络类型。 通过组合许多卷积层,它可以学习越来越复杂的概念。 早期层学习水平、垂直和对角线以及相似颜色的块等内容,中间层学习纹理和角落等特征的组合,最后一层学习将这些特征组合成识别更高级别的概念,例如“ 耳朵”和“时钟”。
CoreMLA proprietary format used to encode weights for Apple devices that takes advantage of the hardware accelerated neural engine present on iPhone and iPad devices.
应用于Apple设备的框架技术,它利用了 iPhone 和 iPad 设备上的硬件加速神经引擎。
CreateMLA no-code training tool created by Apple that will train machine learning models and export to CoreML. It supports classification and object detection along with several types of non computer-vision models (such as sound, activity, and text classification).
由 Apple 创建的无代码训练工具,可训练机器学习模型并导出到 CoreML。 它支持分类和对象检测以及多种类型的非计算机视觉模型(例如声音、活动和文本分类)。
Cross Validation / 交叉验证在机器学习中常用的一种评价模型准确度的训练、验证方法,它的一种扩展叫K-折交叉验证。
CUDANVIDIA’s method of creating general-purpose GPU-optimized code. This is how we are able to use GPU devices originally designed for 3d games to accelerate neural networks.
NVIDIA 创建通用 GPU 优化代码的方法。 这就是我们能够使用最初为 3d 游戏设计的 GPU 设备来加速神经网络的方式。
CuDNNNVIDIA’s CUDA Deep Neural Network library is a set of tools built on top of CUDA pertaining specifically to efficiently running neural networks on the GPU.
NVIDIA 的 CUDA 深度神经网络库是一组建立在 CUDA 之上的工具,专门用于在 GPU 上高效运行神经网络。
Custom Dataset / 自定义数据集a set of images and annotations pertaining to a domain specific problem. In contrast to a research benchmark dataset like COCO or Pascal VOC.
用户根据自己需要创建的数据集
参考内容
[1] https://blog.roboflow.com/glossary/
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)