第九章 卷积神经网络架构模式
ConvNet architecture patterns
运行代码
在 Colab 上运行
在 GitHub 上查看
本章内容
- 模型架构的模块化-层次-重用公式
- 卷积神经网络构建标准最佳实践概述:残差连接、批量归一化和深度可分离卷积
- 计算机视觉模型的持续设计趋势
- The modularity-hierarchy-reuse formula for model architecture
- An overview of standard best practices for building ConvNets: residual connections, batch normalization, and depthwise separable convolutions
- Ongoing design trends for computer vision models
模型的“架构”是创建该模型时所有选择的总和:使用哪些层、如何配置它们、以及如何连接它们。这些选择定义了模型的假设空间:梯度下降可以搜索的可能函数空间,由模型的权重参数化。与特征工程类似,一个好的假设空间编码了你对当前问题及其解决方案的先验知识 。例如,使用卷积层意味着你预先知道输入图像中存在的相关模式是平移不变的。为了有效地从数据中学习,你需要对你要寻找的内容做出假设。
A model’s “architecture” is the sum of the choices that went into creating it: which layers to use, how to configure them, in what arrangement to connect them. These choices define the hypothesis space of your model: the space of possible functions that gradient descent can search over, parameterized by the model’s weights. Like feature engineering, a good hypothesis space encodes prior knowledge that you have about the problem at hand and its solution. For instance, using convolution layers means that you know in advance that the relevant patterns present in your input images are translation-invariant. To effectively learn from data, you need to make assumptions about what you’re looking for.