【哈工大版】Dynamic ReLU:自适应参数化ReLU激活函数(调参记录15)

甄宝玉
• 阅读 1513
自适应参数化ReLU是一种动态ReLU(Dynamic ReLU)激活函数,在2019年5月3日投稿至IEEE Transactions on Industrial Electronics,2020年1月24日录用,2020年2月13日在IEEE官网公布

调参记录14里,只有2个残差模块,结果遭遇欠拟合了。这次增加一个残差模块试试。

自适应参数化ReLU激活函数的基本原理如下:
【哈工大版】Dynamic ReLU:自适应参数化ReLU激活函数(调参记录15)

Keras程序如下:

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Tue Apr 14 04:17:45 2020
Implemented using TensorFlow 1.10.0 and Keras 2.2.1

Minghang Zhao, Shisheng Zhong, Xuyun Fu, Baoping Tang, Shaojiang Dong, Michael Pecht,
Deep Residual Networks with Adaptively Parametric Rectifier Linear Units for Fault Diagnosis, 
IEEE Transactions on Industrial Electronics, 2020, DOI: 10.1109/TIE.2020.2972458,
Date of Publication: 13 February 2020

@author: Minghang Zhao
"""

from __future__ import print_function
import keras
import numpy as np
from keras.datasets import cifar10
from keras.layers import Dense, Conv2D, BatchNormalization, Activation, Minimum
from keras.layers import AveragePooling2D, Input, GlobalAveragePooling2D, Concatenate, Reshape
from keras.regularizers import l2
from keras import backend as K
from keras.models import Model
from keras import optimizers
from keras.preprocessing.image import ImageDataGenerator
from keras.callbacks import LearningRateScheduler
K.set_learning_phase(1)

# The data, split between train and test sets
(x_train, y_train), (x_test, y_test) = cifar10.load_data()
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.
x_test = x_test-np.mean(x_train)
x_train = x_train-np.mean(x_train)
print('x_train shape:', x_train.shape)
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')

# convert class vectors to binary class matrices
y_train = keras.utils.to_categorical(y_train, 10)
y_test = keras.utils.to_categorical(y_test, 10)

# Schedule the learning rate, multiply 0.1 every 1500 epoches
def scheduler(epoch):
    if epoch % 1500 == 0 and epoch != 0:
        lr = K.get_value(model.optimizer.lr)
        K.set_value(model.optimizer.lr, lr * 0.1)
        print("lr changed to {}".format(lr * 0.1))
    return K.get_value(model.optimizer.lr)

# An adaptively parametric rectifier linear unit (APReLU)
def aprelu(inputs):
    # get the number of channels
    channels = inputs.get_shape().as_list()[-1]
    # get a zero feature map
    zeros_input = keras.layers.subtract([inputs, inputs])
    # get a feature map with only positive features
    pos_input = Activation('relu')(inputs)
    # get a feature map with only negative features
    neg_input = Minimum()([inputs,zeros_input])
    # define a network to obtain the scaling coefficients
    scales_p = GlobalAveragePooling2D()(pos_input)
    scales_n = GlobalAveragePooling2D()(neg_input)
    scales = Concatenate()([scales_n, scales_p])
    scales = Dense(channels//8, activation='linear', kernel_initializer='he_normal', kernel_regularizer=l2(1e-4))(scales)
    scales = BatchNormalization(momentum=0.9, gamma_regularizer=l2(1e-4))(scales)
    scales = Activation('relu')(scales)
    scales = Dense(channels, activation='linear', kernel_initializer='he_normal', kernel_regularizer=l2(1e-4))(scales)
    scales = BatchNormalization(momentum=0.9, gamma_regularizer=l2(1e-4))(scales)
    scales = Activation('sigmoid')(scales)
    scales = Reshape((1,1,channels))(scales)
    # apply a paramtetric relu
    neg_part = keras.layers.multiply([scales, neg_input])
    return keras.layers.add([pos_input, neg_part])

# Residual Block
def residual_block(incoming, nb_blocks, out_channels, downsample=False,
                   downsample_strides=2):
    
    residual = incoming
    in_channels = incoming.get_shape().as_list()[-1]
    
    for i in range(nb_blocks):
        
        identity = residual
        
        if not downsample:
            downsample_strides = 1
        
        residual = BatchNormalization(momentum=0.9, gamma_regularizer=l2(1e-4))(residual)
        residual = aprelu(residual)
        residual = Conv2D(out_channels, 3, strides=(downsample_strides, downsample_strides), 
                          padding='same', kernel_initializer='he_normal', 
                          kernel_regularizer=l2(1e-4))(residual)
        
        residual = BatchNormalization(momentum=0.9, gamma_regularizer=l2(1e-4))(residual)
        residual = aprelu(residual)
        residual = Conv2D(out_channels, 3, padding='same', kernel_initializer='he_normal', 
                          kernel_regularizer=l2(1e-4))(residual)
        
        # Downsampling
        if downsample_strides > 1:
            identity = AveragePooling2D(pool_size=(1,1), strides=(2,2))(identity)
            
        # Zero_padding to match channels
        if in_channels != out_channels:
            zeros_identity = keras.layers.subtract([identity, identity])
            identity = keras.layers.concatenate([identity, zeros_identity])
            in_channels = out_channels
        
        residual = keras.layers.add([residual, identity])
    
    return residual


# define and train a model
inputs = Input(shape=(32, 32, 3))
net = Conv2D(16, 3, padding='same', kernel_initializer='he_normal', kernel_regularizer=l2(1e-4))(inputs)
net = residual_block(net, 1, 16, downsample=False)
net = residual_block(net, 1, 32, downsample=True)
# net = residual_block(net, 2, 32, downsample=False)
net = residual_block(net, 1, 64, downsample=True)
# net = residual_block(net, 2, 64, downsample=False)
net = BatchNormalization(momentum=0.9, gamma_regularizer=l2(1e-4))(net)
net = aprelu(net)
net = GlobalAveragePooling2D()(net)
outputs = Dense(10, activation='softmax', kernel_initializer='he_normal', kernel_regularizer=l2(1e-4))(net)
model = Model(inputs=inputs, outputs=outputs)
sgd = optimizers.SGD(lr=0.1, decay=0., momentum=0.9, nesterov=True)
model.compile(loss='categorical_crossentropy', optimizer=sgd, metrics=['accuracy'])

# data augmentation
datagen = ImageDataGenerator(
    # randomly rotate images in the range (deg 0 to 180)
    rotation_range=30,
    # Range for random zoom
    zoom_range = 0.2,
    # shear angle in counter-clockwise direction in degrees
    shear_range = 30,
    # randomly flip images
    horizontal_flip=True,
    # randomly shift images horizontally
    width_shift_range=0.125,
    # randomly shift images vertically
    height_shift_range=0.125)

reduce_lr = LearningRateScheduler(scheduler)
# fit the model on the batches generated by datagen.flow().
model.fit_generator(datagen.flow(x_train, y_train, batch_size=1000),
                    validation_data=(x_test, y_test), epochs=5000, 
                    verbose=1, callbacks=[reduce_lr], workers=4)

# get results
K.set_learning_phase(0)
DRSN_train_score = model.evaluate(x_train, y_train, batch_size=1000, verbose=0)
print('Train loss:', DRSN_train_score[0])
print('Train accuracy:', DRSN_train_score[1])
DRSN_test_score = model.evaluate(x_test, y_test, batch_size=1000, verbose=0)
print('Test loss:', DRSN_test_score[0])
print('Test accuracy:', DRSN_test_score[1])

实验结果如下:

Epoch 2575/5000
10s 197ms/step - loss: 0.3505 - acc: 0.9039 - val_loss: 0.4548 - val_acc: 0.8745
Epoch 2576/5000
10s 198ms/step - loss: 0.3571 - acc: 0.9003 - val_loss: 0.4483 - val_acc: 0.8732
Epoch 2577/5000
10s 194ms/step - loss: 0.3536 - acc: 0.9033 - val_loss: 0.4547 - val_acc: 0.8725
Epoch 2578/5000
10s 196ms/step - loss: 0.3514 - acc: 0.9033 - val_loss: 0.4429 - val_acc: 0.8766

程序还没跑完,似乎也没必要跑完了。

训练集上还没拟合得很好,测试集准确率已经低于训练集准确率大约2.5%了。这是同时存在欠拟合和过拟合呀!

Minghang Zhao, Shisheng Zhong, Xuyun Fu, Baoping Tang, Shaojiang Dong, Michael Pecht, Deep Residual Networks with Adaptively Parametric Rectifier Linear Units for Fault Diagnosis, IEEE Transactions on Industrial Electronics, 2020, DOI: 10.1109/TIE.2020.2972458, Date of Publication: 13 February 2020

https://ieeexplore.ieee.org/d...

点赞
收藏
评论区
推荐文章
blmius blmius
3年前
MySQL:[Err] 1292 - Incorrect datetime value: ‘0000-00-00 00:00:00‘ for column ‘CREATE_TIME‘ at row 1
文章目录问题用navicat导入数据时,报错:原因这是因为当前的MySQL不支持datetime为0的情况。解决修改sql\mode:sql\mode:SQLMode定义了MySQL应支持的SQL语法、数据校验等,这样可以更容易地在不同的环境中使用MySQL。全局s
皕杰报表之UUID
​在我们用皕杰报表工具设计填报报表时,如何在新增行里自动增加id呢?能新增整数排序id吗?目前可以在新增行里自动增加id,但只能用uuid函数增加UUID编码,不能新增整数排序id。uuid函数说明:获取一个UUID,可以在填报表中用来创建数据ID语法:uuid()或uuid(sep)参数说明:sep布尔值,生成的uuid中是否包含分隔符'',缺省为
Easter79 Easter79
3年前
swap空间的增减方法
(1)增大swap空间去激活swap交换区:swapoff v /dev/vg00/lvswap扩展交换lv:lvextend L 10G /dev/vg00/lvswap重新生成swap交换区:mkswap /dev/vg00/lvswap激活新生成的交换区:swapon v /dev/vg00/lvswap
Jacquelyn38 Jacquelyn38
4年前
2020年前端实用代码段,为你的工作保驾护航
有空的时候,自己总结了几个代码段,在开发中也经常使用,谢谢。1、使用解构获取json数据let jsonData  id: 1,status: "OK",data: 'a', 'b';let  id, status, data: number   jsonData;console.log(id, status, number )
Wesley13 Wesley13
3年前
mysql设置时区
mysql设置时区mysql\_query("SETtime\_zone'8:00'")ordie('时区设置失败,请联系管理员!');中国在东8区所以加8方法二:selectcount(user\_id)asdevice,CONVERT\_TZ(FROM\_UNIXTIME(reg\_time),'08:00','0
Wesley13 Wesley13
3年前
00:Java简单了解
浅谈Java之概述Java是SUN(StanfordUniversityNetwork),斯坦福大学网络公司)1995年推出的一门高级编程语言。Java是一种面向Internet的编程语言。随着Java技术在web方面的不断成熟,已经成为Web应用程序的首选开发语言。Java是简单易学,完全面向对象,安全可靠,与平台无关的编程语言。
Stella981 Stella981
3年前
Android蓝牙连接汽车OBD设备
//设备连接public class BluetoothConnect implements Runnable {    private static final UUID CONNECT_UUID  UUID.fromString("0000110100001000800000805F9B34FB");
Stella981 Stella981
3年前
Django中Admin中的一些参数配置
设置在列表中显示的字段,id为django模型默认的主键list_display('id','name','sex','profession','email','qq','phone','status','create_time')设置在列表可编辑字段list_editable
Wesley13 Wesley13
3年前
MySQL部分从库上面因为大量的临时表tmp_table造成慢查询
背景描述Time:20190124T00:08:14.70572408:00User@Host:@Id:Schema:sentrymetaLast_errno:0Killed:0Query_time:0.315758Lock_
Python进阶者 Python进阶者
1年前
Excel中这日期老是出来00:00:00,怎么用Pandas把这个去除
大家好,我是皮皮。一、前言前几天在Python白银交流群【上海新年人】问了一个Pandas数据筛选的问题。问题如下:这日期老是出来00:00:00,怎么把这个去除。二、实现过程后来【论草莓如何成为冻干莓】给了一个思路和代码如下:pd.toexcel之前把这
美凌格栋栋酱 美凌格栋栋酱
5个月前
Oracle 分组与拼接字符串同时使用
SELECTT.,ROWNUMIDFROM(SELECTT.EMPLID,T.NAME,T.BU,T.REALDEPART,T.FORMATDATE,SUM(T.S0)S0,MAX(UPDATETIME)CREATETIME,LISTAGG(TOCHAR(