基于振动传感器数据构建预测性维护AI模型

时间：2022-08-25 04:30:00 g37前abs传感器 1型振动速度传感器振动速度传感器yz 0050振动传感器 3f振动速度传感器振动传感器输出4

“预测性维修（Predictive Maintenance，简称PdM）以状态为基础(Condition Based)在机器运行过程中，定期（或连续）监测和故障诊断其主要（或需要）部件，确定设备状态，预测设备状态的未来发展趋势，根据设备的发展趋势和可能的故障模式，提前制定预测性维护计划。”

—

振动传感器的基本知识

传感器的主要应用领域：

土木桥；机械机床；汽车NVH；航空航天、海洋船舶、风能电力、国防军工、石化、振动台控制、材料特点、教育教学。

传感器产品类型：

集中式、便携式、组合式、坚固式、分布式

描述振幅的三个基本参数：位移x，速度v，加速度a

微分积分的位移速度x：

速度v：加速度：可以看出，位移、速度和加速度范围之间的关系是：

微积分前后波形变化：

1）、相位差90°

2)幅值变化与频率有关：当位移相同时，速度与频率成正比，加速度与频率平方成正比。

在实际工程中，低阶振动更为重要。由于力与加速度成正比，假设在同一力的作用下，不同频率的振动具有相同的加速度A=振动速度V和位移X的公式如下：

频率f	1	10	100	1000
圆频率	6.28	62.8	628	6280
加速度	100	100	100	100
加速度	16	1.6	0.16	0.016
位移	2.5	0.025	0.00025	0.0000025

可以看出，在相同的振动加速度下，随着频率的增加，振动位移按平方关系下降。在工程中，高频振动位移非常弱，对结构几乎没有损坏，因此无需关注

—

常见振动传感器类型

加速传感器：输出振动加速信号；
速度传感器：输出振动速度信号；
位移传感器：输出电涡流传感器等振动位移信号；
力传感器：输出力信号（力垂中的压力和张力)；
应变传感器：应变片需要通过惠斯通桥路输出应变信号；
声传感器：传声器，输出空气中声波的振动压力信号；
转速传感器：输出机械转速信息的多种形式（光电、涡流、编码器等）；

选择传感器的方法

应用对象的频率特性：
- 位移适用于大型结构和地震，频率较低
- 加速度适用于机械振动、冲击等。
- 速度适用于结构、地震等。
- 高频
- 低频
安装方法：根据适当的安装方法，可选择不同的传感器
- 如果需要非接触式测量来测量转轴振动，可以选择电涡流位移传感器
- 如果测量轴承座上的振动，可以选择普通的加速度传感器
- 例如，机械旋转部件的振动
灵敏度(标定值)
- 传感器将物理量转换为电量的比例系数，如50mV/g,100pc/N
- 根据测量尺寸，选择适当灵敏度的传感器，既保持信噪比，又保证量程
测量的频率范围
- 传感器有其可用的频率范围，应包括被测对象关心的频率。传感器可用于频率范围内，其灵敏度它基本稳定，在此范围之外会增加或减少
测量的幅值范围
- 传感器最大和最小可测量的幅值范围，应符合被测对象的要求

—

基于加速度传感器数据构建模型

1.同样，我们导入机器学习三个部分

import numpy as np import pandas as pd import random import matplotlib.pyplot as plt import matplotlib import seaborn as sns # from sklearn.preprocessing import StandardScaler  # scaler = StandardScaler()

2.读取机器健康时的振动数据

header_list = ['time','x','y','z'] mHealthy = pd.read_csv('data/m1_vib1.csv',header = None, names = header_list ) mHealthy['ds'] = mHealthy['time'].apply(lambda x: float((x- 15840094829302)/1000))

3.读入机器出现问题时的数据

mBroken = pd.read_csv('data/m2_vib3.csv',header = None, names = header_list ) mBroken['ds'] = mBroken['time'].apply(lambda x: float((x- 15840094829302)/1000))

4.从上面看，数据的特征似乎有点少。现在我们需要做的是增强数据。我们需要寻求数据的平均值、有效值、偏差、峰值等相关数据转换。

Main = pd.DataFrame(columns=['SDx,'SDy','SDz','RMSx','RMSy','RMSz','Mx','My','Mz','CRx','CRy','CRz','Kx','Ky','Kz','SKx','SKy','SKz','CFx','CFy','CFz','IFx','IFy','IFz','SFx','SFy','SFz','label'])
MainT = pd.DataFrame(columns=['SDx','SDy','SDz','RMSx','RMSy','RMSz','Mx','My','Mz','CRx','CRy','CRz','Kx','Ky','Kz','SKx','SKy','SKz','CFx','CFy','CFz','IFx','IFy','IFz','SFx','SFy','SFz','label'])
num = random.randint(0,99899)
df = mBroken[num:num+100]

num = random.randint(0,99899)
df = mBroken[num:num+100

def RMScalc(df,c):
    return np.sqrt(df[c].pow(2).sum()/1000)
def Clearance(df,c):
    root = (df[c].abs().pow(0.5).sum()/1000)**2
    return root

RMScalc(df,'x')

for i in range(0,100):
    num = random.randint(0,99899)
    df = mBroken[num:num+100]
    SDx = df.std()['x']
    SDy = df.std()['y']
    SDz = df.std()['z']
    RMSx = RMScalc(df,'x')
    RMSy = RMScalc(df,'y')
    RMSz = RMScalc(df,'z')
    Mx = df['x'].mean()
    My = df['y'].mean()
    Mz = df['z'].mean()
    CRx = float(df['x'].max()/RMSx)
    CRy = float(df['y'].max()/RMSy)
    CRz = float(df['z'].max()/RMSz)
    Kx = df['x'].kurt()
    Ky = df['y'].kurt()
    Kz = df['z'].kurt()
    SKx = df['x'].skew()
    SKy = df['y'].skew()
    SKz = df['z'].skew()
    CFx = float(df['x'].max()/Clearance(df,'x'))
    CFy = float(df['y'].max()/Clearance(df,'y'))
    CFz = float(df['z'].max()/Clearance(df,'z'))
    IFx = float(df['x'].max()/Mx)
    IFy = float(df['y'].max()/My)
    IFz = float(df['z'].max()/Mz)
    SFx = float(RMSx/Mx)
    SFy = float(RMSy/My)
    SFz = float(RMSz/Mz)
    label = 0
    list = [SDx,SDy,SDz,RMSx,RMSy,RMSz,Mx,My,Mz,CRx,CRy,CRz,Kx,Ky,Kz,SKx,SKy,SKz,CFx,CFy,CFz,IFx,IFy,IFz,SFx,SFy,SFz,label]
    MainT.loc[i] = list


for i in range(101,200):
    num = random.randint(0,99899)
    df = mHealthy[num:num+100]
    SDx = df.std()['x']
    SDy = df.std()['y']
    SDz = df.std()['z']
    RMSx = RMScalc(df,'x')
    RMSy = RMScalc(df,'y')
    RMSz = RMScalc(df,'z')
    Mx = df['x'].mean()
    My = df['y'].mean()
    Mz = df['z'].mean()
    CRx = float(df['x'].max()/RMSx)
    CRy = float(df['y'].max()/RMSy)
    CRz = float(df['z'].max()/RMSz)
    Kx = df['x'].kurt()
    Ky = df['y'].kurt()
    Kz = df['z'].kurt()
    SKx = df['x'].skew()
    SKy = df['y'].skew()
    SKz = df['z'].skew()
    CFx = float(df['x'].max()/Clearance(df,'x'))
    CFy = float(df['y'].max()/Clearance(df,'y'))
    CFz = float(df['z'].max()/Clearance(df,'z'))
    IFx = float(df['x'].max()/Mx)
    IFy = float(df['y'].max()/My)
    IFz = float(df['z'].max()/Mz)
    SFx = float(RMSx/Mx)
    SFy = float(RMSy/My)
    SFz = float(RMSz/Mz)
    label = 1
    list = [SDx,SDy,SDz,RMSx,RMSy,RMSz,Mx,My,Mz,CRx,CRy,CRz,Kx,Ky,Kz,SKx,SKy,SKz,CFx,CFy,CFz,IFx,IFy,IFz,SFx,SFy,SFz,label]
    MainT.loc[i] = list

我们来看看每个特征的相关系数，用sns中的一个热力图插件来看看

fig, ax = plt.subplots(figsize=(20,20))  
ax = sns.heatmap(Mainx.corr(),annot=True,cmap='viridis')

下面我们分别尝试用ANN、SVM、RandomForest来分别进行模型构建

1、ANN Classfier

X = Mainx.drop('label',axis=1).values
y = Mainx['label'].values

XT = MainT.drop('label',axis=1)

import seaborn as sns
from sklearn.preprocessing import MinMaxScaler
from sklearn.model_selection import train_test_split
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Activation,Dropout
from tensorflow.keras.constraints import max_norm
scaler = MinMaxScaler()

Scaled = scaler.fit_transform(MainT['SDx'].values.astype(float).reshape(-1, 1))

df_normalized = pd.DataFrame(Scaled,columns = ['SDx'])

df_normalized['label'] = MainT['label']

X = Mainx.drop('label',axis=1).values
y = Mainx['label'].values
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, random_state=101)

scaler = MinMaxScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

model = Sequential()


# https://stats.stackexchange.com/questions/181/how-to-choose-the-number-of-hidden-layers-and-nodes-in-a-feedforward-neural-netw




# input layer
model.add(Dense(4,input_dim = 4,  activation='tanh'))
model.add(Dropout(0.2))


# hidden layer
model.add(Dense(8, activation='relu'))
model.add(Dropout(0.2))


# hidden layer
model.add(Dense(8, activation='relu'))
model.add(Dropout(0.2))


# output layer
model.add(Dense(units=1,activation='tanh'))


# Compile model
model.compile(loss='binary_crossentropy', optimizer='adam',metrics = ['accuracy'])

模型训练

model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=400, verbose=2,batch_size = 50)

模型评估

model.evaluate(X_train, y_train, verbose=0)
from sklearn.metrics import classification_report,confusion_matrix
predictions = model.predict_classes(X_test)
print(classification_report(y_test,predictions))

可以看出准确率太低了，只有50%，还不如抛硬币。

下面来尝试SVM

from sklearn import svm
from sklearn.pipeline import Pipeline
clf = svm.SVC()
clf.fit(X_train,y_train)
preds = clf.predict(X_test)
from sklearn.metrics import accuracy_score
print(accuracy_score(y_test,preds))

准确率还是只有55%，我们还是继续跑硬币吧。

最后我们再来尝试下随机森林，理论上，多管齐下，得到的结果应该会稍微好点，话不多说，我们来看看

from sklearn.feature_selection import SelectFromModel
from sklearn.ensemble import RandomForestClassifier
from sklearn.svm import LinearSVC
clf2 = Pipeline([
  ('feature_selection', SelectFromModel(LinearSVC())),
  ('classification', RandomForestClassifier())
])
clf2.fit(X_train, y_train)
print(accuracy_score(y_test,clf2.predict(X_test)))

得到的结果是0.825，相比前两种方法提高了不少，但是要能够实际应用还远远不够，还需要继续提高准确率。

那问题出在哪里呢，是不是数据的特征提取出现了什么问题，我们用遗传算法来重新进行特征选择

mBroken['dt'] = mBroken['time'].apply(lambda x: float((x- 15840094829302)/10000))
mHealthy['dt'] = mHealthy['time'].apply(lambda x: float((x-15839399805840)/10000))
Main = pd.DataFrame(columns=['SDx','SDy','SDz','RMSx','RMSy','RMSz','Mx','My','Mz','CRx','CRy','CRz','Kx','Ky','Kz','SKx','SKy','SKz','CFx','CFy','CFz','IFx','IFy','IFz','SFx','SFy','SFz','label'])
Main2 = pd.DataFrame(columns=['SDx','SDy','SDz','RMSx','RMSy','RMSz','Mx','My','Mz','CRx','CRy','CRz','Kx','Ky','Kz','SKx','SKy','SKz','CFx','CFy','CFz','IFx','IFy','IFz','SFx','SFy','SFz','label'])

#SAMPEL BROKEN coba 100
for i in range(0,100):
    num = random.randint(0,99899)
    df = mBroken[num:num+100]
    SDx = df.std()['x']
    SDy = df.std()['y']
    SDz = df.std()['z']
    RMSx = RMScalc(df,'x')
    RMSy = RMScalc(df,'y')
    RMSz = RMScalc(df,'z')
    Mx = df['x'].mean()
    My = df['y'].mean()
    Mz = df['z'].mean()
    CRx = float(df['x'].max()/RMSx)
    CRy = float(df['y'].max()/RMSy)
    CRz = float(df['z'].max()/RMSz)
    Kx = df['x'].kurt()
    Ky = df['y'].kurt()
    Kz = df['z'].kurt()
    SKx = df['x'].skew()
    SKy = df['y'].skew()
    SKz = df['z'].skew()
    CFx = float(df['x'].max()/Clearance(df,'x'))
    CFy = float(df['y'].max()/Clearance(df,'y'))
    CFz = float(df['z'].max()/Clearance(df,'z'))
    IFx = float(df['x'].max()/Mx)
    IFy = float(df['y'].max()/My)
    IFz = float(df['z'].max()/Mz)
    SFx = float(RMSx/Mx)
    SFy = float(RMSy/My)
    SFz = float(RMSz/Mz)
    label = 0
    list = [SDx,SDy,SDz,RMSx,RMSy,RMSz,Mx,My,Mz,CRx,CRy,CRz,Kx,Ky,Kz,SKx,SKy,SKz,CFx,CFy,CFz,IFx,IFy,IFz,SFx,SFy,SFz,label]
    Main.loc[i] = list

# SAMPEL HEALTHY COBA 100
for i in range(0,100):
    num = random.randint(0,99899)
    df = mHealthy[num:num+100]
    SDx = df.std()['x']
    SDy = df.std()['y']
    SDz = df.std()['z']
    RMSx = RMScalc(df,'x')
    RMSy = RMScalc(df,'y')
    RMSz = RMScalc(df,'z')
    Mx = df['x'].mean()
    My = df['y'].mean()
    Mz = df['z'].mean()
    CRx = float(df['x'].max()/RMSx)
    CRy = float(df['y'].max()/RMSy)
    CRz = float(df['z'].max()/RMSz)
    Kx = df['x'].kurt()
    Ky = df['y'].kurt()
    Kz = df['z'].kurt()
    SKx = df['x'].skew()
    SKy = df['y'].skew()
    SKz = df['z'].skew()
    CFx = float(df['x'].max()/Clearance(df,'x'))
    CFy = float(df['y'].max()/Clearance(df,'y'))
    CFz = float(df['z'].max()/Clearance(df,'z'))
    IFx = float(df['x'].max()/Mx)
    IFy = float(df['y'].max()/My)
    IFz = float(df['z'].max()/Mz)
    SFx = float(RMSx/Mx)
    SFy = float(RMSy/My)
    SFz = float(RMSz/Mz)
    label = 1
    list = [SDx,SDy,SDz,RMSx,RMSy,RMSz,Mx,My,Mz,CRx,CRy,CRz,Kx,Ky,Kz,SKx,SKy,SKz,CFx,CFy,CFz,IFx,IFy,IFz,SFx,SFy,SFz,label]
    Main2.loc[i] = list

Main = Main.append(Main2)

模型训练

def modelGA(train_data,train_labels,test_data,test_labels):
    inputSize = train_data[1].size
    modelG = Sequential()
    # input layer
    modelG.add(Dense(inputSize,input_dim = inputSize,  activation='tanh'))
    #model.add(Dropout(0.2))


    # hidden layer
    modelG.add(Dense(13, activation='relu'))
    modelG.add(Dropout(0.2))


    # hidden layer
    modelG.add(Dense(13, activation='relu'))
    modelG.add(Dropout(0.2))


    # output layer
    modelG.add(Dense(units=1,activation='tanh'))


    # Compile model
    modelG.compile(loss='binary_crossentropy', optimizer='adam',metrics = ['accuracy'])
    modelG.fit(train_data, train_labels, validation_data=(test_data, test_labels), epochs=150, verbose=2,batch_size = 50)
    acc = modelG.evaluate(test_data, test_labels, verbose=0)[1]
    return acc
    
def reduce_features(solution, features):
    selected_elements_indices = np.where(solution == 1)[0]
    reduced_features = features[:, selected_elements_indices]
    return reduced_features


def classification_accuracy(labels, predictions):
    correct = np.where(labels == predictions)[0]
    accuracy = correct.shape[0]/labels.shape[0]
    return accuracy


def cal_pop_fitness(pop, features, labels, train_indices, test_indices):
    accuracies = np.zeros(pop.shape[0])
    idx = 0


    for curr_solution in pop:
        reduced_features = reduce_features(curr_solution, features)
        train_data = reduced_features[train_indices, :]
        test_data = reduced_features[test_indices, :]


        train_labels = labels[train_indices]
        test_labels = labels[test_indices]


        #SV_classifier = sklearn.svm.SVC(gamma='scale')
        #SV_classifier.fit(X=train_data, y=train_labels)
        
        #predictions = SV_classifier.predict(test_data)
        accuracy = modelGA(train_data,train_labels,test_data,test_labels)
        #accuracies[idx] = classification_accuracy(test_labels, predictions)
        accuracies[idx] = accuracy
        idx = idx + 1
    return accuracies


def select_mating_pool(pop, fitness, num_parents):
    # Selecting the best individuals in the current generation as parents for producing the offspring of the next generation.
    parents = np.empty((num_parents, pop.shape[1]))
    for parent_num in range(num_parents):
        max_fitness_idx = np.where(fitness == np.max(fitness))
        max_fitness_idx = max_fitness_idx[0][0]
        parents[parent_num, :] = pop[max_fitness_idx, :]
        fitness[max_fitness_idx] = -99999999999
    return parents




def crossover(parents, offspring_size):
    offspring = np.empty(offspring_size)
    # The point at which crossover takes place between two parents. Usually, it is at the center.
    crossover_point = np.uint8(offspring_size[1]/2)


    for k in range(offspring_size[0]):
        # Index of the first parent to mate.
        parent1_idx = k%parents.shape[0]
        # Index of the second parent to mate.
        parent2_idx = (k+1)%parents.shape[0]
        # The new offspring will have its first half of its genes taken from the first parent.
        offspring[k, 0:crossover_point] = parents[parent1_idx, 0:crossover_point]
        # The new offspring will have its second half of its genes taken from the second parent.
        offspring[k, crossover_point:] = parents[parent2_idx, crossover_point:]
    return offspring




def mutation(offspring_crossover, num_mutations=2):
    mutation_idx = np.random.randint(low=0, high=offspring_crossover.shape[1], size=num_mutations)
    # Mutation changes a single gene in each offspring randomly.
    for idx in range(offspring_crossover.shape[0]):
        # The random value to be added to the gene.
        offspring_crossover[idx, mutation_idx] = 1 - offspring_crossover[idx, mutation_idx]
    return offspring_crossover

‍

data_inputs = X
data_outputs = y
num_samples = data_inputs.shape[0]
num_feature_elements = data_inputs.shape[1]
print("num samples: ",num_samples)
print("num_feature_elements: ", num_feature_elements)




train_indices = np.arange(1, num_samples, 4)
test_indices = np.arange(0, num_samples, 4)
print("Number of training samples: ", train_indices.shape[0])
print("Number of test samples: ", test_indices.shape[0])


"""
Genetic algorithm parameters:
    Population size
    Mating pool size
    Number of mutations
"""
sol_per_pop = 5 # Population size.
num_parents_mating = 3 # Number of parents inside the mating pool.
num_mutations = 2 # Number of elements to mutate.


# Defining the population shape.
pop_shape = (sol_per_pop, num_feature_elements)


# Creating the initial population.
new_population = np.random.randint(low=0, high=2, size=pop_shape)
print("shape = ",new_population.shape[1])




best_outputs = []
num_generations = 20
for generation in range(num_generations):
    print("Generation : ", generation)
    # Measuring the fitness of each chromosome in the population.
    fitness = cal_pop_fitness(new_population, data_inputs, data_outputs, train_indices, test_indices)


    best_outputs.append(np.max(fitness))
    # The best result in the current iteration.
    print("Best result : ", best_outputs[-1])


    # Selecting the best parents in the population for mating.
    parents = select_mating_pool(new_population, fitness, num_parents_mating)


    # Generating next generation using crossover.
    offspring_crossover = crossover(parents, offspring_size=(pop_shape[0]-parents.shape[0], num_feature_elements))


    # Adding some variations to the offspring using mutation.
    offspring_mutation = mutation(offspring_crossover, num_mutations=num_mutations)


    # Creating the new population based on the parents and offspring.
    new_population[0:parents.shape[0], :] = parents
    new_population[parents.shape[0]:, :] = offspring_mutation


# Getting the best solution after iterating finishing all generations.
# At first, the fitness is calculated for each solution in the final generation.
fitness = cal_pop_fitness(new_population, data_inputs, data_outputs, train_indices, test_indices)
# Then return the index of that solution corresponding to the best fitness.
best_match_idx = np.where(fitness == np.max(fitness))[0]
best_match_idx = best_match_idx[0]


best_solution = new_population[best_match_idx, :]
best_solution_indices = np.where(best_solution == 1)[0]
best_solution_num_elements = best_solution_indices.shape[0]
best_solution_fitness = fitness[best_match_idx]

数据切分

XZ = X[:,best_solution_indices]
XZ_train, XZ_test, yz_train, yz_test = train_test_split(XZ, y, test_size=0.20, random_state=101)

构建模型

modelZ = Sequential()


# https://stats.stackexchange.com/questions/181/how-to-choose-the-number-of-hidden-layers-and-nodes-in-a-feedforward-neural-netw


# input layer
modelZ.add(Dense(best_solution_num_elements,input_dim = best_solution_num_elements,  activation='tanh'))
#model.add(Dropout(0.2))


# hidden layer
modelZ.add(Dense(13, activation='relu'))
modelZ.add(Dropout(0.2))


# hidden layer
modelZ.add(Dense(13, activation='relu'))
modelZ.add(Dropout(0.2))


# output layer
modelZ.add(Dense(units=1,activation='tanh'))


# Compile model
modelZ.compile(loss='binary_crossentropy', optimizer='adam',metrics = ['accuracy'])

模型训练

modelZ.fit(XZ_train, yz_train, validation_data=(XZ_test, yz_test), epochs=200, verbose=2,batch_size = 50)

预测结果

predictionsZ = modelZ.predict_classes(XZ_test)
print(classification_report(yz_test,predictionsZ))

从结果看来似乎达到100%的准确率，实际应用效果还有待验证，我们现在这个模型只能预测出机器出现了问题，并不能预测出是什么问题。

听说关注公众号的都是大牛

锐单商城拥有海量元器件数据手册、IC替代型号，打造电子元器件IC百科大全！

基于振动传感器数据构建预测性维护AI模型

选择传感器的方法

相关文章