我这样做的一种方式:
- 一次读取一块样本,比如说值得0.05秒
- 计算块的RMS幅度(各个样本的平方的平均值的平方根)
- 如果块的RMS幅度大于阈值,则为“嘈杂的块”,否则为“安静的块”
- 突然的敲击将是一个安静的区域,然后是少量的噪音区域,然后是一个安静的区域
- 如果您从不安静,则门槛太低
- 如果您从不听到嘈杂的声音,则您的门槛太高
我的应用程序正在录制无人看管的“有趣”噪音,因此只要有嘈杂的声音,它就会录制下来。如果有15秒的嘈杂时间(“遮住耳朵”),则将阈值乘以1.1;如果有15
分钟的 安静时间(“更难听”),则将阈值乘以0.9 。您的应用程序将有不同的需求。
此外,刚刚注意到我的代码中有关观察到的RMS值的一些注释。在Macbook Pro的内置麦克风上,标准化的音频数据范围为+/-
1.0,输入音量设置为max,一些数据点:
- 0.003-0.006(-50dB至-44dB)我家的中央暖气风扇
- 在同一台笔记本电脑上键入0.010-0.40(-40dB至-8dB)
- 0.10(-20dB)在1’距离处轻柔地d指
- 0.60(-4.4dB)在1’处大声响动
更新:这是一个入门的示例。
#!/usr/bin/python# open a microphone in pyAudio and listen for tapsimport pyaudioimport structimport mathINITIAL_TAP_THRESHOLD = 0.010FORMAT = pyaudio.paInt16 SHORT_NORMALIZE = (1.0/32768.0)CHANNELS = 2RATE = 44100 INPUT_BLOCK_TIME = 0.05INPUT_frameS_PER_BLOCK = int(RATE*INPUT_BLOCK_TIME)# if we get this many noisy blocks in a row, increase the thresholdOVERSENSITIVE = 15.0/INPUT_BLOCK_TIME # if we get this many quiet blocks in a row, decrease the thresholdUNDERSENSITIVE = 120.0/INPUT_BLOCK_TIME # if the noise was longer than this many blocks, it's not a 'tap'MAX_TAP_BLOCKS = 0.15/INPUT_BLOCK_TIMEdef get_rms( block ): # RMS amplitude is defined as the square root of the # mean over time of the square of the amplitude. # so we need to convert this string of bytes into # a string of 16-bit samples... # we will get one short out for each # two chars in the string. count = len(block)/2 format = "%dh"%(count) shorts = struct.unpack( format, block ) # iterate over the block. sum_squares = 0.0 for sample in shorts: # sample is a signed short in +/- 32768. # normalize it to 1.0 n = sample * SHORT_NORMALIZE sum_squares += n*n return math.sqrt( sum_squares / count )class TapTester(object): def __init__(self): self.pa = pyaudio.PyAudio() self.stream = self.open_mic_stream() self.tap_threshold = INITIAL_TAP_THRESHOLD self.noisycount = MAX_TAP_BLOCKS+1 self.quietcount = 0 self.errorcount = 0 def stop(self): self.stream.close() def find_input_device(self): device_index = None for i in range( self.pa.get_device_count() ): devinfo = self.pa.get_device_info_by_index(i) print( "Device %d: %s"%(i,devinfo["name"]) ) for keyword in ["mic","input"]: if keyword in devinfo["name"].lower(): print( "Found an input: device %d - %s"%(i,devinfo["name"]) ) device_index = i return device_index if device_index == None: print( "No preferred input found; using default input device." ) return device_index def open_mic_stream( self ): device_index = self.find_input_device() stream = self.pa.open( format = FORMAT,channels = CHANNELS,rate = RATE,input = True,input_device_index = device_index,frames_per_buffer = INPUT_frameS_PER_BLOCK) return stream def tapDetected(self): print("Tap!") def listen(self): try: block = self.stream.read(INPUT_frameS_PER_BLOCK) except IOError as e: # dammit. self.errorcount += 1 print( "(%d) Error recording: %s"%(self.errorcount,e) ) self.noisycount = 1 return amplitude = get_rms( block ) if amplitude > self.tap_threshold: # noisy block self.quietcount = 0 self.noisycount += 1 if self.noisycount > OVERSENSITIVE: # turn down the sensitivity self.tap_threshold *= 1.1 else: # quiet block. if 1 <= self.noisycount <= MAX_TAP_BLOCKS: self.tapDetected() self.noisycount = 0 self.quietcount += 1 if self.quietcount > UNDERSENSITIVE: # turn up the sensitivity self.tap_threshold *= 0.9if __name__ == "__main__": tt = TapTester() for i in range(1000): tt.listen()
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)