## Past Project: Frequency Demodulation

Background:

Data obtained from TDO rf penetration depth measurements is essentially a frequency-modulated rf voltage signal (see TDO/Pulsed Field for more details). The data must be demodulated in order to glean information about the penetration depth and superconducting state. After my January 2016 trip to the NHMFL Pulsed Field Facility at LANL (where they have their own demodulation routines), I wanted to take a closer look at the raw data to identify a source of noise. I wrote a short MATLAB script to perform the demodulation for this specific dataset. Below is an example of field vs. time and frequency vs. time for a 30T magnet pulse at LANL with a λ-(BETS)_{2}GaCl_{4} sample, demodulated using the MATLAB script.

There are a couple obvious features to note: First, the frequency follows the field quite closely. This is due to voltage induced in the TDO circuit by the high dB/dt in the pulsed field system. This can be worked around either by compensating the circuit so that induced voltages cancel out, or by identifying and subtracting the background signal. Second, the periodic frequency noise (~300Hz), most obvious towards the end of the shot. This was the noise I was attempting to identify. (The frequecy suggests it is due to mechanical vibration, but the exact culprit was never identified.)

The MATLAB script worked alright for this particular dataset, but there were obvious flaws and this was a very specific data format that I'm unlikely to encounter on a regular basis (not to mention I didn't really want to pay to re-up my MATLAB student license). I wanted to put together a solution that would be:

- More flexible in terms of input data formats and user-set paramters.
*In particular, able to handle data from the small pulsed field (SPF) setup.* - Able to provide better resolution and stability
- Faster and more efficient
- Open source

The problem:

During a pulse, TDO and pickup voltage data from the small pulsed field are read on two channels of the data acquisition oscilloscope. The data is retrieved from the oscilloscope using LabVIEW in the form of two files:

- A single 8MB file containing 4 million points of TDO oscillation data, followed by 4 million points of pickup voltage data, stored as 8 bit unsigned integers
- A small text file containing the time step between points and the vertical scaling factors for both channels of the oscilloscope.

- A simple C program to integrate the pickup data and append a header to the TDO data so that it can be read by the demodulation program. (Required the user to input scaling factors manually.)
- A complex (and very fast) C++ demodulation program, which was originally designed for use with a different data format (hence the need to append a header). (Required the user to input scaling factors manually.)
- An Igor Pro spline fitting/plotting macro

The solution:

*See the full code on GitHub and an example IPython notebook here or at the GitHub link.*

First, we need to import all the necessary modules:

` ````
import numpy as np
from numpy import fft
import pandas as pd
from scipy import interpolate
from scipy import integrate
from scipy import signal
import matplotlib.pyplot as plt
```

Next, load the integer and scaling data and convert unsigned int data to voltage:
` ````
def dataload_spf_bin(osc_file):
""" Load byte data taken w/ Labview on small pulsed field.
Assumes text file w/ scaling factors named osc_file+'Scale.txt' in same directory as osc_file.
Returns dictionary containing timebase info, oscillations, and field.
"""
scale_file = osc_file+'Scale.txt'
raw_data = np.fromfile(osc_file, dtype=np.uint8) # load byte data as uint8
osc_data = raw_data[:len(raw_data)//2] # first half of file is tdo signal
pu_data = raw_data[len(raw_data)//2:] # second half of file is pickup voltage
scale_data = pd.read_csv(scale_file, '\t') # load scaling data
osc = scale_data.yincrement[0]*(osc_data-128.)+scale_data.yorigin[0] # convert to Voltage
pu = scale_data.yincrement[1]*(pu_data-128.)+scale_data.yorigin[1]
df_osc = pd.DataFrame({'tdo': osc, 'pu': pu})
osc_dict= {'file': osc_file, 'dt': scale_data.dt[0], 'df_osc': df_osc}
return osc_dict
```

Then calculate field from pickup data by. `pu_to_field()`

accepts the dictionary returned by `dataload_spf_bin()`

as an input and returns an update dictionary:
` ````
def pu_to_field(osc_dict, Nturns, diameter, plot=False):
""" Integrates pickup signal to obtain field. Takes output of dataload_spf_bin as input.
Nturns: number of turns in pickup coil.
diameter: of pickup coil in meters.
"""
coilArea = Nturns*np.pi*(diameter/2)**2
# subtract offset from before pulse begins
osc_dict['df_osc'].pu -= np.mean(osc_dict['df_osc'].pu[:100000])
# calculate field
field = (1/coilArea)*integrate.cumtrapz(osc_dict['df_osc'].pu, x=None, dx=osc_dict['dt'])
# data is oversampled, so decimate it:
field = signal.decimate(field, 40)
tdo = signal.decimate(osc_dict['df_osc'].tdo, 10)
# get new time steps after decimation
dt_fast = osc_dict['dt']*osc_dict['df_osc'].tdo.size/len(tdo)
dt_slow = osc_dict['dt']*osc_dict['df_osc'].pu.size/len(field)
osc_dict = {'file': osc_dict['file'], 'dt_fast': dt_fast, 'tdo': tdo, 'dt_slow': dt_slow, 'Field': field}
if plot:
plt.plot(field)
plt.xlabel('Index')
plt.ylabel('Field (T)')
return osc_dict
```

Now we need to calculate frequency vs. time, i.e. demodulate the data. (I also wrote a function, `dataload_lanl_csv()`

, to load the LANL data so that it too can be demodulated using `demod()`

.) The basic approach is to take an FFT of `L`

points, find the fundamental frequency of the TDO signal over those points, move on to another `L`

points, etc.:
` ````
def demod(osc_dict, L=None, nstep=None, plot=True):
""" Performs demodulation of data loaded via dataload_lanl_csv or dataload_spf_bin+pu_to_field.
"""
dt_fast = osc_dict['dt_fast'] # time step for TDO signal
dt_slow = osc_dict['dt_slow'] # time step for field
tdo = np.array(osc_dict['tdo'])
field = np.array(osc_dict['Field'])
file = osc_dict['file']
npnts = tdo.size # number of points in TDO signal
npnts_slow = int(npnts*dt_fast/dt_slow) # number of points in everything else
time_fast = np.arange(0, dt_fast*npnts, dt_fast) # timebase for tdo
time_slow = np.arange(0, dt_slow*npnts_slow, dt_slow) # timebase for everything else
Fs = 1/dt_fast # sampling frequency
# program will select a good value for L based on the frequency and Fs
if L is None:
init_fft = fft.rfft(tdo[:1024])[:-1]
f = Fs*np.arange(0, 512)/(512)
fmax = f[np.argmax(abs(init_fft)[1:])]
pts_per_T = int(2/(fmax*dt_fast))
L = 30*pts_per_T
if nstep is None: # nstep < L so there is some overlap between FFT chunks
nstep = 20*pts_per_T
# heaving zero-padding to improve frequency resolution
npad = np.power(2,16)-L # zero-padding (for speed should be 2^N-L, but probably doesn't matter)
pad = np.zeros(npad)
N = int(npnts/nstep)
freq = np.empty(N)
amp = np.empty(N)
time = np.empty(N)
# main loop:
# Perform sliding FFT
for i in range(N-nstep):
fftdata = np.concatenate([tdo[i*nstep:i*nstep+L], pad])
f = Fs*np.arange(1000, L+npad)/(L+npad)
#tdo_fft = fft.rfft(df_tdo.tdo[i*nstep:i*nstep+L]) # if no zero-padding
#f = Fs*np.arange(0,L)/L; # if no zero-padding
freq[i] = f[np.argmax(abs(fft.rfft(fftdata)[1000:]/L))]
amp [i] = np.sqrt(2*np.mean(np.square(fftdata[:L])))
time[i] = dt_fast*nstep*i;
time = time[:-nstep]
freq = freq[:-nstep]
amp = amp[:-nstep]
demod_dict={'file': file, 'time_tdo': time_fast, 'tdo': tdo, 'time_Field': time_slow, 'Field': field[:len(time_slow)], 'time_freq': time, 'freq': freq, 'amp': amp}
if plot:
fig, ax1 = plt.subplots()
ax2 = ax1.twinx()
ax1.plot(time, freq*1e-6, 'b', label='Frequency')
ax2.plot(time, amp*1e3, 'r', label='Amplitude')
ax2.set_ylabel('Amplitude (mV)')
ax1.set_ylabel('Frequency (MHz)')
ax2.set_xlabel('Time (s)')
lines1, labels1 = ax1.get_legend_handles_labels()
lines2, labels2 = ax2.get_legend_handles_labels()
ax2.legend(lines1 + lines2, labels1 + labels2, loc=0)
ax1.set_zorder(ax2.get_zorder()+1) # put ax in front of ax2
ax1.patch.set_visible(False) # hide the 'canvas'
plt.show()
return demod_dict
```

The data (frequency, amplitude, and field) can now be spline fit so that we can plot frequency and amplitude vs. field in order to idenify the superconducting phase transition, Shubnikov-de Haas oscillations, and other field-dependent phenomena. (I have included functions to perform the spline fitting and save the data in various forms, but I won't show the code here. See tdo-demod on GitHub.)
Discussion on the speed of the `demod()`

:

As noted in the example IPython notebook, the `demod()`

function takes about 3-5 seconds to crunch data from the small pulsed field and closer to 10 seconds to crunch LANL-style data. The difference in speeds between the two data formats is not surprising: I decimate the SPF data from 4 million points to 40,000, while the LANL data is 2.2 million. I could decimate the LANL data, but as mentioned above, this was likely a one-time analysis.

The original C++ demodulation program (which I had no part in building) was significantly faster: I've never profiled it, but it always seemed to execute in less than a second. This comes as no surprise: the student who built the program put in a lot of work to optimize and parallelize it. I toyed around briefly with speeding up my program using Cython, but got only very modest speedup from incorporating static types. This isn't really a surprise, as all of the most time-consuming computations (array manipulations and FFTs) are already being implemented C/C++ (numpy)/Fortran (numpy/scipy fftpack). I also very briefly tried pyFFTW, a wrapper around (the fastest widely-availabe FFT library available, as far as I can tell). I found pyFFTW to be more hassle than it was worth for this particular application, and decided that 3-5 seconds is *fast enough* for crunching one dataset at a time in an IPython notebook. (And, taking into account the amount of time needed for the user to switch between programs and input arguments using the old approach, the Python approach is certainly faster.) It's possible to speed up the execution significantly by changing the parameters `L`

and `nstep`

(the latter in particular), at the expense of resolution in the time domain.

Ultimately I am most concerned with *the readability and useability of the code*, and *minimizing the time and effort it takes to integrate new data formats*. If speed were my main concern, it would make much more sense to switch to a compiled language and/or consider an entirely different approach to the demodulation.