ADC Analysis and Scipy¶
Introduction¶
In an effort to beat the Lockdown Blues, I thought I would dabble in the world of embedded computers. This was sparked by the recommendation on Hacker News for an EDX course "Embedded Systems - Shape The World: Microcontroller Input/Output". It has proved quite an interesting exercise.
The course uses a TI ARM microcontroller TM4C123GH6PM to illustrate aspects of embedded design and programming. Even though it is by now about five years old, there are still cheap Evaluation Kits from TI readily available. In fact, you can buy all the components needed for the course as a bundle, even here in Australia.
Culture Shock¶
It is quite a change to go from a high-level software engineering viewpoint, where you try to make abstractions of real world systems, to the world where you have to have a deep understanding of the architecture of the TM4C123GH6PM. You are expected to turn modules on and off to save power, and here poking bits into arcane registers to modify simultaneously many aspects system behaviour is a way of life. You can certainly come up with very concise C programs, where what would be a series of functions calls in my world, can be compactly expressed as a bit array, written into some control register.
ADC¶
One of the facilities of the TM4C123GH6PM is an Analog-to-Digital Convertor, that is very flexible. One of my exercises was to input an analog voltage, convert it to a 12 bit value, and send to a laptop (via the USB serial line) for display. I took a set of observations of the input Voltage (via a multimeter), and the uploaded ADC value. In doing this I noted that the output was quite noisy, with a jitter in last decimal digit of the ADC value.
I then repeated the experiment, but this time I configured the TM4C123GH6PM ADC to do hardware averaging of the values read. The jitter seemed to be markedly reduced.
Second Thoughts¶
Then I had second thoughts. Maybe it wasn't noise, but (say) 50Hz ripple I was seeing in the reference voltage (unlikely I know), being sampled at random times. In this case, the averaged ADC values would still drift up and down with the ripple.
Scipy to the Rescue¶
I decided to use scipy
to tell if the reduced jitter was really smaller in the averaged case.
Notebook Setup¶
%load_ext lab_black
%load_ext watermark
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np
from statsmodels.formula.api import ols
import statsmodels.api as sm
import scipy.stats as stats
import warnings
import sys
import os
import subprocess
import datetime
import platform
import datetime
from IPython.display import Image
Image("images/TM4C123 ADC_bb.png")
Single Sample ADC¶
We read the data from the single sample ADC values via pandas
, do some data cleanup, and use Seaborn to show the results
data = pd.read_csv('ADCRaw.txt')
data.head()
data.columns
data.columns = ['Voltage', 'ADC']
data.sort_values(by='Voltage', inplace=True)
We use a different colors for raw data, and the fitted line. I have no idea why the legend doesn't show all labels. To make a point, I make the Confidence Interval to be the 99% range (and to make it easier to see)
fig, ax = plt.subplots(figsize=(10, 10))
sns.regplot(
'Voltage',
'ADC',
data=data,
ci=99,
scatter_kws={"color": "black", "label": "ADC Reading"},
line_kws={"color": "red", "label": "OLS Fit"},
)
plt.legend(loc='best')
ax.set_title('Regplot: TM4C123GXL ADC Performance')
Linear Regression¶
We now perform a linear regression, and plot the Observation Confidence Interval (not the mean CI)
res1 = ols('ADC ~ Voltage', data=data).fit()
res1.summary()
Note that the linear model is a very good one, explaining most of the variation in the data
Now, we plot the data, with CI lines (using the default alpha
)
gp = res1.get_prediction({'Voltage': data['Voltage']})
pred_df = gp.summary_frame()
pred_df.head(1)
fig, ax = plt.subplots(figsize=(10, 10))
ax.plot(
data['Voltage'], data['ADC'], 'ko', label='ADC Readings'
)
ax.plot(
data['Voltage'], pred_df['mean'], 'g-', label='OLS Fit'
)
ax.plot(
data['Voltage'],
pred_df['obs_ci_upper'],
'y:',
label='Upper CI',
)
ax.plot(
data['Voltage'],
pred_df['obs_ci_lower'],
'y:',
label='Lower CI',
)
ax.fill_between(
data['Voltage'],
pred_df['obs_ci_lower'],
pred_df['obs_ci_upper'],
alpha=0.2,
)
ax.legend(loc='best')
ax.set_title('TM4C123GXL ADC Performance')
ax.set_xlabel('Volts')
ax.set_ylabel('ADC Reading')
data2 = pd.read_csv('ADCAv64.txt')
data2.head(2)
data2.columns = ['Voltage', 'ADC']
data2.sort_values(by='Voltage', inplace=True)
Perform Linear Regression¶
res2 = ols('ADC ~ Voltage', data=data2).fit()
res2.summary()
Plot Regression Results¶
We plot it twice, the second one with a wider CI band, but it is visually clear that the datapoints lie closer to the best-fit line
gp = res2.get_prediction({'Voltage': data2['Voltage']})
pred_df = gp.summary_frame()
pred_df.head(1)
fig, ax = plt.subplots(figsize=(10, 10))
ax.plot(
data2['Voltage'],
data2['ADC'],
'ko',
label='ADC Readings',
)
ax.plot(
data2['Voltage'], pred_df['mean'], 'g-', label='OLS Fit'
)
ax.plot(
data2['Voltage'],
pred_df['obs_ci_upper'],
'y:',
label='Upper CI (alpha=0.05)',
)
ax.plot(
data2['Voltage'],
pred_df['obs_ci_lower'],
'y:',
label='Lower CI',
)
ax.fill_between(
data2['Voltage'],
pred_df['obs_ci_lower'],
pred_df['obs_ci_upper'],
alpha=0.2,
)
ax.legend(loc='best')
ax.set_title(
'TM4C123GXL ADC Performance (HW Average of 64 Samples)'
)
ax.set_xlabel('Volts')
ax.set_ylabel('ADC Reading')
gp = res2.get_prediction({'Voltage': data2['Voltage']})
pred_df = gp.summary_frame(alpha=0.005)
pred_df.head(1)
fig, ax = plt.subplots(figsize=(10, 10))
ax.plot(
data2['Voltage'],
data2['ADC'],
'ko',
label='ADC Readings',
)
ax.plot(
data2['Voltage'], pred_df['mean'], 'g-', label='OLS Fit'
)
ax.plot(
data2['Voltage'],
pred_df['obs_ci_upper'],
'y:',
label='Upper CI (alpha=0.005)',
)
ax.plot(
data2['Voltage'],
pred_df['obs_ci_lower'],
'y:',
label='Lower CI',
)
ax.fill_between(
data2['Voltage'],
pred_df['obs_ci_lower'],
pred_df['obs_ci_upper'],
alpha=0.2,
)
ax.legend(loc='best')
ax.set_title(
'TM4C123GXL ADC Performance (HW Average of 64 Samples)'
)
ax.set_xlabel('Volts')
ax.set_ylabel('ADC Reading')
Plot Residuals¶
We now plot the residuals for the linear regressions on the two datasets
fig, ax = plt.subplots(figsize=(10, 10))
ax.plot(data['Voltage'], res1.resid, 'o', label='Raw ADC')
ax.plot(
data2['Voltage'],
res2.resid,
'r+',
label='ADC, Av of 64',
markersize=15,
)
ax.legend(loc='best')
ax.axhline(0, color='grey', alpha=0.6)
ax.set_title('TM4C123GXL ADC Residuals against Linear Fit')
ax.set_xlabel('Volts')
ax.set_ylabel('Residual')
Statistical Tests¶
We now apply a set of tests to check the the smaller residuals in the Hardware Averaged dataset regression are unlikely to have happened by chance.
Bartlett Test¶
"In statistics, Bartlett's test (see Snedecor and Cochran, 1989) is used to test if k samples are from populations with equal variances."
stats.bartlett(res1.resid, res2.resid)
f_crit = stats.chi2.ppf(0.95, 1)
f_crit
fstat = 30.53
x = np.linspace(0.1, 40, 100)
_ = plt.plot(
x, stats.chi2.pdf(x, 1), '-', label=f'Chi^2(1,)'
)
_ = plt.fill_between(x, 0, stats.chi2.pdf(x, 1), alpha=0.5)
_ = plt.axvline(f_crit, color='green', label='95% cutoff')
_ = plt.axvline(
fstat, color='red', label='Observed Chi^2 statistic'
)
_ = plt.axhline(0, color='gray')
_ = plt.legend(loc='best')
_ = plt.title('Bartlett Test')
Levene Test¶
Every statistical test has underlying assumptions: let us drop the assumption of normality of the residuals
"The Levene test tests the null hypothesis that all input samples are from populations with equal variances. Levene’s test is an alternative to Bartlett’s test bartlett in the case where there are significant deviations from normality."
stats.levene(res1.resid, res2.resid, center='mean')
"The test version using the mean was proposed in the original article of Levene ([2]) while the median and trimmed mean have been studied by Brown and Forsythe ([3]), sometimes also referred to as Brown-Forsythe test."
stats.levene(res1.resid, res2.resid, center='trimmed')
stats.levene(res1.resid, res2.resid, center='median')
Plot the result
f_crit = stats.f.ppf(
0.95, 1, len(res1.resid) + len(res2.resid - 1)
)
f_crit
fstat = 4.355
x = np.linspace(0.1, 10, 100)
_ = plt.plot(
x,
stats.f.pdf(
x, 1, len(res1.resid) + len(res2.resid) - 1
),
'-',
label=f'F(1, {len(res1.resid) + len(res2.resid)-1})',
)
_ = plt.fill_between(
x,
0,
stats.f.pdf(
x, 1, len(res1.resid) + len(res2.resid) - 1
),
alpha=0.5,
)
_ = plt.axvline(f_crit, color='green', label='95% cutoff')
_ = plt.axvline(
fstat, color='red', label='Observed F statistic'
)
_ = plt.axhline(0, color='gray')
_ = plt.legend(loc='best')
_ = plt.title('Levene Test')
Fligner Test¶
Apply another test with different assumptions
"Fligner’s test tests the null hypothesis that all input samples are from populations with equal variances. Fligner-Killeen’s test is distribution free when populations are identical [2]."
stats.fligner(res1.resid, res2.resid)
Compute F statistic
a, b = (
np.var(res1.resid, ddof=1),
np.var(res2.resid, ddof=1),
)
a, b
fstat = a / b
fstat
Probability of seeing this value or greater
fdist = stats.f(len(res1.resid) - 1, len(res2.resid))
p_value = 1 - fdist.cdf(fstat)
p_value
f_crit = stats.f.ppf(
0.99, len(res1.resid) - 1, len(res2.resid)
)
f_crit
Plot the results
x = np.linspace(0.1, 80, 100)
_ = plt.plot(
x,
stats.f.pdf(
x, len(res1.resid) - 1, len(res2.resid) - 1
),
'-',
label=f'F({len(res1.resid) - 1}, {len(res2.resid)-1})',
)
_ = plt.fill_between(
x,
0,
stats.f.pdf(
x, len(res1.resid) - 1, len(res2.resid) - 1
),
alpha=0.5,
)
_ = plt.axvline(f_crit, color='green', label='99% cutoff')
_ = plt.axvline(
fstat, color='red', label='Observed F statistic'
)
_ = plt.axhline(0, color='gray')
_ = plt.legend(loc='best')
Conclusions¶
From our test results (at the 5% level), we can be confident that the improved jitter did not occur by chance, and that Hardware Averaging in the ADC does improve stability of the readings.
Environment
%watermark
%watermark -h -iv
# show info to support reproducibility
theNotebook = 'ADCAnalysis.ipynb'
def python_env_name():
envs = subprocess.check_output(
'conda env list'
).splitlines()
# get unicode version of binary subprocess output
envu = [x.decode('ascii') for x in envs]
active_env = list(
filter(lambda s: '*' in str(s), envu)
)[0]
env_name = str(active_env).split()[0]
return env_name
# end python_env_name
print('python version : ' + sys.version)
print('python environment :', python_env_name())
print('current wkg dir : ' + os.getcwd())
print('Notebook name : ' + theNotebook)
print(
'Notebook run at : '
+ str(datetime.datetime.now())
+ ' local time'
)
print(
'Notebook run at : '
+ str(datetime.datetime.utcnow())
+ ' UTC'
)
print('Notebook run on : ' + platform.platform())