(https://forum.kitz.co.uk/proxy.php?request=http%3A%2F%2Fwww3.picturepush.com%2Fphoto%2Fa%2F8907576%2Foimg%2Fbeanieboy%2Fqlnmap.png&hash=88cad19476ff84764d76d1e86e3e927221a50e9f) (http://picturepush.com/public/8907576)
(click for full size)
Perhaps I can appeal again for collaboration on a project to programmatically analyse QLN data for RF ingress?
The general problem is one of noise peak or spike detection. There is a good survey of peak detection algorithms for mass spectrometry analysis here [1]. The survey was authored by Chao Yang, et al from Hong Kong University of Science & Tech. All of the algorithms they surveyed are open source. And although intended for proteome research, they should be re-usable for our project.
In that survey, the 'winning' peak detection algorithm, in terms of its sensitivity vs its false discovery rate, is an algorithm based on Continuous Wavelet Transforms (cwt). The algorithm was developed by a team at Northwestern University, led by Pan Du. [2] It is written for 'R'. [3]
The algorithm integrates a smoothing filter (to remove background noise), and baseline correction.
Further work would involve mapping the frequencies of the detected QLN noise peaks to the broadcast channels of AM radio. In Europe, AM (MW) radio band is assigned 9kHz channel spacing. Whereas ADSL/VDSL2 uses spacing of 4.3125kHz. So the channels don't immediately coincide. Some 'jiggery pokery' / tolerance / fuzzy logic is involved.
Does that appeal to anyone here?!
cheers, a
[1] http://www.biomedcentral.com/1471-2105/10/4
[2] http://www.bioconductor.org/packages/release/bioc/html/MassSpecWavelet.html
[3] http://www.r-project.org/
A team at Northwestern University has released an open source Wavelet-based Peak Detection algorithm for Mass Spectrometry Analysis. [1][2] It can be re-purposed for this comparatively trivial task of RF ingress noise detection on DSL lines.
We start with qln.txt, a two column text file of QLN tonemap data from a Broadcom chipset modem. It is just a capture from the modem shell command xdslcmd info --QLN.
$ head -20 qln.txt
"index" "QLN"
0 -113.0000
1 -123.0000
2 -123.0000
3 -123.0000
4 -123.0000
5 -123.0000
6 -123.0000
7 -123.0000
8 -123.0000
9 -123.0000
10 -123.0000
11 -123.0000
12 -123.0000
13 -123.0000
14 -123.0000
15 -123.0000
16 -122.0000
17 -122.0000
18 -122.0000
We begin by loading R, the "free software environment for statistical computing and graphics" [3]. Then we install Northwestern's Wavelet-based Peak Detection library. [2].
$ R
R version 2.15.1 (2012-06-22) -- "Roasted Marshmallows"
Copyright (C) 2012 The R Foundation for Statistical Computing
ISBN 3-900051-07-0
Platform: x86_64-pc-linux-gnu (64-bit)
> source("http://bioconductor.org/biocLite.R")
BiocInstaller version 1.4.7, ?biocLite for help
> biocLite("MassSpecWavelet")
BioC_mirror: http://bioconductor.org
Using R version 2.15, BiocInstaller version 1.4.7.
Installing package(s) 'MassSpecWavelet'
..
* DONE (MassSpecWavelet)
The PeakDet library is loaded, and we read in our QLN dataset before calling the peak-det functions from the library:
> library(MassSpecWavelet)
Loading required package: waveslim
> qlnDAT <- read.table("qln.txt", colClasses = c(rep("NULL",1), rep("numeric",1)), header=TRUE)
> qlnMAT <- data.matrix(qlnDAT)
> wCoefs <- cwt(qlnMAT, scales=seq(1,31,1), wavelet='mexh')
> localMax <- getLocalMaximumCWT(wCoefs)
> ridgeList <- getRidge(localMax)
> majorPeakInfo <- identifyMajorPeaks(qlnMAT, nearbyPeak=TRUE, ridgeList, wCoefs, SNR.Th=1)
> peakIndex <- majorPeakInfo$peakIndex
Which discovers the following Peak Information..
Eight noise peaks in total, using those thresholds. Discovered at DMTs #84 #141 #162 #249 #268 #283 #339 #377, each of a given intensity.
> majorPeakInfo
$peakIndex
1_84 1_141 1_162 1_249 1_268 1_283 1_339 1_377
84 141 162 249 268 283 339 377
$peakValue
1_2 1_30 1_45 1_51 1_66 1_84 1_100
5.9741241 14.6055427 3.4114940 0.0000000 2.2530691 23.2881416 1.9204691
1_112 1_120 1_141 1_162 1_200 1_212 1_226
0.0000000 2.6028832 19.9828253 75.8928215 0.7603451 0.0000000 0.0000000
1_249 1_268 1_283 1_301 1_323 1_339 1_367
14.3018559 52.1779085 10.8258617 0.0000000 0.0000000 12.3137493 1.2914229
1_377 1_399 1_409 1_417 1_429 1_442 1_457
6.7045729 1.1645874 0.0000000 0.0000000 1.4541205 0.0000000 4.2575986
1_477 1_485
1.5450179 0.0000000
$peakCenterIndex
1_2 1_30 1_45 1_51 1_66 1_84 1_100 1_112 1_120 1_141 1_162 1_200 1_212
3 28 48 51 70 60 96 112 117 140 159 198 212
1_226 1_249 1_268 1_283 1_301 1_323 1_339 1_367 1_377 1_399 1_409 1_417 1_429
226 249 270 284 301 323 339 367 389 396 409 417 427
1_442 1_457 1_477 1_485
442 468 478 485
$peakSNR
1_2 1_30 1_45 1_51 1_66 1_84 1_100
1.4794073 3.6031302 0.8416022 0.0000000 0.5558233 5.7450933 0.4737722
1_112 1_120 1_141 1_162 1_200 1_212 1_226
0.0000000 0.6421211 4.9296847 18.7224618 0.1875742 0.0000000 0.0000000
1_249 1_268 1_283 1_301 1_323 1_339 1_367
3.5282118 12.8720857 2.6706977 0.0000000 0.0000000 3.0377537 0.3185890
1_377 1_399 1_409 1_417 1_429 1_442 1_457
1.6539919 0.2872992 0.0000000 0.0000000 0.3587258 0.0000000 1.0503329
1_477 1_485
0.3811499 0.0000000
$peakScale
1_2 1_30 1_45 1_51 1_66 1_84 1_100 1_112 1_120 1_141 1_162 1_200 1_212
5 5 5 0 6 30 5 0 5 5 30 5 0
1_226 1_249 1_268 1_283 1_301 1_323 1_339 1_367 1_377 1_399 1_409 1_417 1_429
0 5 25 5 0 0 5 5 20 5 0 0 5
1_442 1_457 1_477 1_485
0 15 8 0
$potentialPeakIndex
1_84 1_100 1_112 1_120 1_141 1_162 1_200 1_212 1_226 1_249 1_268 1_283 1_301
84 100 112 120 141 162 200 212 226 249 268 283 301
1_323 1_339 1_367 1_377 1_399 1_409 1_417 1_429
323 339 367 377 399 409 417 429
$allPeakIndex
1_2 1_30 1_45 1_51 1_66 1_84 1_100 1_112 1_120 1_141 1_162 1_200 1_212
2 30 45 51 66 84 100 112 120 141 162 200 212
1_226 1_249 1_268 1_283 1_301 1_323 1_339 1_367 1_377 1_399 1_409 1_417 1_429
226 249 268 283 301 323 339 367 377 399 409 417 429
1_442 1_457 1_477 1_485
442 457 477 485
And that data can be plotted with the following command. In the graphs, the noise peaks above our threshold are highlighted in red.
> plotPeak(qlnMAT, peakIndex, main=paste("identified QLN peaks"))
Which produces this graph:
(https://forum.kitz.co.uk/proxy.php?request=http%3A%2F%2Fwww2.picturepush.com%2Fphoto%2Fa%2F8954345%2F480%2Fqln_analysis%2FScreenshot-from-2012-08-12-02%253A36%253A58.png&hash=f2e0371a71e7c22b6773582007469e715416f59c) (http://picturepush.com/public/8954345)
(click for full size)
We can also do a baseline correction to visually examine the SNR of each noise peak:
> plotRange <- c(1,512)
> peakSNR <- majorPeakInfo$peakSNR
> allPeakIndex <- majorPeakInfo$allPeakIndex
> selInd <- which(allPeakIndex >= plotRange[1] & allPeakIndex < plotRange[2])
> plot(allPeakIndex[selInd], peakSNR[selInd], type='h', xlab='DMT Index', ylab='Peak noise to background ratio')
> points(peakIndex, peakSNR[names(peakIndex)], type='h', col='red')
> peakIndex <- majorPeakInfo$peakIndex
> title('CWT Baseline correction - SNR of peaks - majorPeaks in red')
(https://forum.kitz.co.uk/proxy.php?request=http%3A%2F%2Fwww3.picturepush.com%2Fphoto%2Fa%2F8954346%2F480%2Fqln_analysis%2FScreenshot-from-2012-08-12-02%253A48%253A12.png&hash=97198bf3995e68543ad979063a5592d435b7edc8) (http://picturepush.com/public/8954346)
(click for full size)
Hopefully, it is possible to build these Peak Detection functions as a static library, and to use a C language binding to call them from own code. [4]
The next stage is re-sampling(?) the DSL data to align it with the 9kHz channel spacing of AM broadcast radio.
An idea mentioned earlier is to apply a cluster detection algorithm to those noise peak frequencies, for geolocation purposes. [5]
Perhaps the tool could become a plug-in for a DSL line-monitoring application like the one Roseway is developing :) [6]
cheers, a
[1] http://www2.in.tu-clausthal.de/~hammer/lectures/biomlsem/sechs.pdf
[2] http://www.bioconductor.org/packages/release/bioc/html/MassSpecWavelet.html
[3] http://www.r-project.org/
[4] http://stackoverflow.com/questions/7457635/calling-r-function-from-c
[5] http://forum.kitz.co.uk/index.php/topic,11424.msg220565.html#msg220565
[6] http://forum.kitz.co.uk/index.php/topic,11478.0.html