Kitz ADSL Broadband Information
adsl spacer  
Support this site
Home Broadband ISPs Tech Routers Wiki Forum
 
     
   Compare ISP   Rate your ISP
   Glossary   Glossary
 
Please login or register.

Login with username, password and session length
Advanced search  

News:

Author Topic: Regex speed-up  (Read 1439 times)

Weaver

  • Senior Kitizen
  • ******
  • Posts: 11459
  • Retd s/w dev; A&A; 4x7km ADSL2 lines; Firebrick
Regex speed-up
« on: August 14, 2022, 07:35:34 PM »

I have the following data file, courtesy of mr johnson’s custom code in my ZyXEL modems. It is the SNR per tone data:
Code: [Select]
xdslctl: ADSL driver and PHY status
Status: Showtime
Last Retrain Reason: 8000
Last initialization procedure status: 0
Max: Upstream rate = 286 Kbps, Downstream rate = 3064 Kbps
Bearer: 0, Upstream rate = 352 Kbps, Downstream rate = 2676 Kbps

Tone number      SNR
   0 0.0000
   1 0.0000
   2 0.0000
   3 0.0000
   4 0.0000
   5 0.0000
   6 0.0000
   7 31.0000
   8 34.5000
   9 36.5000
   10 35.5000
   11 33.5000
   12 32.5000
   13 31.5000
   14 29.5000
   15 28.5000
   16 27.5000
   17 26.5000
   18 24.5000
   19 24.0000
   20 23.5000
   21 21.5000
   22 21.0000
   23 19.5000
   24 20.0000
   25 18.5000
   26 17.0000
   27 18.0000
   28 15.5000
   29 14.5000
   30 15.0000
   31 14.5000
   32 0.0000
   33 29.1875
   34 31.5625
   35 33.1875
   36 34.0000
   37 34.8750
   38 35.9375
   39 36.6250
   40 37.4375
   41 37.9375
   42 38.6250
   43 39.3125
   44 39.6875
   45 39.4375
   46 33.9375
   47 39.0000
   48 38.5625
   49 37.5625
   50 37.0000
   51 37.0000
   52 36.8125
   53 36.8750
   54 37.0000
   55 37.1875
   56 37.3750
   57 37.5000
   58 36.5000
   59 36.5625
   60 37.2500
   61 37.6875
   62 37.5625
   63 37.5625
   64 37.0625
   65 37.3750
   66 37.1250
   67 35.4375
   68 36.7500
   69 36.3750
   70 36.0625
   71 35.8125
   72 35.3125
   73 35.0000
   74 34.3750
   75 34.3125
   76 34.0625
   77 33.7500
   78 33.4375
   79 32.8750
   80 32.6875
   81 32.1250
   82 31.5625
   83 31.5000
   84 31.0000
   85 30.5625
   86 30.1250
   87 29.6875
   88 29.4375

The file is truncated because the rest isn’t relevant to what I’m doing.

Let us call the first field (ASCII decimal number) on a line x and the second y. I’m searching for a given x and then I return the associated y. I use the following regex to do it:
        replace( /^\X*\n[ \t]*<x>[ \t]+([0-9.]+)\X+$/, "$1" ); - where "<x>" is to be replaced by the literal ascii decimal search x value, without the < >.

* My question: do you think I can speed this up by chopping off the first part of the file up to line 35 ?

This is possible because the lowest x-valued query I ever make is around x=40 and the search x is certainly always greater than 35. That’s safe.

I certainly can test the speed myself. I’m writing this in iOS Shortcuts and would need to make two enclosing comparison loops with a large loop-count that search the original vs chopped-off data. But I wanted to hear your opinions before I waste some time.
« Last Edit: August 14, 2022, 07:42:04 PM by Weaver »
Logged

burakkucat

  • Respected
  • Senior Kitizen
  • *
  • Posts: 38300
  • Over the Rainbow Bridge
    • The ELRepo Project
Re: Regex speed-up
« Reply #1 on: August 14, 2022, 10:28:26 PM »

In your example, above, sub-carriers (tones) 7 to 31 (inclusive) are the US band and sub-carriers (tones) 33 upwards are the DS band of your ADSL2 circuit. Why would you not want to consider the entire range?
Logged
:cat:  100% Linux and, previously, Unix. Co-founder of the ELRepo Project.

Please consider making a donation to support the running of this site.

Weaver

  • Senior Kitizen
  • ******
  • Posts: 11459
  • Retd s/w dev; A&A; 4x7km ADSL2 lines; Firebrick
Re: Regex speed-up
« Reply #2 on: August 15, 2022, 02:00:34 AM »

Because this is only used in HCD detection and the upstream is not considered. The lowest downstream tone ever searched for is something around tone 40, and the highest so far is 88. I am not sure whether or not I’d get a very minor speed increase from chopping off the highest tones, say above 100 (a bit higher than 88 just to be safe in case of future changes). Since the highest part is not searched only discarded, I’m not sure there’s much benefit to be gained.

Coming back to the lowest tones, I’m assuming the time cost of searching for a line with tone=x=nn is the cost of matching, searching for newlines, then skipping any white space and checking for a digit, then matching the decimal number x value. How quickly you can scan forwards and reject non-matching lines determines the performance, so I’m thinking that chopping of the first part decreases the number of times around the loop. Even with this initial truncation, one would still have to go round say ( 88 - 35 ) times to find line tone=x=88 if 35 is the first line, that compares to 88 + n_initial_non_record_lines = 88 + 8.
Logged

burakkucat

  • Respected
  • Senior Kitizen
  • *
  • Posts: 38300
  • Over the Rainbow Bridge
    • The ELRepo Project
Re: Regex speed-up
« Reply #3 on: August 15, 2022, 04:17:25 PM »

I now see your reasoning. Thank you.

33 to 100 or 35 to 88? Your choice.  :)
Logged
:cat:  100% Linux and, previously, Unix. Co-founder of the ELRepo Project.

Please consider making a donation to support the running of this site.