FIX message rate monitoring -
i newby site, have limited scripting skills, able pick way through scripts without problem. write script monitor fix messages coming through number of log files in real time; segregated account & symbol. rate needs calculated on per-minute basis. @ moment not sure whether minute minute calculation or rolling 60 seconds calculation. haven't written yet, looking see if possible , if can give me pointers best scripting language employ. thanks
here brutal solution in gawk. if there 35=d on line use regexes split interesting parts out, timestamp (without seconds entries fall equivalence classes on minute level), , 2 tags , dump 'multidimensional' array, meaning use these indices of array. once went through messages scan array, in no particular order, , dump counters. terribly ugly..the 3 'match' functions should written one, , perhaps output sorted, that's trivial in shell 'sort'.
#!/usr/bin/awk -f #out_vec__pwkbvsp-le2__0 [ 601] : timestamp=2013-08-12-13:00:01.235605858 :: latency=1323.3460000000 :: 8=fix.4.4|9=0253|35=d|34=0000601|52=20130812-13:00:01.235|49=sender|56=receiver|57=sor|50=trader|128=spse|11=orderid1|453=3|448=16|447=d|452=7|448=dma1|447=d|452=54|448=abc|447=d|452=36|1=account123|55=lpsb3|54=1|60=20130812-13:00:00.000|38=6400|40=2|44=17.8700|15=brl|59=0|10=010| :: aux_len=0, /35=d/ { n=match($0, /.*\|1=([^\|]+)\|.*/, tmp1); n=match($0, /.*\|55=([^\|]+)\|.*/, tmp2); n=match($0, /[^:]+: timestamp=([[:digit:]]+)-([[:digit:]]+)-([[:digit:]]+)-([[:digit:]]+):([[:digit:]]+).*/, ts); # print tmp1[1], tmp2[1], ts[1], ts[2], ts[3], ts[4], ts[5]; aggr[tmp1[1], tmp2[1], ts[1], ts[2], ts[3], ts[4], ts[5]]++; } end { (i in aggr) print i, aggr[i]; }
for samples get:
account123pssa3201308121301 3 account123cpfe3201308121301 1 account123lpsb3201308121300 1 account123geti4201308121301 1
which further processed.
Comments
Post a Comment