FipSeq

FipSeq – header manipulation and system variables

Also see basic fip header descriptions on these pages:

 

If you want to filter a TYPE of character :

	 a 	 Alphabetic letters 	
	 x 	 Alphanumeric - letters and numbers 	
	 e 	 Alphanumeric plus dot 	
	 z 	 ANPA type - alphanumeric plus dash 	
	 d 	 Decimal numbers plus optional dot, commas 	
	 m 	 Money plus optional dot, commas, plus, minus and dollar 	
	 n 	 Numeric 0,1,2,3,4,5,6,7,8,9 	
	 t 	 Punctuation (!"\043$%^&*()_+=-:;@'~#<>,. &124; \) 	
	 s 	 Space (space, tab, ff, cr, nl but NOT backspace) 	
	 p 	 Printable (including space) 	
	 c 	 Control

Filters on negatives are also valid using a ‘#’ before the type eg.

	partial:ZE	SN,,,#c

Will give all the characters which are NOT control characters. This will ONLY allow characters matching this type.

For alphabetic characters, you may also force the case :

	 u 	 uppercase 	
	 l 	 lowercase
	partial:QE	SN,1,6,#sL

Will give the first 6 non-space chraracters forced lower case

Start character is a start of valid data character

End character is an end of string when this character is encounted


REPEAT – create a new FipHdr field from a subset of an existing FipHdr field:

  • based on a field delimiter like a dash :
repeat:QR   XK,-,3

or

  • based on a type like NOT a number or a SPACE
repeat:QR   XK,,3,U,N

repeat in more detail :

  • To get the last field – or a field from the rightmost – use a negative field number :
; .. in W$:http://mighty-ducks.com/super-kwak/legs/tails/123456.html
repeat:AB W$,/,-1
  • Syntax – there are three different Repeats styles:
1. with a particular chr as separator :
    repeat:QR   XK,-,3

find field XK, using ‘-‘ as sep, find the 3rd sub-field

Use this for fields like :

    RR:engine-train-coach-seat-trolley
2. using a type of chr (or NOT a type of chr)

in this case the sep is left blank, and valid types are

	 u - uppercase  	
	 l - lowercase  	
	 x - alphanumeric  	
	 e - alphanumeric plus dot  	
	 t - punctuation  	
	 s - space (space, tab, ff etc)  	
	 n - number  	
	 p - printable  	
	 c - control
    repeat:QR   XK,,3,U

To negate, use the ‘#’ before the type

    repeat:QR   XK,,3,#X

Ie QR is the 3rd field in the K field of SH separated by a non-alphanumeric chr

Use this for fields like :

    repeat:QR   RR,,5,S
    RR:a319 Boeing-747 TigerMoth MothsBros *&^%$ BingBang
3. There is actually a third type of REPEAT code for use with OPTION where it is possible to output all codes :

In IPEDSYS, the keyword ‘before’ adds a String at the top of the data file :

   before:\n\ZU\QU\n\$O\n

; add our option/repeat with a star ..
; if Fip Hdr field \ZU is specified,
; ..look for field \QU has a least 1 chr in it
; .. if there is, output to the \$O flag and then start again with a test for the next field
; .. if there is nothing, strip until the \$O.
option:ZU QU,1
; Hdr Field QU is really EACH sub-field of \NU separated by a comma.
repeat:QU NU+,+*

So if Fip Hdr field NU is :

NU:HOK,DDD,NNR,MSV,SMF,XXX

The top of the file will have :

<MODVER ID=”1:00″ VER=”03″><BR>
<DEST ID=”1:05″>HOK</DEST><BR>
<DEST ID=”1:05″>DDD</DEST><BR>
<DEST ID=”1:05″>NNR</DEST><BR>
<DEST ID=”1:05″>MSV</DEST><BR>
<DEST ID=”1:05″>SMF</DEST><BR>
<DEST ID=”1:05″>XXX</DEST><BR>
<FIFO ID=”1:20″ FFNO=”25″><BR>

… and the rest of the data


COMBINATIONS – combie – create a field which is either the contents of another field, or if that does not exist, the contents of a second, or if not there a default string :

combie:QZ   ep	na,(0000000)a

ie

  • Stuff QZ with the contents of the EP header field
  • if EP does not exist or has no data then there use the NA field
  • if NA does not exist or has no data then there use the fixed text ‘(0000000)a’.

*combie in more detail*

  Syntax  combie: [newfield]   [tab/space]
    [existing field1] [	] [existing field2]
    [opt comma] [opt default fixed text]

PIPE – | is used for separating HdrFlds while the first comma means that the rest of the data is fixed text to use as a default.

Note that if a field is present BUT has no data, it is considered NOT present. eg

Hdr field BB ‘BB:’ will fail ‘BB:1′ will pass


OPTION – test for an existing field in order to output some relevant FipSeq IF the condition is TRUE.

option:QT   ep,11,7,s

option in more detail (used in conjuction with the \$O flag):

ie If EP header field exists and has a space in the 7th position, send this text else strip text until the \$O flag.

  Syntax  option: [newfield]   [tab/space]
    [newfield] [?] [existing field] [comma] [size]
    [opt comma] [opt posn of test chr]
    [opt comma] [opt 'S' to output data otherwise nothing is output]
    [opt comma] [opt String to test for]

where size is minimum size of field.

Normally this is just a TEST – ie nothing is output – but the ‘send’ parameter can be used to output the contents from the position specified IF THERE.

Note that both size and test are counters from 1 not 0.

A single chr can be tested to be non-space as in the example above.

If either the size or the test is FALSE, all text and subsequent data whether fixed or variable (including more Optionals) is ignored until the EndOpt flag is met – ‘\$O’ (oh, not zero).

How is this used ?

    option:JN   SN,10

    before:StoryName is \SN \JN..and it is 10 or more chrs long at \$h:\$n \$Owhile the ..

If the SN field is less than 10 chrs in length – say SN:lily – the ‘before’ string will output :

StoryName is lily while the ..

ie JN fails so all (text, , Sys Variables etc) is stripped to the \$O

However an

SN:bigbigbigbig

ß gives a ‘before’ string of :

Story Name is bigbigbigbig ..and it is 10 or more chrs long at 12:22 while the ..

Also see a more involved example at the end of the REPEAT section.

Taking four cases where an example field XB is :

  • 1. NOT present
  • 2. present with no data
  • 3. present with data
  • 4. present with data including a space at position 3

Specify

         (no XB) XB:   XB:111  XB:11 11
    option:QH XB      fail   fail   pass   pass 
    option:QH XB,1    fail   fail   pass   pass 
    option:QH XB,4    fail   fail   fail   pass 
    option:QH XB,1,1   fail   fail   pass   pass 
    option:QH XB,,1   fail   fail   pass   pass 
    option:QH XB,,3   fail   fail   fail   fail 
    option:QH XB,,4   fail   fail   fail   pass

In a further example we can test the content of a field

eg

option:QP   PR,,,,21

Here we do not care about size, position and output; only whether the field starts with the value ’21’.

Case (lower or upper) is always ignored. To test fields with spaces, special chrs etc, put double quotes around.

eg

option:PF   XF,,,,\023\021
       option:PZ   XR,,,,"PASNAP \$d"

Note that the data is compared ONLY for the length of the string specifi ed in the ‘option’. So if the ‘SS’ contains “12345678”

    option:PS   SS,,,,1    - will be TRUE
    option:PS   SS,,,,12345678   - will be TRUE
    option:PS   SS,,,,123456789  - will be FALSE

*So to test for an EXACT string*, add a comma at the end of the field. (if you want a comma as part of the string, use FipSeq to encode it – \054 (octal)) If the string to test is NOT the first chr of the field, put the stat posn in the ‘test’ field.

option:PS SS,,3,,34567

*Test a string is NOT equal* using the following syntax:

option:PS #SS,,,,12345

ie if SS is NOT 12345, PS will be valid


STYLE – create a new FipHdr field from an existing field and either truncate or pad left or right.

style:QH   XM,%.02s
style in more detail

– This uses the C ‘printf’ which is nasty but a standard (of sorts).

– On a Unix machine Do a “man printf” for fuller information if you need.

Note it ALWAYS starts with a ‘%’

If the expression does not end with an ‘s’ (‘d’ for integer for example), then the string in the header field is first converted to that type.

Specify One and ONLY one expression (can not have %s%d%f) – as is takes the first only

Do NOT use for fixed data – use keyword ‘fixed:’ (as explained above)

Types are :

	 string  	 s 	
	 char  	 c 	
	 long  	 d,i,o,p,u,x,X 	
	 float  	 f,e,E,g,G 	
	 %  	 print a % ! 	
	 type n is ignored ??

Examples

	 to trim a string, use a dot 	  %.5s 	
	 to pad a string with spaces 	  %5s 	
	 to pad a string with spaces (left justified) 	 : %-5s  	
	 to pad a number with leading zeros 	 %.06d

REPLACE – copy an existing fields and replace some of the strings.

replace:LC  XC   SPO=s INL=i UDL=u ECO=f

replace in more detail

Copy the contents of one field into another and then search and replace characters or strings.

Syntax

    replace:(New ) (spc/tab) (Existing )
       [optional ,(flags to ForceCase, encode etc)]
       (spc/tab)   (Search String) (punctuation) (Replace)

eg

replace:LC   XC   SPO=s INL=i UDL=u ECO=f

There can be up to 50 searches/replace tuplets on a single line.

If the ‘C’ flag is NOT specified (to Ignorecase), the Search String MUST always be in the right case. So always

specify all combinations :

    SPO=s spo=s

The output can be forced upper or lowercase by specifying ‘U’ or ‘L’ after the existing field.

eg replace:LC XC,u SPO=s INL=i UDL=u ECO=f

Other flags can be set to :
– Url Encode and/or Url Decode the data : u or e
– use a r’repeat character’ to strip multiple consequetive occurances of the same chr : r=+
– use a single-character wild card : s=?
– use a start-of-block or end=ofblock chr : b=|

  • ; eg this will strip all spaces from the star and end of the zone, replace any combination of Yuummmy by YUM and heXXo with hello
    replace:LC AB,s=?,r=+,b=| \s+|=”” |\s+=”” Yu+m+y=YUM he??o=hello
  • Strings with spaces and punctuation MUST be enclosed by double quotes.eg replace:K2 YR,u “.Y”=yellow “.C”=cyan “.M”=magenta “.K”=black The search and replacement string may be in FipSeq.

    You must take care to only use small amounts of data – 1000 chrs max

    Pls note you must NEVER call a field from itself.


    NEWDATE – create a new FipHdr, temporary field containing date and time fields

       newdate:JS   min-38 hour+8 day+\MD   "\ZD-\ZI-\ZZ"

    newdate in more detail

    Create a new internal field with a date relative to when the program runs

    Syntax :

    newdate: (new  field)   (differences) (style)

    where differences is a series of keywords and the varying amount the syntax for differences is

    • years + or – a number
    • months
    • days
    • hours
    • mins
    • secs

    ie

    hours+3

    for plus 3 hours

    days-7

    for the same day a week ago

    If you need to use a date which is relative to a date/time which is NOT the current date, put

    'basedate=YYYYMMDDHHNNBB'

    There are no spaces between the number and the hours etc.

    Several differences can be specified, each spearated by a SPC or TAB.

    The contents of ANOTHER fiphdr field may be used for the number :

    hours+\JR

    Only the first (or for ‘m’ first two) letters are considered.

    The default is a PLUS sign meaning in the future.

    ie

    'h3'

    is the same as

    'hours+3'

    The Style MUST be in double quotes and contain any FipSeq. Relevant header fields are

    	 ZD  	 2 digit day of month (leading space) 	
    	 ZG  	 2 digit day of month (leading zero) 	
    	 ZM  	 2 digit month 	
    	 ZY  	 2 digit Year 92 	
    	 ZZ  	 4 digit Year 1992 	
    	 ZW  	 Day of week as in Monday, Tuesday etc 	
    	 ZS  	 3 chr Day of week as in Mon, Tue etc 	
    	 ZN  	 Month as in January, February 	
    	 ZT  	 3 chr Month as in Jan, Feb, Mar etc 	
    	 ZJ  	 Julian day of year 	
    	 ZH  	 Hour 00-23 	
    	 ZI  	 Hour 00-12 	
    	 ZF  	 Minute 00-59 	
    	 ZE  	 Second 00-59 	
    	 ZU  	 1st, 2nd, 3rd, 24th for the day of the month 	
    	 plus ZA, ZB, ZC 	 for Week of year x2 and d.o.w. and ZP for AM/PM

    Note that actual Day and Month names depend on your LOCALE

    Default is

    "\ZW, \ZD \ZN \ZZ"

    Example

    newdate:JS      min-38 hour+8 day+\MD
    "xx\ZD-\ZS-\ZZ or \ZY \ZH:\ZTxx"

    Where MD is a field in the incoming file.

    Gives a result like

    xx01-Sat-2000 or 00 00:17xx

    UNIQUE – create a new FipHdr with the contents of another where the words are unique

    and separated by a single space or other character. – Use this for controlling Metadata, routing codes and Stox tickers.

    unique in more detail

    Control the contents of a single field so that :

    • each element is unique
    • each element is separated by one and only one separator chr
      unique:ZZ   F1,	
    
      unique:(new  Fld) (space) (old  Field)
        [(opt) (comma) (separator) ]
        [(opt) (comma) (force Upper or Lower case) ]

    The Separator defaults to a plus sign ‘+’. eg

    unique:Q1   XC,=

    So if XC (ie the C field of SH) was :

    SH:N1234:Sabc:Ccat1 cat1 cat3 cat2 cat3   cat4:P3

    Q1 now becomes

    Q1:cat1=cat2=cat3=cat4
    • The individual elements of are considered different if a space, plus ‘+’ or comma ‘,’ is found.
    • Multiple spaces etc are ignored.
    • If you want the Sep to be a comma, use another punctuation chr eg unique:Q1 XC+,

    VALID – create a new FipHdr by checking the contents of another against a standing list of valid entries.

    If there is no match, the first entry is used as the default.

    valid in more detail

    – create a new FipHdr by checking the contents of another against a standing list of valid entries. If there is no match, the first entry is used as the default.

      valid:(new  Fld) (space) (old  Field)
        [(opt) (comma) (force Upper or Lower case) ]
        (list of values using one or more spaces or tabs as separator)

    If any of the values contain a space or comma, it must be inside double quotes eg

    valid:Q1   XP   4 1 2 3 5

    Create Q1 from XP which must be a value of 1 to 5 and defaults to 4


    LOOKUP – create a new FipHdr by matching the contents of another against a standing lookup file.

    lookup in more detail

    Note this is for simple, small lookup files NOT massive tables of thousands of entries.

    • Use fipseq REPLACE for a small number of fixed entries – say up to 30.
    • Use fipSeq LOOKUP for between 30 and 100 entries (or if the file needs to be maintained externally). Use program IPLOOKUP for anything else.
    lookup:(new  Fld) (space) (old  Field) file:(filename) [(opt) sep:(field separator in FipSeq) ]

    ???ult 1 for 1st)

          $ value-field: (field number of the data or value field - starting at 1 - default 2 for 2nd) 
          $ default-value: (FipSeq - default - none) 
          $ separator: (Fipseq chr - default is '	') 
          $ type-format: (filetype - CSV, TAB-sep, Fixed width - default is blank for TEXT) 
          $ comment-line: (character which signifies a comment line - default is ';') 
          $ numeric-key: signifies the key is numeric. 
          $ sorted: yes/no is the file sorted ? 
          $ case-sensitive-key: yes/no ie Cardiff or cardiff allow-spaces: normally leading and trailing spaces in both the key and searched-for zone are removed before testing.

    The old field will have the key to search for.

    eg

    lookup:Y1 YP file:Categories sep:	 comment:# key:3 value:5

    Create Y1 from the entry in file /fip/tables/setup/Categories which matches the contents of XP where the key is in field 3 and the value to use is in field 5.

    If using Fixed width fields, the width of each field follows the ‘F’ in type-format. eg

    lookup:Y2 SU file:services key:1 value:4 type-format:F,6,6,2,2,3

    Create Y2 from the entry in file /fip/tables/setup/services which matches the contents of SU where the key is in field 1 (which are characters 1 to 6) and the value to use is in field 4 (characters 15 and 16) .


    PERL – create a new FipHdr from running a Perl Regular expression

    • Take care that the syntax is correct and does not have an infinite loop!
    • Take care to respect Perl quotes – especially for Winnt where for some bizarre reason we need two double quotes around it !

    eg for Winnt

       perl:Z4 ""@hoho = 29 * \F4;""

    *Note* that all ‘\’ should be doubled as the RegExp string is considered to be FipSeq, so other fields may be included.

    The RegExp MUST be on a single line and you are advised to use a ‘print’ to display the result you want.

    Any NewLines and/or Carriage Returns are mapped to Spaces.

      perl:(new  Fld) (space) (Perl RegExp)

    eg

    perl:E1 $seqno = "\VD"; $bigSeq = 24*$seqno; print "Seqno is $bigSeq";

    Create E1 with (if VD:3) “Seqno is 72″.


    MERGE – create a new FipHdr by merging all the occurances of another field

    – create a new FipHdr from one or more occurances of another field. Normally, if a field is repeated, only the last field is accessible, which allows you to overwrite entries quite happily.

    But there are a few occasions where in fact you want the first, or second, or maybe all of them concatenated together, in one long string. This could be when you run ‘ipxchg’ against another formatted file which has duplicated entries for some fields.

      merge:(new FipHdr Fld) (space) (old FipHdr Fld)
       [(optional) , Separator]
       [(optional) , First-occurance]
       [(optional) , NumberOfTimes]

    In the above syntax the comma may be replaced by any other punctuation except ‘$’. The Separator is to divide two or more fields and can be in FipSeq and can be a TAB (\t), SPC (\s) or any printable character. The First occurance is counted from the begining or top of the . The NumberOfTimes is the number of fields to take into account. eg if the had :

       AB:blue
       AB:green
       AB:red
    
       merge:BC   AB,+

    gives a new Fiphdr field of BC:blue+green+red

    SUM

    – create a new FipHdr from the result of a calculation and/or output in decimal, hex, octal, base36 or base64

    sum:(new FipHdr Fld) [: optional flags] (space) (the calculation)

    The calculation is in FipSeq – so can include other FipHdr fields Calculations can NOT be split over several lines. The default precision for a calculation is 2 decimal places. This can be overridden using the syntax : sum:AB 100*(\Q1/\Q2)

    Flags can be

    • (digit from 0-6) precision ie no of decimal places (default is 2)
    • R – round to nearest decimal (default is NO rounding)
    • O – output in Octal (default is decimal)
    • H – output in Hex (default is decimal)
    • B – output in Base36 (default is decimal)Operators can be :
      • + plus
      • - minus
      • * multiply
      • / divide
      • % mod

      Take care when dividing by zero ! so use a ‘COMBIE’ and ‘PARTIAL’ to make sure

      Use round brackets to denote how the calculation should be worked out. Ie the deepest level is calculated first, and then the calculation is worked from left to right.

      For example : (\Q1/\Q2)*(\Q3/(\Q5+\Q4))/100
      will add run through the following order :

      • Step 1 (\Q5+\Q4)
      • Step 2 (\Q1/\Q2)
      • Step 3 (\Q3/result of Step 1)
      • Step 4 (result of Step 2 * result of Step 3)
      • Step 5 (result of Step 4 / 100)

      FILTER

      – create a new FipHdr by filtering the contents of another against a mask or a list of valid entries.

      filter:(new FipHdr Fld) [: optional flags] (space) (existing FipHdr field) (list and/or mask)

    • Entries may be dbl quoted if spaces are needed.
    • Flags are specified in order – first is a single wild character (? in the example below); second is a wild string
    •      ; allow any value starting MCC: with one or three, ..
                ; .. but NOT two, trailing chrs
                filter:P1       TR,?    MCC:? MCC:???
                filter:P2       AC	Sports News Business


      Notes on FipSeq generated header fields

      • Note these fields are ONLY for use internally to that one program and are NOT added to the existing .
      • They can be used in exactly the same way as any other field.
      • If an existing field has the same name as a newly created fixed/partial/repeat/option/style, the OLD is ignored and the NEWLY created one used.
      • So please make sure you do NOT use existing field names with valid data in them as they will NOT be accessible. To be sure, use fields starting Qx, Nx, Jx where x is A-Z, 0-9


      The new fields can be linked together up to 20 times or levels. Example of a snippet of parameter file for IPEDSYS to generate a consistent filename which can be sorted easily on the day of the year :

      Lets have a containing 3 normal fields : SN for storyname, HS for History and SH for the Source Header

         SN:dpa6722
         HS:wire_9090_99-6-20_13:15:33_5_070
         SH::Sdpa:N6722:P1:CPIG:KPigs-In-Space

      ..and the relevant part of the IPEDSYS parameter file ..

         ; create a new FipHdr field called QJ using REPEAT
         ;   Get the Day of the year from the HS field (which is the 6th field)
         ; create a new FipHdr field called QR using COMBIE
         ;   this will be the same as QJ or 0 if either HS did not exist or there was no 6th field
         ; create a new FipHdr field called QE using STYLE
         ;   which will force it to 3 digits padded zero.
         ;
         ; Use QE to create a new FipHdr filename for the output file which will be :
         ;   3 digit Julian data, Storyname, Category code (C field in SH) and Keyword (K in SH)
         repeat:QJ   HS,_,6
         combie:QR   QJ,0
         style:QE   QR,%.03d
      
         ; Filename
         name:\QE.\SN.\XC.\XK

      For our example , this would generate a filename of :

         070.dpa6722.PIG.Pigs-In-Space

      © FingerPost Ltd. 2013 and years before


      Notes and Examples

      A simple example using lookup.

      Although lookup can be used to lookup data in complex multi-field tables, it is generally used to look things up in simple two column tables – perhaps a two column pipe delimited file that looks like this basic file mapping section names between a Sunday and a Daily publication:

      /fip/tables/setup/SECTIONS:

      txm	Sunday
      tgm	Daily
      tpp	Sunday
      tgs	Daily

      In this case all we want to do is lookup the first three characters of a given filename and generate a new header field containing the string “Daily” or “Sunday”.

      The syntax to do this in a program like ipedsys or ipftp would therefore be …

      ; Now - split the first three characters off the front of the name, and
      ; use them to lookup whether the file is Daily or Sunday ...
      ; uses lookup file /fip/tables/setup/SECTIONS
      
      ; make QN the first two characters of the filename
      partial:QN      SN,,3
      
      ; lookup in our lovely lookup file
      lookup:QL       QN      file:/fip/tables/setup/SECTIONS default-value:Daily

      ; you can then use QL inyour output – say in ipftp to change directory before delivering the file

      ; change directory first
      ftpbeffile:cwd /guardian/\QL