[4mFsdb[24m(3)               User Contributed Perl Documentation              [4mFsdb[24m(3)

[1mNAME[0m
       Fsdb - a flat-text database for shell scripting

[1mSYNOPSIS[0m
       Fsdb, the flatfile streaming database is package of commands for
       manipulating flat-ASCII databases from shell scripts.  Fsdb is useful
       to process medium amounts of data (with very little data you'd do it by
       hand, with megabytes you might want a real database).  Fsdb was known
       as as Jdb from 1991 to Oct. 2008.

       Fsdb is very good at doing things like:

       +o   extracting measurements from experimental output

       +o   examining data to address different hypotheses

       +o   joining data from different experiments

       +o   eliminating/detecting outliers

       +o   computing   statistics   on   data   (mean,  confidence  intervals,
           correlations, histograms)

       +o   reformatting data for graphing programs

       Fsdb is built around the idea of a flat text file as a database.   Fsdb
       files  (by  convention,  with  the  extension  [4m.fsdb[24m),  have  a  header
       documenting the schema (what the columns  mean),  and  then  each  line
       represents a database record (or row).

       For example:

               #fsdb experiment duration
               ufs_mab_sys 37.2
               ufs_mab_sys 37.3
               ufs_rcp_real 264.5
               ufs_rcp_real 277.9

       Is  a  simple  file  with  four  experiments  (the  rows),  each with a
       description, size parameter, and run time in  the  first,  second,  and
       third columns.

       Rather  than  hand-code  scripts to do each special case, Fsdb provides
       higher-level functions.  Although it's  often  easy  throw  together  a
       custom  script  to do any single task, I believe that there are several
       advantages to using Fsdb:

       +o   these programs provide a higher level interface than plain Perl, so

           **  Fewer lines of simpler code:

                   dbrow '_experiment eq "ufs_mab_sys"' | dbcolstats duration

               Picks out just one type of experiment and  computes  statistics
               on it, rather than:

                   while (<>) { split; $sum+=$F[1]; $ss+=$F[1]**2; $n++; }
                   $mean = $sum / $n; $std_dev = ...

               in dozens of places.

       +o   the library uses names for columns, so

           **  No more $F[1], use "_duration".

           **  New or different order columns?  No changes to your scripts!

           Thus   if  your  experiment  gets  more  complicated  with  a  size
           parameter, so your log changes to:

                   #fsdb experiment size duration
                   ufs_mab_sys 1024 37.2
                   ufs_mab_sys 1024 37.3
                   ufs_rcp_real 1024 264.5
                   ufs_rcp_real 1024 277.9
                   ufs_mab_sys 2048 45.3
                   ufs_mab_sys 2048 44.2

           Then the previous scripts still work, even though duration  is  now
           the third column, not the second.

       +o   A  series  of  actions  are  self-documenting  (the  provenance  of
           processsing done to produce each output is recorded in comments).

           **  No more wondering what hacks were used  to  compute  the  final
               data, just look at the comments at the end of the output.

           For example, the commands

               dbrow '_experiment eq "ufs_mab_sys"' | dbcolstats duration

           add to the end of the output the lines
               #    | dbrow _experiment eq "ufs_mab_sys"
               #    | dbcolstats duration

       +o   The library is mature, supporting large datasets (more than 100GB),
           parallelism,  corner  cases, error handling, backed by an automated
           test suite.

           **  No more puzzling about bad output because  your  custom  script
               skimped on error checking.

           **  No  more  memory  thrashing  when  you  try to sort ten million
               records.

           **  Makes use of multiple cores  in  your  computer  when  it  can,
               because  each  pipeline component runs in parallel, and because
               key tools (dbsort, dbmapreduce) run in parlallel when possible.

       +o   Fsdb-2.x supports Perl scripting (in addition to shell  scripting),
           with  libraries  to  do Fsdb input and output, and easy support for
           pipelines.  The shell script

               dbcol name test1 | dbroweval '_test1 += 5;'

           can be written in perl as:

               dbpipeline(dbcol(qw(name test1)), dbroweval('_test1 += 5;'));

       (The disadvantage is  that  you  need  to  learn  what  functions  Fsdb
       provides.)

       Fsdb  is built on flat-ASCII databases.  By storing data in simple text
       files and processing it with pipelines it is easy to experiment (in the
       shell) and look at the output.   To  the  best  of  my  knowledge,  the
       original  implementation  of this idea was "/rdb", a commercial product
       described in the book [4mUNIX[24m [4mrelational[24m [4mdatabase[24m [4mmanagement:[24m  [4mapplication[0m
       [4mdevelopment[24m  [4min[24m  [4mthe[24m  [4mUNIX[24m [4menvironment[24m by Rod Manis, Evan Schaffer, and
       Robert Jorgensen (1988 by Prentice Hall,  and  also  at  the  web  page
       <http://www.rdb.com/>).   Fsdb  is an incompatible re-implementation of
       their idea without any accelerated indexing  or  forms  support.   (But
       it's free, and probably has better statistics!).

       Fsdb-2.x  will exploit multiple processors or cores, and provides Perl-
       level support  for  input,  output,  and  threaded-pipelines.   (As  of
       Fsdb-2.44  it no longer uses Perl threading, just processes, since they
       are faster.)

       Installation instructions follow at the end of this document.  Fsdb-2.x
       requires Perl 5.8 to run.  All commands have manual pages  and  provide
       usage  with  the  "--help"  option.   All  commands  are  backed  by an
       automated test suite.

       The  most  recent  version  of  Fsdb  is  available  on  the   web   at
       <http://www.isi.edu/~johnh/SOFTWARE/FSDB/index.html>.

[1mWHAT'S NEW[0m
   [1m3.4, tbd tbd[0m
       ENHANCEMENT
           dbcolsdecimate now has examples in its documentatino.

       BUG FIX
           dbcolsstats,   dbmapreduce,   dbcolpercentile,   dbfilepivot,   and
           dbmultistats now correctly propagate the temporary  directory  into
           the  sort  route,  if  required.   (Previously,  while it collected
           tmpdir, it did not propagage.  This problem only applied if n-tiles
           were requested.)

[1mREADME CONTENTS[0m
       executive summary
       what's new
       README CONTENTS
       installation
       basic data format
       basic data manipulation
       list of commands
       another example
       a gradebook example
       a password example
       history
       related work
       release notes
       copyright
       comments

[1mINSTALLATION[0m
       Fsdb  now  uses  the  standard  Perl  build   and   installation   from
       [1mExtUtil::MakeMaker[22m(3), so the quick answer to installation is to type:

           perl Makefile.PL
           make
           make test
           sudo make install

       Or, if you want to install it somewhere else, change the first line to

           perl Makefile.PL PREFIX=$HOME

       then  the  other  commands  ("make;  make  test; make install"; but now
       without the sudo), and it will go in your home  directory's  [4mbin[24m,  etc.
       (See [1mExtUtil::MakeMaker[22m(3) for more details.)

       Fsdb requires perl 5.8 or later.

       A test-suite is available, run it with

           make test

       In  the  past,  the  ports  existed  for FreeBSD and MacOS.  If someone
       running one of those OSes wants to contribute a new port, please let me
       know.

[1mBASIC DATA FORMAT[0m
       These programs are based on the  idea  storing  data  in  simple  ASCII
       files.   A  database  is  a  file with one header line and then data or
       comment lines.  For example:

               #fsdb account passwd uid gid fullname homedir shell
               johnh * 2274 134 John_Heidemann /home/johnh /bin/bash
               greg * 2275 134 Greg_Johnson /home/greg /bin/bash
               root * 0 0 Root /root /bin/bash
               # this is a simple database

       The header line must be first and begins with "#fsdb".  There are  rows
       (records)  and  columns  (fields),  just  like  in  a  normal database.
       Comment lines  begin  with  "#".   Column  names  are  any  string  not
       containing  spaces or single quote (although it is prudent to keep them
       alphanumeric with underscore).

       Columns can optionally include type anntations by following  name  with
       :t where t is some type.  (Types are not used in Perl, but are relevant
       in  Python  and  Go  Fsdb  bindings.)   Types use a subset of perl pack
       specifiers: c, s, l, q are signed 8, 16, 32, and 64-bit integers, f  is
       a  float,  d  is double float, a is utf-8 string, and &gt; and &lt; can
       force big or little endianness.

       By default, columns are delimited by any amount  of  whitespace.   With
       this  default  configuration,  the  contents  of a field cannot contain
       whitespace.  However, this limitation can be relaxed  by  changing  the
       field separator as described below.

       The  big  advantage of simple flat-text databases is that it is usually
       easy to massage data into this format, and it's reasonably easy to take
       data out of this format into other (text-based) programs, like gnuplot,
       jgraph, and LaTeX.  Think Unix.  Think pipes.  (Or even output to Excel
       and HTML if you prefer.)

       Since no-whitespace in columns was a  problem  for  some  applications,
       there's  an  option which relaxes this rule.  You can specify the field
       separator in the table header with "-F x" where "x" is a code  for  the
       new  field  separator.   A full list of codes is at [1mdbfilealter[22m(1), but
       two common special values are "-F t" which is a separator of  a  single
       tab  character,  and  "-F S", a separator of two spaces.  Both allowing
       (single) spaces in fields.  An example:

               #fsdb -F S account passwd uid gid fullname homedir shell
               johnh  *  2274  134  John Heidemann  /home/johnh  /bin/bash
               greg  *  2275  134  Greg Johnson  /home/greg  /bin/bash
               root  *  0  0  Root  /root  /bin/bash
               # this is a simple database

       See [1mdbfilealter[22m(1) for more details.  Regardless  of  what  the  column
       separator  is  for  the body of the data, it's always whitespace in the
       header.

       There's also a third format: a "list".  Because it's often hard to  see
       what's columns past the first two, in list format each "column" is on a
       separate line.  The programs dblistize and dbcolize convert to and from
       this format, and all programs work with either formats.  The command

           dbfilealter -R C  < DATA/passwd.fsdb

       outputs:

               #fsdb -R C account passwd uid gid fullname homedir shell
               account:  johnh
               passwd:   *
               uid:      2274
               gid:      134
               fullname: John_Heidemann
               homedir:  /home/johnh
               shell:    /bin/bash

               account:  greg
               passwd:   *
               uid:      2275
               gid:      134
               fullname: Greg_Johnson
               homedir:  /home/greg
               shell:    /bin/bash

               account:  root
               passwd:   *
               uid:      0
               gid:      0
               fullname: Root
               homedir:  /root
               shell:    /bin/bash

               # this is a simple database
               #  | dblistize

       See [1mdbfilealter[22m(1) for more details.

[1mBASIC DATA MANIPULATION[0m
       A  number of programs exist to manipulate databases.  Complex functions
       can be made by stringing together commands with shell  pipelines.   For
       example,  to  print  the  home directories of everyone with ``john'' in
       their names, you would do:

               cat DATA/passwd | dbrow '_fullname =~ /John/' | dbcol homedir

       The output might be:

               #fsdb homedir
               /home/johnh
               /home/greg
               # this is a simple database
               #  | dbrow _fullname =~ /John/
               #  | dbcol homedir

       (Notice that comments are appended to the output listing each  command,
       providing an automatic audit log.)

       In  addition  to  typical database functions (select, join, etc.) there
       are also a number of statistical functions.

       The real power of Fsdb is that one can apply arbitrary code to rows  to
       do powerful things.

               cat DATA/passwd | dbroweval '_fullname =~ s/(\w+)_(\w+)/$2,_$1/'

       converts  "John_Heidemann"  into  "Heidemann,_John".  Not too much more
       work could split fullname into firstname and lastname fields.

       (Or:

               cat DATA/passwd | dbcolcreate sort | dbroweval -b 'use Fsdb::Support'
                       '_sort = _fullname; _sort =~ s/_/ /g; _sort = fullname_to_sort(_sort);'

[1mTALKING ABOUT COLUMNS[0m
       An advantage of Fsdb is  that  you  can  talk  about  columns  by  name
       (symbolically)  rather than simply by their positions.  So in the above
       example, "dbcol homedir" pulled out  the  home  directory  column,  and
       "dbrow '_fullname =~ /John/'" matched against column fullname.

       In  general,  you  can use the name of the column listed on the "#fsdb"
       line to identify it in most programs, and _name to identify it in code.

       Some alternatives for flexibility:

       +o   Numeric values identify columns positionally, numbering from 0.  So
           0 or _0 is the first column, 1 is the second, etc.

       +o   In code, _last_columnname gets the value from columname's  previous
           row.

       See [1mdbroweval[22m(1) for more details about writing code.

[1mLIST OF COMMANDS[0m
       Enough said.  I'll summarize the commands, and then you can experiment.
       For a detailed description of each command, see a summary by running it
       with  the argument "--help" (or "-?" if you prefer.)  Full manual pages
       can be found by running the  command  with  the  argument  "--man",  or
       running the Unix command "man dbcol" or whatever program you want.

   [1mTABLE CREATION[0m
       dbcolcreate
           add columns to a database

       dbcoldefine
           set the column headings for a non-Fsdb file

   [1mTABLE MANIPULATION[0m
       dbcol
           select columns from a table

       dbrow
           select rows from a table

       dbsort
           sort rows based on a set of columns

       dbjoin
           compute the natural join of two tables

       dbcolrename
           rename a column

       dbcolmerge
           merge two columns into one

       dbcolsplittocols
           split one column into two or more columns

       dbcolsplittorows
           split one column into multiple rows

       dbfilepivot
           "pivots" a file, converting multiple rows corresponding to the same
           entity into a single row with multiple columns.

       dbfilevalidate
           check that db file doesn't have some common errors

   [1mCOMPUTATION AND STATISTICS[0m
       dbcolstats
           compute statistics over a column (mean,etc.,optionally median)

       dbmultistats
           group  rows by some key value, then compute stats (mean, etc.) over
           each group  (equivalent  to  dbmapreduce  with  dbcolstats  as  the
           reducer)

       dbmapreduce
           group rows (map) and then apply an arbitrary function to each group
           (reduce)

       dbrvstatdiff
           compare two samples distributions (mean/conf interval/T-test)

       dbcolmovingstats
           computing moving statistics over a column of data

       dbcolstatscores
           compute Z-scores and T-scores over one column of data

       dbcolpercentile
           compute the rank or percentile of a column

       dbcolhisto
           compute histograms over a column of data

       dbcolscorrelate
           compute the coefficient of correlation over several columns

       dbcolsdecimate
           drop rows selectively, keeping large changes and periodic samples

       dbcolsregression
           compute linear regression and correlation for two columns

       dbrowaccumulate
           compute a running sum over a column of data

       dbrowcount
           count the number of rows (a subset of dbstats)

       dbrowdiff
           compute differences between a columns in each row of a table

       dbrowenumerate
           number each row

       dbroweval
           run arbitrary Perl code on each row

       dbrowuniq
           count/eliminate identical rows (like Unix [1muniq[22m(1))

       dbfilediff
           compare fields on rows of a file (something like Unix [1mdiff[22m(1))

   [1mOUTPUT CONTROL[0m
       dbcolneaten
           pretty-print columns

       dbfilealter
           convert  between  column  or  list  format,  or  change  the column
           separator

       dbfilestripcomments
           remove comments from a table

       dbformmail
           generate a script that sends form mail based on each row

   [1mCONVERSIONS[0m
       (These programs convert data  into  fsdb.   See  their  web  pages  for
       details.)

       cgi_to_db
           <http://stein.cshl.org/boulder/>

       combined_log_format_to_db
           <http://httpd.apache.org/docs/2.0/logs.html>

       html_table_to_db
           HTML tables to fsdb (assuming they're reasonably formatted).

       kitrace_to_db
           <http://ficus-www.cs.ucla.edu/ficus-members/geoff/kitrace.html>

       ns_to_db
           <http://mash-www.cs.berkeley.edu/ns/>

       sqlselect_to_db
           the output of SQL SELECT tables to db

       tabdelim_to_db
           spreadsheet tab-delimited files to db

       tcpdump_to_db
           (see man [1mtcpdump[22m(8) on any reasonable system)

       xml_to_db
           XML input to fsdb, assuming they're very regular

       (And out of fsdb:)

       db_to_csv
           Comma-separated-value format from fsdb.

       db_to_html_table
           simple conversion of Fsdb to html tables

   [1mSTANDARD OPTIONS[0m
       Many programs have common options:

       [1m-? [22mor [1m--help[0m
           Show basic usage.

       [1m-N [22mon [1m--new-name[0m
           When a command creates a new column like dbrowaccumulate's "accum",
           this option lets one override the default name of that new column.

       [1m-T TmpDir[0m
           where  to put tmp files.  Also uses environment variable TMPDIR, if
           -T is not specified.  Default is /tmp.

           Show basic usage.

       [1m-c FRACTION [22mor [1m--confidence FRACTION[0m
           Specify confidence  interval  FRACTION  (dbcolstats,  dbmultistats,
           etc.)

       [1m-C S [22mor "--element-separator S"
           Specify column separator S (dbcolsplittocols, dbcolmerge).

       [1m-d [22mor [1m--debug[0m
           Enable  debugging  (may  be  repeated  for  greater  effect in some
           cases).

       [1m-a [22mor [1m--include-non-numeric[0m
           Compute stats over all data (treating non-numbers as  zeros).   (By
           default,  things  that  can't be treated as numbers are ignored for
           stats purposes)

       [1m-S [22mor [1m--pre-sorted[0m
           Assume  the  data  is  pre-sorted.   May  be  repeated  to  disable
           verification (saving a small amount of work).

       [1m-e E [22mor [1m--empty E[0m
           give value E as the value for empty (null) records

       [1m-i I [22mor [1m--input I[0m
           Input data from file I.

       [1m-o O [22mor [1m--output O[0m
           Write data out to file O.

       [1m--header [22mH
           Use  H  as  the full Fsdb header, rather than reading a header from
           then input.  This option is particularly  useful  when  using  Fsdb
           under Hadoop, where split files don't have heades.

       [1m--nolog[22m.
           Skip logging the program in a trailing comment.

       When  giving  Perl  code  (in  dbrow and dbroweval) column names can be
       embedded if preceded by underscores.  Look at [1mdbrow[22m(1) or  [1mdbroweval[22m(1)
       for examples.)

       Most  programs  run  in  constant  memory  and  use  temporary files if
       necessary.  Exceptions are dbcolneaten,  dbcolpercentile,  dbmapreduce,
       dbmultistats, dbrowsplituniq.

   [1mSTANDARD SORTING OPTIONS[0m
       A  number  of programs do sorting, or depend on defining an ordering of
       rows.  Such programs use these standard sorting options:

       [1m-r [22mor [1m--descending[0m
           sort in reverse order (high to low)

       [1m-R [22mor [1m--ascending[0m
           sort in normal order (low to high)

       [1m-t [22mor [1m--type-inferred-sorting[0m
           sort fields by type (numeric or leicographic), automatically

       [1m-n [22mor [1m--numeric[0m
           sort numerically

       [1m-N [22mor [1m--lexical[0m
           sort lexicographically

[1mANOTHER EXAMPLE[0m
       Take the  raw  data  in  "DATA/http_bandwidth",  put  a  header  on  it
       ("dbcoldefine   size   bw"),   took   statistics   of   each   category
       ("dbmultistats -k size bw"), pick out the relevant fields ("dbcol  size
       mean stddev pct_rsd"), and you get:

               #fsdb size mean stddev pct_rsd
               1024    1.4962e+06      2.8497e+05      19.047
               10240   5.0286e+06      6.0103e+05      11.952
               102400  4.9216e+06      3.0939e+05      6.2863
               #  | dbcoldefine size bw
               #  | /home/johnh/BIN/DB/dbmultistats -k size bw
               #  | /home/johnh/BIN/DB/dbcol size mean stddev pct_rsd

       (The whole command was:

               cat DATA/http_bandwidth |
               dbcoldefine size |
               dbmultistats -k size bw |
               dbcol size mean stddev pct_rsd

       all on one line.)

       Then post-process them to get rid of the exponential notation by adding
       this to the end of the pipeline:

           dbroweval '_mean = sprintf("%8.0f", _mean); _stddev = sprintf("%8.0f", _stddev);'

       (Actually,  this step is no longer required since dbcolstats now uses a
       different default format.)

       giving:

               #fsdb      size    mean    stddev  pct_rsd
               1024     1496200          284970        19.047
               10240    5028600          601030        11.952
               102400   4921600          309390        6.2863
               #  | dbcoldefine size bw
               #  | dbmultistats -k size bw
               #  | dbcol size mean stddev pct_rsd
               #  | dbroweval   { _mean = sprintf("%8.0f", _mean); _stddev = sprintf("%8.0f", _stddev); }

       In a few lines, raw data is transformed to processed output.

       Suppose you expect there is an  odd  distribution  of  results  of  one
       datapoint.   Fsdb  can  easily  produce  a CDF (cumulative distribution
       function) of the data, suitable for graphing:

           cat DB/DATA/http_bandwidth | \
               dbcoldefine size bw | \
               dbrow '_size == 102400' | \
               dbcol bw | \
               dbsort -n bw | \
               dbrowenumerate | \
               dbcolpercentile count | \
               dbcol bw percentile | \
               xgraph

       The steps, roughly: 1. get the raw input data and  turn  it  into  fsdb
       format,  2. pick out just the relevant column (for efficiency) and sort
       it, 3. for each data point, assign a CDF percentage to it, 4. pick  out
       the two columns to graph and show them

[1mA GRADEBOOK EXAMPLE[0m
       The  first commercial program I wrote was a gradebook, so here's how to
       do it with Fsdb.

       Format your data like DATA/grades.

               #fsdb name email id test1
               a a@ucla.example.edu 1 80
               b b@usc.example.edu 2 70
               c c@isi.example.edu 3 65
               d d@lmu.example.edu 4 90
               e e@caltech.example.edu 5 70
               f f@oxy.example.edu 6 90

       Or if your students have spaces in their names,  use  "-F  S"  and  two
       spaces to separate each column:

               #fsdb -F S name email id test1
               alfred aho  a@ucla.example.edu  1  80
               butler lampson  b@usc.example.edu  2  70
               david clark  c@isi.example.edu  3  65
               constantine drovolis  d@lmu.example.edu  4  90
               debrorah estrin  e@caltech.example.edu  5  70
               sally floyd  f@oxy.example.edu  6  90

       To compute statistics on an exam, do

               cat DATA/grades | dbstats test1 |dblistize

       giving

               #fsdb -R C  ...
               mean:        77.5
               stddev:      10.84
               pct_rsd:     13.987
               conf_range:  11.377
               conf_low:    66.123
               conf_high:   88.877
               conf_pct:    0.95
               sum:         465
               sum_squared: 36625
               min:         65
               max:         90
               n:           6
               ...

       To do a histogram:

               cat DATA/grades | dbcolhisto -n 5 -g test1

       giving

               #fsdb low histogram
               65      *
               70      **
               75
               80      *
               85
               90      **
               #  | /home/johnh/BIN/DB/dbhistogram -n 5 -g test1

       Now  you  want  to send out grades to the students by e-mail.  Create a
       form-letter (in the file [4mtest1.txt[24m):

               To: _email (_name)
               From: J. Random Professor <jrp@usc.example.edu>
               Subject: test1 scores

               _name, your score on test1 was _test1.
               86+   A
               75-85 B
               70-74 C
               0-69  F

       Generate the shell script that will send the mail out:

               cat DATA/grades | dbformmail test1.txt > test1.sh

       And run it:

               sh <test1.sh

       The last two steps can be combined:

               cat DATA/grades | dbformmail test1.txt | sh

       but I like to keep a copy of exactly what I send.

       At the end of the semester you'll want  to  compute  grade  totals  and
       assign  letter  grades.   Both  fall out of dbroweval.  For example, to
       compute weighted total grades with a 40% midterm/60%  final  where  the
       midterm is 84 possible points and the final 100:

               dbcol -rv total |
               dbcolcreate total - |
               dbroweval '
                       _total = .40 * _midterm/84.0 + .60 * _final/100.0;
                       _total = sprintf("%4.2f", _total);
                       if (_final eq "-" || ( _name =~ /^_/)) { _total = "-"; };' |
               dbcolneaten

       If  you  got  the  data originally from a spreadsheet, save it in "tab-
       delimited"   format   and   convert   it   with   tabdelim_to_db   (run
       tabdelim_to_db -? for examples).

[1mA PASSWORD EXAMPLE[0m
       To convert the Unix password file to db:

               cat /etc/passwd | sed 's/:/  /g'| \
                       dbcoldefine -F S login password uid gid gecos home shell \
                       >passwd.fsdb

       To convert the group file

               cat /etc/group | sed 's/:/  /g' | \
                       dbcoldefine -F S group password gid members \
                       >group.fsdb

       To show the names of the groups that div7-members are in (assuming DIV7
       is in the gecos field):

               cat passwd.fsdb | dbrow '_gecos =~ /DIV7/' | dbcol login gid | \
                       dbjoin -i - -i group.fsdb gid | dbcol login group

[1mSHORT EXAMPLES[0m
       Which  Fsdb  programs are the most complicated (based on number of test
       cases)?

               ls TEST/*.cmd | \
                       dbcoldefine test | \
                       dbroweval '_test =~ s@^TEST/([^_]+).*$@$1@' | \
                       dbrowuniq -c | \
                       dbsort -nr count | \
                       dbcolneaten

       (Answer: dbmapreduce, then dbcolstats, dbfilealter and dbjoin.)

       Stats on an exam (in $FILE, where $COLUMN is the name of the exam)?

               cat $FILE | dbcolstats -q 4 $COLUMN <$FILE | dblistize | dbstripcomments

               cat $FILE | dbcolhisto -g -n 20 $COLUMN | dbcolneaten | dbstripcomments

       Merging a the hw1 column from file hw1.fsdb into  grades.fsdb  assuming
       there's a common student id in column "id":

               dbcol id hw1 <hw1.fsdb >t.fsdb

               dbjoin -a -e - grades.fsdb t.fsdb id | \
                   dbsort  name | \
                   dbcolneaten >new_grades.fsdb

       Merging two fsdb files with the same rows:

               cat file1.fsdb file2.fsdb >output.fsdb

       or if you want to clean things up a bit

               cat file1.fsdb file2.fsdb | dbstripextraheaders >output.fsdb

       or if you want to know where the data came from

               for i in 1 2
               do
                       dbcolcreate source $i < file$i.fsdb
               done >output.fsdb

       (assumes you're using a Bourne-shell compatible shell, not csh).

[1mWARNINGS[0m
       As  with  any tool, one should (which means [4mmust[24m) understand the limits
       of the tool.

       All Fsdb tools should run in [4mconstant[24m [4mmemory[24m.  In some cases  (such  as
       [4mdbcolstats[24m  with  quartiles,  where  the  whole input must be re-read),
       programs will spool data to disk if necessary.

       Most tools buffer one or a few lines of data, so memory will scale with
       the size of each line.  (So lines with many columns,  or  when  columns
       have lots data, may cause large memory consumption.)

       All Fsdb tools should run in constant or at worst "n log n" time.

       All Fsdb tools use normal Perl math routines for computation.  Although
       I  make every attempt to choose numerically stable algorithms (although
       I also  welcome  feedback  and  suggestions  for  improvement),  normal
       rounding  due  to  computer floating point approximations can result in
       inaccuracies when data spans a large  range  of  precision.   (See  for
       example the [4mdbcolstats_extrema[24m test cases.)

       Any requirements and limitations of each Fsdb tool is documented on its
       manual page.

       If  any  Fsdb  program  violates  these assumptions, that is a bug that
       should be documented on the tool's manual page or ideally fixed.

       Fsdb does depend on Perl's correctness, and Perl (and Fsdb)  have  some
       bugs.  Fsdb should work on perl from version 5.10 onward.

[1mHISTORY[0m
       There have been four major versions of Fsdb: fsdb-0.x was begun in 1991
       for  my  personal use.  Fsdb 1.0 is a complete re-write of the pre-1995
       versions, and was distributed  from  1995  to  2007.   Fsdb  2.0  is  a
       significant  re-write  of  the  1.x  versions  to  systematically use a
       library and threads (although threads were replaced with full processes
       in 2.44).  Fsdb 3.0 in 2022 adds type specifiers to the schema,  mostly
       to  support use in languages with stronger typing (like Python, Go, and
       C).

       Fsdb (in its various forms) has been used  extensively  by  its  author
       since 1991.  Since 1995 it's been used by two other researchers at UCLA
       and several at ISI.  In February 1998 it was announced to the Internet.
       Since then it has found a few users, some outside where I work.

       Major changes:

       0.1 1991: begun for my personal use, to replace awk.
       1.0 1997-07-22: first public release.
       2.0 2008-01-25: rewrite to use a common library, and starting to use
       threads.
       2.12 2008-10-16: completion of the rewrite, and first RPM package.
       2.44 2013-10-02: replacing threads with processes for improved
       performance
       3.0 2022-04-04: adding type specifiers to the schema

   [1mFsdb 2.0 Rationale[0m
       I've  thought  about  fsdb-2.0  for  many  years, but it was started in
       earnest in 2007.  Fsdb-2.0 has the following goals:

       in-one-process processing
           While fsdb is great on the Unix command line as a pipeline  between
           programs,  it  should  [4malso[24m  be  possible  to set it up to run in a
           single process.  And if it does so, it  should  be  able  to  avoid
           serializing  and  deserializing  (converting to and from text) data
           between each module.  (Accomplished in  fsdb-2.0:  see  dbpipeline,
           although still needs tuning.)

       clean IO API
           Fsdb's  roots go back to perl4 and 1991, so the fsdb-1.x library is
           very, very crufty.  More than just being  ugly  (but  it  was  that
           too),  this made things reading from one format file and writing to
           another the application's job, when it  should  be  the  library's.
           (Accomplished in fsdb-1.15 and improved in 2.0: see Fsdb::IO.)

       normalized module APIs
           Because  fsdb modules were added as needed over 10 years, sometimes
           the  module  APIs  became  inconsistent.   (For  example,  the  1.x
           "dbcolcreate" required an empty value following the name of the new
           column,  but  other  programs  specify  empty  values with the "-e"
           argument.)   We   should   smooth   over   these   inconsistencies.
           (Accomplished as each module was ported in 2.0 through 2.7.)

       everyone handles all input formats
           Given  a  clean  IO  API,  the  distinction  between  "colized" and
           "listized" fsdb files should go away.  Any program should  be  able
           to read and write files in any format.  (Accomplished in fsdb-2.1.)

       Fsdb-2.0  preserves  backwards compatibility where possible, but breaks
       it where necessary to accomplish the  above  goals.   In  August  2008,
       Fsdb-2.7 was declared preferred over the 1.x versions.  Benchmarking in
       2013  showed that threading performed much worse than just using pipes,
       because Perl's requirements for data that is  shared  between  multiple
       threads  is  quite  heavyweight.   Fsdb-2.44  therefore  uses threading
       "style", but implemented with processes (via my "Freds" library).

   [1mFsdb And Muliple Processors[0m
       Fsdb's use of Unix pipelines  means  Fsdb  automatically  benefits  for
       multiprocessor  computers---each  pipeline  stage can run on a separate
       core.  In addition, compute-intensive  Fsdb  modules  like  dbsort  and
       dbmapreduce  are explicitly multi-process and will use as many cores as
       they can, up to the number of cores on the local computer.

       Although Fsdb takes advanatage of as much parallelism as it can, a five
       stage pipeline won't necessarily saturate five cores.  Pipeline  stages
       almost always have different amounts of work to do, and some stages are
       often data limited.  (Dbsort is attempts as much parallelism as it can,
       and  can run 10-way parallel or more over a large enough input dataset.
       But it cannot sustain high parallelism because of the requirement  that
       it produce one global output.)

   [1mFsdb 3.0 Rationale[0m
       There  are two motiviations for adding optional typing to Fsdb.  First,
       languages such as Python and Go would really like type information.  As
       of 2022 there are now users of those languages,  so  the  basic  system
       should support them.

       Second, while pure text is flexible, it's very inefficient---converting
       numbers  to  and  from decimal is thousands of instructions, and binary
       encodings are often much smaller than text.  In  the  future,  I  would
       love to have a flag that enables a binary encoding.

       Typing is optional---omitting types is never wrong.

       One  somewhat  odd  thing  about  typing is that we reuse the Perl pack
       definitions of types, so q (for "quadword") stands for 64-bit  integer.
       These  are  perhaps  not the most mnemonic choices in 2022, but I would
       rather pick someone's existing set than try to define my own.

   [1mContributors[0m
       Fsdb     includes     code     ported     from      Geoff      Kuenning
       ("Fsdb::Support::TDistribution").

       Fsdb   contributors:   Ashvin  Goel  [4mgoel@cse.oge.edu[24m,  Geoff  Kuenning
       [4mgeoff@fmg.cs.ucla.edu[24m,  Vikram  Visweswariah  [4mvisweswa@isi.edu[24m,  Kannan
       Varadahan  [4mkannan@isi.edu[24m,  Lars  Eggert  [4mlarse@isi.edu[24m, Arkadi Gelfond
       [4markadig@dyna.com[24m,   David   Graff   [4mgraff@ldc.upenn.edu[24m,    Haobo    Yu
       [4mhaoboy@packetdesign.com[24m,   Pavlin  Radoslavov  [4mpavlin@catarina.usc.edu[24m,
       Graham  Phillips,  Yuri  Pradkin,  Alefiya  Hussain,  Ya  Xu,   Michael
       Schwendt,  Fabio Silva [4mfabio@isi.edu[24m, Jerry Zhao [4mzhaoy@isi.edu[24m, Ning Xu
       [4mnxu@aludra.usc.edu[24m,  Martin  Lukac  [4mmlukac@lecs.cs.ucla.edu[24m,  Xue  Cai,
       Michael  McQuaid,  Christopher  Meng, Calvin Ardi, H. Merijn Brand, Lan
       Wei, Hang Guo, Wes Hardaker.

       Fsdb includes datasets contributed from  NIST  ([4mDATA/nist_zarr13.fsdb[24m),
       from
       <http://www.itl.nist.gov/div898/handbook/eda/section4/eda4281.htm>, the
       NIST/SEMATECH  e-Handbook  of  Statistical  Methods, section 1.4.2.8.1.
       Background and Data.  The source is public domain, and reproduced  with
       permission.

[1mRELATED WORK[0m
       As stated in the introduction, Fsdb is an incompatible reimplementation
       of the ideas found in "/rdb".  By storing data in simple text files and
       processing  it  with  pipelines it is easy to experiment (in the shell)
       and look at the output.  The original implementation of this  idea  was
       /rdb,  a  commercial  product  described  in  the  book [4mUNIX[24m [4mrelational[0m
       [4mdatabase[24m [4mmanagement:[24m [4mapplication[24m [4mdevelopment[24m [4min[24m [4mthe[24m [4mUNIX[24m [4menvironment[24m by
       Rod Manis, Evan Schaffer, and Robert Jorgensen (and  also  at  the  web
       page <http://www.rdb.com/>).

       While  Fsdb  is  inspired by Rdb, it includes no code from it, and Fsdb
       makes several different design choices.  In particular: rdb attempts to
       be closer to a  "real"  database,  with  provision  for  locking,  file
       indexing.   Fsdb  focuses  on  single  user  use  and  so eschews these
       choices.  Rdb also has some  support  for  interactive  editing.   Fsdb
       leaves editing to text editors like emacs or vi.

       In August, 2002 I found out Carlo Strozzi extended RDB with his package
       NoSQL  <http://www.linux.it/~carlos/nosql/>.  According to Mr. Strozzi,
       he implemented NoSQL in awk  to  avoid  Perl  start-up  costs  in  RDB.
       Although  I  haven't found Perl startup overhead to be a big problem on
       my platforms (from old Sparcstation IPCs to 2GHz Pentium-4s),  you  may
       want  to  evaluate  his system.  The Linux Journal has a description of
       NoSQL at <http://www.linuxjournal.com/article/3294>.   It  seems  quite
       similar  to  Fsdb.   Like /rdb, NoSQL supports indexing (not present in
       Fsdb).  Fsdb appears to have richer support for statistics, and, as  of
       Fsdb-2.x, its support for Perl threading may support faster performance
       (one-process, less serialization and deserialization).

[1mRELEASE NOTES[0m
       Versions  prior to 1.0 were released informally on my web page but were
       not announced.

   [1m0.0 1991[0m
       started for my own research use

   [1m0.1 26-May-94[0m
       first check-in to RCS

   [1m0.2 15-Mar-95[0m
       parts now require perl5

   [1m1.0, 22-Jul-97[0m
       adds autoconf support and a test script.

   [1m1.1, 20-Jan-98[0m
       support for double space field separators, better tests

   [1m1.2, 11-Feb-98[0m
       minor changes and release on comp.lang.perl.announce

   [1m1.3, 17-Mar-98[0m
       +o   adds median and quartile options to dbstats

       +o   adds dmalloc_to_db converter

       +o   fixes some warnings

       +o   dbjoin now can run on unsorted input

       +o   fixes a dbjoin bug

       +o   some more tests in the test suite

   [1m1.4, 27-Mar-98[0m
       +o   improves error messages (all should now  report  the  program  that
           makes the error)

       +o   fixed a bug in dbstats output when the mean is zero

   [1m1.5, 25-Jun-98[0m
       BUG FIX dbcolhisto, dbcolpercentile now handles non-numeric values like
       dbstats
       NEW dbcolstats computes zscores and tscores over a column
       NEW dbcolscorrelate computes correlation coefficients between two
       columns
       INTERNAL ficus_getopt.pl has been replaced by DbGetopt.pm
       BUG FIX all tests are now ``portable'' (previously some tests ran only
       on my system)
       BUG FIX you no longer need to have the db programs in your path (fix
       arose from a discussion with Arkadi Gelfond)
       BUG FIX installation no longer uses cp -f (to work on SunOS 4)

   [1m1.6, 24-May-99[0m
       NEW dbsort, dbstats, dbmultistats now run in constant memory (using tmp
       files if necessary)
       NEW dbcolmovingstats does moving means over a series of data
       NEW dbcol has a -v option to get all columns except those listed
       NEW dbmultistats does quartiles and medians
       NEW dbstripextraheaders now also cleans up bogus comments before the
       fist header
       BUG FIX dbcolneaten works better with double-space-separated data

   [1m1.7,  5-Jan-00[0m
       NEW dbcolize now detects and rejects lines that contain embedded copies
       of the field separator
       NEW configure tries harder to prevent people from improperly
       configuring/installing fsdb
       NEW tcpdump_to_db converter (incomplete)
       NEW tabdelim_to_db converter:  from spreadsheet tab-delimited files to
       db
       NEW mailing lists for fsdb are     "fsdb-announce@heidemann.la.ca.us"
       and  "fsdb-talk@heidemann.la.ca.us"
           To        subscribe        to        either,        send       mail
           to    "fsdb-announce-request@heidemann.la.ca.us"   or
           "fsdb-talk-request@heidemann.la.ca.us"     with "subscribe" in  the
           BODY of the message.

       BUG FIX dbjoin used to produce incorrect output if there were extra,
       unmatched values in the 2nd table. Thanks to Graham Phillips for
       providing a test case.
       BUG FIX the sample commands in the usage strings now all should
       explicitly include the source of data (typically from "cat foo.fsdb
       |").  Thanks to Ya Xu for pointing out this documentation deficiency.
       BUG FIX (DOCUMENTATION) dbcolmovingstats had incorrect sample output.

   [1m1.8, 28-Jun-00[0m
       BUG FIX header options are now preserved when writing with dblistize
       NEW dbrowuniq now optionally checks for uniqueness only on certain
       fields
       NEW dbrowsplituniq makes one pass through a file and splits it into
       separate files based on the given fields
       NEW converter for "crl" format network traces
       NEW anywhere you use arbitrary code (like dbroweval), _last_foo now
       maps to the last row's value for field _foo.
       OPTIMIZATION comment processing slightly changed so that dbmultistats
       now is much faster on files with lots of comments (for example, ~100k
       lines of comments and 700 lines of data!) (Thanks to Graham Phillips
       for pointing out this performance problem.)
       BUG FIX dbstats with median/quartiles now correctly handles singleton
       data points.

   [1m1.9,  6-Nov-00[0m
       NEW dbfilesplit, split a single input file into multiple output files
       (based on code contributed by Pavlin Radoslavov).
       BUG FIX dbsort now works with perl-5.6

   [1m1.10, 10-Apr-01[0m
       BUG FIX dbstats now handles the case where there are more n-tiles than
       data
       NEW dbstats now includes a -S option to optimize work on pre-sorted
       data (inspired by code contributed by Haobo Yu)
       BUG FIX dbsort now has a better estimate of memory usage when run on
       data with very short records (problem detected by Haobo Yu)
       BUG FIX cleanup of temporary files is slightly better

   [1m1.11,  2-Nov-01[0m
       BUG FIX dbcolneaten now runs in constant memory
       NEW dbcolneaten now supports "field specifiers" that allow some control
       over how wide columns should be
       OPTIMIZATION dbsort now tries hard to be filesystem cache-friendly
       (inspired by "Information and Control in Gray-box Systems" by the
       Arpaci-Dusseau's at SOSP 2001)
       INTERNAL t_distr now ported to perl5 module DbTDistr

   [1m1.12,  30-Oct-02[0m
       BUG FIX dbmultistats documentation typo fixed
       NEW dbcolmultiscale
       NEW dbcol has -r option for "relaxed error checking"
       NEW dbcolneaten has new -e option to strip end-of-line spaces
       NEW dbrow finally has a -v option to negate the test
       BUG FIX math bug in dbcoldiff fixed by Ashvin Goel (need to check
       Scheaffer test cases)
       BUG FIX some patches to run with Perl 5.8. Note: some programs
       (dbcolmultiscale, dbmultistats, dbrowsplituniq) generate warnings like:
       "Use of uninitialized value in concatenation (.)" or "string at
       /usr/lib/perl5/5.8.0/FileCache.pm line 98, <STDIN> line 2". Please
       ignore this until I figure out how to suppress it. (Thanks to Jerry
       Zhao for noticing perl-5.8 problems.)
       BUG FIX fixed an autoconf problem where configure would fail to find a
       reasonable prefix (thanks to Fabio Silva for reporting the problem)
       NEW db_to_html_table: simple conversion to html tables (NO fancy stuff)
       NEW dblib now has a function [1mdblib_text2html() [22mthat will do simple
       conversion of iso-8859-1 to HTML

   [1m1.13,  4-Feb-04[0m
       NEW fsdb added to the freebsd ports tree
       <http://www.freshports.org/databases/fsdb/>.  Maintainer:
       "larse@isi.edu"
       BUG FIX properly handle trailing spaces when data must be numeric (ex.
       dbstats with -FS, see test dbstats_trailing_spaces). Fix from Ning Xu
       "nxu@aludra.usc.edu".
       NEW dbcolize error message improved (bug report from Terrence Brannon),
       and list format documented in the README.
       NEW cgi_to_db converts CGI.pm-format storage to fsdb list format
       BUG FIX handle numeric synonyms for column names in dbcol properly
       ENHANCEMENT "talking about columns" section added to README. Lack of
       documentation pointed out by Lars Eggert.
       CHANGE dbformmail now defaults to using Mail ("Berkeley Mail") to send
       mail, rather than sendmail (sendmail is still an option, but mail
       doesn't require running as root)
       NEW on platforms that support it (i.e., with perl 5.8), fsdb works fine
       with unicode
       NEW dbfilevalidate: check a db file for some common errors

   [1m1.14,  24-Aug-06[0m
       ENHANCEMENT README cleanup
       INCOMPATIBLE CHANGE dbcolsplit renamed dbcolsplittocols
       NEW dbcolsplittorows  split one column into multiple rows
       NEW dbcolsregression compute linear regression and correlation for two
       columns
       ENHANCEMENT cvs_to_db: better error handling, normalize field names,
       skip blank lines
       ENHANCEMENT dbjoin now detects (and fails) if non-joined files have
       duplicate names
       BUG FIX minor bug fixed in calculation of Student t-distributions
       (doesn't change any test output, but may have caused small errors)

   [1m1.15, 12-Nov-07[0m
       NEW fsdb-1.14 added to the MacOS Fink system
       <http://pdb.finkproject.org/pdb/package.php/fsdb>. (Thanks to Lars
       Eggert for maintaining this port.)
       NEW Fsdb::IO::Reader and Fsdb::IO::Writer now provide reasonably clean
       OO I/O interfaces to Fsdb files.  Highly recommended if you use fsdb
       directly from perl.  In the fullness of time I expect to reimplement
       the entire thing using these APIs to replace the current dblib.pl which
       is still hobbled by its roots in perl4.
       NEW dbmapreduce now implements a Google-style map/reduce abstraction,
       generalizing dbmultistats.
       ENHANCEMENT fsdb now uses the Perl build system (Makefile.PL, etc.),
       instead of autoconf.  This change paves the way to better perl-5-style
       modularization, proper manual pages, input of both listize and colize
       format for every program, and world peace.
       ENHANCEMENT dblib.pl is now moved to Fsdb::Old.pm.
       BUG FIX dbmultistats now propagates its format argument (-f). Bug and
       fix from Martin Lukac (thanks!).
       ENHANCEMENT dbformmail documentation now is clearer that it doesn't
       send the mail, you have to run the shell script it writes.  (Problem
       observed by Unkyu Park.)
       ENHANCEMENT adapted to autoconf-2.61 (and then these changes were
       discarded in favor of The Perl Way.
       BUG FIX dbmultistats memory usage corrected (O(# tags), not O(1))
       ENHANCEMENT dbmultistats can now optionally run with pre-grouped input
       in O(1) memory
       ENHANCEMENT dbroweval -N was finally implemented (eat comments)

   [1m2.0, 25-Jan-08[0m
       2.0, 25-Jan-08 --- a quiet 2.0 release (gearing up towards complete)

       ENHANCEMENT: shifting old programs to Perl modules, with the front-end
       program as just a wrapper. In the short-term, this change just means
       programs have real man pages. In the long-run, it will mean that one
       can run a pipeline in a single Perl program. So far: dbcol, dbroweval,
       the new dbrowcount. dbsort the new dbmerge, the old "dbstats" (renamed
       dbcolstats), dbcolrename, dbcolcreate,
       NEW: Fsdb::Filter::dbpipeline is an internal-only module that lets one
       use fsdb commands from within perl (via threads).
           It also provides perl function aliases for the internal modules, so
           a  string  of  fsdb  commands in perl are nearly as terse as in the
           shell:

               use Fsdb::Filter::dbpipeline qw(:all);
               dbpipeline(
                   dbrow(qw(name test1)),
                   dbroweval('_test1 += 5;')
               );

       INCOMPATIBLE CHANGE: The old dbcolstats has been renamed
       dbcolstatscores. The new dbcolstats does the same thing as the old
       dbstats. This incompatibility is unfortunate but normalizes program
       names.
       CHANGE: The new dbcolstats program always outputs "-" (the default
       empty value) for statistics it cannot compute (for example, standard
       deviation if there is only one row), instead of the old mix of "-" and
       "na".
       INCOMPATIBLE CHANGE: The old dbcolstats program, now called
       dbcolstatscores, also has different arguments.  The "-t mean,stddev"
       option is now "--tmean mean --tstddev stddev".  See dbcolstatscores for
       details.
       INCOMPATIBLE CHANGE: dbcolcreate now assumes all new columns get the
       default value rather than requiring each column to have an initial
       constant value. To change the initial value, sue the new "-e" option.
       NEW: dbrowcount counts rows, an almost-subset of dbcolstats's "n"
       output (except without differentiating numeric/non-numeric input), or
       the equivalent of "dbstripcomments | wc -l".
       NEW: dbmerge merges two sorted files. This functionality was previously
       embedded in dbsort.
       INCOMPATIBLE CHANGE: dbjoin's "-i" option to include non-matches is now
       renamed "-a", so as to not conflict with the new standard option "-i"
       for input file.

   [1m2.1,  6-Apr-08[0m
       2.1,  6-Apr-08 --- another alpha 2.0, but now  all  converted  programs
       understand both listize and colize format

       ENHANCEMENT: shifting more old programs to Perl modules. New in 2.1:
       dbcolneaten, dbcoldefine, dbcolhisto, dblistize, dbcolize, dbrecolize
       ENHANCEMENT dbmerge now handles an arbitrary number of input files, not
       just exactly two.
       NEW dbmerge2 is an internal routine that handles merging exactly two
       files.
       INCOMPATIBLE CHANGE dbjoin now specifies inputs like dbmerge2, rather
       than assuming the first two arguments were tables (as in fsdb-1).
           The old dbjoin argument "-i" is now "-a" or <--type=outer>.

           A  minor  change:  comments  in the source files for dbjoin are now
           intermixed with output rather than being delayed until the end.

       ENHANCEMENT dbsort now no longer produces warnings when null values are
       passed to numeric comparisons.
       BUG FIX dbroweval now once again works with code that lacks a trailing
       semicolon. (This bug fixes a regression from 1.15.)
       INCOMPATIBLE CHANGE dbcolneaten's old "-e" option (to avoid end-of-line
       spaces) is now "-E" to avoid conflicts with the standard empty field
       argument.
       INCOMPATIBLE CHANGE dbcolhisto's old "-e" option is now "-E" to avoid
       conflicts. And its "-n", "-s", and "-w" are now "-N", "-S", and "-W" to
       correspond.
       NEW dbfilealter replaces dbrecolize, dblistize, and dbcolize, but with
       different options.
       ENHANCEMENT The library routines "Fsdb::IO" now understand both list-
       format and column-format data, so all converted programs can now
       [4mautomatically[24m read either format.  This capability was one of the
       milestone goals for 2.0, so yea!

   [1m2.2, 23-May-08[0m
       Release 2.2 is another 2.x alpha release.  Now [4mmost[24m of the commands are
       ported, but a few remain, and I plan one last incompatible  change  (to
       the file header) before 2.x final.

       ENHANCEMENT
           shifting   more   old  programs  to  Perl  modules.   New  in  2.2:
           dbrowaccumulate,   dbformmail.     dbcolmovingstats.     dbrowuniq.
           dbrowdiff.    dbcolmerge.    dbcolsplittocols.    dbcolsplittorows.
           dbmapreduce.   dbmultistats.   dbrvstatdiff.   Also  dbrowenumerate
           exists only as a front-end (command-line) program.

       INCOMPATIBLE CHANGE
           The   following   programs   have   been   dropped  from  fsdb-2.x:
           dbcoltighten,           dbfilesplit,           dbstripextraheaders,
           dbstripleadingspace.

       NEW combined_log_format_to_db to convert Apache logfiles

       INCOMPATIBLE CHANGE
           Options to dbrowdiff are now [1m-B [22mand [1m-I[22m, not [1m-a [22mand [1m-i[22m.

       INCOMPATIBLE CHANGE
           dbstripcomments is now dbfilestripcomments.

       BUG FIXES
           dbcolneaten   better  handles  empty  columns;  dbcolhisto  warning
           suppressed (actually a bug in high-bucket handling).

       INCOMPATIBLE CHANGE
           dbmultistats now requires a "-k" option in front of the  key  (tag)
           field,  or if none is given, it will group by the first field (both
           like dbmapreduce).

       KNOWN BUG
           dbmultistats with quantile option doesn't work currently.

       INCOMPATIBLE CHANGE
           dbcoldiff is renamed dbrvstatdiff.

       BUG FIXES
           dbformmail was leaving  its  log  message  as  a   command,  not  a
           comment.  Oops.  No longer.

   [1m2.3, 27-May-08 (alpha)[0m
       Another  alpha  release,  this  one just to fix the critical dbjoin bug
       listed below (that happens to have blocked my MP3 jukebox :-).

       BUG FIX
           Dbsort no longer hangs if given an input file with no rows.

       BUG FIX
           Dbjoin now works with unsorted input coming from a  pipeline  (like
           stdin).   Perl-5.8.8  has  a  bug  (?)  that  was  making this case
           fail---opening stdin in one thread, reading some, then reading more
           in a different thread caused an lseek which  works  on  files,  but
           fails on pipes like stdin.  Go figure.

       BUG FIX / KNOWN BUG
           The  dbjoin  fix also fixed dbmultistats -q (it now gives the right
           answer).  Although a new bug appeared, messages like:
               Attempt  to  free  unreferenced  scalar:  SV  0xa9dd0c4,   Perl
           interpreter:   0xa8350b8   during   global   destruction.   So  the
           dbmultistats_quartile test is still disabled.

   [1m2.4, 18-Jun-08[0m
       Another alpha release,  mostly  to  fix  minor  usability  problems  in
       dbmapreduce and client functions.

       ENHANCEMENT
           dbrow  now  defaults to running user supplied code without warnings
           (as with fsdb-1.x).  Use "--warnings" or "-w" to turn them back on.

       ENHANCEMENT
           dbroweval can now write different format  output  than  the  input,
           using the "-m" option.

       KNOWN BUG
           dbmapreduce  emits warnings on perl 5.10.0 about "Unbalanced string
           table refcount" and "Scalars leaked"  when  run  with  an  external
           program as a reducer.

           dbmultistats  emits  the  warning  "Attempt  to  free  unreferenced
           scalar" when run with quartiles.

           In each case the  output  is  correct.   I  believe  these  can  be
           ignored.

       CHANGE
           dbmapreduce no longer logs a line for each reducer that is invoked.

   [1m2.5, 24-Jun-08[0m
       Another  alpha  release,  fixing  more  minor bugs in "dbmapreduce" and
       lossage in "Fsdb::IO".

       ENHANCEMENT
           dbmapreduce can now tolerate non-map-aware reducers that pass  back
           the  key column in put.  It also passes the current key as the last
           argument to external reducers.

       BUG FIX
           Fsdb::IO::Reader, correctly handle "-header" option again.  (Broken
           since fsdb-2.3.)

   [1m2.6, 11-Jul-08[0m
       Another alpha release, needed to fix DaGronk.  One new port, small  bug
       fixes, and important fix to dbmapreduce.

       ENHANCEMENT
           shifting   more   old  programs  to  Perl  modules.   New  in  2.2:
           dbcolpercentile.

       INCOMPATIBLE CHANGE and ENHANCEMENTS dbcolpercentile arguments changed,
       use "--rank" to require ranking instead of "-r". Also, "--ascending"
       and "--descending" can now be specified separately, both for
       "--percentile" and "--rank".
       BUG FIX
           Sigh, the sense of the --warnings option in dbrow was inverted.  No
           longer.

       BUG FIX
           I found and fixed the string leaks (errors like "Unbalanced  string
           table   refcount"   and   "Scalars   leaked")  in  dbmapreduce  and
           dbmultistats.  (All  "IO::Handle"s  in  threads  must  be  manually
           destroyed.)

       BUG FIX
           The "-C" option to specify the column separator in dbcolsplittorows
           now works again (broken since it was ported).

       2.7, 30-Jul-08 beta

       The  beta  release  of fsdb-2.x.  Finally, all programs are ported.  As
       statistics, the number of lines of non-library code doubled  from  7.5k
       to 15.5k.  The libraries are much more complete, going from 866 to 5164
       lines.   The  overall number of programs is about the same, although 19
       were dropped and 11 were added.  The number of  test  cases  has  grown
       from 116 to 175.  All programs are now in perl-5, no more shell scripts
       or perl-4.  All programs now have manual pages.

       Although  this  is a major step forward, I still expect to rename "jdb"
       to "fsdb".

       ENHANCEMENT
           shifting  more  old  programs  to  Perl  modules.   New   in   2.7:
           dbcolscorellate.   dbcolsregression.   cgi_to_db.   dbfilevalidate.
           db_to_csv.      csv_to_db,     db_to_html_table,     kitrace_to_db,
           tcpdump_to_db, tabdelim_to_db, ns_to_db.

       INCOMPATIBLE CHANGE
           The  following programs have been dropped from fsdb-2.x: db2dcliff,
           dbcolmultiscale, crl_to_db.   ipchain_logs_to_db.   They  may  come
           back,   but  seemed  overly  specialized.   The  following  program
           dbrowsplituniq was dropped because it is superseded by dbmapreduce.
           dmalloc_to_db was dropped pending a test cases and examples.

       ENHANCEMENT
           dbfilevalidate now has a "-c" option to correct errors.

       NEW html_table_to_db provides the inverse of db_to_html_table.

   [1m2.8,  5-Aug-08[0m
       Change header format, preserving forwards compatibility.

       BUG FIX
           Complete editing pass over the manual, making sure it  aligns  with
           fsdb-2.x.

       SEMI-COMPATIBLE CHANGE
           The  header  of fsdb files has changed, it is now #fsdb, not #h (or
           #L) and parsing of -F and -R are also different.   See  dbfilealter
           for  the  new  specification.   The  v1  file  format will be read,
           compatibly, but not written.

       BUG FIX
           dbmapreduce now tolerates comments  that  precede  the  first  key,
           instead of failing with an error message.

   [1m2.9, 6-Aug-08[0m
       Still in beta; just a quick bug-fix for dbmapreduce.

       ENHANCEMENT
           dbmapreduce  now  generates  plausible output when given no rows of
           input.

   [1m2.10, 23-Sep-08[0m
       Still in beta, but picking up some bug fixes.

       ENHANCEMENT
           dbmapreduce now generates plausible output when given  no  rows  of
           input.

       ENHANCEMENT
           dbroweval  the  warnings option was backwards; now corrected.  As a
           result, warnings in user code now default off (like in fsdb-1.x).

       BUG FIX
           dbcolpercentile now defaults  to  assuming  the  target  column  is
           numeric.   The  new  option  "-N" allows selection of a non-numeric
           target.

       BUG FIX
           dbcolscorrelate now includes "--sample" and "--nosample" options to
           compute the sample or  full  population  correlation  coefficients.
           Thanks to Xue Cai for finding this bug.

   [1m2.11, 14-Oct-08[0m
       Still in beta, but picking up some bug fixes.

       ENHANCEMENT
           html_table_to_db  is  now  more  aggressive  about filling in empty
           cells with the official empty value, rather than leaving them blank
           or as whitespace.

       ENHANCEMENT
           dbpipeline now catches failures during pipeline element  setup  and
           exits reasonably gracefully.

       BUG FIX
           dbsubprocess  now  reaps child processes, thus avoiding running out
           of processes when used a lot.

   [1m2.12, 16-Oct-08[0m
       Finally, a full (non-beta) 2.x release!

       INCOMPATIBLE CHANGE
           Jdb has been renamed Fsdb, the flatfile-streaming  database.   This
           change  affects  all internal Perl APIs, but no shell command-level
           APIs.  While Jdb served well for more than ten years, it is  easily
           confused with the Java debugger (even though Jdb was there first!).
           It  also  is  too  generic  to  work  well  in  web search engines.
           Finally, Jdb stands for ``John's database'', and we're a bit beyond
           that.  (However, some call me the ``file-system guy'', so one could
           argue it retains that meeting.)

           If you just used the shell commands, this change should not  affect
           you.   If  you used the Perl-level libraries directly in your code,
           you should be able to rename "Jdb" to "Fsdb" to move to 2.12.

           The jdb-announce list not yet been renamed, but it will be shortly.

           With this release I've  accomplished  everything  I  wanted  to  in
           fsdb-2.x.  I therefore expect to return to boring, bugfix releases.

   [1m2.13, 30-Oct-08[0m
       BUG FIX
           dbrowaccumulate now treats non-numeric data as zero by default.

       BUG FIX
           Fixed  a perl-5.10ism in dbmapreduce that breaks that program under
           5.8.  Thanks to Martin Lukac for reporting the bug.

   [1m2.14, 26-Nov-08[0m
       BUG FIX
           Improved documentation for dbmapreduce's "-f" option.

       ENHANCEMENT
           dbcolmovingstats  how  computes  a  moving  standard  deviation  in
           addition to a moving mean.

   [1m2.15, 13-Apr-09[0m
       BUG FIX
           Fix a [4mmake[24m [4minstall[24m bug reported by Shalindra Fernando.

   [1m2.16, 14-Apr-09[0m
       BUG FIX
           Another minor release bug: on some systems [4mprogramize_module[24m looses
           executable permissions.  Again reported by Shalindra Fernando.

   [1m2.17, 25-Jun-09[0m
       TYPO FIXES
           Typo in the [4mdbroweval[24m manual fixed.

       IMPROVEMENT
           There  is no longer a comment line to label columns in [4mdbcolneaten[24m,
           instead the header  line  is  tweaked  to  line  up.   This  change
           restores  the  Jdb-1.x  behavior,  and  means that repeated runs of
           dbcolneaten no longer add comment lines each time.

       BUG FIX
           It turns out   [4mdbcolneaten[24m  was  not  correctly  handling  trailing
           spaces   when  given  the  "-E"  option  to  suppress  them.   This
           regression is now fixed.

       EXTENSION
           [1mdbroweval[22m(1) can now handle direct references to the last  row  via
           [4m$lfref[24m, a dubious but now documented feature.

       BUG FIXES
           Separators  set  with  "-C" in [4mdbcolmerge[24m and [4mdbcolsplittocols[24m were
           not  properly  setting  the  heading,  and  null  fields  were  not
           recognized.  The first bug was reported by Martin Lukac.

   [1m2.18,  1-Jul-09  A minor release[0m
       IMPROVEMENT
           Documentation for [4mFsdb::IO::Reader[24m has been improved.

       IMPROVEMENT
           The package should now be PGP-signed.

   [1m2.19,  10-Jul-09[0m
       BUG FIX
           Internal   improvements  to  debugging  output  and  robustness  of
           [4mdbmapreduce[24m and  [4mdbpipeline[24m.   [4mTEST/dbpipeline_first_fails.cmd[24m  re-
           enabled.

   [1m2.20,  30-Nov-09  (A  collection  of  minor  bugfixes, plus a build against[0m
       [1mFedora 12.)[0m
       BUG FIX
           Loging for [4mdbmapreduce[24m with code refs is now stable (it  no  longer
           includes a hex pointer to the code reference).

       BUG FIX
           Better  handling of mixed blank lines in [4mFsdb::IO::Reader[24m (see test
           case [4mdbcolize_blank_lines.cmd[24m).

       BUG FIX
           [4mhtml_table_to_db[24m now handles multi-line input better,  and  handles
           tables with COLSPAN.

       BUG FIX
           [4mdbpipeline[24m  now  cleans  up threads in an "eval" to prevent "cannot
           detach a  joined  thread"  errors  that  popped  up  in  perl-5.10.
           Hopefully  this  prevents  a  race  condition  that causes the test
           suites to hang about 20% of the time (in [4mdbpipeline_first_fails[24m).

       IMPROVEMENT
           [4mdbmapreduce[24m now detects and correctly  fails  when  the  input  and
           reducer have incompatible field separators.

       IMPROVEMENT
           [4mdbcolstats[24m,   [4mdbcolhisto[24m,  [4mdbcolscorrelate[24m,  [4mdbcolsregression[24m,  and
           [4mdbrowcount[24m now all take an "-F"  option  to  let  one  specify  the
           output field separator (so they work better with [4mdbmapreduce[24m).

       BUG FIX
           An  omitted "-k" from the manual page of [4mdbmultistats[24m is now there.
           Bug reported by Unkyu Park.

   [1m2.21, 17-Apr-10 bug fix release[0m
       BUG FIX
           [4mFsdb::IO::Writer[24m now no longer fails with  -outputheader  =>  never
           (an obscure bug).

       IMPROVEMENT
           [4mFsdb[24m  (in  the  warnings section) and [4mdbcolstats[24m now more carefully
           document how they handle (and do not  handle)  numerical  precision
           problems,  and  other  general  limits.  Thanks to Yuri Pradkin for
           prompting this documentation.

       IMPROVEMENT
           "Fsdb::Support::fullname_to_sortkey" is now restored from "Jdb".

       IMPROVEMENT
           Documention for multiple  styles  of  input  approaches  (including
           performance description) added to Fsdb::IO.

   [1m2.22,  2010-10-31 One new tool [4mdbcolcopylast[24m and several bug fixes for Perl[0m
       [1m5.10.[0m
       BUG FIX
           [4mdbmerge[24m now correctly handles n-way merges.  Bug reported  by  Yuri
           Pradkin.

       INCOMPARABLE CHANGE
           [4mdbcolneaten[24m now defaults to [4mnot[24m padding the last column.

       ADDITION
           [4mdbrowenumerate[24m now takes [1m-N NewColumn [22mto give the new column a name
           other  than  "count".   Feature  requested by Mike Rouch in January
           2005.

       ADDITION
           New program [4mdbcolcopylast[24m copies the last value of a column into  a
           new  column copylast_column of the next row.  New program requested
           by Fabio Silva; useful  for  converting  dbmultistats  output  into
           dbrvstatdiff input.

       BUG FIX
           Several  tools  (particularly  [4mdbmapreduce[24m  and [4mdbmultistats[24m) would
           report errors like  "Unbalanced  string  table  refcount:  (1)  for
           "STDOUT"  during  global  destruction" on exit, at least on certain
           versions of Perl (for me on 5.10.1), but similar errors  have  been
           off-and-on  for  several  Perl  releases.  Although I think my code
           looked OK, I worked around this problem with  a  different  way  of
           handling standard IO redirection.

   [1m2.23,  2011-03-10  Several  small portability bugfixes; improved [4mdbcolstats[0m
       [1mfor large datasets[0m
       IMPROVEMENT
           Documentation to [4mdbrvstatdiff[24m was changed to use "sd" to  refer  to
           standard  deviation, not "ss" (which might be confused with sum-of-
           squares).

       BUG FIX
           This documentation about [4mdbmultistats[24m was missing the [4m-k[24m option  in
           some cases.

       BUG FIX
           [4mdbmapreduce[24m  was  failing  on  MacOS-10.6.3 for some tests with the
           error

               dbmapreduce: cannot run external dbmapreduce reduce program (perl TEST/dbmapreduce_external_with_key.pl)

           The problem seemed to be only in the error, not in  operation.   On
           MacOS,  the error is now suppressed.  Thanks to Alefiya Hussain for
           providing access to a Mac system that  allowed  debugging  of  this
           problem.

       IMPROVEMENT
           The   [4mcsv_to_db[24m   command   requires   an   external  Perl  library
           ([4mText::CSV_XS[24m).  On computers  that  lack  this  optional  library,
           previously  Fsdb would configure with a warning and then test cases
           would fail.  Now those test cases are skipped  with  an  additional
           warning.

       BUG FIX
           The  test suite now supports alternative valid output, as a hack to
           account for  last-digit  floating  point  differences.   (Not  very
           satisfying :-(

       BUG FIX
           [4mdbcolstats[24m  output  for confidence intervals on very large datasets
           has changed.  Previously it failed for more  than  2^31-1  records,
           and  handling  of  T-Distributions with thousands of rows was a bit
           dubious.   Now  datasets  with  more  than  10000  are   considered
           infinitely large and hopefully correctly handled.

   [1m2.24,  2011-04-15  Improvements  to  fix  an  old  bug  in dbmapreduce with[0m
       [1mdifferent field separators[0m
       IMPROVEMENT
           The [4mdbfilealter[24m command had a  "--correct"  option  to  work-around
           from  incompatible  field-separators,  but  it did nothing.  Now it
           does the correct but sad, data-loosing thing.

       IMPROVEMENT
           The [4mdbmultistats[24m command previously failed with  an  error  message
           when invoked on input with a non-default field separator.  The root
           cause  was  the underlying [4mdbmapreduce[24m that did not handle the case
           of reducers that generated output with a different field  separator
           than  the  input.   We  now  detect  and  repair incompatible field
           separators.  This change corrects a problem  originally  documented
           and detected in Fsdb-2.20.  Bug re-reported by Unkyu Park.

   [1m2.25, 2011-08-07 Two new tools, [4mxml_to_db[24m and [4mdbfilepivot[24m, and a bugfix for[0m
       [1mtwo people.[0m
       IMPROVEMENT
           [4mkitrace_to_db[24m  now  supports  a [4m--utc[24m option, which also fixes this
           test case for users outside of the Pacific time zone.  Bug reported
           by David Graff, and also by Peter Desnoyers (within a week of  each
           other :-)

       NEW [4mxml_to_db[24m can convert simple, very regular XML files into Fsdb.

       NEW [4mdbfilepivot[24m "pivots" a file, converting multiple rows corresponding
           to the same entity into a single row with multiple columns.

   [1m2.26, 2011-12-12 Bug fixes, particularly for perl-5.14.2.[0m
       BUG FIX
           Bugs fixed in [1mFsdb::IO::Reader[22m(3) manual page.

       BUG FIX
           Fixed  problems  where  dbcolstats  was  truncating  floating point
           numbers  when  sorting.   This  strange  behavior  happens  as   of
           perl-5.14.2  and  it  [4mseems[24m like a Perl bug.  I've worked around it
           for the test suites, but I'm a bit nervous.

   [1m2.27, 2012-11-15 Accumulated bug fixes.[0m
       IMPROVEMENT
           [4mcsv_to_db[24m now reports errors in CVS input with real diagnostics.

       IMPROVEMENT
           [4mdbcolmovingstats[24m can  now  compute  median,  when  given  the  "-m"
           option.

       BUG FIX
           [4mdbcolmovingstats[24m  non-numeric  handling (the "-a" option) now works
           properly.

       DOCUMENTATION
           The internal [4mt/test_command.t[24m test framework is now documented.

       BUG FIX
           [4mdbrowuniq[24m now correctly handles the case where there  is  no  input
           (previously  it  output  a  blank  line,  which is a malformed fsdb
           file).  Thanks to Yuri Pradkin for reporting this bug.

   [1m2.28, 2012-11-15 A quick release to fix most rpmlint errors.[0m
       BUG FIX
           Fixed a number of minor release problems  (wrong  permissions,  old
           FSF address, etc.) found by rpmlint.

   [1m2.29, 2012-11-20 a quick release for CPAN testing[0m
       IMPROVEMENT
           Tweaked the RPM spec.

       IMPROVEMENT
           Modified  [4mMakefile.PL[24m to fail gracefully on Perl installations that
           lack threads.  (Without this fix, I get  massive  failures  in  the
           non-ithreads test system.)

   [1m2.30, 2012-11-25 improvements to perl portability[0m
       BUG FIX
           Removed unicode character in documention of [4mdbcolscorrelated[24m so pod
           tests will pass.  (Sigh, that should work :-( )

       BUG FIX
           Fixed  test  suite failures on 5 tests ([4mdbcolcreate_double_creation[0m
           was the first) due to Carp's addition of a  period.   This  problem
           was  breaking  Fsdb  on  perl-5.17.   Thanks to Michael McQuaid for
           helping diagnose this problem.

       IMPROVEMENT
           The test suite now prints out the names of tests it tries.

   [1m2.31, 2012-11-28 A release with  actual  improvements  to  dbfilepivot  and[0m
       [1mdbrowuniq.[0m
       BUG FIX
           Documentation   fixes:   typos   in   dbcolscorrelated,   bugs   in
           dbfilepivot,    clarification    for    comment     handling     in
           Fsdb::IO::Reader.

       IMPROVEMENT
           Previously  dbfilepivot  assumed  the input was grouped by keys and
           didn't very that pre-condition.  Now there is no pre-condition  (it
           will  sort the input by default), and it checks if the invariant is
           violated.

       BUG FIX
           Previously dbfilepivot failed if the input had comments (oops  :-);
           no longer.

       IMPROVEMENT
           Now  dbrowuniq  has the "-L" option to preserve the last unique row
           (instead of the first), a common idiom.

   [1m2.32, 2012-12-21 Test suites should now be more numerically robust.[0m
       NEW New dbfilediff does fsdb-aware file differencing.  It does  not  do
           smart  intuition of add/removes like Unix [1mdiff[22m(1), but it does know
           about columns, and with "-E", it does numeric-aware differences.

       IMPROVEMENT
           Test suites that are numeric now use dbfilediff to do numeric-aware
           comparisons, so the test suite should now  be  robust  to  slightly
           different  computers  and  operating  systems  and  compilers  than
           [4mexactly[24m what I use.

   [1m2.33, 2012-12-23 Minor fixes to some test cases.[0m
       IMPROVEMENT
           dbfilediff and dbrowuniq now supports the "-N" option to  give  the
           new  column  a  different  name.   (And  a  test  cases  where this
           duplication mattered have been fixed.)

       IMPROVEMENT
           dbrvstatdiff now show  the  t-test  breakpoint  with  a  reasonable
           number of floating point digits.

       BUG FIX
           Fixed  a  numerical  stability  problem  in the [4mdbroweval_last[24m test
           case.

[1mWHAT'S NEW[0m
   [1m2.34, 2013-02-10 Parallelism in dbmerge.[0m
       IMPROVEMENT
           Documention for dbjoin now includes resource requirements.

       IMPROVEMENT
           Default memory usage for dbsort is now  about  256MB.   (The  world
           keeps moving forward.)

       IMPROVEMENT
           dbmerge  now  does  merging  in parallel.  As a side-effect, dbsort
           should be  faster  when  input  overflows  memory.   The  level  of
           parallelism can be limited with the "--parallelism" option.  (There
           is more work to do here, but we're off to a start.)

   [1m2.35, 2013-02-23 Improvements to dbmerge parallelism[0m
       BUG FIX
           Fsdb   temporary   files   are  now  created  more  securely  (with
           File::Temp).

       IMPROVEMENT
           Programs that sort or merge on fields (dbmerge2,  dbmerge,  dbsort,
           dbjoin)  now report an error if no fields on which to join or merge
           are given.

       IMPROVEMENT
           Parallelism in dbmerge is should now be more consistent, with  less
           starting and stopping.

       IMPROVEMENT In dbmerge, the "--xargs" option lets one give input
       filenames on standard input, rather than the command line. This feature
       paves the way for faster dbsort for large inputs (by pipelining sorting
       and merging), expected in the next release.

   [1m2.36, 2013-02-25 dbsort pipelines with dbmerge[0m
       IMPROVEMENT For large inputs, dbsort now pipelines sorting and merging,
       allowing earlier processing.
       BUG FIX Since 2.35, dbmerge delayed cleanup of intermediate files,
       thereby requiring extra disk space.

   [1m2.37,  2013-02-26  quick  bugfix  to  support  parallel sort and merge from[0m
       [1mrecent releases[0m
       BUG FIX Since 2.35, dbmerge delayed removal of input files given by
       "--xargs".  This problem is now fixed.

   [1m2.38, 2013-04-29 minor bug fixes[0m
       CLARIFICATION
           Configure now rejects Windows since tests  seem  to  hang  on  some
           versions  of  Windows.  (I would love help from a Windows developer
           to  get  this  problem  fixed,  but   I   cannot   do   it.)    See
           [4mhttps://rt.cpan.org/Ticket/Display.html?id=84201[24m.

       IMPROVEMENT
           All   programs   that   use   temporary   files   (dbcolpercentile,
           dbcolscorrelate, dbcolstats, dbcolstatscores)  now  take  the  "-T"
           option and set the temporary directory consistently.

           In addition, error messages are better when the temporary directory
           has problems.  Problem reported by Liang Zhu.

       BUG FIX
           dbmapreduce  was  failing  with external, map-reduce aware reducers
           (when invoked with -M and an external program).   (Sigh,  did  this
           case  ever  work?)   This  case  should  now  work.  Thanks to Yuri
           Pradkin for reporting this bug (in 2011).

       BUG FIX
           Fixed perl-5.10 problem with dbmerge.  Thanks to Yuri  Pradkin  for
           reporting this bug (in 2013).

   [1m2.39, date 2013-05-31 quick release for the dbrowuniq extension[0m
       BUG FIX
           Actually  in  2.38,  the  Fedora  [4m.spec[24m  got  cleaner dependencies.
           Suggestion        from         Christopher         Meng         via
           <https://bugzilla.redhat.com/show_bug.cgi?id=877096>.

       ENHANCEMENT
           Fsdb  files  are now explicitly set into UTF-8 encoding, unless one
           specifies "-encoding" to "Fsdb::IO".

       ENHANCEMENT
           dbrowuniq now supports "-I" for incremental counting.

   [1m2.40, 2013-07-13 small bug fixes[0m
       BUG FIX
           dbsort now has more respect for a user-given  temporary  directory;
           it no longer is ignored for merging.

       IMPROVEMENT
           dbrowuniq now has options to output the first, last, and both first
           and last rows of a run ("-F", "-L", and "-B").

       BUG FIX
           dbrowuniq now correctly handles "-N".  Sigh, it didn't work before.

   [1m2.41, 2013-07-29 small bug and packaging fixes[0m
       ENHANCEMENT
           Documentation  to dbrvstatdiff improved (inspired by questions from
           Qian Kun).

       BUG FIX
           dbrowuniq  no  longer  duplicates  singleton  unique   lines   when
           outputting both (with "-B").

       BUG FIX
           Add missing "XML::Simple" dependency to [4mMakefile.PL[24m.

       ENHANCEMENT
           Tests  now  show  the  diff of the failing output if run with "make
           test TEST_VERBOSE=1".

       ENHANCEMENT
           dbroweval now includes documentation for how to output extra  rows.
           Suggestion from Yuri Pradkin.

       BUG FIX
           Several  improvements  to  the Fedora package from Michael Schwendt
           via <https://bugzilla.redhat.com/show_bug.cgi?id=877096>, and  from
           the  harsh  master  that  is [4mrpmlint[24m.  (I am stymied at teaching it
           that "outliers" is spelled  correctly.   Maybe  I  should  send  it
           Schneier's  book.   And  an unresolvable invalid-spec-name lurks in
           the SRPM.)

   [1m2.42, 2013-07-31 A bug fix and packaging release.[0m
       ENHANCEMENT
           Documentation to dbjoin improved to better memory usage.  (Based on
           problem report by Lin Quan.)

       BUG FIX
           The [4m.spec[24m is now [4mperl-Fsdb.spec[24m  to  satisfy  [4mrpmlint[24m.   Thanks  to
           Christopher Meng for a specific bug report.

       BUG FIX
           Test [4mdbroweval_last.cmd[24m no longer has a column that caused failures
           because of numerical instability.

       BUG FIX
           Some  tests  now  better handle bugs in old versions of perl (5.10,
           5.12).  Thanks to Calvin Ardi for help debugging this on a Mac with
           perl-5.12, but the fix should affect other platforms.

   [1m2.43, 2013-08-27 Adds in-file compression.[0m
       BUG FIX
           Changed  the  sort  on  [4mTEST/dbsort_merge.cmd[24m  to   strings   (from
           numerics)  so  we're less susceptible to false test-failures due to
           floating point IO differences.

       EXPERIMENTAL ENHANCEMENT
           Yet more parallelism in dbmerge: new "endgame-mode" builds a  merge
           tree  of processes at the end of large merge tasks to get maximally
           parallelism.  Currently this feature is off by default  because  it
           can  hang  for  some inputs.  Enable this experimental feature with
           "--endgame".

       ENHANCEMENT
           "Fsdb::IO" now handles being given "IO::Pipe" objects (as exercised
           by dbmerge).

       BUG FIX
           Handling of NamedTmpfiles now supports concurrency.  This fix  will
           hopefully  fix  occasional "Use of uninitialized value $_ in string
           ne at ...NamedTmpfile.pm line 93."  errors.

       BUG FIX
           Fsdb now requires perl 5.10.  This is a bug fix because  some  test
           cases   used  to  require  it,  but  this  fact  was  not  properly
           documented.  (Back-porting to 5.008 would require removing all "//"
           operators.)

       ENHANCEMENT
           Fsdb now handles automatic compression of  file  contents.   Enable
           compression  with  "dbfilealter  -Z  xz"  (or  "gz" or "bz2").  All
           programs should operate on compressed files and  leave  the  output
           with the same level of compression.  "xz" is recommended as fastest
           and  most  efficient.  "gz" is produces unrepeatable output (and so
           has no output test), it seems to insist on adding a timestamp.

   [1m2.44, 2013-10-02 A major change--all threads are gone.[0m
       ENHANCEMENT
           Fsdb is now thread free and only uses  processes  for  parallelism.
           This  change  is a big change--the entire motivation for Fsdb-2 was
           to exploit parallelism via threading.  Parallelism--good, but  perl
           threading--bad  for  performance.   Horribly  bad  for performance.
           About 20x worse than pipes on my box.  (See perl  bug  #119445  for
           the discussion.)

       NEW "Fsdb::Support::Freds"  provides  a  thread-like  abstraction  over
           forking, with some nice support for callbacks in  the  parent  upon
           child termination.

       ENHANCEMENT
           Details  about  removing  threads: "dbpipeline" is thread free, and
           new tests to  verify  each  of  its  parts.   The  easy  cases  are
           "dbcolpercentile",   "dbcolstats",   "dbfilepivot",  "dbjoin",  and
           "dbcolstatscores",  each  of  which   use   it   in   simple   ways
           (2013-09-09).  "dbmerge" is now thread free (2013-09-13), but was a
           significant  rewrite,  which brought "dbsort" along.  "dbmapreduce"
           is partly thread free (2013-09-21), again  as  a  rewrite,  and  it
           brings  "dbmultistats" along.  Full "dbmapreduce" support took much
           longer (2013-10-02).

       BUG FIX
           When running with user-only output ("-n"), dbroweval now resets the
           output vector $ofref after it has been output.

       NEW dbcolcreate will create all columns at the head of  each  row  with
           the "--first" option.

       NEW dbfilecat  will concatenate two files, verifying that they have the
           same schema.

       ENHANCEMENT
           dbmapreduce now passes comments through, rather than eating them as
           before.

           Also,  dbmapreduce  now  supports  a   "--"   option   to   prevent
           misinterpreting sub-program parameters as for dbmapreduce.

       INCOMPATIBLE CHANGE
           dbmapreduce no longer figures out if it needs to add the key to the
           output.   For multi-key-aware reducers, it never does (and cannot).
           For non-multi-key-aware reducers, it defaults to add  the  key  and
           will now fail if the reducer adds the key (with error "dbcolcreate:
           attempt  to  create  pre-existing  column...").  In such cases, one
           must disable adding the key with the new option "--no-prepend-key".

       INCOMPATIBLE CHANGE
           dbmapreduce no longer copies the input field separator by  default.
           For multi-key-aware reducers, it never does (and cannot).  For non-
           multi-key-aware  reducers,  it  defaults  to  [4mnot[24m copying the field
           separator,  but  it  will  copy  it  (the  old  default)  with  the
           "--copy-fs" option

   [1m2.45, 2013-10-07 cleanup from de-thread-ification[0m
       BUG FIX
           Corrected a fast busy-wait in dbmerge.

       ENHANCEMENT
           Endgame  mode  enabled  in  dbmerge;  it  (and  also large cases of
           dbsort) should now exploit greater parallelism.

       BUG FIX
           Test case with "Fsdb::BoundedQueue" (gone since 2.44) now removed.

   [1m2.46, 2013-10-08 continuing cleanup of our no-threads version[0m
       BUG FIX
           Fixed some packaging  details.   (Really,  threads  are  no  longer
           required, missing tests in the MANIFEST.)

       IMPROVEMENT
           dbsort  now  better  communicates  with  the merge process to avoid
           bursty parallelism.

           Fsdb::IO::Writer now can take "-autoflush =" 1>  for  line-buffered
           IO.

   [1m2.47, 2013-10-12 test suite cleanup for non-threaded perls[0m
       BUG FIX
           Removed  some  stray  "use  threads" in some test cases.  We didn't
           need them, and these were breaking non-threaded perls.

       BUG FIX
           Better  handling  of  Fred   cleanup;   should   fix   intermittent
           dbmapreduce failures on BSD.

       ENHANCEMENT
           Improved  test  framework  to  show  output when tests fail.  (This
           time, for real.)

   [1m2.48, 2014-01-03 small bugfixes and improved release engineering[0m
       ENHANCEMENT
           Test suites now skip tests for libraries that are missing.   (Patch
           for missing "IO::Compresss:Xz" contributed by Calvin Ardi.)

       ENHANCEMENT
           Removed  references to Jdb in the package specification.  Since the
           name was changed in  2008,  there's  no  longer  a  huge  need  for
           backwards compatibility.  (Suggestion form Petr  abata.)

       ENHANCEMENT
           Test   suites   now   invoke   the   perl   using   the  path  from
           $Config{perlpath}.  Hopefully this helps  testing  in  environments
           where  there  are  multiple installed perls and the default perl is
           not   the   same   as   the   perl-under-test   (as   happens    in
           cpantesters.org).

       BUG FIX
           Added  specific  encoding  to  this manpage to account for Unicode.
           Required to build correctly against perl-5.18.

   [1m2.49, 2014-01-04  bugfix  to  unicode  handling  in  Fsdb  IO  (plus  minor[0m
       [1mpackaging fixes)[0m
       BUG FIX
           Restored a line in the [4m.spec[24m to chmod g-s.

       BUG FIX
           Unicode  decoding  is  now handled correctly for programs that read
           from standard input.  (Also: New test scripts cover  unicode  input
           and output.)

       BUG FIX
           Fix to Fsdb documentation encoding line.  Addresses test failure in
           perl-5.16  and earlier.  (Who knew "encoding" had to be followed by
           a blank line.)

[1mWHAT'S NEW[0m
   [1m2.50, 2014-05-27 a quick release for spec tweaks[0m
       ENHANCEMENT
           In dbroweval, the  "-N"  (no  output,  even  comments)  option  now
           implies "-n", and it now suppresses the header and trailer.

       BUG FIX
           A few more tweaks to the [4mperl-Fsdb.spec[24m from Petr  abata.

       BUG FIX
           Fixed  3  uses of "use v5.10" in test suites that were causing test
           failures (due to warnings, not real failures) on some platforms.

   [1m2.51, 2014-09-05 Feature  enhancements  to  dbcolmovingstats,  dbcolcreate,[0m
       [1mdbmapreduce, and new sqlselect_to_db[0m
       ENHANCEMENT
           dbcolcreate  now  has  a  "--no-recreate-fatal"  that  causes it to
           ignore creation of existing columns (instead of failing).

       ENHANCEMENT
           dbmapreduce once again is robust to reducers that output  the  key;
           "--no-prepend-key" is no longer mandatory.

       ENHANCEMENT
           dbcolsplittorows can now enumerate the output rows with "-E".

       BUG FIX
           dbcolmovingstats  is  more  mathematically  robust.  Previously for
           some inputs and  some  platforms,  floating  point  rounding  could
           sometimes cause squareroots of negative numbers.

       NEW sqlselect_to_db converts the output of the MySQL or MarinaDB select
           comment into fsdb format.

       INCOMPATIBLE CHANGE
           dbfilediff  now  outputs  the  [4msecond[24m row when doing sloppy numeric
           comparisons, to better support test suites.

   [1m2.52, 2014-11-03 Fixing the test suite for line number changes.[0m
       ENHANCEMENT
           Test suites changes to be robust to exact line numbers of failures,
           since  different   Perl   releases   fail   on   different   lines.
           <https://bugzilla.redhat.com/show_bug.cgi?id=1158380>

   [1m2.53, 2014-11-26 bug fixes and stability improvements to dbmapreduce[0m
       ENHANCEMENT
           The dbfilediff how supports a "--quiet" option.

       ENHANCEMENT
           Better documention of dbpipeline_filter.

       BUGFIX
           Added  groff-base  and  perl-podlators  to the Fedora package spec.
           Fixes <https://bugzilla.redhat.com/show_bug.cgi?id=1163149>.  (Also
           in package 2.52-2.)

       BUGFIX
           An  important  stability  improvement  to  dbmapreduce.   It,  plus
           dbmultistats,  and  dbcolstats  now  support controlled parallelism
           with the "--pararallelism=N" option.  They default to run with  the
           number  of available CPUs.  dbmapreduce also moderates its level of
           parallelism.   Previously  it  would  create  reducers  as  needed,
           causing  CPU  thrashing  if  reducers  ran  much  slower  than data
           production.

       BUGFIX
           The combination of dbmapreduce with dbrowenumerate now works as  it
           should.   (The obscure bug was an interaction with dbcolcreate with
           non-multi-key reducers that output their own key.  dbmapreduce  has
           too many useful corner cases.)

   [1m2.54, 2014-11-28 fix for the test suite to correct failing tests on not-my-[0m
       [1mplatform[0m
       BUGFIX
           Sigh,  the  test suite now has a test suite.  Because, yes, I broke
           it, causing many incorrect failures at cpantesters.  Now fixed.

   [1m2.55, 2015-01-05 many spelling fixes and dbcolmovingstats  tests  are  more[0m
       [1mrobust to different numeric precision[0m
       ENHANCEMENT
           dbfilediff  now  can  be extra quiet, as I continue to try to track
           down a numeric difference on FreeBSD AMD boxes.

       ENHANCEMENT
           dbcolmovingstats  gave  different  test  output  (just   reflecting
           rounding  error)  when  stddev approaches zero.  We now detect hand
           handle                this                case.                 See
           <https://rt.cpan.org/Public/Bug/Display.html?id=101220>  and thanks
           to H. Merijn Brand for the bug report.

       BUG FIX
           Many, many spelling bugs found by H. Merijn Brand; thanks  for  the
           bug report.

       INCOMPATBLE CHANGE
           A    number    of    programs   had   misspelled   "separator"   in
           "--fieldseparator" and "--columnseparator" options as  "seperator".
           These are now correctly spelled.

   [1m2.56, 2015-02-03 fix against Getopt::Long-2.43's stricter error checkign[0m
       BUG FIX
           Internal argument parsing uses Getopt::Long, but mixed pass-through
           and      <>.      Bug     reported     by     Petr     Pisar     at
           <https://bugzilla.redhat.com/show_bug.cgi?id=1188538>.a

       BUG FIX
           Added missing BuildRequires for "XML::Simple".

   [1m2.57, 2015-04-29 Minor changes, with better performance from  dbmulitstats.[0m

       BUG FIX
           dbfilecat  now  honors  "--remove-inputs"  (previously  it didn't).
           This omission  meant  that  dbmapreduce  (and  dbmultistats)  would
           accumulate files in [4m/tmp[24m when running.  Bad news for inputs with 4M
           keys.

       ENHANCMENT
           dbmultistats  should be faster with lots of small keys.  dbcolstats
           now supports "-k" to get some of the functionality of  dbmultistats
           (if data is pre-sorted and median/quartiles are not required).

           dbfilecat  now  honors  "--remove-inputs"  (previously  it didn't).
           This omission  meant  that  dbmapreduce  (and  dbmultistats)  would
           accumulate files in [4m/tmp[24m when running.  Bad news for inputs with 4M
           keys.

   [1m2.58, 2015-04-30 Bugfix in dbmerge[0m
       BUG FIX
           Fixed a case where dbmerge suffered mojobake in endgame mode.  This
           bug  surfaced when dbsort was applied to large files (big enough to
           require merging) with unicode in them; the  symptom  was  soemthing
           like:
             Wide  character  in  print  at /usr/lib64/perl5/IO/Handle.pm line
           420, <GEN12> line 111.

   [1m2.59,  2016-09-01  Collect  a  few  small  bug  fixes   and   documentation[0m
       [1mimprovements.[0m
       BUG FIX
           More  IO  is  explicitly  marked  UTF-8 to avoid Perl's tendency to
           mojibake on otherwise  valid  unicode  input.   This  change  helps
           html_table_to_db.

       ENHANCEMENT
           dbcolscorrelate now crossreferences dbcolsregression.

       ENHANCEMENT
           Documentation  for  dbrowdiff  now  clarifies  that  the default is
           baseline mode.

       BUG FIX
           dbjoin now propagates "-T" into  the  sorting  process  (if  it  is
           required).  Thanks to Lan Wei for reporting this bug.

   [1m2.60, 2016-09-04 Adds support for hash joins.[0m
       ENHANCEMENT
           dbjoin   now  supports  hash  joins  with  "-t  lefthash"  and  "-t
           righthash".  Hash joins cache a table in memory, but do not require
           that the other table be sorted.  They  are  ideal  when  joining  a
           large table against a small one.

   [1m2.61, 2016-09-05 Support left and right outer joins.[0m
       ENHANCEMENT
           dbjoin  now  handles  left and right outer joins with "-t left" and
           "-t right".

       ENHANCEMENT
           dbjoin hash joins are now  selected  with  "-m  lefthash"  and  "-m
           righthash"    (not   the   shortlived   "-t   righthash"   option).
           (Technically this change is incompatible with Fsdd-2.60, but no one
           but me ever used that version.)

   [1m2.62, 2016-11-29 A new yaml_to_db and other minor improvements.[0m
       ENHANCEMENT
           Documentation for xml_to_db now includes sample output.

       NEW yaml_to_db converts a specific form of YAML to fsdb.

       BUG FIX
           The test suite now uses "diff -c -b" rather than "diff -cb" to make
           OpenBSD-5.9 happier, I hope.

       ENHANCEMENT
           Comments that log operations at the end of each file now do  simple
           quoting  of  spaces.   (It  is  not  guaranteed  to be fully shell-
           compliant.)

       ENHANCEMENT
           There is a new standard option, "--header", allowing one to specify
           an Fsdb header for inputs that lack it.  Currently it is  supported
           by   dbcoldefine,  dbrowuniq,  dbmapreduce,  dbmultistats,  dbsort,
           dbpipeline.

       ENHANCEMENT
           dbfilepivot now allows the [1m--possible-pivots [22moption, and if  it  is
           provided processes the data in one pass.

       ENHANCEMENT
           dbroweval logs are now quoted.

   [1m2.63,  2017-02-03  Re-add some features supposedly in 2.62 but not, and add[0m
       [1mmore --header options.[0m
       ENHANCEMENT
           The option [1m-j [22mis now a synonym  for  [1m--parallelism[22m.   (And  several
           documention bugs about this option are fixed.)

       ENHANCEMENT
           Additional  support for "--header" in dbcolmerge, dbcol, dbrow, and
           dbroweval.

       BUG FIX
           Version 2.62 was supposed to have this  improvement,  but  did  not
           (and  now  does):  dbfilepivot  now  allows  the  [1m--possible-pivots[0m
           option, and if it is provided processes the data in one pass.

       BUG FIX
           Version 2.62 was supposed to have this  improvement,  but  did  not
           (and now does): dbroweval logs are now quoted.

   [1m2.64, 2017-11-20 several small bugfixes and enhancements[0m
       BUG FIX
           In  dbroweval,  the  "next row" option previously did not correctly
           set up "_last_fieldname".  It now does.

       ENHANCEMENT
           The csv_to_db converter now has an optional "-F x"  option  to  set
           the field separator.

       ENHANCEMENT
           Finally  dbcolsplittocols  has  a "--header" option, and a new "-N"
           option to give the list of resulting output columns.

       INCOMPATIBLE CHANGE
           Now dbcolstats and dbmultistats produce no output  (but  a  schema)
           when  given no input but a schema.  Previously they gave a null row
           of       output.        The       "--output-on-no-input"        and
           "--no-output-on-no-input" options can control this behavior.

   [1m2.65, 2018-02-16 Minor release, bug fix and -F option.[0m
       ENHANCEMENT
           dbmultistats  and  dbmapreduce now both take a "-F x" option to set
           the field separator.

       BUG FIX
           Fixed missing "use Carp" in dbcolstats.  Also went back and cleaned
           up all uses of croak().  Thanks to Zefram for the bug report.

   [1m2.66, 2018-12-20 Critical bug fix in dbjoin.[0m
       BUG FIX
           Removed old tests from MANIFEST.  (Thanks to Hang Guo for reporting
           this bug.)

       IMPROVEMENT
           Errors for non-existing input files now include  the  bad  filename
           (before: "cannot setup filehandle", now: "cannot open input: cannot
           open TEST/bad_filename").

       BUG FIX
           Hash  joins  with  three  identical  rows  were  failing  with  the
           assertion failure "internal error: confused about overflow" due  to
           a now-fixed bug.

   [1m2.67, 2019-07-10 add support for reading and writing hdfs[0m
       IMPROVEMENT
           dbformmail  now  has  an  "mh"  mechanism  that  writes messages to
           individual files (an mh-style mailbox).

       BUG FIX
           dbrow failed to include the  Carp  library,  leading  to  fails  on
           croak.

       BUG FIX
           Fixed  dbjoin  error  message  for  an  unsorted  right  stream was
           incorrect (it said left).

       IMPROVEMENT
           All Fsdb programs can now read from and write to HDFS,  when  files
           that start with "hdfs:" are given to -i and -o options.

   [1m2.68,  2019-09-19 All programs now support automatic decompression based on[0m
       [1mfile extension.[0m
       IMPROVEMENT
           The omitted-possible-error test case for  dbfilepivot  now  has  an
           altnerative  output  that I saw on some BSD-running systems (thanks
           to CPAN).

       IMPROVEMENT
           dbmerge and dbmerge2 now support "--header".   dbmerge2  now  gives
           better error messages when presented the wrong number of inputs.

       BUG FIX
           dbsort  now works with "--header" even when the file is big (due to
           fixes to dbmerge).

       IMPROVEMENT
           cvs_to_db now processes data with the "binary" option, allowing  it
           to handle newlines embedded in quoted fields.

       IMPROVEMENT
           All programs now will transparently decompress input files, if they
           are  listed  as a filename as an input argument that extends with a
           standard extension (.gz, .bz2, and .xz).

   [1m2.69, 2019-11-22 a small bugfix in dbcolstats[0m
       BUG FIX
           Filled in the the test case for autodecompress, which  was  missing
           for the 2.68 release.

       ENHANCEMENT
           The  groff  program  is  required  for build, and the "Makefile.PL"
           fails if groff is missing at build time.  Thanks to Chris  Williams
           for  suggesting  this  check, and the CPAN auto-building system for
           trying many platforms.

       BUG FIX
           The dbcolstats program had  numerical  instability  that  sometimes
           results  in  failing  with  a square-root of a negative number when
           many values varied right at the edge of  floating-point  precision.
           We now detect and report that case as 0 stddev.  Thanks to Hang Guo
           for providing a test case.

   [1m2.70,  2020-11-12  Some  small quality-of-life enhancements and corner-case[0m
       [1mbugfixes.[0m
       ENHANCEMENT
           dbcol can now take an option "-a" to include all columns,  allowing
           reordering of certain columns while passing the rest through.

       ENHANCEMENT
           dbrowuniq  and  dbmerge  now buffer comments in a way that the last
           row of data output is no longer in  the  last  block  of  comments.
           (The  data  is  identical,  but  for humans looking at output, this
           change makes it less likely to lose the last row.)

       BUG FIX
           dbmultistats and dbpipeline documentation now indicates  that  they
           support  "--header"  (something  they  did  since  version  2.62 in
           2016-11-29, but now documented.

       ENHANCEMENT
           dbcolcreate now supports "--header".

       BUG FIX
           Fixed several spelling errors in deprecated  programs  and  removed
           information  about  the no-longer existing FreeBSD and MacOS ports.
           Thanks to Calvin Ardi for the patch.

       BUG FIX
           dbmerge now handles --xargs when only one  file  is  provided  (and
           passes  the  file through unchanged).  It also throws a clean error
           with --xargs if zero files  are  provided.   (To  support  dbmerge,
           dbcol  now  has an internal "--saveoutput" option.)  Thanks to Yuri
           Pradkin for reporting the unhandled corner-case.

   [1m2.71, 2020-11-16 Fix a race condition breaking test suites.[0m
       BUG FIX
           Suppress a race condition in dbcolmerge was sometimes throwing  the
           error   "Fsdb::Support::Freds:   ending,   but   running   process:
           dbmerge:xargs" in the dbmerge_0_xargs test case, on exit.

   [1m2.72, 2020-12-01 A small bug and a packaging improvement.[0m
       BUG FIX
           dbcolhisto now handles the degenerate case where everything has the
           same value (previously it would throw "illegal division by zero").

       ENHANCEMENT
           The spec for Fedora now includes "make" as BuildRequires, something
           required for Fedora 34.

   [1m2.73, 2021-05-18 Updates dbcolpercentile with "--weighted", and  with  more[0m
       [1mipv6.[0m
       ENHANCEMENT
           dbcolpercentile now has a "--weighted" option.

       ENHANCEMENT
           The   new   Fsdb::Support::IPv6  package  includes  ipv6_normalize,
           ipv6_zeroize to rewrite ipv6 print addresses in IPv6  normal  form,
           with a 0 in each 4-nybble field.

   [1m2.74, 2021-06-23 More ipv6.[0m
       ENHANCEMENT
           Fsdb::Support::IPv6  package  includes ipv6_fullhex to rewrite ipv6
           print addresses as full, 128-bit hex values.

   [1m2.75, 2022-04-02 New type specifications in the schema  to  better  support[0m
       [1mtype conversions in python.[0m
       ENHANCEMENT
           Add optional type specifications to the schema.  Types are not used
           in  Perl,  but  are relevant in Python and Go Fsdb bindings.  Types
           use a subset of perl pack specifiers: c, s, l, q are signed 8,  16,
           32,  and  64-bit  integers,  f  is a float, d is double float, a is
           utf-8 string, and &gt; and &lt; can force big or little endianness.
           The default type for everything is "a",  that  is,  utf-8  strings.
           Thanks to Wes Hardaker for pushing to get this long-desired feature
           out the door; his Python bindings need types.

       ENHANCEMENT
           dbcol,  dbcolcreate,  dbcolcopylast, and dbcolrename now understand
           and propagate schema types.  dbsort, dbjoin, dbmerge, dbmerge2  and
           dbfilepivot  all  take  a  new option "-t" to sort by type-inferred
           comparision, if a type is given.

       ENHANCEMENT
           dbcolstat, dbmultistats,  and  dbcolmovingstats  now  include  type
           information  in their output schema.  (They assumes input variables
           are floats, not integers.)

       ENHANCEMENT
           Even more IPv6: the functions in  Fsdb::Support::IPv6  package  now
           support  strings  of  hex  digits  as  an alternate encoding for IP
           address (and they are already  the  output  of  ipv6_fullhex),  and
           "ip_fullhex_to_normal"  converts  full  hex-encoded  IPv4  or  IPv6
           addresses to their "normal" form  (dotted-quad  or  IPv6  printable
           format).

   [1m3.0, 2022-04-04 Complete type support and accordingly bump major version.[0m
       NEW The  major  version number is now 3.0 to correspond to the addition
           of types (although they were actually added  in  2.75).   Old  fsdb
           files   are   supported  (Fsdb-3.0  is  backwards  compatible  with
           databases), but older versions will confuse types in new files (new
           Fsdb files are not forward compatible with old versions).

       ENHANCEMENT
           Type  specifications  in   a   few   more   programs:   dbcolhisto,
           dbcolscorrelate,         dbcolsregression,         dbcolstatscores,
           dbrowaccumulate, dbrowcount, dbrowdiff, dbrvstatdiff.

       ENHANCEMENT
           dbcolhisto now puts an empty value on any empty rows.

       NEW dbcoltype redefines column types, or  clears  them  with  the  "-v"
           option.

   [1m3.1, 2022-11-22 A post-3.0 cleanup release with minor fixes.[0m
       ENHANCEMENT
           Type   specifications  in  a  few  more  programs  that  I  missed:
           dbrowuniq, dbcolpercentile.

       ENHANCEMENT
           Minor documentation improvements.

   [1m3.2, 2023-10-11 Add new module dbcolsdecimate[0m
       NEW dbcolsdecimate reduces density in timeseries data  to  make  graphs
           with overly dense points visually similar but smaller.

       ENHANCEMENT
           yaml_to_db  now  flattens  one level of arrays into comma-separated
           lists.

       ENHANCEMENT
           Clearer installation instructions.

   [1m3.3, 2023-10-13 Quickly making dbcolsdecimate more flexible.[0m
       INCOMPATBILE ENHANCEMENT
           dbcolsdecimate now takes either  relative  ([1m-p[22m)  or  absolute  ([1m-P[22m)
           precision,  and  precision  now  affects  only  subsequent columns.
           Also, if absolute precisions are given for all columns, data is not
           buffered.

[1mAUTHOR[0m
       John Heidemann, "johnh@isi.edu"

       See "Contributors" for the many people who have contributed bug reports
       and fixes.

[1mCOPYRIGHT[0m
       Fsdb is Copyright (C) 1991-2024 by John Heidemann <johnh@isi.edu>.

       This program is free software; you can redistribute it and/or modify it
       under the terms of version 2 of  the  GNU  General  Public  License  as
       published by the Free Software Foundation.

       This  program  is  distributed  in the hope that it will be useful, but
       WITHOUT  ANY  WARRANTY;  without   even   the   implied   warranty   of
       MERCHANTABILITY  or  FITNESS  FOR  A  PARTICULAR  PURPOSE.  See the GNU
       General Public License for more details.

       You should have received a copy of the GNU General Public License along
       with this program; if not, write to the Free Software Foundation, Inc.,
       675 Mass Ave, Cambridge, MA 02139, USA.

       A copy of the GNU General Public License  can  be  found  in  the  file
       ``COPYING''.

[1mCOMMENTS and BUG REPORTS[0m
       Any  comments  about  these  programs  should be sent to John Heidemann
       "johnh@isi.edu".

perl v5.38.2                      2024-01-06                           [4mFsdb[24m(3)