The best way to learn about the structure of libmtcelledit is to study the public API header file, and the programs that use it.
I find libmtcelledit a very useful library when developing my own software so I thought it would be helpful to include a few of these programs with the mtCellEdit package to demonstrate what can be done.
This program demonstrates how you can create a small console program to monitor network activity, and then use spreadsheet formulas for accounting. Data is saved as values in a TSV file so its contents can be read at the command line at any time (even while logging). You can run this program at bootup and it will record all network activity. I do this by inserting the following code at the end of file like /etc/rc.d/rc.S or /etc/rc.local :
/usr/bin/mtnetlog -network ppp0 -delay 10 /var/log/mtnetlog_ppp0.tsv & /usr/bin/mtnetlog -network eth0 -delay 10 /var/log/mtnetlog_eth0.tsv &
This program demonstrates how TSV files can be used to index other input files. In this case, mtEleana is a front end to the UK General Election results from 1945 to 2015.
This program demonstrates how you can use a spreadsheet to make the following jobs easier:
mtCellEdit was designed and built on a GNU/Linux system, with the GUI created with the help of GTK+ version 2. It should possible to port it anywhere else (with a little effort) as it is just a C/C++ program. I don't have enough time and energy to do this porting myself but feel free to port the code if you want (according to the GNU GPL of course) and tell me or the world about your experiences.
If you don't like the default menu keyboard shortcuts, it is possible to change them by tweaking the source code. This is a fairly easy thing to do:
cp tk_gtk2_main.c tk_gtk2_main.c.old
diff -up tk_gtk2_main.c.old tk_gtk2_main.c > ~/mtcelledit_shortcuts.diff
patch -p0 < ~/mtcelledit_shortcuts.diff
Because the TSV file format is so simple its very easy to read using the command line or a script. This is completely independent of the mtCellEdit GUI or even the core libmtcelledit library. This ensures that your data is completely accessible on any properly equipped GNU operating system.
Here is a summary of some useful commands for manipulating a TSV spreadsheet:
Command | Description |
awk | General purpose parsing, manipulation, and reporting |
cat | Concatenate various files together |
cut | Remove columns |
head | Output the top rows of a file |
paste | Paste one files data after another files data (line by line) |
sed | Simple regexp's |
sort | Sort rows |
tac | Reverse the order of the rows (vertical flip) |
tail | Output the bottom of a file |
tee | Read from stdin and output to stdout and files |
tr | Translate or delete characters |
uniq | Remove duplicate rows (see also sort -u) |
The shell also provides the usual redirections:
> | Create a new file |
>> | Append rows to a file |
Writing a small awk program to process a TSV sheet offers a few advantages over a C program. For example some simple tasks require just a few awk instructions to complete, compared to a C program that would need to be much larger and therefore more time consuming to create and maintain.
You can use the awk program to process the TSV file created by mtNetLog and present the data in a more human readable form like this:
cat /var/log/mtnetlog_ppp0.tsv | awk -F "\t" 'NR==2{for (i=1; i<=68; i++) printf "-"; printf "\n"} NR!=2{printf "%20s %10.2fMB %10.2fMB %10.2fMB %8.2f\n", $1, $7, $8, $9, 100*$8/$7}'
Which displays the in/out/total columns and a perctange of input to output flow. This will convert the data from this file:
61.555277776904 253420306 30916922 330063404 301.02403259277 13.748958587646 314.77299118042 'Start 'End 'Hours 'Bytes In 'Bytes Out 'Bytes Total 'MB In 'MB Out 'MB Total 2010-11-10 12:10:42 2010-11-11 00:39:11 12.474722223356 36193057 11935767 86299634 73.300669670105 9.0010747909546 82.30174446106 2010-11-14 09:59:13 2010-11-15 01:36:16 15.617500000633 15034107 2117117 70948296 67.253649711609 0.40791797637939 67.661567687988 2010-11-19 11:58:12 2010-11-20 02:43:36 14.756666664965 68171569 6224867 36326937 32.803199768066 1.8408670425415 34.644066810608 2010-11-20 10:09:43 2010-11-21 02:21:46 16.200833332725 84574392 1741061 56100792 52.855527877808 0.6463565826416 53.501884460449 2010-11-21 12:29:45 2010-11-21 15:00:05 2.5055555552244 49447181 8898110 80387745 74.810985565186 1.8527421951294 76.663727760315
Into this:
301.02MB 13.75MB 314.77MB 4.57 -------------------------------------------------------------------- 2010-11-10 12:10:42 73.30MB 9.00MB 82.30MB 12.28 2010-11-14 09:59:13 67.25MB 0.41MB 67.66MB 0.61 2010-11-19 11:58:12 32.80MB 1.84MB 34.64MB 5.61 2010-11-20 10:09:43 52.86MB 0.65MB 53.50MB 1.22 2010-11-21 12:29:45 74.81MB 1.85MB 76.66MB 2.48
You can also use the sort command line program in order to sort according to a given field:
cat /var/log/mtnetlog_ppp0.tsv | tail -n+3 | sort -t $'\t' -nk 3 | awk -F "\t" '{ printf "%s\t%5.2f hours\n", $1, $3 }'
Creates this output:
2010-11-21 12:29:45 2.51 hours 2010-11-10 12:10:42 12.47 hours 2010-11-19 11:58:12 14.76 hours 2010-11-14 09:59:13 15.62 hours 2010-11-20 10:09:43 16.20 hours
Awk is able to process the mtEleana data very easily due to the simplicity of the files. For example here is an awk program to calculate the total number of votes for each political party:
# party_totals.awk # by Mark Tyler 6th December 2010 # Tweaked 22nd August 2011 to be ~25% faster (let sed remove ' at end) # examples: # cat 2010.tsv | awk -f party_totals.awk | sort | sed -e "s/^'//" # This counts up the totals for each political party in the 2010 general election, sorting by name of party. # cat 2010.tsv | awk -f party_totals.awk | sort -t $'\t' -rnk 2 | head | sed -e "s/^'//" # This outputs a top 10 of the most popular political parties from the 2010 general election. BEGIN { FS = "\t" } NR>3 { if ( $5 > 0 ) list[ $4 ] += $5 } END { for ( party in list ) { printf "%s\t%s\n", party, list[ party ] } }
The output from the second example is as follows:
Conservative 10703654 Labour 8606517 Liberal Democrat 6836248 UK Independence Party 919471 British National Party 564321 Scottish National Party 491386 Green 285612 Independent 192963 Sinn Fein 171942 Democratic Unionist Party 168216
If a text file contains a rectangular lump of text with various columns set up using spaces it is possible to chop this up into a TSV file ready for spreadsheet editing:
cat fixed.txt | awk -v FIELDWIDTHS='20 11 2 11 2 11 2 9' -v OFS='\t' '{ $1=$1; print }' > fixed.tsv
This cuts up the output from the mtnetlog example in 5.3.1.1. You may also want to remove spaces using the following sections.
cat fixed.tsv | tr -d " "
This is a very crude and indiscriminate weapon of mass destruction, so care is needed when using it. For example it destroys the space between the date and time in the example from cutting up the text from 5.3.1.1. To avoid this problem use the following example.
You can use sed or awk to chop off leading and trailing spaces in each cell. This is useful if you want to preserve the spaces within the cell, but remove useless spaces after splitting a fixed width file:
cat fixed.tsv | sed -e "s/^[ ]\+//" | sed -e "s/[ ]\+$//" | sed -e "s/\t[ ]\+/\t/g" | sed -e "s/[ ]\+\t/\t/g" cat fixed.tsv | awk '{ sub( /^ +/, "" ); sub( / +$/, "" ); gsub( /\t +/, "\t" ); gsub( / +\t/, "\t" ); print }'
To reverse the operation in 5.3.1.3 and to create a fixed width file from a TSV file:
cat fixed.tsv | awk -v FS='\t' '{ printf "%20s%11s%2s%11s%2s%11s%2s%9s\n", $1, $2, $3, $4, $5, $6, $7, $8 }'
In this example, we want all of the chunks of text to be right justified, but if we wanted to have them left aligned we would simply place a '-' before the number like this:
cat fixed.tsv | awk -v FS='\t' '{ printf "%-20s%-11s%-2s%-11s%-2s%-11s%-2s%-9s\n", $1, $2, $3, $4, $5, $6, $7, $8 }'
The command line program unzip allows you to access the sheets held inside a CED/ZIP file. Here is an example using the example data in the file 'test_suite.tsv.zip' from the mtCellEdit source code distribution:
unzip -p test_suite.tsv.zip "test_suite.tsv/sheet/Data (1959 GE)" | awk -F "\t" 'NR>340 { printf "%s - %s\n", $1, $2 } NR>350 {exit}'
This outputs the first 2 columns from rows 341 to 350, which results in this being displayed:
'King's Lynn - 'DG Bullard 'Kingston upon Thames - 'JA Boyd-Carpenter 'Kinross and West Perthshire - 'WG Leburn 'Kirkcaldy - 'HPH Gourlay 'Knutsford - 'WH Bromley-Davenport 'Lanark - 'JCM Hart 'Lanarkshire North - 'MM Herbison 'Lancaster - 'HJ Berkeley 'Leeds East - 'DW Healey 'Leeds North East - 'K Joseph 'Leeds North West - 'D Kaberry
As you can see it prints out the ' character at the beginning of each field which is undesirable, so we ask awk to cut this out for us using the sub function:
unzip -p test_suite.tsv.zip "test_suite.tsv/sheet/Data (1959 GE)" | awk -F "\t" 'NR>340 { sub(/^\x27/, "", $0); gsub(/\x09\x27/, "\t", $0); printf "%s - %s\n", $1, $2 } NR>350 {exit}'
Which creates this:
King's Lynn - DG Bullard Kingston upon Thames - JA Boyd-Carpenter Kinross and West Perthshire - WG Leburn Kirkcaldy - HPH Gourlay Knutsford - WH Bromley-Davenport Lanark - JCM Hart Lanarkshire North - MM Herbison Lancaster - HJ Berkeley Leeds East - DW Healey Leeds North East - K Joseph Leeds North West - D Kaberry
If you wanted to see the last 10 rows in a file you would simply call the tail command first:
unzip -p test_suite.tsv.zip "test_suite.tsv/sheet/Data (1959 GE)" | tail | awk -F "\t" '{ sub(/^\x27/, "", $1); sub(/^\x27/, "", $2); printf "%s - %s\n", $1, $2 }'
Which creates this:
Worcester - GR Ward Worcestershire South - PG Agnew Workington - TF Peart Worthing - OL Prior-Palmer Wrekin, The - W Yates Wrexham - JI Jones Wycombe - J Hall Yarmouth - A Fell Yeovil - JWW Peyton York - CB Longbottom
If you have saved some TSV data inside a compressed GZ file you cat use the zcat command to access this data. This example extracts the first and third columns from a file and stores the results in a new compressed output file:
zcat input.tsv.gz | awk -F "\t" '{ printf "%s\t%s\n", $1, $3 }' | gzip > output.tsv.gz
Occasionally you may come across a file with an unusual field separator such as : ; or @. Assuming these characters only exist as field separators then its very easy to swap the characters to tabs using tr:
cat input.txt | tr "@" "\t" > output.tsv
For exotic multi-character separators like @;@ you can use awk:
cat input.txt | awk '{ gsub( /@;@/, "\t" ); print }' > output.tsv
You can also change a normal TSV file into a file with rather more eccentric field separators:
cat input.tsv | tr "\t" "@" > output.txt cat input.tsv | awk '{ gsub( /\t/, "@;@" ); print }' > output.txt
As mentioned in section 5.3, a GNU system can manipulate a TSV file in a number of ways. However there are some jobs that are not possible to complete quickly and easily so I have assembled a group of command line utilities that can be used by the shell to manipulate spreadsheet files.
These utilities are designed for accessing CSV or TSV files using basic scripts. If you need access to data inside a CED/ZIP file then you should unzip it to a temp directory and access the sheets there. Any complex or CPU intensive tasks should be left to custom built C programs.
These utilities are accessed via the binary program cedutils. You access its functions by calling it via a symlink to execute a single command. Because all commands really exist in a central program, it is also possible to do multiple different operations within the same line by using the argument "-com" followed by the new command. The advantage of this is in terms of speed and fewer file operations. When you call 2 commands separately in two script lines, you action a load and a save operation twice. However, with multiple commands in a single line you only have a single load and a single save operation which saves CPU cycles, and involves fewer file read/write operations. Here is an example which flips a sheet and inserts 10 empty rows:
cedflip input.tsv -com insert -total 10 "" -o output.tsv
Compare this to the two operations done separately:
cedflip input.tsv -o output.tsv cedinsert -total 10 output.tsv -o output.tsv
More information can be obtained by consulting the man page:
man cedutils
mtCedCLI is a program that provides spreadsheet functionality via a Command Line Interface (CLI). It offers exactly the same functions as the GUI based mtCellEdit, but instead uses text commands.
Both GUI's and CLI's have value, and allow particular jobs to be done more efficiently. For example I use mtCedCLI as part of the testing phase before I release the whole mtCellEdit suite. I have created a number of tests in the form of scripts which test each part of the core libraries. This exposes any problems that may have crept into the code, and is a good form of regression testing.
mtCedCLI reads instructions from the user via GNU Readline. This offers useful interactive facilities like using the up and down arrows to access previous commands.
It is also possible to run a script from the command line like these examples:
cat script.txt | mtcedcli mtcedcli < script.txt mtcedcli < script.txt > log.txt 2>&1
Here is the complete list of commands that mtCedCLI accepts:
Command | Arguments | Notes |
about | About the program | |
clear content | Clear cell content in current selection | |
clear prefs | Clear cell preferences in current selection | |
clear | Clear cells in current selection | |
clip flip_h | Horizontal flip of clipboard | |
clip flip_v | Vertical flip of clipboard | |
clip load | <OS FILENAME> | Load clipboard from a file |
clip save | <OS FILENAME> | Save the current clipboard to this file |
clip rotate_a | Rotate the clipboard anticlockwise by 90 degrees | |
clip rotate_c | Rotate the clipboard clockwise by 90 degrees | |
clip transpose | Transpose the clipboard | |
copy output | Copy the output of the currently selected cells to the clipboard | |
copy values | Copy the values of the currently selected cells to the clipboard | |
copy | Copy the currently selected cells to the clipboard | |
cut | Copy the currently selected cells to the clipboard and clear the cells | |
delete column | Delete the currently selected column(s) | |
delete graph | Delete the currently active graph | |
delete row | Delete the currently selected row(s) | |
delete sheet | Delete the currently active sheet | |
duplicate sheet | Duplicate the currently active sheet | |
export graph | <OS FILENAME> | Export the current graph to this file |
export output graph | <OS FILENAME> <FILETYPE> | Export the current graph output to this file and type (as per 5.5.3) |
export output sheet | <OS FILENAME> <FILETYPE> | Export the current sheet output to this file and type (as per 5.5.4) |
export sheet | <OS FILENAME> <FILETYPE> | Export the current sheet to this file and type (as per 5.5.5) |
find | <TEXT> [wild] [case] [value] [all] | Find cells with this text (possibly using the current selection) |
help | [ARG]... | Display help on this command |
import book | <OS FILENAME> | Import a book from a file into the current book |
import graph | <GRAPH NAME> <OS FILENAME> | Import a file as a graph and give it this name |
info | Display current information | |
insert column | [clip] | Insert column(s) according to the current selection (or clipboard width) |
insert row | [clip] | Insert row(s) according to the current selection (or clipboard height) |
list files | List files in the active book | |
list graphs | List graphs in the active book | |
list sheets | List sheets in the active book | |
load | <OS FILENAME> | Load a new book |
new book | Destroy the current book and start a new book with a sheet called "Sheet 1" | |
new sheet | Add a new empty sheet to the book | |
new | Destroy the current book and start a new book with a sheet called "Sheet 1" | |
paste content | Paste clipboard contents onto the current selection | |
paste prefs | Paste clipboard preferences onto the current selection | |
paste | Paste clipboard onto the current selection | |
print cell num | Print the cell number for all the selected cells | |
print cell text | Print the cell text for all the selected cells | |
print cell type | Print the cell type for all the selected cells | |
print prefs book | Print the book preferences | |
print prefs cell | Print the cell preferences in all the selected cells (that are not default) | |
print prefs sheet | Print sheet preferences | |
print prefs state | Print state preferences | |
Print the cell output for all the selected cells | ||
q | Quit the program | |
quit | Quit the program | |
recalc book | Recalculate the book | |
recalc sheet | Recalculate the sheet | |
recalc | Recalculate the sheet | |
redo | Redo next sheet action | |
rename graph | <NEW NAME> | Rename the current graph |
rename sheet | <NEW NAME> | Rename the current sheet |
save | Save the current book | |
save as | <OS FILENAME> [FILETYPE] | Save the current book to this file (optionally changing the type as per 5.5.2) |
select | < all | CELLREF[:CELLREF] > | Set the current cursor selection |
set 2dyear | [ YEAR START ] | Change the currently selected cells if they contain a date with a 2 digit year |
set book | <INTEGER> | Change the current active book (0..4) |
set cell | <CELL CONTENT> | Set the cell text |
set graph | <GRAPH NAME> | Set the current graph |
set prefs book | <KEY> <DATA> | Set book preference (as per A.8.3) |
set prefs cell | <KEY> <DATA> | Set currently selected cell preferences (as per A.8.1) |
set prefs cellborder | <DATA> | Set currently selected cell border preferences (as per 5.5.6) |
set prefs sheet | <KEY> <DATA> | Set the preferences for the current sheet (as per A.8.2) |
set prefs state | <KEY> <DATA> | Set the state preferences (as per A.8.4) |
set sheet | <SHEET NAME> | Set active sheet |
set width | < auto | INTEGER > | Set selected columns width as integer > 0 (0=default=10), or by calculating the current maximum width |
sort column | <EXPRESSION> | Sort selected columns by expression as per 5.5.7 |
sort row | <EXPRESSION> | Sort selected rows by expression as per 5.5.7 |
undo | Undo last sheet action |
tsv_book |
tsv_value_book |
ledger_book |
ledger_value_book |
eps |
png |
ps |
svg |
eps |
html |
pdf_paged |
png |
ps |
svg |
tsv |
tsv_q |
tsv |
tsv_gz |
tsv_noq |
tsv_val |
tsv_val_gz |
tsv_val_noq |
csv |
csv_noq |
csv_val |
csv_val_noq |
ledger |
ledger_gz |
ledger_val |
ledger_val_gz |
-1 | Remove all borders |
-2 | Thin Outside |
-3 | Thick Outside |
-4 | Double Outside |
-5 | Thin Top And Bottom |
-6 | Thick Top And Bottom |
-7 | Double Top And Bottom |
Horizontal | |
0 | Clear Top |
1 | Clear Middle |
2 | Clear Bottom |
3 | Thin Top |
4 | Thin Middle |
5 | Thin Bottom |
6 | Thick Top |
7 | Thick Middle |
8 | Thick Bottom |
9 | Double Top |
10 | Double Middle |
11 | Double Bottom |
Vertical | |
12 | Clear Left |
13 | Clear Centre |
14 | Clear Right |
15 | Thin Left |
16 | Thin Centre |
17 | Thin Right |
18 | Thick Left |
19 | Thick Centre |
20 | Thick Right |
21 | Double Left |
22 | Double Centre |
23 | Double Right |
<INTEGER> , < a | d > [c] [ , ... ]
For example, expression 1,ac,2,d sorts by row/column 1 (ascending, case sensitive), then row/column 2 (descending).
Because the interface to mtCedCLI is just text, this means that shell scripts can control the program.
#!/bin/sh # cedsheetbook.sh - Puts sheet files into a single book file # usage: cedsheetbook.sh BOOKFILENAME SHEETFILE... | mtcedcli # e.g. cedsheetbook.sh ~/output.zip *.tsv *.csv | mtcedcli # by Mark Tyler 19th February 2013 echo "delete sheet" echo "save as \"$1\"" shift while [ "$1" != "" ] do echo "import book \"$1\"" shift done echo "save"
The handy thing with this approach is that you can dry run the script first by using:
./cedsheetbook.sh ~/output.zip *.tsv *.csv
If you are happy with the text output then you can commit the actions with:
./cedsheetbook.sh ~/output.zip *.tsv *.csv | mtcedcli
The following script does the reverse of the first script. It reads a ZIP file and extracts all of the sheets to TSV files in the given directory.
#!/bin/sh # cedbooksplit.sh - Extracts the sheets from a book # usage: cedbooksplit.sh BOOKFILENAME OUPUTDIRECTORY # e.g. cedbooksplit.sh ~/output.zip ~/temp_sheet # by Mark Tyler 19th February 2013 # On error exit set -e mkdir -p "$2" TMP_FILE="$2"/tmp.txt printf "load \"$1\"\n" > "$TMP_FILE" printf "load \"$1\"\nlist sheets" | mtcedcli -q | awk '!/^$/ && !/^mtcedcli/' | while read SHEET do LOWER=$(echo "$SHEET" | tr '[:upper:]' '[:lower:]') case "$LOWER" in *csv ) TYPE=csv;; * ) TYPE=tsv;; esac echo set sheet \"$SHEET\" >> "$TMP_FILE" echo export sheet \"$2/$SHEET\" $TYPE >> "$TMP_FILE" done cat "$TMP_FILE" echo echo Type y and press ENTER to run this script read KEY if [ "$KEY" = "y" ] then cat "$TMP_FILE" | mtcedcli fi echo