Sunday, February 14, 2010

Vim Substitution to make List Items

As you can see from the Hyperhelper code, most of my lists used "-" as the bullet.

I used "substitution" to mark these all up as List Items (ie <li> </li>).
Substitution is a command-mode command with this form:
:<range>s/<from>/<to>/
Pretty simple and we have come across it before.
So to convert "- lists to HTML, I used:
%s+^- +</li><li>+
That is:
%s - over the whole file substitute
+ - I used "+" as a delimiter instead of the usual "/" because the slash appeared in the body of the substitution (</li>), but I could have escaped it with \/
^- + - (the "from" part of the substitution) the caret is a regular expression meaning that the search is for the pattern (in this case a dash followed by a space (- ) are the first characters on the line.
</li><li>+ - the "to". This configuration (the closing tag preceding the opening tag) is effective in closing previous list-item line and I can then simply substitute the initial closing tag with the <ul> tag to start a list. Otherwise it is all good.

If you look carefully at the Hyperhelper code, you will notice that I nested list with spaces. Subpoints were indented a space or two). To use this to effectively nest my HTML lists, I needed to simply add a space to the "from" and "to" fields of the substitution and repeat it.

The interesting thing about substitution is that it shows the number of substitutions it makes. I recorded these so I can tell you that there are a grand total of 8,034 bulleted list items in the whole of Traditional Hydrotherapy. Even though some lists are not bulleted, I would have expected more. In any case it has taken longer to write about this than it actually took to do it. Substitution is much faster than macros! The whole process was done in under an hour

So here are my substitutions followed by the number of substitutions in each of the files

%s+^- +</li><li>+
Problems: 1726 Diseases:2098 Effects:314 Techniques:186 Total: 4324
%s+^ - + </li><li>+
Problems: 1259 Diseases:1096 Effects:158 Techniques:348 Total: 2861
%s+^  - +  </li><li>+
Problems: 213 Diseases:219 Effects:32 Techniques:49 Total: 513
%s+^   - +   </li><li>+
Problems: 91 Diseases:154 Effects:8 Techniques:30 Total: 283
%s+^    - +    </li><li>+
Problems: 12 Diseases:30 Effects:0 Techniques:11 Total: 53,
Grand Total: 8,034

Quotes Headings with Vim Macros and Substitutions

Even though I had completed the Quote Index, in the pages the headings were not marked up. To do this I used both macros and substitute. Once again the substitute was much faster.

What was in the text file was:

#quote Dr Kellogg says...
#qhead Hydrotherapy Departments

The most elegantly equipped establishment for the administration
of hydriatic procedures may be only the means for bungling and
unscientific dabbling with human ailments, unless conducted under
skilled medical direction and by the aid of attendants well
trained in the versatile procedures of hydrotherapy. p 402
#end

I wanted to change it into:
<h4>Dr Kellogg says...<h4>
<h4>Hydrotherapy Departments</h4>
<p>The most elegantly equipped establishment for the administration
of hydriatic procedures may be only the means for bungling and
unscientific dabbling with human ailments, unless conducted under
skilled medical direction and by the aid of attendants well
trained in the versatile procedures of hydrotherapy. p 402</p>
#end

After searching for Quotes (with "/#quote") If there was a heading I would use "@g", if there was no heading (ie. it was the same as the card title) I used "@x"

"@g" - make a H4 heading for the quote IF there is a heading and finish </p> after

j^2dwi<h4>~@@7</h4>^[j^i<p>^[
That is:
j^ - move the cursor down one line and to the first character on the line (ie. to the hash in #qhead)
2dw - delete two words (# and head)
i<h4> - change to Insert mode and insert the text <h4>
~@@7</h4> - <End> to move to the end of the line and insert the text </h4>
^[j - <Esc> out of Insert mode, back to Normal mode and move cursor down a line (to the body of the quote)
I<p>^[ - open Insert mode before the first character on the line and start the paragraph by inserting <p>

"@x" where there is no #qhead given, (it finds the card title and inserts it)
o^[ma?#card^Mj"zy$'apI<h4>~@@7</h4><p>^[
o^[ma - open a line below the #quote line, <Esc> to get back to normal mode and set line as "mark a"
?#card^M - search backwards for #card <Enter>
j - move down a line to the page title
"zy$ - into "register z" yank (copy) everything to the end of the line (ie the whole title)
'ap - move back to mark a (under the #quote line) and "put" the copied text (the card title)
I<h4> - before the first character of the line Insert the text <h4>
~@@7</h4><p>^[ - <End>, insert the text "</h4><p>" then <Esc> back to Normal mode

And finally: (and it was much, much faster as it was a straight-forward substitution to mark up the Quote heading)
%s/#quote Dr Kellogg says.../<h4>Dr Kellogg says...<h4>/

All up, this process took an hour.

Vim Macros, Slow Edits

Now that the html links are finished and the indexes are complete, it is time to start getting the other html done.

I started with the headings. To show where I had taken the information from I used had four headings:
  1. from Hydrothermic Remedies...
  2. from Dr JH Kellogg's Prescriptions...
  3. from Dr JH Kellogg's Hydriatic Techniques...
  4. from Dr GK Abbott's Prescriptions...

Once again I used Vim's keyboard macros:

"@g" To turn all the "#from from Dr JH Kellogg's Prescriptions..." etc. to "<h2>from Dr JH Kellogg's Prescriptions...</h2>".

/#from^M2dwi<h2>~@@7</h2>^[n
That is:
/#from^M - search for #from <Enter>
2dw - delete 2 words (vim thinks "#" and "from" are both words - I could have done it with dW)
i<h2> - change to Insert mode and insert <h2>
~@@7</h2> - <End> of the line and insert </h2>
^[n - <Esc> back to normal mode and search for next occurrence of the search pattern (#from)

I did this in batches again with the command, 100@g. Even so, the whole process took several hours. It has been a long time since I could leave a computer chugging away and come back in 10 minutes and find it still working. I'm using a 1GHz machine so that could explain it.

It ran very slowly I will use substitute more in the future.

I suspect the reason it ran so slow was because I had the syntax highlighting on and Vim had recolour the rest of the file after each tag was inserted.

Vim Sort to make an Index

I have shown you how I began creating an Index of Kellogg's Quotes, firstly using Vim keyboard macros that found the quotes and put their URLs in a file. Then I used Vim's search and replace to get a file with just the quote's title and URL, in other words a list of quotes. It was not in order, so:
To sort the index list into alphabetical order:
"@w" to get the displayed text to the front of the line for sorting:
^df>$pj
That is:
^ - move to the beginning of the line
df> - delete to and including the ">"
$p - move to the end of the "p"ut or paste what had been deleted
j - move down one line

Once again I did this 20 lines at a time with the command
20@w

Our line now looks like:
Hydrotherapy Departments</a><a href="../Techniques/OtherApplications.html">

The actual sorting is very simple
Place the cursor on the first line of the list
ma - to mark this line
G - to go to the end of the file (and the list)
!'a sort - runs the block back to the mark through the bash command "sort"

Sorted!

^2f<D^Pj

"@q" To get the link back in shape:
^ - go to the beginning of the line
2f< - move the cursor the second "<" character
D - delete to the end of the line
^P - move back to the beginning of the line and "P"ut the deleted text before the cursor
j - move down a line, ready to repeat the procedure.

Once more I repeated it 20 times with:
20@q

Finally to complete the HTML contents of QuoteIndex.html. I made the title line and marked it up as the appropriate header then simply put <ul> under the title to start the list and a </ul> at the end of the file to end the list. Then

"@w" Make into list items
I<li>^[A</li>^[j

I<li>^[ - Insert at the beginning of the line the text "<li> then <Esc> back to normal mode
A</li>^[ - Append at end of the line, the text, </li>
j - move down a line, ready to repeat

Once again done with a 20@w to do 20 lines at a time. And presto, the file was back to normal HTML.

The body of QuoteIndex.html is now complete.

Vim Search and Replace to edit lines

In the previous post we began making an index of Kellogg's quotes. When all the Section were read into QuoteIndex.html I used the following macros:

"@q" searches for #qhead (Kellogg's heading) and replaces the displayed text in the link with this title

/#qhead/e^Mw"ly$2k^f>ldt<"lPn

/#qhead/e^M - search for #qhead and leave the cursor at the "e"nd of the search txt, (ie. the d of #qhead) <Enter>
w - move to next word to the right (the first word of the heading)
"ly$ - yank into "register l" everything to the end of the line (ie. the title of the quote, "Hydrotherapy_Departments"
2k^ - move cursor up two lines and then to the first character of the line (the #card line)
f> - "find" the first occurrence of ">" on the line and move the cursor onto it
ldt< - move one letter to the right one space then "d"elete from there to "<", ie all the displayed text.
"lP "P"ut or paste the contents of "register l" in the displayed text area of the link.

The resulting line now looks like:
#card @<a href="../Techniques/OtherApplications.html">Hydrotherapy_Departments</a>

"@w" gets rid of all the #quote and #qhead lines

/#quote^M2dd

This means:
/#quote^M - search for the text "#quote" <Enter>
2dd - delete two lines (the #quote and the #qhead lines)
I did this 20 cards at a time by typing
20@w until the end of the file, leaving nothing but the #card lines.

To get rid of anything but the link use a simple substitution:
%s/#card @//
- for the whole file substitute nothing for the text "#card @", in effect deleting it
The line now looks like:
<a href="../Techniques/OtherApplications.html">Hydrotherapy Departments</a>

Making an Index page with Vim Macros

As I worked my way through Kellogg's "Rational Hydrotherapy", whenever I came across particularly interesting bits of prose, I added these to a stack, "Quote". When making Traditional Hydrotherapy I appended these quotes to the appropriate cards using similar macros to those described in Vim Keyboard Macros and Split Windows. These quotes will appear in the third (right) column on the final page.
This is how a quote appeared in the file

#card @<a href="OtherApplications.html">Other Applications</a>
Electrotherapy
Massage
Other Baths
Diet
#quote Dr Kellogg says...
#qhead Hydrotherapy Departments

The most elegantly equipped establishment for the administration
of hydriatic procedures may be only the means for bungling and
unscientific dabbling with human ailments, unless conducted under
skilled medical direction and by the aid of attendants well
trained in the versatile procedures of hydrotherapy. p 402
#end

The "#quote" marks the quote , #qhead indicates the title if it differs from the page title.

I wanted a page where people could look up quotes on particular topics, so that meant creating QuoteIndex.html... an index page.

To achieve this I used the following macros on each of the four section files:

"@q" searches for #quote and copies it, the heading and the #card (including link to the page) to quoteindex.txt (which was open in a split screen with txt file being edited, in this case Techniques.txt)

/#quote^Mmaj$y'a^Wpp^Wp?#card^Myy^WpPG^Wp'ajj
This means:
/#quote^M - search for "/#quote" <Enter>
ma - set "mark a" on the line
j$y'a - move down a line and go to the end of the line then yank (copy) everything back to "mark a"
^Wp - move back to the previous window (ie. into quoteindex.txt)
p - put (place) the two yanked lines after the previous line, in effect it appended the lines (that means we have the #quote line above the #qhead line just as it is in the Techniques file
^Wp - return to the previous window (back to Techniques.txt)
?#card^Myy - search backwards for "#card" and yank the line
^WpP - move the cursor back to the previous window (quoteindex.txt) and place insert the yanked #card line Previous to line the cursor is on (in other words it is put above the #quote line
G - move the cursor to the last line of the file
^Wp'ajj - move the cursor back to previous window (Techniques.txt) and move to "mark a" then down two lines (to get ready for the next /#quote search which was repeated until there were no more #quote found

"@w" added the directory to the link (as QuoteIndex.html will be in home directory) - change directory between
/#card^Mf"a../Techniques/^[
That is while editing quoteindex.txt,
/#card^M - search for next #card <Enter>
f" - find " (which puts the cursor on the double quote before the filename in the link)
a../Techniques/^[ - append into Insert mode (after the ") and insert ../Techniques/ <Esc> back to normal mode. The inserted text was changed between Sections.

For the quote we are looking at, we now have in quoteindex.txt:

#card @<a href="../Techniques/OtherApplications.html">Other Applications</a>
#quote Dr Kellogg says...
#qhead Hydrotherapy Departments

I did this for each quote in each of the Sections (Disease, Effects, Problems, Techniques separately then "read" (appended) the contents of of quoteindex.txt into QuoteIndex.html with the command:

:read quoteindex.txt

Vim Substitutions to Clean Up

Now that I didn't need the links to be unique words anymore I did a few substitutions, which are fairly obvious:
%s/H&C/Hot and Cold/ - substitutes "Hot and Cold" for "H&C" through the whole file
%s/ & / and /
%s/_/ /
etc.

Vim Find and Substitute to Finish the Links

This took longer than expected. The macros made short work of any Hyperhelper link that was spelt correctly but if I had not capitalised correctly or left out the underscore or just plain mispelled it, then the macro didn't work.

Another problem is that PCWrite didn't put the usual line endings in the code and even though Vim knew it was a DOS file, it maintained the same line wrapping as PCWrite which meant that some links wrapped over what Vim thought was a line break.

So there were many "missing links". In order to get the last 10%, I used the "Sections" in Hyperhelper. A Hyperhelper stack could have sections beginning with the line "#section" and ending with #ends

The first card of the section was, by default, an index card for that section.

By searching through each file (/#section) I was able to jump from section to section, checking that all the links were correctly marked up. As the cards were nearby in the same file, it was fairly simple to run the macros for fixing these links.

The last and more labour-intensive method of finding broken links was to write down each one I found as I moved around the files. I would have done about 200 this way.

I used the substitute command on each errant link I found. Some had a dozen, or so, occurrences, some only one. The substitute command looked like:
:%s/^Ro/\=@l/^M

You saw this in the last post but it means:
%s - through the whole file, substitute for...
/^Ro/ - literally <Ctrl>-R o meaning the contents of "register o" ("o"riginal name - put in the register with a "oy2W or "oyf(end letter)).
/=@l/ the contents of "register l" ("l"ink name) - (put in the register with a "ly2f> on the link on the target file's #card line - this is the substitute text
^M - <Enter>

To fill the register in both cases I used "f". The syntax is [count]f {character}, which means Search forward to the the character on the current line and stop on the the character. "t" does a similar thing but the cursor stops before the character specified.
In the case above "ly2f> means that register l ("l) would consist of text yanked from the cursor to the second occurrence of ">" on the line. In other words register l would end up containing:


<a href="VisceralCongestion.html">Visceral Congestion</a>

That would take ages to type but was very quick using this method. In any case it took about two months to finish all the links.

Vim Substitution using Registers

In the last post I showed you an example of the large macros I used to create page titles and URLs. The reason they were powerful was because Substitute (:s) was combined with Registers.

Having Vim use the card title to create the contents of the registers then read them into the new page title, URL and link, all without input from me, meant that there was little possibilities of typos and other errors. It was also very quick as I never have to manually enter anything.

The particular section of the macro involved is:

:%s/^Ro/\=@l/

This is the standard form of a substitution in Vim and lots of other places as well. What is interesting is what is being substituted. Rather than typing in the strings, both "from" and "to" strings are contents of registers.

To read in the original link name in the substitute the "<Ctrl>-R o" (insert the contents of register o) command is used. This is the command in Insert mode and it is expanded on the message line, but the "to" string (below) is not;
To read in the new link name the \=@l command is used. The "\" is the escape code and means that the "=" is not interpreted, but read literally and "@l" is the Normal mode (which we are in when we run the substitute command) method of inserting "register l".

I came across this way of doing things at StackOverflow and Vim Tips Blog. I'm not sure why the difference in methods of inserting registers is used. But it works, and quickly too, it took around 5 seconds to complete each card.

More on this substitution next time...

HTML Links with Vim Keyboard Macros and Registers

Inserting links turned out to be the most laborious part of the transition, it took about two months of my spare time.

You will notice the first three lines of the card above:
#card @<a href="VisceralCongestion.html">Visceral Congestion</a>
Visceral Congestion
VisceralCongestion.html

The first line has the link to what will become the "Visceral Congestion" page, the second line will be the page's title and the third line is the page's file name.

To achieve this and to search through the rest of the text files and replace the Hyperhelper link with new HTML links I used the "@c" macro after:
  • combining the other three, non-working txt files into a single file entitled htech.txt
  • searching for the next card with "/#card" so the cursor was on the line
  • visually checking what sort of card title it was, there were six different sorts of titles.
Titles with:
  1. just spaces
  2. underscores
  3. simple one word title, probably no doubles
  4. simple one-word titles but probably with doubles (eg there is a card called "Congestion" which would affect the card "Visceral Congestion" on a simple "find and replace")
  5. underscores and spaces
  6. dashes
Visceral_Congestion was the second type, each type had its own macro. "@a" - to change all occurrences of "Visceral_Congestion" to a link to the "Visceral Congestion" page.
f@l"oyW:put^M:s/_/ /g^M0"ty$:put^M:s/ //g^MA.html^[0"fy$I<a href="~@@7">^Rt</a>^[0"ly$kma:%s/^Ro/\=@l/^M'ao^Rf^[j0f"a../Problems/^[0"ld$:sp htech.txt^M:%s/^Ro/\=@l/^M:wq^M:w^M/#card ^M

This means:
f@ - "find @", the cursor moves to the @
l - the cursor moves right one character (so it ends up on the start of the title in this case the "V")
"oyW - yank (copy) to register "o" (o for "old") the Word (that is, to the next space. In effect, all Visceral_Congestion)
:put - open a line below and insert the contents of the "yank"
^M - <Enter>
:s/_/ /g - substitute a space for every underscore on this line (becomes Visceral Congestion)
0 - move cursor to first character on the line
"ty$ - yank to register "t" (t for "title") from the cursor to the last character on the line
:put - open a line below and insert the contents of the "yank"
:s/ //g - substitute "no character" for every space on this line (in effect: delete the spaces to become VisceralCongestion)
A.html - start insert mode after the last character of the line and insert .html
^[ - escape back to normal mode
0"fy$ - move to first character of the line and yank to end of the line to register "f" (f for filename)
I<a href=" - start Insert mode before the first character of the line and insert "<a href="" (to start the link)
~@@7"> - <End> and insert ">
^Rt</a>^[ - <Ctrl>-R to insert "register t" (Visceral Congestion) followed by </a> and <Escape> to normal mode
0"ly$ - move cursor to beginning of the line and yank to "register l" (l for link) everything to end of the line
kma - move cursor up one line and set mark "a" (this is the filename line VisceralCongestion.html
:%s/^Ro/\=@l/^M - over the whole file substitute <Ctrl>-R o (that is the contents of "register o", in this case Visceral_Congestion) with the contents of @l ie "register l", (in this case <a href="VisceralCongestion.html">Visceral Congestion</a>) followed by <Enter>
'a - move the cursor to "mark a" (ie the filename line) now that the substitution is complete
o^Rf - open a line below in Insert mode and <Ctrl>-R f to insert the contents of "register f" (VisceralCongestion.html)
^[j0 - <Esc> to normal mode, move cursor down a line and to the beginning of the line (this is the line containing the link)
f" - "find" (move the cursor to) the first occurance of " on the line (the cursor is at the beginning of the filename in the link)
a../Problems/ - "append" (start insert mode after the cursor) and insert "../Problems/ (thus turning the link into <a href="../Problems/VisceralCongestion.html">Visceral Congestion</a> "../" points to a fellow subdirectory)
^[0"ld$ - escape to normal mode, move to beginning of the line, delete to end of the line to "l (register l) This deletes the contents of the line from beginning to end but the empty line remains there as you can see in file.
:sp htech.txt^M - split the screen and open htech.txt in the new window, with the cursor in the window - (this is the file containing, in this case, Diseases, Effects and Techniques - all of which will be put in other subdirectories and need links that point to the ../Problems directory) then <Enter>
:%s/^Ro/\=@l/^M - over the whole file substitute <Ctrl>-R o (the contents of "register o" or in this case Visceral_Congestion) with @l (the contents of register l, ie <a href="../Problems/VisceralCongestion.html">Visceral Congestion</a>) and <Enter>
:wq^M - write (save) and quit htech.txt and <Enter> (this leaves the cursor back in Problems.txt)
:w^M - write Problems.txt
/#card ^M - find the next occurrence of of "#card " (ready for the next card macro to be run)

This is a powerful and quick macro (5 seconds per card). It not only creates the title and filename (URL) of the new webpage and puts them in a place where I can call them later, it changes all the links through the whole of "Traditional Hydrotherapy" so they will point to the new URL.

Monday, February 8, 2010

Vim Keyboard Macros and Split Windows

To produce the card I showed you in the last post, which combines similarly named cards from all the six sections of the original program (Hydro, John, Harvey, George, Quote and Glossary), I used the keyboard macro capabilities of Vim.

So I would:
  • be working with a split screen (:sp), one open in the hprob.txt and the other on jprob.txt (the "John" problem file)
  • search for the card (/@Visceral_Congestion) in jprob.txt and if I found it press <Enter> so the cursor was on the #card line in jprob.txt
  • <Ctrl>-W p which takes me back to hprob.txt (my active text file)
  • put the cursor on the #card @Visceral_Congestion line in hprob.txt
  • call the following macro with "@a"

ofrom from Dr JH Kellogg's Hydriatic Techniques...^[k$^Wpofrom from Dr JH Kellogg's Prescriptions...^[k0ma/#end^Mk$d'a^Wppdd

This is:
o - opens a line below into Insert mode and inserts "from from Dr JH Kellogg's Hydriatic Techniques..." (in practice Vim puts a hash (#) in front of the first "from" making "#from" apparently the "o" command always puts in any unusual line starts. This is great for searching later as we will see. This correctly marks the source of what is already on the card
^[ - is <Esc> and takes us back to Normal mode
k - moves the cursor up a line
$ - moves the cursor to the end of the line ready for the jprob.txt card's contents to be prepended
^Wp - is <Ctrl>-W p which moves the cursor back to the Previous window (jprob.txt) and places the cursor where it was (on the "#card" line)
ofrom from Dr JH Kellogg's Prescriptions... - the "o" opens the line below into Insert mode and inserts "#from from Dr JH Kellogg's Prescriptions..."
^[ - is <Esc> and takes us back to Normal mode
k - takes the cursor up a line
0 - takes the cursor to the beginning of the line
ma - sets the "a" mark on the line
/#end - searches for the next "#end" (end of card)
^M - is <Enter> so the cursor stops on the "#end"
k - moves the cursor up a line
$ - takes the cursor to the end of the line
d'a - is delete to mark "a" (in effect the contents of the whole card)
^Wp - takes the cursor back to the previous card
p - puts (pastes) the contents of the delete
dd - deletes the line where the cursor is as its empty

I would do this for each section, this one is from "John", I just changed the "from" areas of the macro before I did each new section.

The result of all this is that now all the information on Visceral Congestion is on just one card. (Which is what you saw in my last post.)

It took just over a month of working in my spare time to combine all the cards from the original six parts into just four sections:

  1. Diseases.txt
  2. Effects.txt
  3. Problems.txt
  4. Techniques.txt

These will eventually end up as four subdirectories in the final Traditional Hydrotherapy web site.