diff --git a/changelog b/changelog index 5d04a1e..d2a5e23 100644 --- a/changelog +++ b/changelog @@ -1,3 +1,6 @@ +20111118 wxh src/axiom-website/patches.html 20111118.01.tpd.patch +20111118 tpd src/axiom-website/litprog.html added +20111118 tpd src/axiom-website/documentation.html add litprog.html 20111117 wxh src/axiom-website/patches.html 20111117.01.tpd.patch 20111117 tpd src/interp/c-doc.lisp treeshake compiler 20111117 tpd books/bookvol9 treeshake compiler diff --git a/src/axiom-website/documentation.html b/src/axiom-website/documentation.html index f6a5831..c9cbd91 100644 --- a/src/axiom-website/documentation.html +++ b/src/axiom-website/documentation.html @@ -123,6 +123,8 @@
I believe that the time is ripe for significantly better documentation of programs, and that we can best achieve this by considering programs to be diff --git a/src/axiom-website/litprog.html b/src/axiom-website/litprog.html new file mode 100644 index 0000000..04b3383 --- /dev/null +++ b/src/axiom-website/litprog.html @@ -0,0 +1,966 @@ + + ++ Example of Literate Programming in HTML + + + + + ++++ 1.0 Literate Programming - A Gentle Introduction + 1.1 What is the problem? + 1.2 Is there a better way? + 1.3 Literate Programming -- The basic idea + 1.3.1 Literate Programming is NOT documentation + 1.3.2 Literate Programming is a change in mindset + 1.3.3 Literate Programming reduces bugs + 1.4 Understanding the literate programming tangle tool + 1.4.1 Motivation, or getchunk + 1.4.2 All of the literate programming tools + 2.0 The Tangle Program + 2.1 Understanding the story of the program + 2.2 The main task + 2.3 A complication with HTML + 2.4 Finding the chunk + 2.5 Finding the next line in the buffer + 2.6 At last, finding the chunk + 2.7 Printing the chunk we found + 2.8 The simple case, print the line + 2.9 Checking chunks for getchunk tags + 2.10 Getting a new chunk name from the getchunk tag + 2.11 Finding the end of the chunk we are printing + 2.12 Final cleanup and housekeeping. The preamble + 3.0 The tangle source code + 4.0 What have we learned? +++ ++ + diff --git a/src/axiom-website/patches.html b/src/axiom-website/patches.html index fef254b..91bde71 100644 --- a/src/axiom-website/patches.html +++ b/src/axiom-website/patches.html @@ -3686,5 +3686,7 @@ books/bookvol9 treeshake compiler1.0 Literate Programming - A Gentle Introduction
+1.1 What is the problem?
++In the 1970s I worked on a PDP11/40, which is the same class of +machine that was used to develop UNIX. My PDP had 8192 bytes of +memory. The editor used 4096 bytes and left the other 4096 bytes for +an edit buffer. +
+This meant that the largest file I could create at any time was 4096 +bytes. As a result, C programs were limited in size. To get around +this problem we used include files, libraries of code, +linkers, and overlay loaders. Segment registers exist on the x86 +processors to help overlay loaders. +
+People program today the same way I programmed in the 1970s. We write +programs that are made of tiny pieces of sand in hundreds of little +files spread across dozens of directories. In order to find anything +we use grep to search files, or we create tools to pre-search +the files such as Eclipse and IntelliJ. +
+Imagine taking the same approach to calculus. Create a directory for +each chapter. Take each equation in the chapter and put it in a file. +Do this for every chapter. Throw away the textbook. Now try to learn +calculus. +
+That's the way we program today, just like I did in the 1970s. +
+We program to communicate to the machine, not to communicate to people.
+ +
+That's the problem. +1.2 Is there a better way?
++Consider the best possible world. You've been hired at a company and +join a team that is already working on a program. They hand you a book, +tell you to go home and read it over the next two weeks. At the end of +the two weeks you can work on the program as effectively anyone on the +team. The team has successfully communicated from one human to another. +
+What is in the book? +
+Remember our calculus textbook? It started from the ideas like limits +and gradually developed the ideas until they could be expressed in +equations. By the time you got to the equations you already understood +the concepts. You could look at the equations and see why they matched +the text. It is the why that is the important part. It is the +part that our programs are missing. +
+The book you took home uses the same method. You started with the problem +in chapter 1. Chapter 2 expresses the ideas needed to solve the problem. +The next few chapters expand on each idea, gradually becoming more specific +until the idea is reduced to code. By the time you get to the code it +should be perfectly clear what the code should look like. Any part of +the code you don't understand means that the book needs some additional +words. +
+1.3 Literate Programming -- The basic idea
++It is rather pointless to have code in a book, right? Someone could +change the code and then the book would be out of date. +
+The way to solve this problem is to extract the code directly from the +book. That way, every time you change the code in the book you change +the actual program. +
+How hard is it to write an extraction program that can find the code in +the book and extract it to some files? Well, lets suppose we name each +section of code and call it a chunk. +
+We need a program that lets us find a chunk in a document and extract it. +The traditional name for such a program is tangle. The tangle +program takes two arguments, the name of the book and the name of the +chunk: +
+ tangle mybookname chunkname +++With this simple tool you can now write books that contain the actual +source code. +
+1.3.1 Literate Programming is NOT documentation
++The first reaction is that this is a new, painful form of documentation. +It is not. Documentation is a how construct. It is a way of +explaining how a program works or how an application interface should +be used. +
+Literate programming is about why. It explains the ideas. It is +communication from one person to another. +
+The why becomes important when you have to maintain and modify +a program. You can perfectly understand how a subroutine, +module, or class works. You can explain its input and outputs. What +you don't understand is why it exists. Nobody writes that +down. So when you are joining a group and given the directory +containing thousand of pieces of code, you have no idea why the +program is structured the way it is. +
+1.3.2 Literate Programming is a change in mindset +
+In order to do literate programming you have to change the way you +think. You have to write english text to communicate ideas. You have +to move from ideas to implementation in a way that make the story +clear. +
+If programmers were writing a James Bond novel they would create a +file for JamesBond, MissMoneyPenny, Q, and TheBadGuys. They would +create a directory for Scenes with subdirectories for Scene1, Scene2, +and so on. Then they would box up the whole tree, send it to the +audience, and tell them +
+ When this program is run, the bad guy dies +++We live by stories. We tell stories and we remember stories. +Characters do things and the author tries to convey their motivations +(the why). Scenes, flashy cars, trick watches, and other items +are added to solve local problems in the story. The whole story moves +from the beginning to the end. You can remember the story and you can +find places where you think it is weak or cheesy or bad. +
+Subroutines, modules, or classes are like characters in a story. They +need motivation. They need the why. The factory classes, the +singletons, the garbage collection routines, and other "items" exist +to solve local problems in the story of the program. +
+Literate programming is about writing the story of your program. What +are you trying to solve? How are you solving it? Why did you introduce +piece of code? Is it part of the main story or is it an item that helps +solve a local problem? +
+1.3.3 Literate Programming reduces bugs +
+We are all conditioned to follow stories. We do it all the time. +If I present you with a piece of code without the story then you +have to struggle to understand it. Take, for example, this: +
++#include <stdio.h> +#include <stdlib.h> +#include <string.h> +#include <sys/stat.h> +#include <sys/mman.h> +#include <sys/types.h> +#include <fcntl.h> + +/* forward reference for the C compiler */ +int getchunk(char *chunkname); + +char *chunkbegin = "<pre id=\""; int chunkbeginlen = 9; +char *chunkend = "</pre>"; int chunkendlen = 6; +char *chunkget = "<getchunk id=\""; int chunkgetlen = 14; + +/* a memory mapped buffer copy of the file */ +char *buffer; +int bufsize; + +/* return the length of the next line */ +int nextline(int i) { + int j; + if (i >= bufsize) return(-1); + for (j=0; ((i+j < bufsize) && (buffer[i+j] != '\n')); j++); + return(j); +} + +/* output the line we need */ +int printline(int i, int length) { + int j; + for (j=0; j<length; j++) { putchar(buffer[i+j]); } + printf("\n"); +} + +/* handle <pre id="chunkname"> */ +/* is this chunk name we are looking for? */ +int foundchunk(int i, char *chunkname) { + if ((strncmp(&buffer[i],chunkbegin,chunkbeginlen) == 0) && + (strncmp(&buffer[i+9],chunkname,strlen(chunkname)) == 0) && + (buffer[i+chunkbeginlen+strlen(chunkname)] == '"') && + (buffer[i+chunkbeginlen+strlen(chunkname)+1] == '>')) return(1); + return(0); +} + +/* handle </pre> */ +/* is it really an end? */ +int foundEnd(int i) { + if (strncmp(&buffer[i],chunkend,chunkendlen) == 0) { + return(1); + } + return(0); +} + +/* handle <getchunk id="chunkname"/> */ +/* is this line a getchunk? */ +int foundGetchunk(int i, int linelen) { + int len; + if (strncmp(&buffer[i],chunkget,chunkgetlen) == 0) { + for(len=1; ((len < linelen) && (buffer[i+chunkgetlen+len] != '\"')); len++); + return(len); + } + return(0); +} + +/* Somebody did a getchunk and we need a copy of the name */ +/* malloc string storage for a copy of the getchunk name */ +char *getChunkname(int k, int getlen) { + char *result = (char *)malloc(getlen+1); + strncpy(result,&buffer[k+chunkgetlen],getlen); + result[getlen]='\0'; + return(result); +} + +/* print lines in this chunk, possibly recursing into getchunk */ +int printchunk(int i, int chunklinelen, char *chunkname) { + int j; + int k; + int linelen; + char *getname; + int getlen = 0; + for (k=i+chunklinelen+1; ((linelen=nextline(k)) != -1); ) { + if ((getlen=foundGetchunk(k,linelen)) > 0) { + getname = getChunkname(k,getlen); + getchunk(getname); + free(getname); + k=k+getlen+17; + } else { + if ((linelen >= chunkendlen) && (foundEnd(k) == 1)) { + return(k+chunkbeginlen); + } else { + printline(k,linelen); + k=k+linelen+1; + } + }} + return(k); +} + +/* find the named chunk and call printchunk on it */ +int getchunk(char *chunkname) { + int i; + int j; + int linelen; + int chunklen = strlen(chunkname); + for (i=0; ((linelen=nextline(i)) != -1); ) { + if ((linelen >= chunklen+11) && (foundchunk(i,chunkname) == 1)) { + i=printchunk(i,linelen,chunkname); + } else { + i=i+linelen+1; + } + } + return(i); +} + +void fixHTMLcode() { + int point = 0; + int mark = 0; + int i=0; + for(point = 0; point < bufsize;) { + if ((buffer[point] == '&') && + (strncmp(&buffer[point+1],"lt;",3) == 0)) { + buffer[mark++] = 60; + point = point + 4; + } else + if ((buffer[point] == '&') && + (strncmp(&buffer[point+1],"gt;",3) == 0)) { + buffer[mark++] = 62; + point = point + 4; + } else + buffer[mark++] = buffer[point++]; + } + bufsize = mark; +} + +/* memory map the input file into the global buffer and get the chunk */ +int main(int argc, char *argv[]) { + int fd; + struct stat filestat; + if ((argc < 2) || (argc > 3)) { + perror("Usage: tangle filename chunkname"); + exit(-1); + } + fd = open(argv[1], O_RDONLY); + if (fd == -1) { + perror("Error opening file for reading"); + exit(-2); + } + if (fstat(fd,&filestat) < 0) { + perror("Error getting input file size"); + exit(-3); + } + bufsize = (int)filestat.st_size; + buffer = (char *)malloc(bufsize); + read(fd,buffer,bufsize); + fixHTMLcode(); + getchunk(argv[2]); + close(fd); + return(0); +} + +++Odds are good you just skipped over it, right? It is pretty much just a +long pile of noise. It really doesn't matter if you are a C programmer +or not. It is just noise. +
+1.4 Understanding the literate programming tangle tool
++We write literate programs to communicate to both people and machines. +Since they are combined in a single document we need a markup language +to distinguish human text from machine text as well as a markup language +to write readable english. +
+Literate programs don't care what you use as a documentation markup +language. I prefer latex but for this example we'll use HTML. +
+HTML has a set of markup tags. One of the markup tags is the pre +tag. Anything between the beginning of a <pre> tag and the +trailing </pre> tag is quoted and +printed exactly as written. +
+The pre tag also has an id field which lets us put a +name on the tag. So we can have many named pre tags in a file. +
+The tangle program reads the HTML file and extracts the text in a +pre chunk. That's almost all of the magic. +
+1.4.1 Motivation, or getchunk
++Remember, though, that we wanted to motivate ideas in a program. +Motivations get introduced in an order that humans can understand. +So ideas have to be presented in a particular order for humans and +a different order for machines. We need a way to reorganize the +program into pieces. The trick is getchunk. +
+We introduce a special tag called getchunk. It only occurs +within a pre tag block. The getchunk tag has an +id field which names another chunk. When we find a +getchunk tag, we read the id name and replace the +getchunk tag with the named chunk. +
+This getchunk tag allows us to "clip out" a piece of a +program from the middle of a subroutine, put it into its own chunk, +and wrap some text around it. +
+This is useful if you have some tricky loop in the middle of a large +subroutine that you need to explain. You can clip the loop out, put it +in a pre tag of its own and replace it with a getchunk. +
+1.4.2 All of the literate programming tools
++So we have a single program, called tangle. It works with three +HTML tags, the <pre> tag, the </pre> + and the <getchunk> tag. They +look like: +
+ <pre id="main.c"> + </pre> + <getchunk id="main.c"> +++That's all there is. +
+Now you can embed your subroutines in an HTML file, surround it by +pre tags, replace nasty sections with getchunk tags +so you can explain them separately, and you can extract the whole +set of subroutines using the tangle program. +
+The big loop of program development is now: +
+ do forever { + edit the html file + tangle thehtmlfile sub1.c >sub1.c + tangle thehtmlfile sub2.c >sub2.c + tangle thehtmlfile main.c >main.c + gcc -o myfile sub1.c sub2.c main.c + ./myfile + } +++When you are finished for the day, the week, or the project your +whole program is properly explained as well as properly maintained. +
+If you did it right you can just point someone at the HTML file, +let them go away for a while, and when they return they understand +enough about your program to take over the whole project. +
+But wait, you say... you skipped over the hard part. +
+Indeed, I did. The hard part is that you need to communicate your +ideas to another human. And you have to do it in a way that reduces +your idea to practice. In fact, you need to clearly reduce your +ideas to practice. +
+2.0 The Tangle Program
+2.1 Understanding the story of the program
++We will try to illustrate literate programming using the tangle +program as an example. However, if you want to see a real +literate program I strongly recommend the book: +
+ Lisp in Small Pieces by Christian Queinnec (ISBN 0-521-54566-8) +++We would like to write programs embedded in documents which surround +the program text with explanations. Since the compiler cannot +understand the markup language we need a program, which we will call +tangle, that wil extract code from a document in a given markup +language. +
+The tangle program is specific to the markup language used, whether it is +latex, HTML, or some other language. For this example we are assuming +that the markup is HTML. +
+The markup language has to have some sort of verbatim quoting +mechanism which allows us to put code inline. It also has to have a way +to name the quoted pieces of code. We call the quoted pieces of code a +chunk. We call the label the chunkname. +
+HTML uses the pre tag as its quoting mechanism. Items between +the opening pre tag and the closing /pre tag are +unchanged by HTML processors. The pre tag has an id +parameter. We will use the id parameter as a chunk name. +
+The tangle program takes two arguments, the name of the document and +the name of the chunk. It must print the chunk on standard output. +
+2.2 The main task
++The main thing we have to do is find a chunk somewhere in our book. +We specified the book and the chunk name on the command line so we +ought to check that we have the right inputs. +
+There can only be two inputs and they are both requires so first +we check that condition. If there are not two then the user might not +understand what was required so we print the usual usage message. +
+We have huge amounts of memory so we will simply load the file in a +single chunk using the read function. This +requires that we can open the file, which we check. It requires a +handle to the file, which we get from the open call. It requires the +size of the file so it can allocate memory, so we use fstat. +
+Now that we have all of the read parameters we can load the +file into memory. The buffer variable is described as a pointer to +a character which is C-speak for a string. +
+We also need to know the size of the buffer at all times so we +allocate the bufsize variable at global scope. +
++ +/* a memory mapped buffer copy of the file */ +char *buffer; +int bufsize; + +++Since the buffer is being used everywhere we make it global by +defining it at the top of the program, outside the scope of the +functions. +
++/* memory map the input file into the global buffer and get the chunk */ +int main(int argc, char *argv[]) { + int fd; + struct stat filestat; + if ((argc < 2) || (argc > 3)) { + perror("Usage: tangle filename chunkname"); + exit(-1); + } + fd = open(argv[1], O_RDONLY); + if (fd == -1) { + perror("Error opening file for reading"); + exit(-2); + } + if (fstat(fd,&filestat) < 0) { + perror("Error getting input file size"); + exit(-3); + } + bufsize = (int)filestat.st_size; + buffer = (char *)malloc(bufsize); + read(fd,buffer,bufsize); + fixHTMLcode(); + getchunk(argv[2]); + close(fd); + return(0); +} + +++If the read succeeds we move on to the problem of finding +the named chunk in the book. +
+2.3 A complication with HTML
++HTML is not clever about ignoring HTML symbols in <pre> +blocks of code. This causes us three complications which do not normally +occur with a tangle program. +
+The first complication can be seen by viewing the source for this page. +We need to replace the angle brackets with the HTML escape codes +everywhere, including in the verbatim sections. So we have to be careful +using certain language constructs. Reasonable markup languages such +as latex do not have this problem. +
+The second complication is that the string search for the pre, +/pre and getchunk tags need to use the escape code +sequence for matching. +
+The third complication is that the tangle program has to reverse +this translation when printing the program. +
+There is no such thing as a simple job. +
+The trick here will be to walk the buffer and replace every use of +the HTML code with the corresponding character, appropriately adjusting +the space. Since HTML codes are always longer than the character they +replace we can just compress the buffer a bit. And since the buffer is +never used for anything else we can even replace the HTML codes in parts +of the buffer where it might be inappropriate. +
+The point variable is where we are looking in the buffer. +The mark variable is where we are writing in the buffer. +Normally we just walk them in parallel. They diverge when they hit an +HTML code. The point is copied to the mark. In the +absence of any codes this is a copy of the buffer to itself. +
+When we hit a code we place the corresponding character at the mark, +bump the point over the HTML code, and continue the copy. +
+When we hit the end of the buffer we rewrite the bufsize variable +to reflect the new, shorter buffer size. +
+There is another really subtle point here. Note that we have to do the +comparison in two pieces otherwise the combined string is a valid HTML +code and we end up destroying our own search string. If we break it +into two parts then the substring "lt;" is not a valid HTML code and +will not be replaced. This is the kind of information that +literate programming preserves. A clever programmer would combine the +two tests into the single strncmp and everything will fail. +
+void fixHTMLcode() { + int point = 0; + int mark = 0; + int i=0; + for(point = 0; point < bufsize;) { + if ((buffer[point] == '&') && + (strncmp(&buffer[point+1],"lt;",3) == 0)) { + buffer[mark++] = 60; + point = point + 4; + } else + if ((buffer[point] == '&') && + (strncmp(&buffer[point+1],"gt;",3) == 0)) { + buffer[mark++] = 62; + point = point + 4; + } else + buffer[mark++] = buffer[point++]; + } + bufsize = mark; +} + ++2.4 Finding the chunk
++We are given the chunkname as a string and we have to search the +file for the
<pre id="name"> tag. ++Since the string came from the command line we know that it ends +with a null character so we can just call strlen to get +the length. +
+By design, we require the <pre> tag to start in the first +character of the line. We could remove this restriction if we +felt like being clever but code is usually left-aligned in the +output so left-aligning the tag is reasonable. +
+Given this design decision we can walk the book one line at a time. +We delegate the task of finding the line length linelen to +the nextline function. The "magic number" 11 exists because +the smallest pre tag contains 11 characters: +
+<pre id=""> +++If the line is long enough to contain the <pre> tag and its +associated id tag and we find the named chunk then we +print it. +
+Another design decision is that we will allow multiple chunks with +the same name and we will print them in order. This allows us to +break a big block into smaller blocks and insert text in the middle. +So you can write +
++ <pre id="somename"> + the first part of a function + </pre> + some explanation + <pre id="somename"> + the next part of the function + </pre> + more explanation ++When you run tangle filename somename you will see the output: ++ the first part of a function + the next part of the function +++This is trivial to implement. All we have to do is keep searching the +document and printing the same named chunk. +
+ ++/* find the named chunk and call printchunk on it */ +int getchunk(char *chunkname) { + int i; + int j; + int linelen; + int chunklen = strlen(chunkname); + for (i=0; ((linelen=nextline(i)) != -1); ) { + if ((linelen >= chunklen+11) && (foundchunk(i,chunkname) == 1)) { + i=printchunk(i,linelen,chunkname); + } else { + i=i+linelen+1; + } + } + return(i); +} + ++2.5 Finding the next line in the buffer
++The nextline function just walks the buffer starting at any +character position and returns the character position after the next +newline. We only have to check that we don't run off the end of the +buffer, otherwise we just keep a count of the character position. +
++/* return the length of the next line */ +int nextline(int i) { + int j; + if (i >= bufsize) return(-1); + for (j=0; ((i+j < bufsize) && (buffer[i+j] != '\n')); j++); + return(j); +} + ++ +2.6 At last, finding the chunk
++We are positioned at the beginning of a line so we are looking at +something that looks like: +
+<pre id="somechunkname"> +++We need to check the format of the line. The first 9 characters must be: +
++<pre id=" +++The next substring must be the chunkname we seek and the rest of +line must be the +
+"> +++So if we find a properly formed line with the chunk name then we +return a 1 indicating success, otherwise we return a 0 indicating failure. +
+In a fit of hasty generality we allocate the chunkbegin +information at the global scope. In theory we would like to be able to +just change the patterns for different markup languages. +
+char *chunkbegin = "<pre id=\""; int chunkbeginlen = 9; +++/* handle <pre id="chunkname"> */ +/* is this chunk name we are looking for? */ +int foundchunk(int i, char *chunkname) { + if ((strncmp(&buffer[i],chunkbegin,chunkbeginlen) == 0) && + (strncmp(&buffer[i+9],chunkname,strlen(chunkname)) == 0) && + (buffer[i+chunkbeginlen+strlen(chunkname)] == '"') && + (buffer[i+chunkbeginlen+strlen(chunkname)+1] == '>')) return(1); + return(0); +} + ++ +2.7 Printing the chunk we found
++So we found our chunk and all we have to do is print it. +We just print each line until we find the </pre> tag, +a task we delegate to the foundEnd function. +
+Of course, there is no such thing as a simple job and, indeed, +there is a complication here. We made a design decision +that we will allow a getchunk tag to have special meaning +within the pre tag. +
+We delegate getting the id from the getchunk tag +to the getChunkname function which we will examine next. +
+Recall that getchunk will search the document for the +pre tag named by the getchunk's id label. +This means that we need to +
+ 1) stop printing the current chunk + 2) find the chunk named by getchunk and print it + 3) continue printing the original chunk where we left off +++We already have a function to do step 2, the +getchunk function mentioned above. +We can just call that function which handles both finding and +printing a chunk. +
+The magic number 17 occurs because we need to account for all of +the characters in the line which are not part of the id +string in the getchunk. So an empty string, including +the trailing newline is: ++<getchunk id="">\n a total of 17 characters +++/* print lines in this chunk, possibly recursing into getchunk */ +int printchunk(int i, int chunklinelen, char *chunkname) { + int j; + int k; + int linelen; + char *getname; + int getlen = 0; + for (k=i+chunklinelen+1; ((linelen=nextline(k)) != -1); ) { + if ((getlen=foundGetchunk(k,linelen)) > 0) { + getname = getChunkname(k,getlen); + getchunk(getname); + free(getname); + k=k+getlen+17; + } else { + if ((linelen >= chunkendlen) && (foundEnd(k) == 1)) { + return(k+chunkbeginlen); + } else { + printline(k,linelen); + k=k+linelen+1; + } + }} + return(k); +} + ++ +2.8 The simple case, print the line
+In the simple case we simply output the line. ++/* output the line we need */ +int printline(int i, int length) { + int j; + for (j=0; j<length; j++) { putchar(buffer[i+j]); } + printf("\n"); +} + ++2.9 Checking chunks for getchunk tags
++Because we treat the getchunk tag as a special case when we +are printing we need a predicate to look for it. We are positioned at +the beginning of a line when we are called so we only need to check +for the string we call chunkget, that is: +
+char *chunkget = "<getchunk id=\""; int chunkgetlen = 14; ++ +This predicate returns the length of the chunkname or 0 if it is not +a getchunk tag. ++/* handle+ +*/ +/* is this line a getchunk? */ +int foundGetchunk(int i, int linelen) { + int len; + if (strncmp(&buffer[i],chunkget,chunkgetlen) == 0) { + for(len=1; ((len < linelen) && (buffer[i+chunkgetlen+len] != '\"')); len++); + return(len); + } + return(0); +} + + 2.10 Getting a new chunk name from the getchunk tag
++We have encountered a <getChunkName> tag. We need to strip out +the id string. We malloc a small piece of memory to contain +the name. Note that this is a potential memory leak since we are +letting allocated memory leave this function. The caller must remember +to free it. If we check the caller, the printchunk function +we see that the buffer is freed and memory does not leak. +
++/* Somebody did a getchunk and we need a copy of the name */ +/* malloc string storage for a copy of the getchunk name */ +char *getChunkname(int k, int getlen) { + char *result = (char *)malloc(getlen+1); + strncpy(result,&buffer[k+chunkgetlen],getlen); + result[getlen]='\0'; + return(result); +} + ++ +2.11 Finding the end of the chunk we are printing
++The printchunk routine delegated the task of finding the +</pre> tag to foundEnd. If we find the +tag we return 1 otherwise we return 0. +
+The chunkend and chunkendlen variables are allocated +in the global scope. +
+char *chunkend = "</pre>"; int chunkendlen = 6; ++ ++/* handle </pre> */ +/* is it really an end? */ +int foundEnd(int i) { + if (strncmp(&buffer[i],chunkend,chunkendlen) == 0) { + return(1); + } + return(0); +} + ++ +2.12 Final cleanup and housekeeping. The preamble
+ +C compilers aren't smart enough to look up symbols everyone uses so we +have to tell it where to find them. We also need to forward reference +the getchunk function because it is used before it exists. +Compilers in 1970 couldn't handle such complexity. It is terribly +hard, you know. + ++#include <stdio.h> +#include <stdlib.h> +#include <string.h> +#include <sys/stat.h> +#include <sys/mman.h> +#include <sys/types.h> +#include <fcntl.h> + +/* forward reference for the C compiler */ +int getchunk(char *chunkname); + ++ +3.0 The tangle source code
++The big chunk of code at the beginning of this book is a working +tangle program. You can clip it out of a web page, save it as +tangle.c, type +
+ gcc -o tangle tangle.c ++and you now have a working tangle program for HTML files. If you want +to improve it just modify this web page and run tangle again to extract +your changes. ++Alternatively you can get the tangle source code from +here +
+When we want the whole program we output this chunk with the command: +
++ tangle litprog.html tangle.c >tangle.c ++and it recreates the tangle.c function. + ++<getchunk id="include"> +<getchunk id="chunkbegin"> +<getchunk id="chunkend"> +<getchunk id="chunkget"> +<getchunk id="buffer"> +<getchunk id="nextline"> +<getchunk id="printline"> +<getchunk id="foundchunk"> +<getchunk id="foundEnd"> +<getchunk id="foundGetchunk"> +<getchunk id="getChunkname"> +<getchunk id="printchunk"> +<getchunk id="getchunk"> +<getchunk id="fixHTMLcode"> +<getchunk id="main"> ++ +4.0 What have we learned?
++We have walked through a fairly simple program in great detail. You +now understand why HTML causes problems, you understand what the magic +numbers in the program mean, and you understand the design decisions +such as always have pre tags in the first column. +
+We have learned that we only need a simple 160 line tangle program to +handle all of the literate programming tasks. You can make this more +complex if you like. You could create a tangle plugin for Eclipse. You +could add automatic indexing. Many people have done similar things +with their literate tools. +
+Ultimately, though, the important point is that you need to +write programs to communicate with other people. We do not do that +now. The result is that there are tens of thousands of programs on +Sourceforge and Savannah and Github that are never going to be used +because nobody understands them. +
+We can do better. We can make programs that live. We can communicate +our ideas in ways that others can understand without talking to us. +
+We can be better programmers. +
+Tim Daly, November 18, 2011 +
+
src/interp/i-spec2.lisp fix AN has sqrt: % -> %
20111117.01.tpd.patch books/bookvol9 treeshake compiler
+20111118.01.tpd.patch +src/axiom-website/litprog.html added