merged 1.1 branch into head
[mir.git] / doc / developers-guide / producers.xml
diff --git a/doc/developers-guide/producers.xml b/doc/developers-guide/producers.xml
new file mode 100755 (executable)
index 0000000..61d5ecd
--- /dev/null
@@ -0,0 +1,509 @@
+<chapter id="producer_framework">
+<title>The producer framework</title>
+Please read the presentation of the <glossterm linkend="producers">producers</glossterm> concept
+for an introduction.
+
+<section><title>How to use the <filename>producers.xml</filename> file</title>
+  [FIXME: this should be in the admin's guide, not here]
+
+<para>
+Please check the standard <filename>producers.xml</filename> file
+for two fully commented, real-world examples :
+the <code>&lt;nodedefinition name="Language"&gt;</code> node
+and the <code>&lt;producer name="articles"&gt;</code> node.
+</para>
+
+  <section><title>Introduction</title>
+  
+  <para>
+  Mir allows admins to fully configure the producer tasks and set up 
+  arbitrary producers through the <filename>producers.xml</filename> file.
+  </para>
+  
+  <para>
+  Producers consist of "nodes". Every node has a specific function. 
+  For example, it is possible to use a node to generate a file out of a template. 
+  Or it is possible to use a node to enumerate over a collection of articles.
+  </para>
+
+  
+
+  <para>
+  A producer is defined using a Producer tag:
+  <programlisting>    
+    &lt;producer name="content"/&gt;
+  </programlisting>  
+  This would define a producer named <code>content</code>. 
+  
+  </para><para>  
+  In a producer, <emphasis>verbs</emphasis> must be defined. 
+  Verbs are sub-tasks of a producer.
+  
+  <programlisting>
+
+    &lt;producer name="content"&gt;
+      &lt;verbs&gt;
+        &lt;verb name="new"&gt;
+        &lt;/verb&gt;
+        &lt;verb name="all"&gt;
+        &lt;verb&gt;
+      &lt;/verbs&gt;
+    &lt;producer name="content"/&gt;  
+  </programlisting>    
+  
+  This would define a producer with verbs named <code>all</code> and <code>new</code>.
+  
+  </para><para>
+  And also the specific nodes and their relationship should be specified:  
+  
+  <programlisting>
+
+    &lt;producer name="content"&gt;
+      &lt;verbs&gt;
+        &lt;verb name="new"&gt;
+        &lt;/verb&gt;
+        &lt;verb name="all"&gt;
+        &lt;verb&gt;
+      &lt;/verbs&gt;
+      &lt;body&gt;           
+        &lt;Generate 
+            generator="/producer/startpage.template" 
+            destination="${config.storageRoot}/index.shtml"/&gt;
+      &lt;/body&gt;  
+    &lt;/producer&gt;  
+  </programlisting>    
+  
+  we will later learn that this producer generates a single file.
+
+  </para><para>
+  Producers can be made to do different things for different verbs: 
+
+  <programlisting>
+
+    &lt;producer name="content"&gt;
+      &lt;verbs&gt;
+        &lt;verb name="new"&gt;
+          &lt;Set key="count" value="3"/&gt;
+        &lt;/verb&gt;
+        &lt;verb name="all"&gt;
+          &lt;Set key="count" value="5"/&gt;
+        &lt;verb&gt;
+      &lt;/verbs&gt;
+      &lt;body&gt;           
+        &lt;Generate 
+            generator="/producer/startpage.template" 
+            destination="${config.storageRoot}/index.shtml"/&gt;
+      &lt;/body&gt;  
+    &lt;/producer&gt;  
+  </programlisting>    
+  
+  if a producer is called with a specific verb, first the nodes of that verb
+  are processed, and only thereafter the body: in our case, if the producer
+  <code>content</code> is called with verb <code>new</code>, first the variable <code>count</code>
+
+  is set to <code>3</code>, and after that, a file is generated.
+  
+</para>  
+  </section>
+<section><title>Node arguments</title>
+  <para>
+  Nodes can have arguments. 
+  Arguments that amount to integer values can be direct expressions.
+  Arguments that amount to text values can be enriched with expressions between ${}.
+  
+  </para>
+  <para>
+  Some examples:
+  <programlisting>
+    &lt;Log message="This article has the following title: ${content.title}"/&gt;
+
+  </programlisting>    
+  <code>Log</code> has 1 mandatory argument, <code>message</code>. This argument should be text, 
+        but can be enriched with expressions enclosed by ${ and }. In this example,
+        the title of an article is logged.
+
+  <programlisting>
+    &lt;Set key="age" value="34+22*(3+2)"/&gt;
+  </programlisting>    
+  <code>Set</code> has 2 mandatory argument: <code>key</code> and <code>value</code>. 
+        The key arugment is a fixed text, 
+        the value argument is a direct expression, in this case of arithmetic nature.
+    
+  </para>
+
+  </section>
+<section><title>Expressions</title>
+
+  
+  <para>
+  Expressions, either direct expressions, or expressions between ${}, can contain the following
+  constructions:
+    </para>
+
+  <para>
+  <table border="1">
+    <title>Expressions</title>
+
+    <tr><td>a string literal</td>       <td><code>'hello'</code></td></tr>
+    <tr><td>a numeric literal</td>      <td><code>2138</code></td></tr>
+
+    <tr><td>a variable reference</td>   <td><code>content.title</code></td></tr>
+    <tr><td>arithmetic operators</td>   <td><code> 3 + 4 *(2+5-2)</code></td></tr>
+    <tr><td>string operators</td>   <td><code> 'hello' ++ ' ' ++ 'Mir'</code></td></tr>
+
+    <tr><td>boolean operators</td>      <td><code>(content.id==3) or (content.id in (5,7,2,8) and (content.title!='hello')</code></td></tr>
+  </table>
+  
+  </para>
+  </section>
+<section><title>Node types and statements</title>
+    <para>
+    Here's  a list of different node types that can be used inside 
+  producer definitions. 
+Currently, there exists only one statement (&lt;nodedefinition&gt;), that
+is declared outside producer definitions.
+    
+
+    Here an overview:
+  
+    </para>
+    <table border="1">
+    <title>Reference for producer node types and statements</title>
+
+      <tr><td>Name</td>                       <td bgcolor="#eea8a8"><code>Set</code></td></tr>
+      <tr><td>Purpose</td>                    <td>Alter a variable's using a free expression</td></tr>      
+      <tr><td colspan="2">
+              Arguments</td></tr>
+      <tr><td>key</td>                        <td>The variable</td></tr>
+
+      <tr><td>value</td>                      <td>The expression to set the variable to</td></tr>
+      <tr>
+        <td>Example</td>
+        <td>
+          <code>
+            &lt;Set key="data.result" value="3 + 5 * (5-2)"/&gt;
+
+          </code>
+        </td>
+      </tr>
+      
+      
+      <tr><td>Name</td>                       <td bgcolor="#eea8a8"><code>Define</code></td></tr>
+      <tr><td>Purpose</td>                    <td>Alter a variable's using a string</td></tr>      
+      <tr><td colspan="2">
+
+              Arguments</td></tr>
+      <tr><td>key</td>                        <td>The variable</td></tr>
+      <tr><td>value</td>                      <td>The string to set the variable to</td></tr>
+      <tr>
+        <td>Example</td>
+
+        <td>
+          <code>
+            &lt;Define key="filename" value="/var/www/${content.id}.shtml"/&gt;
+          </code>
+        </td>
+      </tr>
+
+
+
+      <tr><td>Name</td>                       <td bgcolor="#eea8a8"><code>If</code></td></tr>
+
+      <tr><td>Purpose</td>                    <td>Create a conditional part of a producer</td></tr>      
+      <tr><td colspan="2">
+              Arguments</td></tr>
+      <tr><td>condition</td>                  <td>The expression to test</td></tr>
+      <tr><td colspan="2">
+              Sub tags</td></tr>
+
+      <tr><td>then</td>                       <td>The part to process if the expression evaluates to true</td></tr>
+      <tr><td>else</td>                      <td>The part to process if the expression evaluates to false</td></tr>
+
+
+      <tr><td>Name</td>                       <td bgcolor="#eea8a8"><code>nodedefinition</code>&nbsp;&nbsp;(statement)</td></tr>
+
+      <tr><td>Purpose</td>                    <td>Acts as a "function" (or "macro") that can be "called" elsewhere in a producer.
+More precisely, it is a way to define a new producer node type
+inside the  <filename>producers.xml</filename> file.
+(check the <code>Language</code> node in that file for an example).
+</td></tr>      
+      <tr><td colspan="2">
+              Arguments</td></tr>
+      <tr><td>name</td>                  <td>The name of the newly created node ("function")</td></tr>
+      <tr><td colspan="2">
+              Sub tags</td></tr>
+
+      <tr><td>parameters</td>                 <td>a list describing the arguments the "function" must be given </td></tr>
+      <tr><td>definition</td>                 <td>the actual body of the function, containing the code that should be executed. Note that this may contain a <code>&lt;sub/&gt;</code> tag that is replaced by the child nodes of the calling node. </td></tr>
+
+
+
+
+      <tr><td>Name</td>                       <td bgcolor="#eea8a8"><code>Log</code></td></tr>
+
+      <tr><td>Purpose</td>                    <td>Log a message in the producer log</td></tr>      
+      <tr><td colspan="2">
+              Arguments</td></tr>
+      <tr><td>message</td>                    <td>The message to log</td></tr>
+
+
+
+      <tr><td>Name</td>                       <td bgcolor="#eea8a8"><code>Enumerate</code></td></tr>
+      <tr><td>Purpose</td>                    <td>Enumerate over the results of a query</td></tr>      
+      <tr><td colspan="2">
+              Arguments</td></tr>
+      <tr><td>key</td>                        <td>The variable name that receives the enumerated record</td></tr>
+
+      <tr><td>table</td>                      <td>The table that is used to enumerate over</td></tr>
+      <tr><td>selection (optional)</td>       <td>The condition (where clause) of the query.</td></tr>
+      <tr><td>order (optional)</td>           <td>The order in which the results of the query are enumerated.</td></tr>
+      <tr><td>skip (optional)</td>            <td>The number of records to skip</td></tr>
+
+      <tr><td>limit (optional)</td>           <td>The maximum number of records to enumerate</td></tr>
+      <tr><td colspan="2">
+              Sub tags</td></tr>
+      <tr><td colspan="2">
+              This node can have subnodes that will be processed for every enumerated record</td></tr>
+<!--
+      <tr>
+        <td>Remarks</td>
+        <td>
+        </td>
+      </tr>
+      <tr>
+        <td>Example</td>
+        <td>
+          <code>
+            &lt;Generate generator="/producer/content.template" destination="/var/www/${content.date.formatted.yyyy}/${content.date.formatted.MM}/${content.id}.shtml"/&gt;
+          </code>
+        </td>
+      </tr>
+ -->      
+      <tr><td>Name</td>                       <td bgcolor="#eea8a8"><code>List</code></td></tr>
+
+      <tr><td>Purpose</td>                    <td>Store the results of a query into a variable</td></tr>      
+      <tr><td colspan="2">
+              Arguments</td></tr>
+      <tr><td>key</td>                        <td>The variable name that receives the result list</td></tr>
+      <tr><td>table</td>                      <td>The table that is used to select from</td></tr>
+
+      <tr><td>selection (optional)</td>       <td>The condition (where clause) of the query.</td></tr>
+      <tr><td>order (optional)</td>           <td>The order in which the results of the query are put into the list.</td></tr>
+      <tr><td>skip (optional)</td>            <td>The number of records to skip</td></tr>
+      <tr><td>limit (optional)</td>           <td>The maximum size of the list</td></tr>
+
+
+
+      <tr><td>Name</td>                       <td bgcolor="#eea8a8"><code>Batch</code></td></tr>
+      <tr><td>Purpose</td>                    <td>Divide the results of a query into batches</td></tr>      
+      <tr><td colspan="2">
+              Arguments</td></tr>
+
+      <tr><td>key</td>                        <td>The variable name that receives the batch</td></tr>
+      <tr><td>infokey</td>                    <td>The variable name that receives meta information on the batches</td></tr>
+      <tr><td>table</td>                      <td>The table that is used to select from</td></tr>
+      <tr><td>batchsize</td>                  <td>The size of a batch (the first batch however varies in size)</td></tr>
+
+      <tr><td>selection (optional)</td>       <td>The condition (where clause) of the query.</td></tr>
+      <tr><td>order (optional)</td>           <td>The order in which the results of the query are put into the list.</td></tr>
+      <tr><td>skip (optional)</td>            <td>The number of records to skip</td></tr>
+      <tr><td>process (optional)</td>         <td>The maximum number of batches to process</td></tr>
+
+      <tr><td>minbatchsize (optional)</td>    <td>The minimal size of the first batch</td></tr>
+      <tr><td colspan="2">
+              Sub tags</td></tr>
+      <tr><td>batches</td>                    <td>The part to process for every batch</td></tr>
+      <tr><td>batchlist</td>                  <td>The part to process once with the meta info</td></tr>
+
+
+
+
+
+
+      <tr><td>Name</td>                       <td bgcolor="#eea8a8"><code>Generate</code></td></tr>
+      <tr><td>Purpose</td>                    <td>Generate a page using a generator (i.e. an abstraction of a template)</td></tr>
+      <tr><td colspan="2">
+
+              Arguments</td></tr>
+
+      <tr><td>generator</td>                  <td>the generator to use</td></tr>
+      <tr><td>destination</td>                <td>the specification of the destination.</td></tr>
+      <tr><td>parameters</td>                 <td>Additional configuration info for the generator (for
+                                                  freemarker this now only contains the wanted encoding,
+                                                  empty for the default).</td></tr>
+
+      <tr>
+        <td>Remarks</td>
+        <td>
+           This node is used to have an actual page generated. 
+           The generator parameter usually is the name of a template.
+           The destination is the file to be generated. 
+           Variable references are possible in all arguments, and, especially for the
+                 destination attribute, widely used.
+        </td>
+      </tr>
+      <tr>
+        <td>Example</td>
+
+        <td>
+          <code>
+            &lt;Generate generator="/producer/content.template" destination="/var/www/${content.date.formatted.yyyy}/${content.date.formatted.MM}/${content.id}.shtml"/&gt;
+          </code>
+        </td>
+      </tr>
+
+
+
+      <tr><td>Name</td>                       <td bgcolor="#eea8a8"><code>GenerateMedia</code></td></tr>
+      <tr><td>Purpose</td>                    <td>
+The generateMedia node instructs the media handler associated with the media to
+"reproduce" the media. In practice this can mean generate an icon from an image,
+writing an image from the database to the web root, or create an m3u file or so.
+Media handling is limited at this moment, but  a serious 
+redesign is planned right after 1.1
+</td></tr>
+      <tr><td colspan="2">
+
+              Arguments</td></tr>
+
+      <tr><td>key</td>                  <td>FIXME???</td></tr>
+
+
+
+      <tr><td>Name</td>                       <td bgcolor="#eea8a8"><code>DeleteFile</code></td></tr>
+      <tr><td>Purpose</td>                    <td>Delete a file</td></tr>      
+      <tr><td colspan="2">
+              Arguments</td></tr>
+      <tr><td>filename</td>                   <td>The filename of the file to delete</td></tr>
+
+
+
+
+
+      <tr><td>Name</td>                       <td bgcolor="#eea8a8"><code>SetFileDate</code></td></tr>
+      <tr><td>Purpose</td>                    <td>Set a file's date</td></tr>      
+      <tr><td colspan="2">
+              Arguments</td></tr>
+
+      <tr><td>filename</td>                   <td>The filename</td></tr>
+      <tr><td>date</td>                       <td>The date to use</td></tr>
+      
+      
+      
+
+      <tr><td>Name</td>                       <td bgcolor="#eea8a8"><code>Resource</code></td></tr>
+      <tr><td>Purpose</td>                    <td>Make a message resource bundle available. In other words, this defines a function that can be used in expressions to reference a bundle. </td></tr>      
+      <tr><td colspan="2">
+
+              Arguments</td></tr>
+      <tr><td>key</td>                        <td>The name of the function.</td></tr>
+      <tr><td>bundle</td>                     <td>The bundle to use</td></tr>
+      <tr><td>language (optional)</td>        <td>The specific language to use</td></tr>
+      <tr>
+        <td>Example</td>
+        <td>
+          <code>
+            &lt;Resource bundle="bundles.producer" key="lang" language="en"/&gt;
+          </code>
+          This makes the english language producer bundle available
+          through the "lang()" function. One
+          can then  use expressions like  ${lang("page.title")} to
+          refer to a that value.                                                                          
+        </td>
+      </tr>
+
+
+
+
+
+      <tr><td>Name</td>                       <td bgcolor="#eea8a8"><code>Execute</code></td></tr>
+      <tr><td>Purpose</td>                    <td>Execute a script</td></tr>      
+      <tr><td colspan="2">
+              Arguments</td></tr>
+
+      <tr><td>command</td>                    <td>The command to execute</td></tr>
+
+
+
+
+
+
+      <tr><td>Name</td>                       <td bgcolor="#eea8a8"><code>ModifyContent</code></td></tr>
+      <tr><td>Purpose</td>                    <td>Modify a field of an article</td></tr>      
+      <tr><td colspan="2">
+
+              Arguments</td></tr>
+      <tr><td>key</td>                        <td>The variable containing the article</td></tr>
+      <tr><td>field</td>                      <td>The field to modify</td></tr>
+      <tr><td>value</td>                      <td>The value to set the field to</td></tr>
+
+
+
+      <tr><td>Name</td>                       <td bgcolor="#eea8a8"><code>MarkContent</code></td></tr>
+      <tr><td>Purpose</td>                    <td>Mark an article as produced</td></tr>      
+      <tr><td colspan="2">
+              Arguments</td></tr>
+
+      <tr><td>key</td>                        <td>The variable containing the article</td></tr>
+
+      <tr><td>Name</td>                       <td bgcolor="#eea8a8"><code>IndexContent</code></td></tr>
+      <tr><td>Purpose</td>                    <td>adds/updates content to the search index (search engine)</td></tr>      
+      <tr><td colspan="2">
+              Arguments</td></tr>
+
+      <tr><td>key</td>                        <td>The variable containing the article (FIXME????)</td></tr>
+      <tr><td>pathToIndex</td>                        <td>(FIXME????)</td></tr>
+
+      <tr><td>Name</td>                       <td bgcolor="#eea8a8"><code>UnIndexContent</code></td></tr>
+      <tr><td>Purpose</td>                    <td>removes content from the search index (search engine)</td></tr>      
+      <tr><td colspan="2">
+              Arguments</td></tr>
+
+      <tr><td>key</td>                        <td>The variable containing the article (FIXME????)</td></tr>
+      <tr><td>pathToIndex</td>                        <td>(FIXME????)</td></tr>
+
+    </table>
+</section>
+</section>
+
+<section><title>How the producer framework is implemented</title>
+
+<!-- <para>A Producer is a set of tasks, scripted in xml. Producers allow -->
+<!-- mir installations to have their own actions that can be called for -->
+<!-- instance when a new article is posted.  Originally producers were -->
+<!-- mostly used to generate pages, but they are used for a lot of -->
+<!-- other tasks such as pulling rss feeds for the global wire on -->
+<!-- indymedia.org. Producers are added and configured through the -->
+<!-- <filename>producers.xml</filename> file.</para> -->
+
+
+<para>The xml nodes contained within a <code>&lt;producer&gt;</code> 
+tag  in the <filename>producers.xml</filename> file define
+a small  program. 
+This program  (or script) may  contain constructs
+such as  <code>if</code> clauses, loops  and variables... The  program is parsed  into a
+tree  of <classname>ProducerNode</classname>s (figure).   The root  of this  tree is  defined  in a
+<classname>NodedProducer</classname> (which  is the  only class that  currently implements
+the  <classname>Producer</classname>  interface).   When  the producer  is  executed,  the
+<function>produce()</function> methods of each  node are recursively called, effectively
+executing the program as it was scripted. </para>
+
+<figure><title>A producer in the <filename>producer.xml</filename> file is parsed into a tree of <classname>ProducerNodes</classname></title>
+<mediaobject>
+<imageobject>
+<imagedata fileref="figures/producer-node-tree-example.eps" format="EPS"></imagedata></imageobject>
+<imageobject>
+<imagedata fileref="figures/producer-node-tree-example.png" format="PNG"></imagedata></imageobject>
+<textobject>
+</textobject>
+</mediaobject>
+</figure>
+
+<para>
+</para>
+</section>
+</chapter>