Docs:Overview
This document contains an overview of the operation of the development wiki and its supporting tools, and discusses the concepts and practices involved in creating and publishing a course via the wiki.
For details about the decision to overhaul the APEcs course development process, and the justifications for employing a wiki (and specifically MediaWiki in this case) to facilitate collaborative course development, please refer to the Rationale page.
Operational overview
The diagram to the right contains a simplified view of the components in the course development process, and the flow of data between the components. Content roughly flows in the directions of the arrows, and the portion below the box entitled 'Development Wiki' illustrates the process of converting a course stored in the wiki into a published course.
Development wiki
The development wiki - this site - contains a number of resources including documentation, tools, and other files as well as the content for the courses. Courses are organised such that each course has its own namespace, ensuring that there is no possibility of name clashes between courses and it is trivial to determine which course a given page belongs to. In order to ensure that the support tools can operate on the data in the wiki correctly, the contents of each namespace must be structured as specified in the Course Structure specification document.
Using a wiki - especially one as mature and feature-rich as MediaWiki - in this situation has a number of significant advantages over other authoring techniques, as discussed in the Rationale page, not least of which is the immediacy, wide availability, and easy learning curve associated with using a wiki to create content.
A note about access control
Access control in any collaborative development environment is always a somewhat problematical issue, and it involves balancing the convenience for users against ensuring that users can not accidentally damage other's work or access material they should not be able to see. The development wiki currently operates a system where:
- anonymous users (users who have not logged in) may not view pages in the wiki at all, other than documentation pages, help pages, and the login page. This ensures that unauthorised users may not access course and development materials.
- new users may only be created by administrators. This prevents anonymous users from self-registering and accessing the material despite the previous restriction.
- several groups of users exist in addition to the groups normally provided by MediaWiki:
- `user` - users may only read pages, and create/edit talk pages.
- `editonly` - user may edit existing pages, but may not create new pages, and can create/edit talk pages.
- `editor` - can perform all normal level wiki operations.
The current access control system is provisional, and subject to changes. In particular, user management and group control is a less-than-intuitive process, and per-namespace access control would go a long way to ensuring that accidents in one part of the wiki can not have knock-on effects in other areas. User management and access control is currently a major development area for the MediaWiki developers, and research is continuing in APEcs on potential improvements.
Creating courses in the wiki
The first stage of course creation in the development wiki is the creation of a namespace to contain the course. This is currently a job that can only be performed by the system administrator, as it involves the direct manipulation of the wiki's configuration files. The MediaWiki roadmap includes moving namespace configuration to a more accessible web-based control framework, but it is unclear about when this will happen, and depending on timescales involved it may be productive to develop a suitable namespace manager in-house.
Once the namespace is in place, the content creators may begin to construct the course in accordance with the Course Structure specification specification. Multiple editors may safely work on the course, and the system keeps track of who has edited what and when, so the editors may easily examine the progress of any piece of the course.
It should be noted at this point that the requirement to follow the Course Structure specification specification is imposed by the current design of the wiki2course.pl and course processor scripts. The possibility exists for future modification of those files in a way that would permit radically different course layouts if desired. This is a subject for discussion and research in the future.
Exporting data from the wiki
Individual courses can be automatically exported from the wiki using the wiki2course.pl script, which is made available as part of the Course Processor package. The wiki2course.pl script expects to be given at least a username and namespace, and when run it prompts for a password, logs into the wiki using the username and password provided, and attempts to export the contents of the namespace to HTML files suitable for passing to the course processor. At present, this means that the contents of themes and modules in the wiki are exported, converted from wiki markup to HTML, and then written to the filesystem in the theme/module/step form the course processor expects for HTML input.
Processing exported data
Once the data has been exported from the wiki, it can be passed through the course processor. The course processor is described in more detail in the Course Processor documentation.
A rough overview of the internal operation of the Course Processor is given here. Some aspects of the system are glossed over - in particular, the complexities of the plugin loading system are skipped over entirely - but it should be enough to gain an understanding of the overall operation of the system.
The course processor software is split into roughly four major pieces: the core code; input handler plugins; output handler plugins; and reference handler plugins.
The core code - largely contained within the processor.pl script - coordinates the operation of the processor, loads the plugins that do the processing work, and invokes the input and output handlers on the course data as needed.
Which input handlers are used is determined automagically by inspecting the source data for the course. The core code asks each input handler whether it understands the source data, and whether it is capable of processing the source data into the standard intermediate format. If the input handler indicates that it knows what to do with the data, it will be run on the source data. This means that you do not need to tell the course processor which input handler plugins it should use on the source data as it can determine that for itself. If none of the input handlers can run on the course data then the processor will exit with an error. If this happens, you need to check that the necessary input handler plugins installed, the correct course data directory has been specified, and the data is in the correct format.
Once all of the input handler plugins have inspected the source, and possibly done work on it, the output handler plugin is invoked to do its work. Unlike the input handler plugins, it is generally necessary to specify which output handler plugin should be used: this allows the processor to be used to generate a variety of output formats by running it with different output handlers selected. Output hander plugins take the intermediate format data and perform a range of operations on it to create the final course data. For example, the HTML output handler takes the intermediate format data, processes any special tags in the data, and writes out the course content via a template engine to form a completed course.
The reference handler plugins are not invoked directly by the core code but rather by the output handler plugins: during processing of the intermediate data generated by the input plugins, the output handler does a number of special tag substitutions. One of these allows references to be included in the course data and, when one is encountered, the output handler passes processing of the reference to a reference handler selected by the user. This allows the generated references to be presented in a number of different styles depending on the reference handler selected.
The diagram to the right illustrates the operation of the processor when the source 'Course data' contains HTML files (either authored directly, or generated by exporting data from the wiki using wiki2course.pl). The course processor passes the data through the HTMLInputHandler to create the intermediate format data, and then the HTMLOutputHandler takes the intermediate format data, the framework data, and the template files and generates the course CBT.
This is effectively a zoomed-in version of the lower part of the overview diagram shown above.
A note about safety
The development and maintenance of teaching materials is a significant undertaking in both time and effort, and centralising the storage of the material in a wiki may appear to run the risk of 'putting all the eggs in one basket'. In reality, there are a number of safeguards against data loss and corruption in place:
- the wiki software itself is robust, heavily tested (MediaWIki, the software used for this wiki, is the same system used by the Wikipedia project and many others,) and maintains a complete history of all edits. Provided that users save their changes regularly, it is possible to roll back pages to any point in their history or recover accidentally removed content quite trivially.
- the database, uploaded files, and wiki software are backed up on an hourly basis to a dedicated backup server. The backups are performed using software developed by Chris that has been used on numerous systems for several years.
- Hourly database backups are stored for 4 days, while backups of uploads are currently stored for up to 16 weeks.
What this means is that users do not need to worry about trying things, or keeping backups: they may edit and test ideas without having to be concerned about breaking anything!
Planned changes
The current version of the course processor only has one target output format: the HTML format used to produce CBT packages. The intention is to extend the processor with a MediaWiki output format. This would allow for courses to be authored in the wiki, and then exported, processed, and then imported into another wiki. The intention is that the target wiki would either be a course-wide shared wiki, where students could annotate pages either directly or on discussion pages, or a private wiki operated by the student where they could edit pages and annotate locally. The possibility of syncing local changes to some central location is under consideration, and while it has complex requirements, it is not technically impossible. This subject is ongoing research and will be documented on WikiOutputHandler
|