One of the tasks that I’ll be working on during my brief stint here in Nepal is researching and (hopefully) implementing a way to organize all the different media objects produced by OLE Nepal as basis for their E-Paath learning activities. Currently we are talking about several thousand images, sounds, texts and videos but it’s not hard to imagine their repository containing hundred thousand or more artefacts in the not-too-distant future. Apart from the specific OLE Nepal use-case I also believe that even larger content repositories have to be a core consideration for both the larger OLPC and SugarLabs efforts.
In order to efficiently handle this quantity of material one needs a solid and scalable solution. Let’s just call it Educational Content Management (ECM), shall we?
The basic requirements for such a solution are as follows:
- the ability to handle tens if not hundreds of thousands of multimedia objects
- easy to search so existing objects can be quickly retrieved
- a version control mechanism, especially for text documents which tend to undergo a lot of revisions
- reasonably easy to integrate in the current workflow (I’ll take a closer look at this aspect in just a bit)
- the ability to define workflows with the simplest one of them being the review of an object
- support for metadata that goes beyond what normal file formats offer
- allow for batch processing (upload, download, tagging, etc.)
- preferably based on software people already know, e.g. a browser or file explorer
After doing some research last week I came up with half a dozen solutions that looked reasonable well suited to meet these requirements:
Upon further inspection I decided to give Alfresco a shot since it appeared to be the most versatile solution. Well, two days later and I’m still stuck toying around with Alfresco and not very successfulÂ getting it do what I want. In particular I’ve been concentrating on two use cases that I’d like to address by utilizing an ECM solution:
One of the most important assets during the design of an E-Paath activity are four text documents:
- Activity document: This is the blue-print for the activity and contains every piece of information the developers need to implement the activity. It can therefore be considered the activity’s DNA. What’s important is that here at OLE Nepal the activity document is created by a curriculum expert and/or teacher. (Other broader collaborations with technical people are of course also possible.)
- Teacher’s note: An extensive document detailing learning goals, links to school book contents, ideas for preceding and follow-up activities, etc.
- Lesson plan: This contains a detailed overview of how teachers can use the activity in the classroom, which homework can be assigned to pupils, etc.
- Help text: E-Paath activities contain an online help-text to facilitate the use of the activities.
With regards to the workflow the documents are initially written by a teacher and/or curriculum expert and in then get gradually refined and improved by various people. There are also review stages especially when it comes to the Nepali text that is being display within the activities.
At the moment all of these texts are saved as .docx files and stored on a central fileserver where multiple versions of the same document are saved for archival purposes. People communicate informally about which version is the latest one, which steps need to be taken next, etc.
In my mind this is a clear scenario and use case that could benefit from the use of an ECM that would allow for workflows to be implemented explicitly, for roles to be distributed to different people and as a one-stop solution for saving and retrieving the current and relevant versions of the documents.
Since Alfresco offers a SharePoint Protocol component the idea was to set this up in the backend and allow people to interface with the system via their current software of choice, Word 2007.
The problem here is that after 10 hours of experimenting and reading countless PDFs and forum threads and I still haven’t managed to get this running. Using the built-in Office functionality I can create a document workspace and subsequently save a document into it. However whenever I restart Word and try to retrieve documents from that document workspace I end up looking at an error message telling me that the repository URL isn’t valid.
The Office add-ins provided by Alfresco on the other hand allow me to browse existing document workspaces and also create new ones. However once I try to save a document into it I’m presented with an error message that Word isn’t able to save anything there. Similarly trying to check out existing documents from a workspace can result in empty error messages.
This is what I like to call being stuck between a rock and a hard place.
The second major use case I’m trying to address with Alfresco is the management of image files which make up the majority of the assets created for E-Paath activities. Even today with a relatively small team of content developers having worked on activities for two years there are thousands of images that are stored on the fileserver. Even with a decent naming scheme, which is only partially utilized, it’s not hard to imagine that finding existing images is anything but easy. Imagine what the situation will look like 5 years from now when changing teams of dozens of developers and volunteers will be dealing with thousands and thousands of available assets. This isn’t just an issue for the team here in Nepal. Imagine how much it will hamper content sharing on a global scale between Nepal, Uruguay, Peru, Rwanda, Austria, UK, etc.
In order to deal with this issue a solution should have the following capabilities:
- quick search to find existing materials
- batch capabilities for upload, download and tagging
- support for extensive but not mandatory metadata (what this means is that it should be possible to add metadata incrementally at a later date therefore not forcing developers to spend time with tagging content at the time of upload)
Since using a single solution for both document and image management seemed like a good idea I again toyed around with Alfresco to see what it had to offer.
Batch uploading worked like a charm, if you use Chrome or Internet Explorer that is. There’s some sort of weird issue going on with a combination of Firefox, Flash, AdBlock extension and Windows Vista which is of course exactly the combination I happen to use. Once the images are in the document library however it’s a pain to add metatags to them as this, to the best of my knowledge, can only be done on a per-picture basis. The search on the other works very nicely however I would definitely love to see a batch download solution there to allow me to download a whole result set with a single click.
At the end of the day what I’m stuck with is a solution that seems to have a lot of potential but currently doesn’t quite have what it takes to be my Educational Content Management system of choice.
Anyway, since I’m still actively toying around with Alfresco I’d appreciate any pointers and information about potential solutions for the the issues described above. At the same time I’d also be very interested in your suggestions for and experiences with other Enterprise Content Management solutions that meet the requirements discussed at the beginning of the article.