Fairly Positive

Online Prospectus

I’m currently working on a project to utilise the T4 SiteManager CMS for the production of the online and print versions of the 2015 undergraduate and postgraduate prospectuses for the University of Bristol. It was felt that the default content editing provided by the CMS was unsuitable for use without a training session. A training session was not feasible for academics who might only update programme and course information once a year. We decided to write a separate application for updating the information on programmes via some simplified screens and supporting specific workflows needed in the production of the prospectus. The application will interact with the CMS through its Java API.

Below is a chart on the current architecture in development.

OP Architecture

The SiteManager CMS is the core platform of the Prospectus, storing templates, assets and content. It will also provides two publishing channels - HTML and XML. HTML for the web version of the prospectus while the XML is ingested by Adobe InDesign to create the proofs for the print version.

Staff in support services who are trained in using the CMS will use the editing facilities provided by SiteManager. The academics will update the content by the Online Prospectus (OP) web application through simplified screens - the OP web application will retrieve and update content from the CMS through the Java API. It was also connect to our data warehouse (DataHub) to get details about people, albeit we might switch to our LDAP service. The OP application also has its own database to support roles outside of those provided by SiteManager, to support specific workflows needed in the production of the prospectus.

Using Maven Overlays With SiteManager

At work we use Apache Maven to manage the builds and dependencies of our Java-based projects. I’m looking at using Maven Overlays so we can hold some of our SiteManager CMS development (pieces of functionality to extend that CMS, such as custom navigation objects) and have it easily incorporated into the SiteManager WAR file that is deployed to our test and production servers. Basically, a Maven Overlay merges the contents of a project with a WAR file and then create a new WAR file of the merged content. Coupled with a suitable tagging and branching policy in Git and the use of Maven profiles, we should be able to manage development and deployments to different environments. Another benefit of this approach is that I can run an instance of SiteManager on my local development machine from the command line.

We have started using Sonatype Nexus OSS internally to mirror the central maven repositories and provide a place to hold third party JAR files that aren’t available via public servers. The default Nexus installation provides an anonymous user that can access all of the mirrored and local repositories. Our Nexus installation is IP restricted but that is still too public to store commercially licensed artefacts. I created a restricted repository and a new anonymous user that only had access to specified repositories, i.e. not the restricted repository which would need full authentication to access. With the restricted repository in place, I could upload the SiteManager artefacts needed for development.

To obtain the artefacts you need to get a copy of the SiteManager WAR file from the TerminalFour extranet. Make a copy of the WAR file and unpack it to find the SiteManager.jar and Acme.jar files in the WEB-INF/lib directory. You can then upload the WAR file and two jars with appropriate group, artefact and version numbers. If you don’t have a nexus installation you can install the files to the maven repository on your local machine:

1
2
3
4
5
6
7
8
mvn install:install-file -Dfile=SiteManager.war -DgroupId=com.terminalfour \
    -DartifactId=SiteManager -Dversion=7.2.0003 -Dpackaging=war

mvn install:install-file -Dfile=SiteManager.war -DgroupId=com.terminalfour \
  -DartifactId=SiteManager -Dversion=7.2.0003 -Dpackaging=jar

mvn install:install-file -Dfile=SiteManager.war -DgroupId=com.terminalfour \
  -DartifactId=Acme -Dversion=7.2.0003 -Dpackaging=jar

With the artefacts in place you can add the dependencies to your maven project pom.xml file:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
<dependencies>
    <dependency>
        <groupId>com.terminal4</groupId>
        <artifactId>SiteManager</artifactId>
        <version>7.2.0003</version>
        <type>war</type>
        <classifier>SiteManager</classifier>
        <scope>runtime</scope>
    </dependency>
    <dependency>
        <groupId>com.terminal4</groupId>
        <artifactId>SiteManager</artifactId>
        <version>7.2.0003</version>
        <type>jar</type>
        <scope>provided</scope>
    </dependency>
    <dependency>
        <groupId>com.terminal4</groupId>
        <artifactId>Acme</artifactId>
        <version>7.2.0003</version>
        <type>jar</type>
        <scope>provided</scope>
    </dependency>
</dependencies>

The scope is naturally set to provided for the JAR files because we only want them available for local development, since they will be provided at runtime in the WAR file. If you are using a Nexus repository to store the SiteManager artefacts you need to tell maven where to find it:

1
2
3
4
5
6
7
<repositories>
    <repository>
        <id>restricted</id>
        <name>Restricted Nexus Repository</name>
        <url>https://example/content/repositories/restricted/</url>
    </repository>
</repositories>

If the repository needs authentication then you will need to configure your ~/.m2/settings-security.xml and ~/.m2/settings.xml files with the encrypted authentication details.

To manage the different deployment environments (local development, test, production etc.) we use profiles to manage resources that are needed for each of these environments. For example, I have a local instance of MySQL for development but we use another vendor on test and production systems.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
<profiles>
    <profile>
        <id>local-dev</id>
        <activation>
            <activeByDefault>true</activeByDefault>
        </activation>
        <build>
            <resources>
                <resource>
                    <directory>src/main/resources</directory>
                    <excludes>
                        <exclude>local-dev/META-INF/*</exclude>
                    </excludes>
                </resource>
            </resources>
            <plugins>
                <plugin>
                    <groupId>org.codehaus.mojo</groupId>
                    <artifactId>tomcat-maven-plugin</artifactId>
                    <version>1.1</version>
                    <configuration>
                        <contextFile>
                            ${project.basedir}/src/main/resources/local-dev/META-INF/context.xml
                        </contextFile>
                        <path>/terminalfour</path>
                    </configuration>
                </plugin>
            </plugins>
        </build>
        <dependencies>
            <dependency>
                <groupId>mysql</groupId>
                <artifactId>mysql-connector-java</artifactId>
                <version>5.1.21</version>
            </dependency>
        </dependencies>
    </profile>
</profiles>

The local-dev profile (which is active by default) adds a context.xml that refers to a local MySQL database and makes sure the WAR file has the MySQL JDBC driver. It also sets the deployed context to /terminalfour.

Finally, we need to add the war plugin to the build element of the pom.xml file for the Overlay to work:

1
2
3
4
5
6
7
8
9
<build>
    <plugins>
        <plugin>
            <groupId>org.apache.maven.plugins</groupId>
            <artifactId>maven-war-plugin</artifactId>
            <version>2.2</version>
        </plugin>
    </plugins>
</build>

So, if we run the following command …

1
mvn tomcat:run-war

… the embedded tomcat server will startup and SiteManager will be available at http://localhost:8080/terminalfour :-D.

Working With TerminalFour SiteManager

The University of Bristol has recently bought a license for TerminalFour’s CMS product called SiteManager. SiteManager is used in a number of HE institutions and we will be using it to replace our current Plone implementation. I’m working on a project that is focussing on making sure our SiteManager implementation can support the production of the undergraduate and postgraduate prospectuses (online and print).

The SiteManager product is a Java web application that sits upon a relational database and a filesystem. Terminal four provide an API for extending the functionality of the product. Documentation for the API is just a JavaDoc, but I did attend some excellent training by Paul Kelly, Terminal Four’s Software Architect, a few months ago.

I’ve decided to blog about how we are using the API in case other people find it useful. Since it is a commercial project I was worried that TerminalFour might object to code snippets of their API appearing on a public blog, but I contacted them and they are happy for me to go ahead.

Exploring the Internet of Things: Creativity and Learning

Last Thursday (6th September, 2012) I was lucky enough to attend the first of three workshops on Exploring the Internet of Things: Creativity and Learning. The event was organised by Sarah Eagle and the first workshop was facilitated by Karen Abadie, Steve Bullock, Peter Bennett with support from from members of Bristol Hackspace.

The purpose of the workshops are “to explore the relationships between creativity, informality, and future forms and uses of hardware and software …” with a mix of practical and abstract/theoretical sessions. The first workshop was a hands-on session where we got to play and explore the Arduino. A number of exercises were provided to give a taste of what is possible with the Arduino. I teamed up with Emma Weitkamp from UWE and we eventually created the (rather silly) Cake O’Meter:

Our neighbours created the amazing DJ Shadow and other participants created equally interesting items.

The day ended with some discussion about the types of teaching and learning and issues around privacy and security when things are connected to the internet.

I’ve dusted of the Arduino I bought a couple of years ago and have started experimenting with it again. I’m also looking forward to the next workshop.

Bibliographic Management

Earlier in the week I attended the Mobile technologies in libraries: information sharing event in Birmingham. It was an interesting event and I spoke to a few librarians and information specialists. A Lanyrd page has a full list of details and materials.

I ran a breakout session on ‘Bibliographic management on mobile devices’ which covered the m-biblio project. It was a small group but they were interested in the project and the issues surrounding using mobile devices for managing bibliographic references.

I’ve put slides on Slideshare:

Dev8D, 2012

Last week (14 - 16 February, 2012) I attended the Dev8D developer conference with Damian Steer. The conference is primarily aimed at developers working in Higher Education, but also attracts developers from other sectors and some indies as well. Calling it a conference doesn’t really do it justice, since there is a mix of invited speakers, delegates offering talks, workshops and tutorials. The event is free for the attendee and is funded by JISC and other sponsors. The Professional Development Group of IT Services at the University of Bristol were kind enough to fund my travel, accommodation and subsistence.

This year the format changed slightly with less lightning talks and the ability for people to offer sessions on whiteboards. In the afternoon, those sessions that attracted the most interest went ahead - you put a mark next to a session you were interested in with a marker pen. As you would expect, the quality of the sessions varied but the net gain of learning new technologies and talking to other developers outweighed any minus points. In fact, Dev8D promotes voting with your feet - if a session isn’t what you expect or too easy, leave and find another session.

On the first day I attended Alex Bilbie’s (@alexbilbie) session on HTML5. I’ve already used some HTML5 with m.bristol.ac.uk but it was good to see an overview of the various changes to HTML5, the tags available and examples of where to use them.

I also attended the Pearson Education session on their Developer API, which includes access to FT Press, DK Eyewitness Travel Guides and Longman Dictionary. The API travel guide looks particularly interesting if you wanted data from one of the cities they cover. The dictionary also includes multimedia content for certain words, so could be used in Flash card type applications for kids. I think Pearson are still working on the pricing framework since the API call limit doesn’t seem high for some APIs like the dictionary and would become expensive quickly.

On the first day I also attended the Jorum session. I wanted to learn a little more about learning repositories. Mimas are working on proving a RESTful API over Jorum which uses DSpace. The Jorum team have a challenge to create “applications that demonstrate useful, innovative, original use the Jorum DSpace Read API for the benefit of HE and FE”. I was initially interested in this, but it looks like the team have a lot of work to make the API scalable since a call can return more information than you need. For example, I sent a query for information on a community - it returned ~65,000 lines of JSON. This was for too much data for my poor brain to parse and work out what would be relevant for further API requests.

The end of the first day was marred by breaking my glasses and I missed the morning of the second day due to being at an opticians getting a new pair.

Mike broke his glasses ... idiot

When I got back to the event I attended a session on The JLern Experiment and related programming challenge. This is around a JISC Learning Registry Node which is attempting to create a community of creators, publishers, curators and consumers. I need to read more information about The Learning Registry project and the idea of capturing ‘paradata’ around a learning resource. In this sense, ‘paradata’ refers to activity data around an item, such as feedback, rankings and usage data.

I was interested in the Introductory and Advanced session on CoffeeScript by Jack Franklin. The few slides and then a programming challenge certainly made me concentrate :-). I never enjoy writing JavaScript and I thought CoffeeScript might be a useful approach. CoffeeScript is a language that compiles to JavaScript and has removed braces and semicolons and indentations are important. So, the following JavaScript …

1
2
3
square = function(x) {
  return x * x;
};

… can be written like this in CoffeeScript:

1
square = (x) -> x * x

It seems fairly clean although verbosity in languages doesn’t usually bother me - I like Java and Objective C :-). I’m going to spend sometime learning CoffeeScript over the next few weeks and have bought Trevor Burnham’s CoffeeScript: Accelerated JavaScript Development. If I become confident in using the language I’ll offer to talk about it for one of our internal tech talks.

On the second day I also went to a really informative session by Owen Stephens and Thomas Meehan on library data. I’ve started accessing library data for the m-biblio and they provided a really useful session on MARC and why library catalogues provide the information in a certain format. The session was a rich mine of information on systems, tools and formats.

The third day was busy attending sessions, looking in the project zone and catching up with some developers who hadn’t been able to attend the earlier days of the event. I attended a session on ePub but the exercises including creating a basic epub book by hand but that involved copying XML off a number of powerpoint slides. It was interesting to see what constitutes the epub format but you’d definitely create one with tools such as KindleGen 2 or iBooks Author. Damian and I also managed to have a whirlwind visit to the British Library to see a number of exhibits, including ‘Manga: Professor Munakata’s British Museum adventure’.

One exciting development of the three days was finding out that Wilbert Kraan (@wilm) of Cetis uses Glint, the SPARQL client application that I wrote for OS X. I really need to find some time to fix some bugs and develop the application further!

I would highly recommend Dev8D to other developers in the HEI community. There are many interesting talks and sessions and, with several parallel tracks, the hardest thing is deciding what to attend.

Exhibit at the British Museum, with pointless photoshop filter added

Success in Scanning a University Barcode

Yay! I’ve managed to successfully scan some telepen barcodes (including one in a University library book) with the iPhone.

Successfully scanned barcode

I created a telepen decoder to work with the ZBar barcode reader library and related iPhone SDK. The code needs more testing and improving, but I’m happy with the result.

How the Barcodes Numbers Are Encoded and Decoded

In a previous post I noted that Telepen is the proprietary barcode format used in the Library at the University of Bristol. The Telepen symbology is publicly available and this post documents my understanding of the symbology.

The symbology has a number of key characteristics:

  • It represents the full ASCII character set
  • Characters take up the same amount of space
  • Wide to narrow bar ratio is 3:1
  • Four possible combinations of wide and narrow bars and spaces
  • Can be read as a binary sequence; uses 8-bit even-parity characters
  • Has a start character(_), stop character (z) and a check character

Telepen barcodes can represent numeric data (like the numbers used by the University) in a double-density mode. This means an ASCII character is used to represent a pair of numeric characters.

1511075964

For each pair of numbers (15 11 07 59 64), we add 27 to get their ASCII representation (17-26 are reserved for the character series 0X to 9X), which leaves us with:

42 38 34 86 91

These should be prefixed by the start character ‘_’ (ascii value of 95) and postfixed by the check digit (90 in this case) and the stop character ‘z’ (ascii value 122), giving us:

95 42 38 34 86 91 90 122

The values can then be looked up in the symbology character set to create the barcode. Alternatively, for creating barcodes for testing, I used a barcode generator with the unencoded values:

telepen barcode for 1511075964

To decode the barcode you could analyse the image against the symbology character set. However, more interestingly, the barcode can be read as a bit stream. There are four possible patterns:

  • narrow bar + narrow space (1)
  • wide bar + narrow space (00)
  • wide bar + wide space (010)
  • narrow bar + wide space (01 or 10 - alternates within a byte)

We are dealing with 8-bit even-parity characters which are encoded with least significant bit first.

Looking at the barcode above, the first pattern is a narrow bar and a narrow space (1) giving us the pattern:

1

There are then four more narrow bar and narrow space patterns (1), giving us the following pattern:

11111

The next pattern is then a wide bar + wide space (010), so we now have the following 8 bit pattern:

01011111

The decimal value for this pattern is 95, which represents ‘_’ in ascii which, in turn, is the start character for the barcode.

If we continued to look at the next set of bar and space patterns, we would get the following 8 bit pattern:

10101010

So, the decimal value for this pattern is 170. However, the most significant bit is set to 1 and this can be discounted to obtain the correct ascii value. The decoded bytes are even parity, so if the first 7 bits in the byte have an uneven number of 1s then the most significant bit will be set to 1 - this provides some simple error detection when decoding the barcodes.

We can use a bitwise operation to mask the most significant bit to obtain the ascii value:

1
int ascii_value = decoded_byte & 0x7F

In the case of 170 that would leave us with 42. If we deduct 27, that will leave 15, which are the first two digits if the barcode number 1511075964.

If we decoded all of the patterns we would have the following numeric values:

95 170 166 34 86 219 90 250

After we mask the most significant bit we are left with:

95 42 38 34 86 91 90 122

We can remove the start character ‘_’ (95) and the stop character ‘z’ (122) and that leaves us with the decoded numbers and the check digit:

42 38 34 86 91 90

To validate against the check digit obtain the sum of the numbers (excluding the check digit):

42 + 38 + 34 + 86 + 91 = 291

We then use modulus 127 to find the remainder:

291 % 127 = 37

Deduct the remainder from 127 to get the check digit:

127 - 37 = 90

So, the check digit matches! If we then subtracted 27 from each of the decoded numbers:

42 - 27 = 15, 38 - 27 = 11, 34 - 27 = 07, 86 - 27 = 59, 91 - 27 = 64

Thus, giving us 15 11 07 59 64. :-)

There are some exceptions to be noted.

  • The barcode numbers can end in X, and so the pair sequence 0X to 9x are represented by values to 17 to 26.
  • When working out the check digit, if the calculated value is 127 then the check digit is actually 0
  • There are other conditions when representing ascii values and not the double-density numeric mode - I won’t worry about those for the moment since I’m interested in only numeric values for this project.

In future posts, I’ll document how I’m attempting to write a decoder for the ZBar bar code reader so I can decode the telepen barcodes without using pencil and paper. :)

University of Bristol Barcode Numbers

The Telepen barcode used by the Library at the University of Bristol encodes a 10 digit number that uniquely identifies an item of stock such as a book or journal. The barcode number has a number of characteristics:

  • The first digit is the prefix and is always the number 1
  • The second through to the 9th digit will be from the range 0 to 9.
  • The tenth digit is the check digit and can range from 0 to 9 or be the character X

The check digit allows us to test whether or not the barcode number is a valid number used by the University, since we use a specific weighting algorithm. This is independent of the check digit used by the Telepen barcode symbology.

In testing the check digit we ignore the prefix which is the first digit. Each remaining number is multiplied against a relevant weighting in the following list: {7, 8, 4, 6, 3, 5, 2, 1}. Modulus 11 is then used on the sum of the weighted values to get a remainder. The remainder is then subtracted against 11 to get the check digit value. If the value is 10 or 11, then that is represented by the characters X or 0 respectively.

Example 1

1511075964

We ignore the prefix and multiply the next 8 digits against the appropriate number in the weightings list:

(5 x 7) + (1 x 8) + (1 x 4) + (0 x 6) + (7 x 3) + (5 x 5) + (9 x 2) + (6 x 1) = 117

Find the remainder:

117 % 11 = 7

Subtract from 11 to find the check digit:

11 - 7 = 4

Therefore, 1511075964 is a valid University barcode because the last number matches the check digit created by the algorithm.

Example 2

142837074X

Ignore the prefix and multiply the next 8 digits against the appropriate number in the weightings list:

(4 x 7) + (2 x 8) + (8 x 4) + (3 x 6) + (7 x 3) + (0 x 5) + (7 x 2) + (4 x 1)

Find the remainder:

133 % 11 = 1

Subtract from 11 to find the check digit:

11 - 1 = 10

10 is represented by X and this matches the last digit of 142837074X

Reading Telepen Barcodes

The Library at the University of Bristol uses Telepen barcodes for stock management. In the m-biblio project I’d like to be able to read the barcodes within the smartphone application we are creating. I’m looking at using the ZBar bar code reader, which also includes an iPhone SDK. ZBar supports a number of barcodes implementations, including EAN-13/UPC-A, UPC-E, EAN-8, Code 128 and QR Codes. However, it doesn’t support the Telepen barcode symbology. I’ve spent far more time that I’d like to admit into looking at how easy it would be to add a new decoder to the ZBar SDK to decode the Telepen symbology. It probably would have been a lot easier if I’d written a reasonable amount of C in the last eight years.

I’ve had some success implementing a new decoder that can successfully decode a number of Telepen barcodes of various sizes. For example, the following barcode was decoded by using the command line zbarimg utility:

Barcode

Alt text

I’m planning to document what I’ve learnt and implemented over a series of blog posts.