Posts Tagged ‘Geonames’

Users and use cases: part two

September 27, 2012 Leave a comment

This is a follow-up to an earlier post on users and use cases. That post discussed the needs of our users and the ways in which we have accounted for those needs during this project. This post will consider the requirements that archivists have as users of the cataloguing tool, Alicat (Archival Linked-data Cataloguing).

Alicat, the tool that we have been pilot testing during this project, allows archivists to process catalogue content as Linked Data, as part of the cataloguing process. It enables cataloguers to identify terms within their own descriptions and define each term as a concept, place, person, or organisation. This is done by highlighting the relevant text within a chosen field (e.g. Scope and Content) and when prompted, verifying in which of the four categories the term belongs. The term can then be added to an index of access points.

For example, to tag Hamrin, Iraq as a new place name, simply highlight the word ‘Hamrin’. Alicat will provide a list of suggested locations from Geonames. You can choose from one of these suggested place names, or alternatively, you can define a new place name by pinpointing a location in Google Maps:

The creation of index terms can be achieved by other means also. Eventually, archivists will be able to use Alicat to import data from a number of external systems (during our pilot test, this function was only available using data stored in AIM25-UKAT and Geonames). This function allows archivists to browse their own descriptions for pre-defined terms that exist as personal, corporate, place, and subject names in AIM25-UKAT. By clicking on the relevant ISAD (G) field, then moving the cursor away from that field and clicking once more, the archivist instructs Alicat to perform an analysis of that particular body of text. After a few seconds, those terms that already exist in AIM25-UKAT will be coloured according to their categories (blue for people, brown for organisations, red for concepts, and green for places). In order to mark up these terms, users can simply click and drag the relevant coloured words from the catalogue description and into the index on the right hand side of the screen.

When enriching catalogues with index terms, it is likely that most archivists who use Alicat either will draw on the data found in AIM25-UKAT (or another external CMS), or will use the tool to identify and define new terms. Since AIM25-UKAT does not have an exhaustive set of terms, it is inevitable that archivists will need to spend some time defining new terms.

Archivists who are using this tool during the cataloguing process should find it a great benefit to be able to create authority records either by defining new terms, or by drawing on the vast amount of data that is housed in AIM25-UKAT. Archivists wishing to edit descriptions in existing catalogues will find that Alicat is useful in this regard also. When accessed through Alicat, existing catalogue descriptions are not read-only but in fact can be altered. For instance, inconsistencies such as variations of the same personal, corporate, place and subject names can be amended manually.

The testing of Alicat by archivists has allowed Alicat’s developer to respond to problems and suggestions in order to make the tool both more user-friendly and more effective.

For instance, during our first test, we instructed Alicat to analyse the Scope and Content field of one our catalogues and to highlight any existing, pre-defined terms. Alicat failed to identify more than a couple of AIM25-UKAT terms that were not already present in the catalogue’s index. We could see that there were a further eight or nine terms that had not been identified – terms that we knew had been added to AIM25-UKAT.

This initial test revealed an issue that was already apparent to Alicat’s developer. He acknowledged that what was needed was a facility that enabled users to highlight terms that Alicat had missed – i.e., terms that were known to be in AIM25-UKAT – and to select such terms from a list of AIM25-UKAT suggestions, in a similar way to how, when users choose to define a new place name using Alicat, it presents them with a list of suggested place names from Geonames (and where applicable, from AIM25-UKAT also).

Clearly, this is a very important function. It is not essential that Alicat finds all of the relevant terms from AIM25-UKAT at the first time of asking (although of course, that would be ideal), but it is essential that archivists can highlight within bodies of text terms that they suspect are in AIM25-UKAT, so that they can then select these terms from AIM25-UKAT and mark them up as index terms.

This is necessary not least because there are some terms (such as abbreviations or alternative names) that only humans (as opposed to machines) could be expected to identify. For instance, in the example pictured above, ‘Gallipoli’ appears highlighted in green in the Scope and Content, denoting it as a place. However, as we pointed out in our earlier post on users, ‘Gallipoli’ also exists in AIM25-UKAT as a concept, as the non-preferred term for ‘Dardanelles’. It is understandable that Alicat did not make this connection, but it is important that at this point, an archivist is able to intervene and select the terms ‘Dardanelles’ and ‘Gallipoli’ from the AIM25-UKAT data. Another example is the abbreviated term ‘29 Div’: only an archivist with the necessary background knowledge would be able to recognise this as referring to ‘29th Division’, a corporate name that we have recently added to AIM25-UKAT.

In order to overcome this problem, Alicat’s developer installed a mechanism that allows archivists to dictate their own search terms. So, when we came to test Alicat again, we found that a box had been added to the search function: the highlighted term appeared in this box, and we were able to edit the term and ask Alicat to search for a word or phrase that was more likely to return the desired term. In the case of ‘29 Div’, we knew that it was expressed in AIM25-UKAT as ‘29th Division’, so we changed the search term accordingly, and Alicat retrieved the correct entry:

In the case of our chosen topic, the First World War, this facility has allowed us to locate specific battle names in the AIM25-UKAT data. For instance, the Scope and Content field of one of our collections includes the phrases ‘Battle of the Somme, 1 Jul 1916’ and ‘Battle of the Somme, 4 Jul 1916’. A search for the term ‘Somme’ under the ‘concepts’ category returned the following suggestions from AIM25-UKAT:


Battle of the Somme (1916)

Actions at the Somme Crossings (24-25 March, 1918)

Operations on the Somme (1 July-18 November, 1916)

Thiepval Memorial to the Missing of the Somme

One of these terms, ‘Operations on the Somme (1 July-18 November, 1916)’, was added to the index. However, we also wanted to include the broader term, ‘Battles of the Somme, 1916’. The search edit function made this straightforward: it allowed us to change our search term to ‘Battles of the Somme, 1916’. Alicat retrieved an exact match, and we dragged the term into the index.

A separate issue that we encountered during the testing of Alicat was the problem of updating old index terms. When a catalogue is viewed in Alicat, any existing index terms appear in the ‘Index (access points)’ column on the right hand side of the screen. In our case, the index terms dated from when the catalogues were first created. We required a function that would allow us to tag these terms so that they appeared on our website with URIs attached. However, the usual method of dragging these same terms from an ISAD (G) field and into the index resulted in the creation of duplicates. For instance, the term, ‘World War One (1914-1918)’, was already listed in the index of one of our catalogues, but we wanted to create a tagged version of this term, one that would appear on our website with an attached URI. We followed the usual process of highlighting the text and selecting the right match from the list of AIM25-UKAT suggestions. We then dragged the term into our index. The index in Alicat now appeared to have two entries for ‘World War One (1914-1918)’: presumably one with a URI and one without.

We reported this issue to Alicat’s developer and he duly provided a new feature that solved the problem. Those terms in the index that had not yet been tagged now had exclamation marks attached to them. When we clicked on the exclamation marks next to the index term, ‘World War One (1914-1918)’, we were given the option of searching AIM25-UKAT for that term. We could then select the term from the list of suggestions and drag it into the index, thereby replacing the untagged ‘World War One (1914-1918)’ with a tagged version.

The term now appeared in the index without exclamation marks – a sign that it had been tagged.

By performing the functions described above, Alicat enables archivists to enhance their catalogues simply and efficiently. No doubt, as further refinements are made, additional features will appear.